This was a very very busy and also fun semester. I worked on two main projects over the semester. One was with the Wine Agent team and Evan Patton continuing the work on what I started last semester with the web interface for data input for wines and food for restaurants and wineries. This semester I got a lot of work done on the backend of the system. The platform I’m using is Ruby On Rails and it uses an ActiveRecord database model to represent objects. I had to create my own object model using a triple store instead of a regular relational database. Evan helped put up a test Joseki triple store for me to test with. I created a SPARQL 1.1 updater module in ruby that can send INSERT and DELETE queries to a SPARQL endpoint through simple POST requests. Then I had to convert my ActiveRecord Wine model into a custom triple store model. Most of the time was spent on appropriately creating object members so that the Rails framework treats my Wine model as an ActiveRecord model while in reality it using http requests and SPARQL queries to pull and push all the data from a Joseki triple store. The cool part of the work was taking ruby object members and automatically converting them into parts of a SPARQL query based on some prior metadata of what they represented.
When I had that set and finished I was amazed at the simple fact that this new Wine model was no longer constrained by a regular database! Its entities are subgraphs that are on the triple store, this means even if an entity is created it can have extra metadata annotated onto it and all other entities in the same fashion by simple SPARQL queries. Also all the people who have their URIs attached to the wines are also on the semantic graph. This means that if you find a person you can immediately find all the wines they own all in the same graph space! There’s no need to traverse through multiple database tables, its all elegantly stored through semantic web technologies! After I got the upload to the triple store working correctly I went into getting editing and deletion working right. Deletion was easy due to already having code on to automatically generate a SPARQL query of a Wine entity, all I had to do was shift it to a different query structure based on DELETE from the SPARQL 1.1 Update specifications. The editing was a little more troublesome due to Rails having odd routing issues when trying to access specific entities. I eventually just made each Wine auto generate a unique URI for access in the website based on the Wine’s own URI. Then editing was working right but not correctly due to me not putting enough validations on the data. Although it is the end of the semster I will still be working on this project over winter break and hopefully finish it. All that is left is the correct Wine validations and then mirroring the Wine model into a Dish model with its own forms. The Dish model should probably be easier due to looser restrictions on the Dish entity from its vocabulary in an RDF file. The Dish form might cause a little trouble due to having a tree like structuring heirarchy that represents a menu, but thats just more web design that I need to pick up. After the Wine and Dish models are set I just have to make overall site much prettier and the project should be finished!
The second project I worked on over the semester was doing visualizations with Orgpedia and LittleSis data for Xian Li and John Erickson. I made a very elaborate post on how I made the visualizations in my last blog post. It got featured along with Alexei’s post on PlanetRDF! A quick summary is that I made two visualizations. The first was a mashup of data from Orgpedia and Google Finance done using the Google Visualization Toolkit in R, which is a great mathematical programming language with great community and suites of libraries. I found the financial sectors of the US and their companies through a couple SPARQL queries and then pulled in all the financial information of the companies through a helpful R package called Quantmod. Then I put all the data together neatly into one big dataset and passed it along to the Motion Charts library which generated an html page with all the data in a very awesome visual form. The final visualization shows a motion chart of US sectors and you can change different financial properties to compare them by. The second visualization was made by me and Alexei Bulazel in which we took in and parsed data about board members of various companies using the LittleSis API. Then I took the data and turned it into a graph and did some pruning on the data and pushed it onto a D3 forced graph visualization. After a bunch of visual tweaks the final visualization showed a physics graph that shows a subset of company board members clustered together with board members on multiple boards having multiple colors on their nodes. Doing the visualizations were really fun and much easier than I thought. I also picked up R in my work and that is a tool I will definitely be using in the future for its ease of use and heavy community support.
This was a really fun and busy semester, I got a lot of work done here at the TWC and in work I was doing at the Cognitive Robotics Lab. I also picked up a lot of really neat tools and frameworks this semester that I will definitely use again on future projects. SPARQL is one of these tools and will be very helpful in the future when I need access to open datasets for data mining projects. I had a great experience working in the lab and received a lot of help from people working in the lab. The TWED talks were also pretty awesome to see all the branches of semantic web technologies. I’d recommend anyone reading this article to go check out the videos from TWED. I’m also very happy that today’s TWED talk deals with Semantic Web Agents, which was the initial reason I became interested in the research done at the TWC. What perfect way to end the semester!