1,741 times faster than dial up!

I am checking out the grand new offices for the RAMCloud project at Stanford. The internet here is 1,741 times faster than 56k dial-up! Ironically storage speeds have not grown at the same rate as Internet speeds, that's what RAMCloud hopes to solve (in the data center).

This is insane bandwidth. Oh and every PC here gets a public IP! I'm in Internet heaven.


Open Government Data and JasperReports

Major geek out this evening. I was supposed to go to the gym with Diego but he passed out on the couch from jet lag and I decided to hit the open source reporting space instead (the obvious choice!). I'm evaluating JasperReports, an open source reporting engine. I'm very impressed with the results.

Installation was a breeze - I fired up a Debian 5 VPS on my home server and after downloading the JasperServer package I had my TomCat, MySQL, Java installs all done for me. I had the option to configure each component separately - I opted to have Jasper do all that for me this time. I also decided to give JasperReports-Pro a try under the time trial demo. There's also the completely free open-source edition.

Data Source:
Once I was able to log into the web interface I had to find some data! I decided to try data.gov. I found some "raw data" on 2008 adoptions per foreign country for United States families. The data came in the form of an .xls file that I cleaned up, turned into a .csv and imported into a separate MySQL server that I had running.

Configuring JasperReports:
Configuring JasperReports was not too bad. There's definitely a learning curve but within a couple of hours I was able to get everything plugged in nicely and a report generated. The basic steps were:

1) Create a JDBC connection to your database (in this case it was another host on my LAN with a mysql server listening for connections). Jasper calls this a JBDC Data Source

JasperReports: Turning Data into Information
2) Create an abstraction of the Data Source and the queries. This can get extremely complicated - you can pull in several tables and several data sources, manipulate them in any way you like (yes, graphically if you like) and then you should have what Jasper calls a Domain. My Domain was very simple, just pulls in the tables from my data source. In the real world this step would be done by an experienced database administrator.

3) Finally you can then set up a report using the data from the Domain created in step two. The idea is that your end users create reports using the JasperServer GUI, the end user really doesn't need to worry about the JBDC data source.

The Results:
After all this work you have a report. A sweet report that is linked to your data source. If I upload data into my database the reports are updated instantly. This is very cool. Since I already have all the data from my table in my Domain, creating other reports and mash-ups is simple.

The "real-world" application doesn't seem obvious in my example here - if you work in an enterprise where data is needed on a constant basis then JasperReports is for you. Reports can be ran on demand by end-users or scheduled and e-mailed on a recurring basis. Here's what my report ended up looking like. It's super basic and just tells me that Guatemalans were the child of choice for 2008.

If you put together all the information available from the government on sites like data.gov and combine it with the power of a tool like JasperReports, having incredibly useful information at your fingertips becomes a reality.