Charles Engelke's Blog

May 9, 2011

Handling Large Data

Filed under: Uncategorized — Charles Engelke @ 8:48 pm
Tags: , ,

This is my final session at Google IO Bootcamp this year, and the one I know least about going in.  We’re working with larger datasets and trying to get meaning from them, so there’s a lot of potential for us here.

This is the only session I’ve been in that wasn’t packed.  There are plenty of folks here, but there are empty seats and nobody sitting on the floors.  I’m sure people don’t find this as sexy as Android, Chrome, or App Engine.

We’re starting with the setup, which is pretty complicated.  We have to sign in as a special Google user, then join a group, then download and unpack some files, then go to some web pages…  And I’ve done all that.  Now I guess I’m ready.

We start with Google Storage for Developers, which is geared toward developers, not end users.  You can store unlimited objects within a whole lot of buckets, via a RESTful API.

We do an exercise where we create a text file, upload it, and make it readable to the world.  Then we fetch our neighbor’s file.

Next on to Big Query.  Which, for me, is a disaster.  Getting the tools set up and working is a mess under Windows, even with instructions.  And the meaning of the data we’re querying isn’t clear, making the exercise difficult.  But I got a few things to work.

Finally, we’ll use the Prediction API.  As for the exercises, I’ll try each one once then give up if it doesn’t work.  Messing with the installation and configuration takes my attention away from the actual tools.  Well, I think I’ve set it all up; it says it’s running.  I learned a lot of mechanics here, but don’t really understand what’s going on.  It should take about 5 minutes to do the prediction run I’m trying, and then I’ll see if I can make sense of the result.

Well, the result was “error” after 10 minutes of crunching.  I guess I’ll try it again, perhaps from a Linux box, someday.

That concludes IO Bootcamp this year.  All in all, it was well worth attending, even though I already knew some of the material.


Blog at