The schedule for day one was so packed, and continued until late at night, that I had no time to write anything as it happened. And today, it’ll be just a quick recap.
The day started with keynotes. These were the now standard O’Reilly conference “keynotes” consisting of 10-15 minute presentations, some intrinsically interesting, some little more than infomercials the speakers’ companies paid O’Reilly for. I dislike the format (can you tell?), and would flat out boycott them but there are always a few moments of value in there. And the first day keynotes were no exception:
- Hilary Mason (who participated in the prior day’s Data Bootcamp) of bit.ly opened with a breezy, interesting talk. It’s available online now, too. Nothing very deep in only ten minutes, of course.
- James Powell of Thomson Reuters gave a talk that wasn’t very interesting, but it was certainly okay.
- Mark Madsen‘s talk was fine and light. I liked Hilary Mason’s a lot better.
- Werner Vogel of Amazon gave a short informative talk about their web services. It wasn’t a sales pitch, it was actually interesting for its content.
- Zane Adams of Microsoft presented the most blatant commercial, including a video from Microsoft’s marketing group that was simply embarrassing.
- That was followed by a panel discussion on “Delivering Big Data”. There’s a video available. I didn’t think much of the session; you can’t have a worthwhile panel in ten minutes.
- The closing talk wasn’t announced ahead of time. Anthony Goldbloom of Kaggle talked about the $3 million prize for producing a good model for predicting who will need to go to the hospital in the coming year. They acted like this was an announcement of the prize, but it was publicized at least a few days before.
Overall, the keynotes simply weren’t worth the time they took. The sessions later in the day were better. I’m not going to talk about them all, just a few highlights (to my mind):
- MAD Skills: a Magnetic, Agile and Deep Approach to Scalable Analytics exposed me to a lot of new tools and techniques. I’ll be following up to learn how to apply them to my own problems.
- Small is the New Big, by Kim Rees of Periscopic, piqued my interest. Unfortunately, a lot of her graphics were intended for looking at on your own screen, not a distant projector, so I didn’t get the full effect. I hope she posts her slides so I can see the details better. (Later: she did post them!)
- New Developments in Large Data Techniques by Joseph Turian of MetaOptimize was excellent. Though it made me understand that most of my data problems just aren’t at the scale that most of those tools address.
- Google Cloud for Data Storage could have been considered a pitch for a bunch of Google products and tools, but I don’t care (and it didn’t seem others did, too). The tools are extremely useful, affordable or free, and really accessible to users new to these areas, like I am. I thought this was a great talk.
- Building Data Products with Hadoop by Sam Shah of LinkedIn should not have kept my interest. It was late at night, and I felt that the sessions shouldn’t have been scheduled so late. I was tired, and almost skipped this. I’m glad I didn’t. Very interesting and well presented.
That’s a bit more than half the sessions I attended. The others weren’t bad, but just not as useful or interesting to me as the ones above. I’ll update this later today or tomorrow with links to the material as I get a chance.