Google Conference on Scalability

The Google Conference on Scalability is over and was very interesting.

The organization was first rate, and the food was very good too (I am an unapologetic  foodie). The presentations were also very good and very informative, most people were from the Seattle area and the San Francisco area, some from the East Coast, and some came in from as far as England.

The first keynote “MapReduce, BigTable, and Other Distributed System Abstractions for Handling Large Datasets by Jeff Dean, Google, Inc.” was interesting, all the material presented was already out there in the form of other presentations and papers, but Jeff really tied things together, and provided some good insight in to the tools and how they were used at Google. One very cogent point he made was that giving developers powerful tools allows them to be much more productive as well as allowing them to take on challenges that they otherwise could not. Both points are very important, and the first one really resonated with me, the less you have to worry about infrastructure as a developer the more you can focus on the problems at hand.

The session on the Lustre file system “Lustre File System by Peter Braam, Founder and President, Cluster File Systems, Inc.” talked about really big file systems, how you make them scale in the face of heterogeneity and unreliability.

Barry Brumitt’s presentation “Using MapReduce on Large Geographic Datasets” was entertaining and interesting, providing insight into the technology and processes that went into building Google Maps. Which got me thinking about whether you could adapt the technology to build maps of the sky. I am sure you could and I think it would be a very worthwhile project.

Reza Behforooz talked about “Lessons in Building Scalable Systems” (not listed) provided insight into the engineering process at Google, how they tested scaling and how they deployed systems. This was very interesting as I have not seen this information before.

The second keynote “Marissa Mayer, Vice President, Search Products & User Experience, Google, Inc., Topic TBD.” provided insight into user testing at Google, talking about how they tested various user interfaces, and the work they did on Universal Search. Nothing really new here, but there was much more content in the Q&A session afterwards.

The final talk I went to was “Challenges in Building an Infinite Scalable Datastore, Swami Sivasubramanian and Werner Vogels, Amazon.com.” which proved to be very entertaining. Werner provided some interesting and amusing insights into the situation he found at Amazon when he got there, and how he addressed them. He draws a lot on biology for inspiration which make sense as a lot of the issues we run into with scaling have been solved a long time ago by biological organisms. Swami talked about some of the ways they dealt with scaling and reliability, which are detailed in an upcoming paper. He did the point that elegant academic solutions can be very difficult to implement, and the complex engineering that needs to be done to make thing work. Verner was a little miffed at the not-so-hidden recruiting pitches that Google were making so decided to make his own, very overt, recruiting pitch as well.

All the session were videotaped and are going to be put on on Google Video and/or YouTube in the near future, at which point I will post a link to them.

As for the recruiting pitches, both Google and Amazon sound like very interesting companies to work for, it is a shame that neither has a strong presence in the Boston area.

Updated June 27th, 2007, Robin Harris of StorageMojo has posted some notes too.

4 thoughts on “Google Conference on Scalability

  1. Indeed it does, I checked the jobs listing before posting and came up with one engineering position open, and I think in total there were thirteen jobs open.

    At first I thought that one engineering position open was a bit light, but perhaps this is a ‘generic’ engineering position and any and all resumes are looked at, and good candidates hired.

Leave a comment