On Clouds and Data
I’m sitting in SFO tonight, awaiting my return trip back to Hurricane Pending Maryland. (As a former Floridian, I must of course scoff at any notions that this hurricane is significant). Walking through the airport I noticed a large billboard about “Big Data and the Cloud”. This is the kind of billboard you only see in Silicon Valley; I don’t see signs like that in Portland or Ottawa, and certainly not when I had to change flights in Detroit this year.
Anyway, these two buzz words aren’t a local phenomenon, and are actually taking the tech world by storm. Big Data has become serious enough that there are multiple conferences now for folks interested in the topic. And cloud, well, perhaps harder to define, but more and more businesses are moving to the cloud every day. The problem here is that, most of the traditional ideas on big data run entirely counter to the ideas that work well in the cloud.
Last spring I moderated a panel PGEast in New York that focused on Postgres in the cloud. As someone who works on multi-terabyte systems, and someone who deals with cloud servers on at least a semi-regular basis, I tried to prod and poke my panelists into sharing their take on how they see Postgres’s role in the cloud. Not too surprisingly, the idea behind “Big Data” on Postgres in the cloud was not a particularly popular one. The tools you need to do the job effectively with Postgres just aren’t there. Not to say you can’t try, but so far I haven’t seen many wild successes.
Next month at Surge though, I’m going to be involved in another panel focusing on ”Pushing Big Data To The Cloud”. This time though I’m turning over moderating duties to long-time thought leader in the MySQL community Baron Schwartz. Joining me on the panel are several folks who all have a stake in the idea of Big Data in the cloud;  John Hugg and Philip Wickline from VoltDB and Hadapt, respectivly, two new database vendors built with scale-out in mind; Bryan Cantrill, VP of Engineering at Joyant, a cloud provider with thier own strong opinions on dealing with data in the clouds, and Kate Matsudaira, someone who is currently managing those multi-TB databases, all in the cloud, over at SEOMoz. This should be a really good mix of people using different technology, with different biases against the problems involved. If you’re looking to work on Big Data in The Cloud, I hope you’ll join us, it should be a lot of fun.