Sunday, October 14, 2012

Big Piles Of Data - Facebook, LinkedIN etc

I've had the chance to work onsite at Facebook for a few weeks with their BI group doing a POC for Informatica's ETL tools against their platforms.

It was an interesting experience and I enjoyed the great food and hospitality of my Facebook colleagues. While I'm still not a fan of Facebook as a user I do admire the energy and fast paced environment. It is the type of place that breeds creativity and out of the box thinking. The Meno Park campus vibrates with excitement.

Some of the mottos stenciled everywhere are: "Move fast and break things!"    "HACK" is plastered everywhere and is the paradigm for their development mentality. "Done is better than perfect." Think of it like Agile after downing a couple of Red Bulls!

They typically show up at 10am and work until ..... its gets done!

With all the innovation and creativity Facebook still has to wrestle with down to earth problems.  How do I get data from point A to B?  What does the data mean?  Is the quality of the data good enough to make decisions and spend resources on?

The scale that they work on is LARGE.   How large?  The main HDFS is pushing 110PB and growing fast. They just passed 1 Billion active users and have plans to expand the usage as much as possible across the globe.  Zuckerberger gave every employee a little Red Book with some thoughts on this milestone. The basic theme is "1% is not done" and we've only just started with 1% of the population.

Back to reality; the fact is that they have discovered that Big Data is really just "Big Piles Of Data."  Which is totally useless until you extract value from it. Relationships, likes, dislikes, needs, desires, and dreams.

And the fact that everyone is distracted by Hype around Big Data is not lost on some of them.  I think that companies like Cloudera, Hortonworks, MapR, and other wannabes for "The Hadoop Standard" are the future Sybase, Informix, and Borlands of this wave of technology.

Yes there is room for these folks to make money and some will.  They will leverage their VC money and gain market share then cash out and move to the Next Gig.  But I doubt their products will have a lasting legacy. Market consolidation is not only a certainty it has already begun.

Ultimately, as the Facebook folks realize,  Hadoop is nothing but a cheap and commodity  technology to store Big Piles of data.  As more businesses come off the euphoria of Big Data Hype and realize it is NOT the silver bullet to solve their problems, more traditional software companies like Teradata / Aster, Oracle, IBM, Teracotta/Software AG and other data analytics companies will make large inroads by supplying software and systems to perform useful business analytics.

Watch this trend in the next 12 to 18 months.  I predict it will be an exciting time on the other side of the Big Data Wave.

Cheers,

Dave