I wrote about Hadoop years ago, when it was a young project from Yahoo. Over the years the framework has evolved to become very important to a number of companies, important enough for Microsoft to make some substantial investments in Hadoop and SQL Server.
Why use Hadoop? What’s the power of this framework? It’s designed to work with disparate data sets, with structured and unstructured data, from a variety of sources, and perform complex analysis of this data using clusters of inexpensive hardware. In other words, it scales out very nicely.
There are plenty of large companies using Hadoop, which is why Microsoft made their investment in the technology, but there is an interesting use I ran across from a Utah bank. Their security department is using Hadoop in place of traditional data warehouses on relational platforms to analyze security data and proactively make decisions. That’s pretty cool to me, especially since security departments seem to have smaller budgets than their mandate would suggest.
I’m not sure how many places I’ve worked where Hadoop would be useful, but I suspect that as we have larger data sets, from more sources and the need to analyze the data before we put it in a data warehouse, more and more of us will end up using Hadoop along with our SQL Servers.
The Voice of the DBA Podcasts
We publish three versions of the podcast each day for you to enjoy.