Can Your Data Warehouse Survive Big Data? Yes, if You do it Right.

[twitter-follow screen_name=’@dinojain’ show_screen_name=’no’]

Thirty years ago (really?), when Devlin and Murphy coined the term, “data warehouse,” (DW) and introduced the fundamentals of its design, it was a revolutionary concept.[1] It had the potential to change the way business was done. And it did, but over the years data warehousing has had its rough times, and in the IT lexicon the term has often been synonymous with “boondoggle,” and “spectacular failure.”

Yet, there are scores of companies out there—think Wal Mart and Amazon—that have succeeded beyond their wildest expectations because they built data warehouses that enabled them to understand their business better and chart a profitable course based on the insights they gained. And, what’s more, they’ve managed to harness the power of the most important IT development, well, ever: Big Data

Big Changes with Big Data

The inundation of data—especially the unstructured variety—has put a strain on enterprise DWs like never before. No doubt, there’s a data tsunami out there, but the real story with Big Data isn’t the size, it’s the type—unstructured data. Traditional DWs simply aren’t built to accommodate it, and there are Henny-Penny pundits galore who are predicting the demise of the venerable data warehouse.

To be sure, Big Data has spurred a need to bolster traditional DW environments with new capabilities—to extend the architecture and build a seamless, comprehensive DW architecture. Adding these capabilities will help you leverage Big Data to truly fulfill the purpose of your DW and—if it’s failing, rescue it—or if it’s adequate, make it great.

New DW NeedsHere’s what you need to build your comprehensive DW architecture:

  1. The ability to process unstructured data. You need to be able to pull information from many sources—customer call logs, emails, images, web pages, audio files—basically anything that can’t be readily loaded into a traditional database. Data lakes are one example of this type of capability. They use technologies like Hadoop to aggregate, store, and prepare data on query, rather than on load, to facilitate less formal data exploration. Traditional DWs don’t allow this, so these technologies are indispensable.
  2. New ways to connect to the data. Traditional ETL technologies won’t work with new data needs. Instead, you need specialized connectors that enable on-demand data access to get the data stored in data lakes built using Hadoop (or similar) technologies.
  3. Flexible data storage capabilities. You’ll most likely employ both cloud and traditional storage, but make sure that you don’t go too far in hybridizing your storage. Keep your DW data in one store. Pick either one, but keep it in one type of storage to ensure that your speed of access isn’t compromised.
  4. More off-the-shelf, pre-built analytics models to facilitate quicker, more uniform analysis. That flood of data will bog down your analysis speed, so you’ll also want to avail yourself of a tool that comes with pre-built analytics models that can be customized to meet your specific needs. Why re-invent the wheel if you just need a new set of rims?
  5. Better systems management and governance. This is critical. With Big Data comes the need to build really large and maddeningly complex architectures. Automation and governance are a must. Automate where you can: build self-monitoring and self-tuning capabilities into your architecture. Also, implement an enterprise-wide data governance function to ensure that your data is consistent, no matter who’s querying it.
  6. Democratization of analysis capabilities via a presentation layer that enables natural-language queries instead of making everyone learn SQL. Queries should read like, “Find all customers who clicked on multiple sub-categories within a web page, and multiple products, but didn’t add anything to the shopping cart.”

What’s in it for You?

I won’t lie to you; it won’t be cheap or easy to build a comprehensive DW architecture that will upgrade your data warehouse and provide all the add-ons it takes to accommodate Big Data, but you really don’t have a choice if you want to stay competitive. If you do it right, however, the benefits will be well worth the cost.

DW Needs 2Building a comprehensive DW architecture will enable you to see what’s happening at a both a more-granular and panoramic level—at the same time. You’ll be able to better understand customer behavior and identify behavioral triggers and take action to incent—or mitigate—them. You’ll also get deeper insight into your interactions with customers to match offers to needs and improve customer service and reduce churn. I could go on forever; the benefits are too numerous to list them all.

The bottom line is that you’ll have the insights you need to truly understand and grow your business like never before—and crush your competition. Who doesn’t want that?

I’d love to hear what you think. Please comment or DM me on Twitter, and please follow me! [twitter-follow screen_name=’@dinojain’ show_screen_name=’no’]

You can also message me on Linkedin, or email me at
[1]An architecture for a business and information system“. IBM Systems Journal. 27: 60–80. 1988.

Leave a Reply