So how can organizations tackle this challenge? The answer lies in Hadoop.One of the major opportunities to breathe new life into data warehouses is migrating heavy ETL processing to Hadoop, leading to faster processing times and lower costs. First, raw data from source systems is loaded as-is into Hadoop. Organizations can then leverage the cluster processing in Hadoop to transform the data into the required data models. The transformed data is then loaded from Hadoop into the existing data warehouse(s).Another opportunity is in moving “cold data” – infrequently used, inactive or dormant data – from the data warehouse into Hadoop. This frees up capacity and improves performance in the current data warehouse while still providing access to cold data for queries as needed. In fact, cold data in this approach can even be mined for additional insights or combined with other data. Since Hadoop storage costs are much lower than typical EDW costs this also saves on storage costs (versus adding capacity to the existing EDW infrastructure). The infographic below illustrates how to offload “cold data” from a data warehouse to Hadoop. As a side benefit, organizations that optimize their data warehouses using Hadoop can keep their data for longer and can even mine and analyze the data for more advanced and impactful insights. Hadoop-based data also can be combined with logs, social media content, and other unstructured data to gain new analytical insights. Of course, there’s a certain amount of effort and reengineering to make either of these opportunities a reality. That’s where ETL tools come in. By providing the codeless, visual means to port ETL streams to Hadoop without significant redevelopment, the process becomes that much easier and faster. There are even BI vendor tools that support Hadoop, allowing you to easily migrate analytic models and processes. Ed. Note: This blog post has been updated from the original in 2014 to reflect additional content.
Read more about the author: