A few short weeks ago at Strata+Hadoop World in San Jose, our team surveyed attendees of the show walking through the expo floor. We asked these attendees five simple questions to get some sense of the adoption of Hadoop, the business drivers behind it and challenges organizations are facing. The five questions we asked are these:
1. Has your organization invested in a Hadoop platform?[clear-line]
2. If yes, which best describes your Hadoop investment.
3. What business justification drove or is driving your implementation of Hadoop?
4. What is the most challenging aspect of big data and your ability to derive value from it?
5. What sources of data are you analyzing or plan to analyze?
Not surprisingly, for an event all about Hadoop, nearly 4 out of 5 respondents stated their organization has invested in a Hadoop platform. This certainly is more than the general company population at large. I have seen stats cited that say 3 out of 5 companies have invested in or are considering investing in the next 12 months. I recall speaking less than two years ago to audiences across the US about the use cases for Big Data, and my own experience back then in informal polls was that Big Data and Hadoop was very, very new with low adoption rates. The subtle message though at the time was Big Data is inevitable and for competitive reasons every organization needs to have a big data strategy. The attendees at this show are early adopters and leaders at taking advantage of Hadoop.
For the second and third questions, we wanted to get some sense of how tactical or strategic of an investment those respondents that said yes had made. We wanted to know the business reasons they invested in Hadoop. Now, surprisingly, 3 out of 5 organizations that responded yes said they were realizing real business value out of Hadoop. I had a mis-perception myself of most of these organizations experimenting or using Hadoop as a tactical solution to lower the cost of ETL. Certainly Hadoop transforms the economics of data storage and computing by making it much more cost-effective to store big data. But, that wasn’t the number one answer. In fact, nearly half said Hadoop was being used to increase revenues and accelerate operational insights. Business reasons we see through conversations with our customers include:
- Instill and improve trust through identification of inefficiencies and dissatisfied patients in healthcare
- Gain new insights into what drives effective retail campaigns by modeling locations, events, and products as well as customer relationships
- Identify everything from dissatisfied customers to fraud in banks
- Increase public trust by identifying service problems and dissatisfied citizens in public sector.
- Give insurance brokers a broader picture of the customer experience – providing an opportunity to uncover upsell opportunities.
Finally, with questions 4 and 5 we tried to understand the challenges with big data and the types of data these organizations were storing and analyzing in Hadoop. We see the majority of organizations dealing with transaction data, customer data and log data. Not far behind are clickstream and social media data. The number one challenge cited was the inability to integrate or link the data. Organizations struggle to connect the data across their many systems in an effective manner. This makes the promised insights and hidden connections they seek more costly and time consuming than expected – if possible at all. Organizations have structured data, semi-structured, and unstructured data. Of course it’s the unstructured and semi-structured data that has been the most difficult to get at, both internally and externally. But it’s all three working together that hold the greatest potential for insight. The problem is “how do you link all of this together in a sound, repeatable way that can deal with dirty and fragmented data”?
For Novetta, Strata+Hadoop World in San Jose was an excellent event to have great conversations with leaders in Big Data and Hadoop and to discuss the value we add from an entity resolution and analysis perspective. Our value prop gets at the heart of the benefits organizations are trying to achieve with Hadoop and the challenges they are facing. We accelerate operational insights by constructing complete 360-degree views of a customer, organization, location, product, and event, at any volume from any source whether structured or unstructured. We also help organizations increase revenues by creating unified customer profiles and relationships to products and services improving cross-sell/up-sell opportunities. And, we do this, by helping solve the biggest challenge organizations face – we make Hadoop data useful to anyone using an adaptive process to unify all types of data – regardless of schema – and allow analysts to look at and connect their data in entirely new ways.
Thank you to all of the respondents to this survey. See you at the next Strata+Hadoop World in New York.
Thinking of building an entity resolution and analysis application yourself? Register for the first of Novetta’s O’Reilly-hosted webcast series, Entity Resolution On Hadoop: The Pitfalls of Building It Yourself.