Skip to main content

The 4th V of Big Data?


Janani Dumbleton 3 minute read Data quality

The data world is buzzing with articles, posts and presentations asking organisations where they are with their big data strategy. It seems as though the world had barely started taking data quality seriously for basic master and reference data, and now we are out to conquer the next data challenge, lots of data! 

A Gartner survey revealed revealed that 64% of organisations have invested or planned to invest in big data in 2013. Now this is a very positive trend, as it means more and more businesses are comfortable in the race to take on data as an enterprise asset. The most prevalent topic being discussed is the technological challenge of managing big data, typically classified under the three “Vs”, Volume, Variety and Velocity. While this is a very valid approach, let’s not forget that the whole reason we are hungry for big data is so we can produce precious insight that can influence strategy and drive positive growth.

Big data opens opportunities to insight that was not available before, certainly not on the scale that has now exploded through the growth of social media, industrial automation and the improved technology and infrastructure that supports very large volumes of data. However, insight that can be trusted requires not only the quantity to predict trends but also the quality, this time in the quality of the master and reference data being used to apply context to any strategic decision.  We call this the fourth “V” of big data, Veracity, which is underpinned by the principle of having good quality data that is accurate, complete and appropriate enough to represent truth.

This was most eloquently put by Jonathan Krebbers, VP of Architecture at Shell, “If you want a step-change in efficiency, you need to have better quality of data”, talking about big data at a recent SAS business leadership event. While the majority of big data may be machine generated like web logs or equipment readings, the quality of big data itself may not be considered an issue. It is the quality of the usual suspects, domain data such as customer, product, location, employees, price lists and suppliers that can detract value from the insight generated. For example, web page hits and click-throughs may not mean much unless it is accurately linked to a product or service. Or smart meter readings providing multiple electricity readings in a day may not tell a lot unless linked to the right customer, location and energy tariff.

It is critical that any company in the stages of planning their big data strategy, understand how the components of the “small and medium” data can influence the big data that is being collected and the insight being derived. Data quality analysis, correction and monitoring of master data should be incorporated into your big data programme along with the collection, collation and aggregation of big data to be able to produce strategic insight. Quality data instils trust into insight, and we at Experian see this as we work with large volumes of data on a day-to-day basis in our efforts to produce trustworthy reference data and insight for use by our customers.

So, businesses out there trying to monetise the big data they generate, make data quality part of your big data initiative.