According to recent research, organisations that pro-actively manage data as an asset with a joined up approach to data quality, are the ones that reap its full strategic value. Furthermore, although 93% say they are actively trying to find and resolve data quality issues, they also feel that a massive 23% of revenue is still being wasted due to poor quality data.
The good news is that I believe technology can provide a solution, helping organisations achieve better results, quicker, thanks to learning from the experiences in other areas of Information Management.
Whereas Business Intelligence and Analytics capabilities have evolved and are now in the hands of business people, approaches to Data Quality remain in something of a time-warp - technical and relatively disconnected from the rest of the business.
The simple fact is that data quality is still not easy enough in practice and if we are going to change this, we need to develop a better understanding about putting a value or importance on what’s going wrong in the data. Although it has similarities with Analytics, I’d call this Data Quality Analytics, and in the rest of this article I’ll explain what I mean and how to obtain a solution.
Similarly to Analytics, Data Quality needs to be addressed by the business users, because they are the people who understand the relevance of the data. You’ve probably heard that concept before, however what’s important here is not just who the people are, but how they need to carry out their work. I believe that results in Data Quality are disappointing to 83% of organisations because the tools currently in use are both technically difficult to master and are based on an out-dated question/response approach to analysis.
Often, an analyst hands questions to a technical person who writes a question in the form of a program or script, and, after a suitable wait, provides a precise but limited answer. Too much time is spent translating the analysts’ questions into programs, and correcting misunderstanding, so the process is slow and frustrating for both parties.
In reality, analysts usually set out with expectations of the data and some questions to which they want answers. However they formulate more questions and often follow unexpected trains of thought, as they find out more about the data along the way. This is metaphorically referred to as “following the rabbit” in reference to Lewis Carroll’s, Alice’s Adventures in Wonderland.
Of course this is comparable to ad-hoc analytics, where the best results are often obtained because the business person is driving the tool. Data Quality analysis should be driven by business people in exactly the same way. And we should provide pre-packaged Data Quality reports and proactively provide any insight we can, to give analysts a head start in exactly the same way tools do in Analytics.
Context is critical, because Data Quality statistics in isolation lack relevance. The quality of data is visible and meaningful when it is viewed alongside other associated and relevant data. This is why it’s difficult to capitalise on the traditional question/response approach, which simply provides statistics disconnected from the data itself.
By showing a range of statistics and Data Quality answers beside the data, business focus and relevance is maintained, improving insight, increasing productivity and accelerating conclusions. So it’s not about simply providing answers, we need the data and answers together on the screen.
Finally, business people need the technical capability to go from raw data quality information – pass/fail per rule and per record - to information about the state of the data within the context of the business. This is analytics to interactively carry out ad-hoc slicing and dicing of the data, its associated data quality statistics and cost/benefit estimates. Provided we have maintained the context, as described previously, only some of the information being analysed is about quality, the rest is about businesses allowing analysts to understand the potential impact, its scope, its severity and therefore its importance. The ability to focus ad-hoc analysis on only the issues allows users to determine their provenance and root cause, and to estimate approaches to resolution, with associated costs and thus priorities. In fact the software able to provide this capability should also be able to calculate some potential costs and benefits based on findings so far.
And of course it’s got to be easy, quick, collaborative and interactive – in the same way as the best of the analytics platforms. Unfortunately, organisations cannot simply turn to their analytics tools because they are unable to provide the specialised Data Quality capabilities required, something more focussed is required.
Data Quality analysis and statistics are a means to an end, not ends in themselves. When Alice asked the Cheshire cat
“Would you tell me, please, which way I ought to go from here?”
the cat replied
“That depends a good deal on where you want to get to.”
In the same way, organisations need to associate these activities and information with business destinations or outcomes, and by combining people, context and technical capability, Data Quality Analytics provides the most effective way of doing so.
The Experian Pandora software product allows our customers to carry out such Data Quality Analytics to improve data, and therefore better business outcomes such as improved sales, reduced costs and avoidance of business risks. A free data profiling software version is available to those wanting to experience it for themselves and includes features such as automatic analysis, instant root-cause analysis and What-If profiling.