Experian's global data quality research carried out in 2015 showed that the strategic importance of data in business is recognised by most organisations, principally in the realm of improved decision-making. Organisations are now looking to “monetise” their data assets in more explicit and measurable ways in order to prioritise where investment will drive greatest returns, reduce risk and minimise loss.
The monetisation of data is not a new concept, for some time now retailers have been using information about online searches to provide targeted, real time customer advertising. However technology (such as Experian Pandora) is now making other ways of data monetisation accessible to every organisation. This article examines the place of data quality in this emerging trend and describes a pragmatic four-step process for putting a value on data issues.
Experian’s research also showed that while 93% of organisations are actively trying to improve, they still believe that poor data quality is losing them about 23% of their revenue. Organisations believe that the three most important enablers of data quality improvement are better business collaboration, the quantification of value to the business and the use of appropriate software tools.
In the past few years, technical teams have often been able to provide organisations with dashboards measuring the number of errors and even percentages of data quality. Unfortunately such measurements are considered next to useless because they are not actionable; they don’t indicate impact on the bottom line, priority, cost to fix, or root cause. Prioritising different quality measurements, weighting measurements by Business Unit or calculating average quality scores per system are all a step towards associating business value, but fall short because they don’t account for the value of specific customers, products, transactions etc.
The monetisation of (non) quality can bring some welcome objectivity to this area, and based on my own experience, I can recommend using a method composed of three simple measurement tasks which then feed into a fourth task where we calculate the return on investment and make objective decisions.
Firstly, the data issues must be found and given a high-level categorisation to establish responsibility and an initial level of importance. The three broad areas of impact are revenue, cost and risk, and I will consider the potential impact of poor quality asset data to illustrate this. Incorrect location information could lead to missed revenue because the sales team believed a necessary asset was unavailable. Missing or incorrect asset data could lead to early/extra maintenance, or even missed maintenance with the risk of breakdown which would result in repair costs as well as safety issues or fines from regulators due to disruption to customers.
When it is not easy to establish an accurate monetary amount for the impact of bad data, a relative business weighting can be used, provided it is specified by subject matter experts. How important would it be for example to know when there are profane comments or unprotected credit card details in your customer system? If you ask a business person, they always know which records, which system or which information is most important, and by asking the right questions this valuable prioritisation can be formalised and automated by rules in a software tool.
Second, the cost to fix the data must be estimated. This can only be done by analysing the issues in detail. The most reliable estimates are based on fixes which either use the existing business applications or use the construction of prototype fixes for different error types (tools which let their users transform and cleanse data interactively in an iterative data workshop). Analysing the bad data can often suggest a plan for fixes, for example tackling the issues by geography, by product line, by distribution channel etc. The approach to fixes must remain pragmatic, and often they involve a mixture of 95% automated and 5% manual work.
Fixing the data you have won’t eliminate the issues with current business processes which caused the data issues in the first place. To ensure on-going data quality, the cost to fix these processes must also be estimated, and once again this is based on detailed analysis of the issues. This time, the focus is on the different types of problems and their common characteristics and trends, in order to establish where and why things have gone wrong - their root cause. Fixing the process is likely to focus more on people than technology, with solutions varying. Examples I have seen include a combination of changes to computer systems, changes to manual business processes and staff training. As expected, pragmatic trade-offs are common, for example, improved staff training may be chosen as the best solution to avoid future data issues instead of costly and lengthy changes to a legacy computer system.
Finally, the measurements from these three tasks are brought together for each data issue, providing estimated costs and return on investment for each. Occasionally, there is no value to be gained fixing the data, for example if a contract only allows for billing within 3 months of delivering a customer service, there is no point fixing the billing data from last year, but even in such situations it may well be worth fixing the business process. Decisions can now be made on prioritisation and the correct amount of resources to assign to such activities can be based on business importance and direct comparison with other potential projects. By automating the measurements, progress can be monitored, the impact of the fixes measured, and the project spend evaluated and justified.
Following these four steps will enable organisations to get a more accurate view of the value of data issues and therefore provide important insight around where to focus data quality initiatives for greatest return. In the second part of this series I will go on to consider an example scenario which demonstrates how this can work in practise: