Data quality practitioners run the risk of that famous adage, “running around like headless chickens”, without convincing and tangible evidence for data improvement. Data quality management provides us with a form of reassurance, the data quality threshold, typically calculated as a percentage of poor quality records against the total number of records in a data entity. However, merely analysing volumes of data to determine the difference between staying green or crossing over to the red on our data quality dashboards, we are ever so close to turning data quality into a just a bean counting process rather than valuable business improvement technique.
As a data quality manager, how do you decide that two per cent poor quality measured on one data entity is more worthy of your limited budget than the same two per cent measured on another data entity? And what makes the data entity with 40 per cent poor data quality more worthy as an improvement cause than one with 20 per cent, is it just a battle of numbers? What defines the threshold where data is not good enough or fit for purpose?
Prioritising data quality is not as simple as counting beans, we need to apply a little more thought and process to make it a science, and we have best practice used within other industries around us to inspire it. Taking examples from Lean Manufacturing and Agile Software Development, we can evolve from reactive record counting and cleansing to proactive management and control of quality. This article explores some of these principles and how they can be adapted for data quality management.
Wikipedia defines “Lean” as a production practice that considers the expenditure of resources for any goal other than the creation of value for the end customer to be wasteful, and thus a target for elimination. When applied in the data quality world, lean principles ensure that you are not running around like headless chickens and know exactly why certain data quality issues are tackled before others and why some data is worth so much more than others. You understand the true business impact of poor data quality, that even two per cent poor data quality in one data entity can be the difference between life and death compared to 40 per cent poor quality on another data entity.
Similarly, Wikipedia states that “Agile” Software Development promotes adaptive planning, evolutionary development and delivery, a time-boxed iterative approach, and encourages rapid and flexible response to change.
Agile principles ensure that you can deliver value through improvement in a quick and cost-effective manner. Agile means you have the necessary means readily available to demonstrate what good data can look like, and requires collaboration and innovation in process and technology.
While Lean and Agile may sound like popular buzzwords, the ethos they represent is all about justification and efficiency. While there are many documented measures and metrics data quality can include, it can induce peer and industry pressure to try and implement all at once rather than simply implementing what your data quality technology provides. Lean and Agile principles can help already stretched data quality teams sift out what is critical to your organisation and deliver value through data improvement in a quick and adaptive manner.
The following three-step process can help you embark on a more Lean and Agile data quality journey, and while this does not make you experts in the principles of Lean and Agile, I hope you can start reaping the benefits by just thinking differently about data quality.
Step 1. Justify the Why
Before you implement any data quality measurement programme, can you justify why you are measuring data quality? Each data quality metric should be linked to a business objective, which could be regulatory, growth or operationally based. The justification behind any data quality measurement is important, as it will help you determine the priority for fixing data quality problems as and when they occur. Answering the question ‘why’ can reveal that accurate emails underpin a multichannel growth strategy that supports new business and retention, or that having accurate dates of birth helps you comply with banking regulations such as “Know Your Customer”.
Step 2. Quantify the What
Can you quantify what each percentage of poor data quality means to you as a business, beyond just numbers of records or data elements? For example, the cost of poor email data could be linked to the average sales for existing customers but could be linked to the propensity of buying through email channels when it comes to prospect data. One size does not fit all, and it is critical to link business KPIs and measures with data quality to derive the necessary and maximum business benefit. Even a small number such as two per cent poor data quality could be a missed opportunity in growth or even worse, linked to a huge regulatory fine.
By justifying the reason why and quantifying data quality through business measures you become LEANER in prioritising data quality.
Step 3. Plan the How
Enable your staff with the right processes and technologies that enable quicker routes to improvement. Having understood what data is important to you through the lean principles, data quality teams should be able to proactively detect known data quality issues and prototype what it takes to fix poor quality data. You do not want to turn data quality into a repetitive chore for known issues.
Collaborative data prototyping technology is one of the rarely used arsenals in the battle against poor data quality. Equip your data quality teams with the tools and reference data that helps them design data validation and transformation rules that can be implemented in your live systems. Your data quality teams can then spend their time in more critical discovery and analysis of the unexpected issues.
By investing in preventative solutions for poor data quality through collaborative prototyping, data quality teams can be more AGILE in delivering their data improvement targets.
Experian Data Quality are presenting a half-day workshop on 4th November 2013 at the IRM Enterprise Data and Business Intelligence Conference. Delegates will learn: