Skip to main content

Finding the right 'frames' to establish data quality context

It's well established at this point that if you want your business to take significant steps forward in efficiency and productivity through analytics, you'd be best served by focusing on good, clean data. For retail organizations, for example, using data can be a tremendous boon, as it can help salespeople fine-tune their approach and sell consumers the products they demand. But if you're working with bad data - addresses misspelled, information outdated, elements poorly transcribed - you can be steered in the wrong direction.

That's why data quality is important. For companies that lack it, there's a constant risk of major missteps that can cost them serious time and money. But what exactly does it mean to have high-quality data? What elements need to be clean and accurate? When do they need to be verified? What level of quality is considered high enough?

There's no one answer to any of these questions that's set in stone. Each individual company that works with data needs to set its own standards, and these rules may vary depending on the business' specific needs. In other words, everyone should strive for better data quality, but they first need to establish a good frame of reference.

Data quality guru Jim Harris is of the opinion that having the right frame of reference is crucial. According to the OCDQ Blog, he believes that without context, it's difficult to make any headway through the use of data.

"Frames of reference communicate the requirements of all data users, allowing impact analysis to be performed and stronger business cases to be built for data quality improvements," Harris explained. "The bottom line is even when real-world alignment makes data fit for the purpose of every use, you still need to keep track of, and track changes in, each use. To keep the data supporting all your business objectives in context, get framed for data quality."

To that end, there are a few ground rules that need to be set when striving for data quality.

What data elements are most important?
Depending on your specific operations, your business is probably collecting a great deal of customer data. It may include information about technology use, spending habits or communication practices. At the very least, you're mining for basic information such as people's phone numbers and email addresses. It's important to evaluate which of these elements matter most. If your goal is to improve engagement with online customers, then email verification might be your No. 1 target. If snail mail is more your speed, you may focus more on street addresses. It all depends on your objectives.

How high are your standards?
Say you've already figured out which data needs fixing, be it addresses or emails or anything else. The next question is: How good is good enough? Will you be content with getting your data clusters to the point of 90 percent accuracy, or is 95 percent a better goal? Setting standards is important because it helps define your future plans. If you hit your target, you can be content with the status quo, but if you fall short, you know it's time to invest additional money and manpower.

Where are your essential checkpoints?
You need to map out a specific plan that defines how exactly you'll check for data quality. Some prefer to verify their information immediately at the point of collection. Others prefer to work in transition, checking entries as they're transferred from one system to another. Still more prefer periodic spot-checks every few months.

The best strategy may well be some combination of all of the above. In any event, starting with a specific framework will set you on the path to success.