Skip to main content

Data quality, explained in a 3-step process

Richard Jones Archive

As technology improves and analytics becomes more and more of a focus for every business over time, corporate leaders are beginning to realize the importance of data quality. It's obviously vital - if you're going to great lengths to analyze your customers and figure out how to meet their needs, what good will it do if all your data is wrong?

An error in quality can be any number of things - it might be a misspelled name or a wrong address, or it might be a particular demographic or financial detail that's incorrect. Any of the above can lead to companies making faulty assumptions about their patrons, which can only lead to bad news down the road.

It's therefore clear that something needs to be done about "dirty data." But what? If you ask the data curators in many corporate offices, they'll tell you that a reactive strategy is the easiest way to go. Shuffle through your existing data, find the mistakes, correct them and move on. That's the obvious solution, and on paper it makes sense.

Except there's an issue. According to Business 2 Community, that's not the most efficient way to go. Data management expert Michael Farrington called attention to a recent study from Sirius Decisions about the way companies handle their customer relationship management records. The research revealed that the cost of fixing a bad CRM record tends to be around $10. The cost of preventing that bad record from happening in the first place? Far lower - more like $1.

This makes intuitive sense. After all, fixing an erroneous piece of information is a lot of work. It requires exhaustingly searching through databases, finding the mistakes, proving that they are in fact mistakes, going back and verifying the correct info, and then resubmitting it all. All that time adds up, and companies are going to pay handsomely for the labor.

Preventing a mistake before it happens, though? Far easier. There's no tiresome searching involved - it's just collect and clean.

Here's a recommended three-step process for efficient and foolproof verification of data quality:

Clean the data
The key is to gather information from people and cleanse it at the initial point of collection - but the tricky part is that different users' points of collection may vary. One person might send information in the mail, using a handwritten form, while another might submit it online and a third might use a mobile device. They all might use different formats and abbreviations to spell out their information. A good way to clean data is to make sure everything, as it's collected, is in a good standardized format to be easily read and used later.

Protect the data
Data quality is more than a one-time thing, though. Once information is collected in the first place, it needs to be protected down the road. This means consistently checking up and making sure nothing needs to be changed. If someone moves, that's a new address. If they get married, it might mean a new last name. Pieces of data are becoming outdated and inaccurate all the time. The challenge is in keeping up.

Enrich the data
Thirdly: Is it really enough just to have data that's free from mistakes? Simply having an entry without typos is a fairly low standard to settle for. A superior strategy is this: Once you have an accurate file on a customer, seek to enhance it. Reel in more information that will give you a greater understanding of their complete profile. Include an email address, a Twitter handle, some purchasing history details that might better inform you. Every little piece of knowledge helps, so a bit of data enrichment can take you a long way.