Skip to main content

Data scientists need good metrics for assessing quality

Paul Newman Archive

It would be difficult for anyone in the business world - especially in the areas of marketing or customer support - to disagree that data quality is important. Companies need to gather accurate information about their customers and the economy as a whole. Doing so can help them fine-tune a variety of practices.

The challenge, though, is finding the right metrics to gauge that quality. Everyone wants to have accurate knowledge of the market, but how accurate is accurate enough? How is that measured?

Recent research has shed some light on that very topic. Data experts are looking to get to the bottom of the quality issue, asking the right questions and probing for answers.

Searching for completeness
According to TechTarget, today's data experts are looking for "completeness" in their pursuits. They want the information they gather to tell the whole story in a way that's timely, valid and consistent with all other verifiable information that's out there. The news source spoke with Laura Sebastian-Coleman, author of "Measuring Data Quality for Ongoing Improvement" and creator of Optum Insight's Data Quality Assessment Framework, for some background on how businesses can improve their data quality strategies.

"What does it mean to measure the completeness - or timeliness, validity, consistency or integrity - of data at the beginning of data processing?" Sebastian-Coleman asked. "During data processing? After processing? What should measurement results look like? How would measurements detect when the data was not in the desired condition?"

These are the questions that business leaders need to answer as they move forward. One of the most practical concerns they face is where exactly in the data lifecycle to focus on quality. It can happen as data is collected, as it's transferred from place to place, or simply at regular intervals along the way.

Applying knowledge in the real world
The next step, of course, is finding ways for organizations to apply all the lessons learned from this high-quality data they're working with.

Sebastian-Coleman noted that specific strategies for applying data vary depending on the industry in question. In health, it's a matter of communicating the right information to many different parties - the doctors and nurses, the insurance companies and the patients. In retail, it's a matter of using data to assess how rival merchants can beat out the competition.

For all organizations, though, finding high-quality information should be a priority. People must know how to find good data and measure just how good it is.

Comments