When the information technology industry first caught wind that data was rapidly gaining interest, mass and speed, professionals may have been so focused on the potential that they didn't see the ways in which it could possibly get off track.
In a recent article for TWDI, professor Gian Di Loreta recounts a conversation he had in the late 1990s when a liberal arts professor said that this would turn into an age-old debate of quantity versus quality. At the time, Di Loreta dismissed the warning, under the assumption that the technical schools of thought would build in the necessary solutions to eliminate quality issues. Fast-forward 15 years, and data quality is a major concern for all users, and it's gaining attention as big data becomes ubiquitously tied to corporate success.
From insurance companies that are beginning to offer online services to marketing firms looking to launch highly targeted email campaigns, businesses across sectors are cracking open their massive databases and hoping to spot golden nuggets of actionable information. However, they must confirm that all of the information stored in their platforms is correct before they reach in and expect to pull out helpful insights.
The following are four ways to weed out "dirty data" and leave behind the content that's clean and ripe for the picking.
1. Get your information squeaky clean
If companies want to know how to improve their data, they must first take an honest assessment of the information they already have. CRM Search suggests that businesses use data quality tools to check their content, such as address management programs to verify email addresses, names and contact data.
Information that shows up as incomplete, inaccurate or outdated should be fixed or ditched. In an era that's focused on hoarding data, it might seem counterintuitive to get rid of content. However, holding onto dirty data will only create opportunities for further contamination.
Once data users have a clean slate, they can take the steps necessary to identifying and amending the root cause.
2. Create data standards
Individuals throughout the United States and around the world have different standards for recording information. Some write out the word "Street" when they are asked for address information, while others use the "St." abbreviation, the source adds. This can create increase variation and cause confusion within a system.
It might be easy to figure what a person means in those circumstances, but there are more significant differences that can be problematic. For instance, oil companies in Russia record gamma ray data in the opposite way from those in western countries, according to the Digital Energy Journal. Creating and communicating standards can eliminate some of the issues that crop up as a result of these divergences.
3. Perform email verification at the source
When companies confirm data quality at the source, there is less of a chance that end users will recognize flaws and need to spend time tracing back the cause and making fixes, CRM Search reports. Websites that feature double opt-ins ask that consumers confirm their information before sending it through.
This is an effective way to catch incorrectly typed titles, email addresses and other basic information before it's added to contact lists and customer relationship management (CRM) databases.
4. Assign a point person
Most organizations will be hard-pressed to find an employee who is willing to take the blame for poor data quality that has been weakening the company's bottom line. Rather than assigning any blame, decision-makers can assign responsibility instead. Picking a data quality point person and holding him or her accountable is often an effective method for reducing errors and improving accuracy. That individual will need to enforce data standards, audit use of quality assurance tools and secure entry points.