Skip to main content

Your data’s diet is more important than you think

We’ve all heard the saying “you are what you eat” in reference to our morning donut and coffee. But did you ever consider that the same principle applies to your organization’s data collection processes? Much like the saturated fats from bad-for-you food that clog up your arteries, unchecked bad data can enter your database and compromise your ability to draw on that information in the future. That’s why it’s important to put measures in place to validate and correct bad data at every point of entry. Before your database has a cardiac event (and increases your stress levels), let’s talk about your data’s diet.

Is your data consuming locally-sourced and organic ingredients?

This may seem like a tongue-and-cheek question, but the general idea here is to identify whether you know where your data is coming from. Do you have insight into all of the various channels that bring data into your organization? These channels might include online forms that your marketing team uses to collect customer information, Ecommerce transactions that include payment and shipping data, and call centers that track customer service requests.

While these are a few examples, your organization might have several dozen channels through which data flows into the organization, including third-party data suppliers. In order to prevent bad data from entering your systems, you will want to deploy data validation tools on each and every one of these channels. This means using email, address, and telephone validation tools on your website, in your call centers, and anywhere else data is collected.

If you don’t know exactly where your data is coming from, you’ll want to work with various departments within your organization to understand the full picture before you can even think about addressing quality issues. After all, you wouldn’t eat something that just appeared on your desk, would you?

Is your data digestible?

Much like food that goes straight to your gut, all of the data that enters your organization needs a place to go. Typically, your data enters a storage repository such as a data warehouse or a data lake. A data warehouse stores structured data in a way in which business users can access the data at any time. In a data lake, however, structured and unstructured data can coexist, and IT is usually needed to generate reports from the data. While data warehouses are ideal for business users, they’re typically more expensive to maintain in the long run and don’t handle the amount of data that you can store in a data lake.

Whichever repository your organization uses to store its data assets, you’ll want to ensure that the information stored there is digestible. Because you’ve taken the step to ensure all data entering your system from outside channels is accurate and actionable, you can turn your attention to profiling, transforming, and monitoring the information in your repository. This will provide a second layer of defense against bad data from lurking your system. If your organization uses a data lake, for instance, it can be very difficult to get a holistic view of your datasets, making it hard to tell the good from the bad. By running a full-volume analysis on your database, you’ll be able to profile massive amounts of data at once to identify anomalies in your records.

Once you’ve identified the records you’d like to fix, you can involve the appropriate stakeholders at your business who would be best equipped to resolve the discrepancies. In addition, you can transform the data at this stage to meet your business requirements for standardization. Lastly, to ensure the quality of your data is upheld over time, you’ll want to set up business-defined rules to monitor your data quality and to alert you to any sudden dips. Any drop in data quality can signal a failure of validation at one of your channels of data capture, which should be investigated immediately.

Is your data taking any supplements?

Just as a healthy dose of vitamins can help ward off illness and put a little pep in your step, supplementing your data with third-party information can unlock a world of insight. Commonly referred to as data enrichment, the process is actually quite simple. Working with a third-party vendor, you can append hundreds of attributes (such as occupation, estimated household income, marital status, and more) to your existing customer contact records. By leveraging data from self-reported information, public records, and historical retail purchases, data enrichment can help you to discover the attitudes, values, and motivations that drive your customers’ purchasing decisions.

By supplementing your data using third-party supplied information, you can make better and faster marketing decisions, target your most valuable customers, and more accurately predict their future behaviors. Sounds like one heck of a vitamin, right? It can be when leveraged appropriately. Like anything else, you need to be careful for low-quality data that these vendors might supply. Always work with trusted enrichment vendors who have solid reputations for accuracy, and always incorporate this third-party information into your data quality ecosystem to ensure it remains accurate.

We have the tools to get your data into shape. Let us help you!

Comments