Skip to main content

Quality for unstructured data 'remains unclassified'

The data quality characteristics of unstructured data remain largely ambiguous and unclassified, according to a sector commentator.

While there are well-established dimensions for structured and textual data, such as currency, completeness and timeliness, these do not fit into the good data quality practices of other information, said David Loshin, posting the query on the data governance blog of DataFlux.

He went on to say: "The typical dimensions don't really apply in the same way - for example, how do you gauge the completeness of a video? On the other hand, one can consider accuracy and precision when it comes to the content concepts embedded without unstructured data.

"For example, does the photograph accurately represent what is presumed to be pictured (or has photoshop been used to alter the veracity of the image)? You can also consider precision, such as the clarity of the sound recording, or the visibility of the video image."

Last month, Stuart Johnson, the managing director of Experian QAS UK, remarked that data quality is key to successful direct marketing strategies.