What is a data quality dimension?
A data quality dimension is a term used to describe a data quality measure that can relate to multiple data elements including attribute, record, table, system or more abstract groupings such as business unit, company or product range.
There is no widespread agreement for a definitive list of data quality dimensions but most practitioners recognise the importance of 6 core dimensions: accuracy, completeness, uniqueness, timeliness (often referred to as Currency), validity and consistency.
Data quality dimensions are a useful measurement approach for comparing data quality levels across different systems (or tables/business functions) over time.
A data quality dimension is typically presented as a percentage or a total count. For example, 97% of equipment codes were valid or 123,722 patient records were incomplete.
A single data quality dimension may require a number of data quality rules to be created in order for a measure to be processed.
But ‘missing values’ may require a further set of data quality rules to execute a comprehensive measure. For example, someone may type in ‘N/A’ or ‘Unknown’ but this still equates to a missing value so we would need a processing rule to discover ‘hidden blanks’ within an attribute.
Due to the complexity and processing logic required to manage and control the usage of data quality dimensions, most organisations rely on data quality management software. This allows complex data quality rules to be consolidated into data quality dimensions. These can then be reused and applied across the whole organisation.