What Does Data Quality Mean?
Data quality (DQ) is the degree to which a given dataset meets a user’s needs. Data quality is an important criteria for ensuring that data-driven decisions are made as accurately as possible.
High quality data is of sufficient quantity — and has sufficient detail — to meet its’ intended uses. It is consistent with other sources, presented in appropriate ways and has a high degree of completeness. Other key data quality components include:
- Accuracy — The extent to which data represents real-world events accurately.
- Credibility — The extent to which data is considered trustworthy and true.
- Timeliness — The extent to which data meets the user’s current needs.
- Consistency — The extent to which the same data occurrences have the same value in different datasets.
- Integrity — The extent to which all data references have been joined accurately.
Currently, there is no global standard for evaluating and verifying data quality. Instead, most organizations approach data quality improvement on an organizational or project-by-project basis, using policies and frameworks to ensure data is properly collected, handled and processed at all stages of the information lifecycle.
Data Quality Guidelines
Extracting reliable and useful information from a large quantity of data requires the data to be as complete and error-free as possible. When data quality is unreliable, it can lead to poor decisions and wasted budget. If poor quality data is being used to make decisions about an online advertising campaign, for example, it’s likely that valuable advertising dollars will be spent on consumers who do not belong to the target audience.
The quality of data should be constantly assessed and reassessed in an iterative fashion to ensure that appropriate levels of quality are sustained in an acceptable and transparent manner. It requires organizations to establish data quality guidelines for data managers, data stewards, and other stakeholders who use the data. This includes:
- Assessing data quality early and often.
- Adopting a framework for evaluating data quality in order to ensure that all aspects of data quality are evaluated and verified consistently. Data quality assessments (DQAs) can help managers understand how much confidence they should have in specific datasets.
- Periodically reviewing data quality policies to ensure they support compliance regulations.
- Hiring a neutral third party to monitor data quality. Look for a partner who has both the expertise and wherewithal to identify which datasets are high quality and privacy compliant, and which are inherently flawed and will raise concerns.
Internal data quality policies should include guidelines for data entry, edit checking, validating and auditing data, correcting data errors, and removing the root causes of data contamination. Guidelines should also include policies and procedures for change-control, standardizing data formats and resolving data disputes.
Techopedia Explains Data Quality
There are an increasing number of factors to consider when it comes to data quality and those using data are often left to evaluate data quality in a disjointed ad hoc manner. It’s important for organizations to engage stakeholders from all relevant areas of a business to agree on the following:
- How will data quality be monitored?
- What goals and objectives can be achieved better if data quality is improved?
- How will efforts to improve data quality be prioritized?
- What are the risks of poor data quality regarding cost, compliance and productivity?
- Who will lead and coordinate improvement efforts?
- How will data quality improvements be measured, analyzed, and reported on over time
If processes, methods, and procedures are developed independently for each effort, the organization risks:
- A lack of awareness on the part of business staff about quality needs across the data lifecycle;
- Undue effort and duplicative costs; and
- Inefficient implementations (for example, repeatedly cleansing data in a downstream data store while not improving data quality at the source).
Data Quality and Compliance
Data quality plays an important role in privacy compliance. Regulations such as GDPR and COPPA are designed to ensure that consumer data is collected transparently and all personally identifiable information (PII) is handled in a safe manner. Poor data quality practices, such as collecting data without appropriate consumer consent, can result in stiff regulatory fines for non-compliance.
When businesses purchase data from a data broker, they do not have transparency into how the data was collected or is being stored and secured. By being more transparent about how data is sourced and stored and highlighting overall data quality, those employees using the data will have more trust in the results.
This is an important business consideration because a lack of trust can negatively impact reputational risk. It is more important now than ever for companies to be able to demonstrate that they are collecting data in a transparent, safe manner upfront and appropriately securing data in transit and data at rest after collection.
Data quality is an evolving space. As privacy regulations continue to expand, data quality verification will become an even more critical component of business operations. In a large organization, planning efforts may extend over a few weeks. In a small organization, strategic planning for data quality may be completed in a few brief meetings.