High-quality data is essential for organizations to achieve accurate insights and informed decision-making, ultimately leading to improved operations and cost savings. However, only 20% of organizations have a centralized data quality program, resulting in disconnected strategies that contribute to poor data quality, which is estimated to cost U.S. businesses over $6 billion annually.
Driving Data Quality Improvement Across Your Organization
High-quality data provides an accurate, comprehensive view of an organization’s business processes, enabling data-driven decisions that optimize operations and reduce costs. By improving data accuracy and reliability, organizations can benefit from more informed decisions and streamlined operations.
The Current State of Data Quality
According to a leading data quality report, only 20% of organizations have a centralized data quality program. Most companies rely on fragmented, departmental strategies, leading to ineffective data quality efforts and poorer data management. The result? U.S. businesses lose over $6 billion annually due to data quality issues such as duplicate records, data errors, and lack of data accuracy.
The Need for Centralized Data Quality Management
Every organization can benefit from having a dedicated team focused on the centralized management of data assets. Research shows that businesses that implement data quality improvement initiatives within an enterprise data governance framework:
Enjoy increased profitability
Reduce operational costs
Leverage reliable data for data-driven decision-making
In contrast, organizations with a decentralized approach often face challenges like data silos and ineffective data quality processes, which hinder their ability to use customer data and other critical data sources as valuable assets.
Key Steps for Enhancing Data Quality
Create a Data Quality Team: Establish a centralized team responsible for developing and overseeing data quality programs. This team should include:
Data stewards and metadata management professionals
Business users, technology professionals, data scientists, and analytics experts
Develop a Data Quality Improvement Process: Focus on the following:
Address data quality challenges by setting measurement metrics and defining acceptable data quality standards
Implement data validation rules and data quality checks to correct errors and ensure data remains accurate
Foster a data quality culture by providing data quality training and continuous data quality improvement programs
Leverage Data Quality Tools: Use data profiling tools and data validation software to enhance data accuracy and reliability across the organization.
Monitor and Measure Data Quality: Continuously monitor data quality using well-defined measurement metrics. Ensure that data values are up-to-date and accurate by implementing regular data quality checks.
Implement Continuous Training: Provide data quality training to ensure that business users and data teams understand the importance of maintaining high-quality data. This helps to sustain a data-driven culture and ensures data access is reliable.
Address Pressing Data Quality Issues: Regularly review and update data quality standards to address emerging issues like data security, data silos, and data governance.
By following these steps, organizations can enhance data quality, ensure customer data and other critical data sources are reliable, and foster a data-driven culture that drives operational success.
With these foundational steps in place, organizations can move forward by defining specific strategies that align with their unique goals and business processes. The next phase involves identifying business objectives and assessing the current state of data quality to ensure continuous improvement.
1. Develop business objectives for data quality
Data quality means something different across different organizations. For some, it’s ensuring that customer contact data is accurate, so all orders are received as placed, and payments made smoothly. For others, it could be based on complete product inventory information to support sales and enable replenishment requirements.
Since many organizations discover several business objectives for improving data quality, different lines of business using the same data may have different standards and therefore different expectations for the data quality improvement program. Ultimately, data quality is about that data being fit for a desired purpose.
How to determine the organization’s business objectives for data quality? Consider the following factors:
The organization’s business goals
How the organization measures progress towards goals (informal, formal, no measurement)
What data is collected (and reasons/definitions/usage) and where it is stored
How this data will be used and analyzed for data quality
What characteristics of data quality are most important to the organization and reasons for the choice
2. Assess the current data state
Before implementing any data quality improvement plan, understand the current state, and determine the current situation for any data management efforts.
This will help identify the next steps in the process as well as inform the organization about its strengths and challenges concerning data and its management. The four stages of data quality sophistication are often depicted as this graphic shows.
Figure 1: Four stages of Data Quality Maturity
https://media.edq.com/48d959/globalassets/blog-images/dq-sophistication-curve.png
Over 60% of organizations fall into either the unaware or the reactive stage, allowing significant room for data quality improvement.
Data quality assessment should be a continuous process. The most successful businesses periodically assess the quality of their data and the effectiveness of their current data quality management plan. A regular reassessment of its data’s quality allows a business to react to areas of concern and make improvements when necessary.
3. Profile existing data
Data profiling is the process of reviewing source data, understanding its structure, content, and relationships, and identifying potential challenges for using the data in projects.
It is not a one-time effort and should be done regularly on all critical data. Data quality specialists are skilled in data profiling activities and should be able to create and implement a data profiling effort.
Data profiling is a crucial part of:
Data warehouse and business intelligence (DW/BI) projects —data profiling can uncover data quality issues in data sources, and what must be corrected in the Extraction-Transformation-Loading (ETL) processes.
Data conversion and migration projects —data profiling can identify data quality issues, which can be addressed in scripts and through data integration tools that copy data from source to target. It can also uncover new requirements for the target system due to hidden business rules.
Source system data quality projects —data profiling can highlight data that suffers from serious or numerous quality issues, and the source of the issues (e.g., user inputs, errors in interfaces, data corruption).
Data profiling involves:
Identifying a relevant sample of data from a dataset/database
Collecting descriptive statistics (e.g., min, max, count, and sum) against critical data in the dataset – including statistics on missing or incomplete data
Collecting data types, length, and recurring patterns for identified critical data
Tagging data with keywords, descriptions, or categories to confirm meaning and usage
Discovering metadata and assessing its accuracy against expectations
Performing data quality assessment processes to evaluate results
Identifying distributions, key candidates, foreign-key candidates, functional dependencies, embedded value dependencies, and performing inter-table analysis (advanced profiling)
4. Cleanse existing data
Inaccuracies or inconsistencies in data will affect data quality and prevent its use for effective decision-making or streamlining operations. Data cleansing is the process of correcting incomplete or inaccurate information, fixing formatting issues, providing accurate metadata, etc. Issues that require data cleansing include:
Typos and other errors in data entries
Incomplete data – missing data – partially completed records
Inconsistent formatting of addresses or contact information
Inconsistent use of reference data – different depending on the application or department
Incomplete or outdated contact information
Incomplete transaction records
Duplicate data entries (full duplication or partial)
Resolving (cleansing) data quality errors supports more confident use of data for operations and decisions. There are a variety of tools available to cleanse data, depending on the technical platform and the business objectives, but the focus should be on consistent processes for managing data rather than relying on tools Identify and correct data quality issues in source data, before moving it into a target database. Only move confirmed “clean” data into any target file or database.
5. Establish a data collection plan for improved data quality management
Because data collection issues are nearly inevitable, develop a clear plan to reduce the occurrence of these issues and address them if they occur. The data quality team should work with data governance and appropriate technology teams to identify the process steps, metrics, and guidelines to prevent poor quality data from entering the environment, and consistently assess the quality of the data currently stored. Methods to include in an effective data collection plan include:
Using tools and user supports to ensure accuracy upon entry, including creating required fields on forms, using consistent reference data, and providing data definitions at data entry
Regular, periodic data cleansing processes performed by the data quality team and supported by business data stewards to catch errors or identify inaccurate/incomplete information
Providing appropriate, regular training to all staff on the concepts, processes, and guidelines for supporting high-quality data collection and usage
A clear data collection and data quality management plan saves valuable time and money (fewer mistakes, less time spent on resolving data quality issues) and increases confidence in data for operational and strategic purposes.
6. Develop a Plan for Data Quality Maintenance and Improvement
Maintaining high data quality and enhancing data management capabilities requires a well-structured plan that prioritizes continuous assessment and improvement. Data-driven operations rely on evolving enterprise strategies for data governance, metadata management, and data quality initiatives.
A robust data quality plan should include:
Data profiling, data cleansing, and measuring the effectiveness of data quality improvement efforts
Presenting data quality results through a dynamic data quality dashboard, which provides clear insights for all stakeholders
Regular updates to the dashboard to highlight data quality improvements and data issues that need attention
To foster a data quality culture, assign data quality responsibilities to business data stewards and provide them with specialized training. Data stewards, working with data quality specialists, play a critical role in ensuring that measurement metrics are accurately defined and data remains accessible and trustworthy.
A successful plan for improving data quality requires buy-in from leadership, technical teams, and all business units. Key strategies should include:
Define acceptable data quality standards and develop techniques to correct errors and manage data silos.
Ensure data ownership is clearly assigned, with all users understanding their role in maintaining accurate data.
Regularly assess and refine data quality techniques to adapt to new data issues and challenges.
Encouraging a culture of continuous improvement, where data quality challenges are openly addressed, is crucial for long-term business success. By acknowledging problems like flawed data collection or inconsistent monitoring, and proactively managing data governance activities, organizations can ensure improved data accuracy and develop a system that supports high data quality for all decision-making processes.
A well-executed plan ensures the organization can meet its objectives and use data as a valuable asset for growth and efficiency.
How to Improve Data Quality Across the Organization
Implementing data quality management processes throughout an organization is essential for achieving accurate insights and effective decision-making. By aligning data quality initiatives with data governance efforts, companies can improve data quality, enforce consistent data standards, and ensure data is reliable and actionable.
A well-defined strategy includes the ongoing participation of data stewards to ensure:
Data is accessible to all relevant users
Measurement metrics are accurately defined to track data quality progress
Data silos are effectively managed to prevent fragmentation and ensure data consistency
This approach helps to maintain data quality and empowers organizations to use high-quality data for streamlined operations and informed decisions.
Ultimately, attention to improving data quality fosters an environment where data is more valuable, ensuring better outcomes for decision-makers and operational efficiency across the entire business.