High-quality data is essential for organizations to achieve accurate insights and informed decision-making, ultimately leading to improved operations and cost savings. However, only 20% of organizations have a centralized data quality program, resulting in disconnected strategies that contribute to poor data quality, which is estimated to cost U.S. businesses over $6 billion annually.

Driving Data Quality Improvement Across Your Organization

High-quality data provides an accurate, comprehensive view of an organization’s business processes, enabling data-driven decisions that optimize operations and reduce costs. By improving data accuracy and reliability, organizations can benefit from more informed decisions and streamlined operations.

The Current State of Data Quality

According to a leading data quality report, only 20% of organizations have a centralized data quality program. Most companies rely on fragmented, departmental strategies, leading to ineffective data quality efforts and poorer data management. The result? U.S. businesses lose over $6 billion annually due to data quality issues such as duplicate records, data errors, and lack of data accuracy.

The Need for Centralized Data Quality Management

Every organization can benefit from having a dedicated team focused on the centralized management of data assets. Research shows that businesses that implement data quality improvement initiatives within an enterprise data governance framework:

  • Enjoy increased profitability
  • Reduce operational costs
  • Leverage reliable data for data-driven decision-making

In contrast, organizations with a decentralized approach often face challenges like data silos and ineffective data quality processes, which hinder their ability to use customer data and other critical data sources as valuable assets.

Key Steps for Enhancing Data Quality

  1. Create a Data Quality Team: Establish a centralized team responsible for developing and overseeing data quality programs. This team should include:
    • Data stewards and metadata management professionals
    • Business users, technology professionals, data scientists, and analytics experts
  2. Develop a Data Quality Improvement Process: Focus on the following:
    • Address data quality challenges by setting measurement metrics and defining acceptable data quality standards
    • Implement data validation rules and data quality checks to correct errors and ensure data remains accurate
    • Foster a data quality culture by providing data quality training and continuous data quality improvement programs
  3. Leverage Data Quality Tools: Use data profiling tools and data validation software to enhance data accuracy and reliability across the organization.
  4. Monitor and Measure Data Quality: Continuously monitor data quality using well-defined measurement metrics. Ensure that data values are up-to-date and accurate by implementing regular data quality checks.
  5. Implement Continuous Training: Provide data quality training to ensure that business users and data teams understand the importance of maintaining high-quality data. This helps to sustain a data-driven culture and ensures data access is reliable.
  6. Address Pressing Data Quality Issues: Regularly review and update data quality standards to address emerging issues like data security, data silos, and data governance.

By following these steps, organizations can enhance data quality, ensure customer data and other critical data sources are reliable, and foster a data-driven culture that drives operational success.

With these foundational steps in place, organizations can move forward by defining specific strategies that align with their unique goals and business processes. The next phase involves identifying business objectives and assessing the current state of data quality to ensure continuous improvement.

1. Develop business objectives for data quality

Data quality means something different across different organizations. For some, it’s ensuring that customer contact data is accurate, so all orders are received as placed, and payments made smoothly. For others, it could be based on complete product inventory information to support sales and enable replenishment requirements.

Since many organizations discover several business objectives for improving data quality, different lines of business using the same data may have different standards and therefore different expectations for the data quality improvement program.  Ultimately, data quality is about that data being fit for a desired purpose. 

How to determine the organization’s business objectives for data quality? Consider the following factors:

  • The organization’s business goals
  • How the organization measures progress towards goals (informal, formal, no measurement)
  • What data is collected (and reasons/definitions/usage) and where it is stored
  • How this data will be used and analyzed for data quality
  • What characteristics of data quality are most important to the organization and reasons for the choice

2. Assess the current data state

Before implementing any data quality improvement plan, understand the current state, and determine the current situation for any data management efforts. 

This will help identify the next steps in the process as well as inform the organization about its strengths and challenges concerning data and its management. The four stages of data quality sophistication are often depicted as this graphic shows.

Steps To Improved Data Quality 2
Figure 1: Four stages of Data Quality Maturity

https://media.edq.com/48d959/globalassets/blog-images/dq-sophistication-curve.png

Over 60% of organizations fall into either the unaware or the reactive stage, allowing significant room for data quality improvement.

Data quality assessment should be a continuous process. The most successful businesses periodically assess the quality of their data and the effectiveness of their current data quality management plan. A regular reassessment of its data’s quality allows a business to react to areas of concern and make improvements when necessary.

3.  Profile existing data

Data profiling is the process of reviewing source data, understanding its structure, content, and relationships, and identifying potential challenges for using the data in projects.  

It is not a one-time effort and should be done regularly on all critical data.  Data quality specialists are skilled in data profiling activities and should be able to create and implement a data profiling effort.

Data profiling is a crucial part of:

  • Data warehouse and business intelligence (DW/BI) projects—data profiling can uncover data quality issues in data sources, and what must be corrected in the Extraction-Transformation-Loading (ETL) processes.
  • Data conversion and migration projects—data profiling can identify data quality issues, which can be addressed in scripts and through data integration tools that copy data from source to target. It can also uncover new requirements for the target system due to hidden business rules.
  • Source system data quality projects—data profiling can highlight data that suffers from serious or numerous quality issues, and the source of the issues (e.g., user inputs, errors in interfaces, data corruption).

Data profiling involves:

  • Identifying a relevant sample of data from a dataset/database
  • Collecting descriptive statistics (e.g., min, max, count, and sum) against critical data in the dataset – including statistics on missing or incomplete data
  • Collecting data types, length, and recurring patterns for identified critical data
  • Tagging data with keywords, descriptions, or categories to confirm meaning and usage
  • Discovering metadata and assessing its accuracy against expectations
  • Performing data quality assessment processes to evaluate results
  • Identifying distributions, key candidates, foreign-key candidates, functional dependencies, embedded value dependencies, and performing inter-table analysis (advanced profiling)

4. Cleanse existing data

Inaccuracies or inconsistencies in data will affect data quality and prevent its use for effective decision-making or streamlining operations. Data cleansing is the process of correcting incomplete or inaccurate information, fixing formatting issues, providing accurate metadata, etc. Issues that require data cleansing include:

  • Typos and other errors in data entries
  • Incomplete data – missing data – partially completed records
  • Inconsistent formatting of addresses or contact information
  • Inconsistent use of reference data – different depending on the application or department
  • Incomplete or outdated contact information
  • Incomplete transaction records
  • Duplicate data entries (full duplication or partial)

Resolving (cleansing) data quality errors supports more confident use of data for operations and decisions. There are a variety of tools available to cleanse data, depending on the technical platform and the business objectives, but the focus should be on consistent processes for managing data rather than relying on tools Identify and correct data quality issues in source data, before moving it into a target database. Only move confirmed “clean” data into any target file or database.

5. Establish a data collection plan for improved data quality management

Because data collection issues are nearly inevitable, develop a clear plan to reduce the occurrence of these issues and address them if they occur.  The data quality team should work with data governance and appropriate technology teams to identify the process steps, metrics, and guidelines to prevent poor quality data from entering the environment, and consistently assess the quality of the data currently stored. Methods to include in an effective data collection plan include:

  • Using tools and user supports to ensure accuracy upon entry, including creating required fields on forms, using consistent reference data, and providing data definitions at data entry
  • Regular, periodic data cleansing processes performed by the data quality team and supported by business data stewards to catch errors or identify inaccurate/incomplete information
  • Providing appropriate, regular training to all staff on the concepts, processes, and guidelines for supporting high-quality data collection and usage

A clear data collection and data quality management plan saves valuable time and money (fewer mistakes, less time spent on resolving data quality issues) and increases confidence in data for operational and strategic purposes.

6. Develop a Plan for Data Quality Maintenance and Improvement

Maintaining high data quality and enhancing data management capabilities requires a well-structured plan that prioritizes continuous assessment and improvement. Data-driven operations rely on evolving enterprise strategies for data governance, metadata management, and data quality initiatives.

A robust data quality plan should include:

  • Data profiling, data cleansing, and measuring the effectiveness of data quality improvement efforts
  • Presenting data quality results through a dynamic data quality dashboard, which provides clear insights for all stakeholders
  • Regular updates to the dashboard to highlight data quality improvements and data issues that need attention

To foster a data quality culture, assign data quality responsibilities to business data stewards and provide them with specialized training. Data stewards, working with data quality specialists, play a critical role in ensuring that measurement metrics are accurately defined and data remains accessible and trustworthy.

A successful plan for improving data quality requires buy-in from leadership, technical teams, and all business units. Key strategies should include:

  1. Define acceptable data quality standards and develop techniques to correct errors and manage data silos.
  2. Ensure data ownership is clearly assigned, with all users understanding their role in maintaining accurate data.
  3. Regularly assess and refine data quality techniques to adapt to new data issues and challenges.

Encouraging a culture of continuous improvement, where data quality challenges are openly addressed, is crucial for long-term business success. By acknowledging problems like flawed data collection or inconsistent monitoring, and proactively managing data governance activities, organizations can ensure improved data accuracy and develop a system that supports high data quality for all decision-making processes.

A well-executed plan ensures the organization can meet its objectives and use data as a valuable asset for growth and efficiency.

How to Improve Data Quality Across the Organization

Implementing data quality management processes throughout an organization is essential for achieving accurate insights and effective decision-making. By aligning data quality initiatives with data governance efforts, companies can improve data quality, enforce consistent data standards, and ensure data is reliable and actionable.

A well-defined strategy includes the ongoing participation of data stewards to ensure:

  • Data is accessible to all relevant users
  • Measurement metrics are accurately defined to track data quality progress
  • Data silos are effectively managed to prevent fragmentation and ensure data consistency

This approach helps to maintain data quality and empowers organizations to use high-quality data for streamlined operations and informed decisions.

Ultimately, attention to improving data quality fosters an environment where data is more valuable, ensuring better outcomes for decision-makers and operational efficiency across the entire business.