Affiliated with:

Foundations of Enterprise Data Management

Enterprise Data Management is the global function that facilitates the management of data as a valuable asset of an enterprise.  Its components align to provide the capabilities to manage data as an organization resource.

Enterprise Data Management (EDM) is the global function that facilitates the management of data as a valuable asset of an enterprise.  Enterprise Data Management (EDM) includes:

  • Data Governance – planning, oversight, and control over management of data and the use of data and data-related resources; development and implementation of policies and decision rights over the use of data.
  • Data Architecture – the overall structure of data and data-related resources as an integral part of the enterprise architecture
  • Data Operations –  structured physical data assets storage deployment and management
  • Data Security – ensuring privacy, confidentiality and appropriate access to data
  • Data Integration & Development – acquisition, extraction, transformation, movement, delivery, replication, federation, virtualization and operational support
  • Documents & Content Management – storing, protecting, indexing, and enabling access to data found in unstructured sources (electronic files and physical records), and making this data available for integration and interoperability with structured (database) data.
  • Reference & Master Data – Managing shared data to reduce redundancy and ensure better data quality through standardized definition and use of common data values.
  • Data Warehousing & Business Intelligence – managing analytical data processing and enabling access to decision support data for reporting and analysis
  • Metadata – collecting, categorizing, maintaining, integrating, controlling, managing, and delivering the context of data (definitions, calculations, descriptions, sources, etc.)
  • Data Quality – defining, monitoring, maintaining data integrity, and improving accuracy, completeness, validity, timeliness, consistency of data

Data is a corporate asset that must be managed to maximize its value, and its management must be enabled as a common capability to ensure that this asset is used properly.  Leveraging data to meet corporate information needs is very difficult when the definition of fields, formats, and codes are not consistent.  When data resides in multiple applications the task of managing data is more difficult.  By implementing a proven EDM framework, the business can respond quickly to changes in the marketplace, since EDM provides a connective layer that should be implemented across the enterprise, not just for a single application or IT project.  EDM addresses these issues by taking a corporate-wide view of the data and provides the methods and processes that treat data as a shared resource, adding value for the enterprise and all its business units.

Just as a building is held together by its supporting structure, its frame, frameworks support the understanding and implementation of related concepts.  In enterprise data management, a framework provides principles and practices for creating and using the description of a method of managing data and information as organizational resources.  It structures stakeholders’ thinking by dividing the totality of enterprise data management into domains, layers, or views, and offers a variety of models for documenting each view.

There are several recognized enterprise data / information management frameworks in the industry, some from product vendors and others from vendor-neutral organizations.  It is important to use an industry standard and vendor-neutral framework to understand and implement enterprise data management effectively.

Image 2

EWSolutions EDM framework

EDM Objectives

The management of data within an enterprise should adhere to the same goals and objectives that apply to the management of other enterprise resources.  Common objectives for enterprise data management include:

  1. To understand the data and information needs of the company for all stakeholders
  2. To capture, store, protect and ensure the integrity of the data and information needed for operational and decision-making activities
  3. To continually improve the availability and quality of data and information across the company
  4. To promote consistent understanding of the meaning and context of data
  5. To prevent inappropriate use and access of data and information
  6. To provide reliable, accurate, timely and consistent data to support effective business processes and informed decision making
  7. To leverage use of data and information assets to their full value while controlling costs appropriately
  8. To promote a wider and deeper understanding of the value of data and information assets to the company’s success
  9. To manage data and information consistently across the company with appropriate policies, standards and guidelines, communicated and implemented consistently
  10. To align the information management infrastructure with business requirements according to an accepted enterprise strategy for managing data and information

Components of EDM – Overview

Following are short overviews of each of the parts of an enterprise data / information management program as defined by most industry experts.

Data Governance and Data Stewardship

Data governance provides the principles, processes and organizational definition to preserve and enhance the data asset. This function is performed by data management / data governance professionals.

Data stewardship enables the organization to make sound, consistent decisions using the knowledge of the business people about the data they use daily within the guidance of industry standards and best, proven practices for data management, supported by the guidance of data management and data governance professionals.

Data governance is a foundational part of enterprise data / information management, and contributes to the success of all the other component disciplines of EDM.  In addition, all the other component disciplines support data governance in various ways. 

Data Architecture

Data architecture is a set of rules, policies, standards, and models that govern and define the type of data collected and how it is used, stored, managed and integrated within an organization.  It provides a formal approach to creating and managing the flow of data and how it is processed across an organization’s IT systems and applications.  Data Architecture Management is the process of defining and maintaining specifications that:

  • Provide a standard, common business vocabulary
  • Express strategic data requirements that reflect the business needs
  • Outline high level integrated designs to meet these requirements
  • Align with enterprise strategy and related business architecture

Data Integration and Development

Data integration and development is the analysis, design, implementation, deployment, and maintenance of data solutions to maximize the value of the data resources to the enterprise.

Data development is the subset of project activities within the system development lifecycle (SDLC) focused on defining data requirements, designing the data solution components, and implementing these components.  The primary data solution components are databases and other data structures. Other data solution components include information products (screens and reports) and data access interfaces.

Data implementation consists of data management activities that support system building, testing, and deployment, including:

  • Database implementation and change management in the development and test environments.
  • Test data creation, including any data security procedures, such as obfuscation.
  • Development of data migration and conversion programs, both for project development through the SDLC and for business situations like consolidations or divestitures.
  • Validation of data quality requirements.
  • Creation and delivery of user training.
  • Contribution to the development of effective documentation.

Data Operations Management

Data operations management is the development, maintenance, and support of structured data to maximize the value of the data resources to the enterprise.  Data governance supports data operations through the processes and standards that are the basis of the data governance function, making the activities of data operations more efficient, more consistent, and more effective across the organization.

Data operations management includes two sub-functions:

  • Database support
  • Data technology management

Database support is one of the oldest parts of data management; the function is delivered by database administrators (DBAs).  The role of DBA is the most established and most widely adopted data professional role, and database administration practices are perhaps the most mature of all data management practices since data management started with the physical administration of data assets.

Data Security Management

Data Security Management is the planning, development, and execution of security policies and procedures to provide proper authentication, authorization, access, and auditing of data and information assets.

Effective data security policies and procedures ensure that the right people can use and update data in the right way, and that all inappropriate access and update is restricted. Data governance policies can establish data security requirements, in conjunction with data security professionals, to ensure the application of appropriate levels of security to each critical data object.

Reference and Master Data Management

Reference and Master Data Management is the continuing reconciliation and maintenance of reference data and master data, which provide the context for transaction and analytical data.  Reference and master data management is a combined discipline, with two related components:

  • Reference Data Management is control over defined domain values (also known as vocabularies)
  • Master Data Management is control over common data values from recognized subject areas,  enabling consistent, shared, contextual service for data used in multiple applications or operations

Master data are the critical nouns of a business and fall generally into four groupings: people, things, places, and concepts.  Since master data is used by multiple applications, an error in master data can cause errors in all the applications that use it.  For example, an incorrect address in the customer master might mean orders, bills, and marketing literature are sent to the wrong address (resulting in large unnecessary expenses for mailing and re-mailing, printing and re-printing, error identification and correction, etc.).

Reference data are those forms of data that are used as code values and other types of connection to actual data (state codes, product hierarchies, units of measure, etc.)  In contrast to master data, reference data usually consists only of a list of permissible values and attached textual descriptions.

Data Warehousing and Business Intelligence

A data warehouse is a central storage facility for the analytical data that an organization needs for making business decisions.  Data warehousing collects data from multiple sources, both internal and external, to provide reporting and analysis opportunities for users across the organization of current and historical trends.  The data warehouse maintains the history and lineage of the data that is considered critical to the analytical and reporting needs of the enterprise, even if the source systems do not store history.  Many data warehouses re-format or re-structure data so that it is useful and intelligible to business users engaged in analysis.  This activity requires that the data warehouse maintain significant metadata concerning the source data and the data as it is stored in the data warehouse, which means that successful data warehouses rely heavily on formal data governance programs to define the proper source of data, its accepted definition and usage, and valid contexts.

A Data Warehouse (DW) consists of two main components:

  • Integrated decision support database
  • Related software applications used to manage data from a variety of sources (internal operational, and possibly external)

Document and Content Management (Unstructured Content Management)

Document and Content Management is the control over capture, storage, access, and use of data and information stored outside relational databases.  Document and Content Management focuses on integrity and access; equivalent to data operations management for relational databases.

Since most unstructured data has a direct relationship to data stored in structured files and relational databases, the management decisions should provide consistency across all three areas.  However, Document and Content Management looks beyond an operational focus to interact with other data management functions in addressing the need for data governance, architecture, security, managed metadata, and data quality for unstructured data.

As its name implies, Document and Content Management includes two sub-functions:

Document management is the storage, inventory, and control of electronic and paper documents.  Document management encompasses the processes, techniques, and technologies for controlling and organizing documents and records, whether stored electronically or on paper.

Content management refers to the processes, techniques, and technologies for organizing, categorizing, and structuring access to information content, resulting in effective retrieval and reuse. Content management is particularly important in developing websites and portals, but the techniques of indexing based on keywords, and organizing based on taxonomies, can be applied across technology platforms.  Sometimes, content management is referred to as Enterprise Content Management (ECM), implying the scope of content management is across the entire enterprise.

Metadata Management

Metadata is “data about data,” but what exactly does this commonly used definition mean?  Metadata is generated whenever data is created, acquired, changed, read / accessed, moved, used, across any part of the enterprise.  Metadata gives the context to the data’s content.

Metadata Management is the set of processes that ensure proper creation, storage, integration, and control to support associated usage of metadata.

Business metadata includes the business names and definitions of subject and concept areas, entities, and attributes; attribute data types and other attribute properties; range descriptions; calculations; algorithms and business rules; and valid domain values and their definitions.  Business metadata relates the business perspective to the metadata user.

Examples of business metadata include, but are not limited to:

  • Business data definitions, including calculations
  • Business rules and algorithms, including hierarchies
  • Data lineage and impact analysis
  • Data model: enterprise level conceptual and logical
  • Business names and definitions of subject and concept areas, entities, and attributes; attribute data types and other attribute properties

Technical and operational metadata provides developers and technical users with information about their systems.  Technical metadata includes physical database table and column names, column properties, other database object properties, and data storage.  The database administrator needs to know users patterns of access, frequency, and report / query execution time.  Capture this metadata using routines within a DBMS or other software.

Operational metadata is targeted at IT operations users’ needs, including information about data movement, source and target systems, batch programs, job frequency, schedule anomalies, recovery and backup information, archive rules, and usage.

All forms of metadata are crucial for an organization to have and use to understand the context of its data and information.  The absence of one of these forms of metadata reduces the ability to understand any element in any source, rendering that data less reliable, perhaps useless.

Data Quality Management

Data quality is a perception or an assessment of data’s fitness to serve its purpose in a given context.  It is an essential characteristic that determines the credibility of data for decision-making and operational effectiveness.  Data governance is the main vehicle by which high quality data can be delivered in an organization, through the development and implementation of policies, standards, processes, and practices that instill a desire to achieve and sustain high quality data across the enterprise.

Aspects (also known as dimensions) of data quality include: 

  • Accuracy – The extent to which the data are free of identifiable errors.
  • Completeness – All required data items are included. Ensures that the entire scope of the data is collected with intentional limitations documented
  • Timeliness – Concept of data quality that involves whether the data is up-to-date and available within a useful time period
  • Relevance – The extent to which data are useful for the purposes for which they were collected
  • Consistency across data sources – The extent to which data are reliable and the same across applications.
  • Reliability – Data definitions are important to data quality. Data users must understand what the data mean and represent when they are using the data. Each data element should have a precise meaning or significance.
  • Appropriate presentation – Data is presented in a manner and format that is consistent with its intended use and audience expectations
  • Accessibility – Data items that are easily obtainable and legal to access with strong protections and controls built into the process

Blended and new data sources, master data management efforts, data integration initiatives can require the need for data quality management.  All of these efforts have a common goal of improved data quality for organizational use, and the data quality process should result in continuing enhancement of each of the data quality characteristics for the data that is cleansed. 

Benefits of Enterprise Data Management

The disciplines of enterprise data / information management are aligned and should be connected in any organization that wants to reap the benefits of improved enterprise data:

  • Improved access to organized, properly defined data through data governance and metadata management
  • Improved quality of data for decision making and operations – faster operations and faster, more accurate decisions
  • Multi-user data access where appropriate
  • Improved reporting and analytic capabilities on both enterprise and local scales, for accurate results
  • Improved data security and privacy access according to standards and procedures applied consistently
  • Integration of data across sources according to standards and using a consistent architecture framework, for ease of integration and access

Conclusion

Enterprise Data Management is an essential function of every organization, providing the structure and capabilities for managing data and information as assets.  Effective enterprise data management enables an organization to realize the benefits of of data and information that is fit for use for the right purposes, at the right times, in the right processes, to ensure the optimal decisions and outcomes.

LinkedIn
Facebook
Twitter

Dr. David P. Marco, LinkedIn Top BI Voice, IDMMA Data Mgt. Professional of the Year, Fellow IIM, CBIP, CDP

Dr. David P. Marco, PhD, Fellow IIM, CBIP, CDP is best known as the world’s foremost authority on data governance and metadata management, he is an internationally recognized expert in the fields of CDO, data management, data literacy, and advanced analytics. He has earned many industry honors, including Crain’s Chicago Business “Top 40 Under 40”, named by DePaul University as one of their “Top 14 Alumni Under 40”, and he is a Professional Fellow in the Institute of Information Management. In 2022, CDO Magazine named Dr. Marco one of the Top Data Consultants in North America and IDMMA named him their Data Management Professional of the Year. In 2023 he earned LinkedIn’s Top BI Voice. Dr. Marco won the prestigious BIG Innovation award in 2024. David Marco is the author of the widely acclaimed two top-selling books in metadata management history, “Universal Meta Data Models” and “Building and Managing the Meta Data Repository” (available in multiple languages). In addition, he is a co- author of numerous books and published hundreds of articles, some of which are translated into Mandarin, Russian, Portuguese, and others. He has taught at the University of Chicago and DePaul University. DMarco@EWSolutions.com

© Since 1997 to the present – Enterprise Warehousing Solutions, Inc. (EWSolutions). All Rights Reserved

Subscribe To DMU

Be the first to hear about articles, tips, and opportunities for improving your data management career.