Subscribe to DMU

Search DMU Library


Metadata Management Fundamentals

Metadata management is one of the foundational components of an enterprise data management initiative. Many organizations struggle with its incorporation into their data management processes.

Metadata Management’s Relationship to Data Management

Data Management is one of the hottest topics in our industry as Global 2000 companies and large government agencies are beginning to understand that without accurate, timely, and well-understood data they cannot realize the benefits of advanced analytics, big data, mobile analytics, data lakes, and the vast reservoir of data opportunities from the internet of things (IoT).

The practice of metadata management is foundational to every aspect of data management. Imagine trying to build a sustainable data governance practice without metadata management. Data stewards spend most of their time working with metadata and a smaller amount of time on data. Without proper metadata management, these stewards would be limited to working with only Sharepoint, Excel spreadsheets, Word documents, and a bunch of non-automated processes to accomplish their vital tasks.

Master data management needs metadata management as much as data governance does. In a master data management program, we need to have automated and accurate metadata on the systems, data movement, data stores, and golden records in our information technology (IT) environment. ,The Data Administration Management Association (DAMA) correctly states that every component of enterprise data management has deep connections to metadata and its management including:

  1. Business analytics: Data definitions, reports, users, usage, performance.
  2. Business architecture: Roles and organizations, goals and objectives.
  3. Business definitions: The business terms and explanations for a particular concept, fact, or other item found in an organization.
  4. Business rules: Standard calculations and derivation methods.
  5. Data governance: Policies, standards, procedures, programs, roles, organizations, stewardship assignments.
  6. Data integration: Sources, targets, transformations, lineage, ETL workflows, EAI (enterprise application integration), EII (enterprise information integration), migration / conversion.
  7. Data quality: Definitions, defects, metrics, ratings.
  8. Document content management: Unstructured data, documents, taxonomies, ontologies, name sets, legal discovery, search engine indexes.
  9. Information technology infrastructure: Platforms, networks, configurations, licenses.
  10. Logical data models: Entities, attributes, relationships and rules, business names and definitions.
  11. Physical data models: Files, tables, columns, views, business definitions, indexes, usage, performance, change management.
  12. Process models: Functions, activities, roles, inputs / outputs, workflow, business rules, timing, stores.
  13. Systems portfolio and IT governance: Databases, applications, projects and programs, integration roadmap, change management.
  14. Service-oriented architecture (SOA) information: Components, services, messages, master data.
  15. System design and development: Requirements, designs and test plans, impact analysis.
  16. Systems management: Data security, licenses, configuration, reliability, service levels.

Understanding Metadata Management

If you’re in the field of metadata management it can be intimidating. There are many best practices and terminology that needs to be understood to effectively function in this line of work. The goal of this article is to present the key concepts and the basic best practices of metadata management so that you have a solid foundation on this valuable topic.

Metadata Defined

The classic definition of metadata is “data about data.” Unfortunately, this definition is limiting as metadata is about much more than “data about data”.

Metadata is a type of data that digitally describes the who, what, when, where, why, and how of an organization’s data, processes, applications, assets, business concepts, and/or other things of interest.

More simply we can say that metadata provides the context to the content of our digital assets.

From this definition, we can see that metadata is a type of data. Like data, metadata is a set of digitized values, facts or information that provides knowledge. This knowledge looks to answer the who, what, when, where, why, and how. The 5 Ws & 1 H (who, what, when, where, why, and how) table provides definitions for each type of knowledge.

Knowledge Type



Refers to person(s)


Used as a request for specific information; to inquire about the character, occupation, etc. of a person or thing; to inquire as to the origin, identity, etc. of something; to inquire as to the worth, usefulness, force, or importance of something; or to refer to how much something is.


Refers to time, a time period, how long ago, how soon, under what circumstances or upon what occasion.


Refers to a thing or person that is in or at a place, part, point, etc..


Used to ask for what, for what reason, cause, or purpose.


Used to refer what way or manner; by what means;

to what extent, degree, etc.; in what state or condition?; for what reason; why?; to what effect; with what meaning?; a question concerning the way or manner in which something is done, achieved, etc.; a way or manner of doing something.

Table 1: 5 W’s & 1 H Description

It’s important to understand that metadata management has two distinct but related uses. It is valuable to the information technology (IT) department of a company and to the business side of the organization as well.

Knowledge Type

Technical Example

Business Example


Who is the programmer responsible for a specific data movement process?

Who is the chief data steward of the CUSTOMER subject area?


What is the data lineage between our customer system and our enterprise data warehouse?

What field on our analytics shows the profitability of our products?


When do the extraction, transformation and load (ETL) jobs run and what is each job’s dependencies?

When was the data that I am analyzing last refreshed?


Where in our IT environment are there servers operating at less than 40% of capacity?

Where do we have a report that shows our social media analytics by marketing campaign?


How do we setup security privileges for a new analyst?

How do we calculate the fields on our key report?


Why are we experiencing more errors in the quality of our data?

Why are we missing some customers in our analytical reports?

Table 2: 5 W’s & 1 H Business and Technical Examples

Managed Metadata Environment

The managed metadata environment (MME) represents the architectural components, and processes that are required to properly and systematically gather, retain, and disseminate metadata throughout the enterprise. The MME encapsulates the concepts of metadata repositories, catalogs, data dictionaries and any other term that people have thrown out to refer to the systematic management of metadata .

Figure 1: Managed Metadata Environment (MME)

There is a great deal more to explain about the MME and I encourage you to read my detailed article on it.

Meta Model & Metadata Model

The terms meta model and metadata model are synonymous. They both refer to the physical model that is designed to store the metadata. A meta model looks much like the data models that most of us are familiar with. They have elements (metadata elements instead of data elements), tables and relationships. In fact, almost all the best modeling practices for data are equally applicable to metadata modeling as well. Even with the similarities, the modeling of metadata does pose some challenges that very few data modelers are familiar with.

 More detail about the MME is provided in my book, “Universal Metadata Models”, David Marco, J. Wiley, 2001).

The 4 Characteristics of a Meta Model

A great meta model has 4 key characteristics. It is generic, integrated, current and historical.


Generic means that the physical meta model looks to store metadata by metadata subject area as opposed to application-specific. For example, a generic meta model will have an attribute named “DATABASE_PHYS_NAME” that will hold the physical database names (a metadata subject area) within the company. A meta model that is application-specific would name this same attribute “ORACLE_ACCTREC_PHYS_NAME”. The problem with application-specific meta models is that metadata subject areas expand their scope and can even change over time. To return to our example, today Oracle may be our company’s database standard. Tomorrow we may switch the standard to SQL Server for cost or compatibility advantages. This situation would cause needless additional changes to the change to the physical meta model. Further, we should not have application-specific names in to meta model like ACCTREC (i.e. Accounts Receivable). It has inputs (data coming in), processes and outputs (data coming out) just like any other system. Therefore, there is no reason to have our meta model have application-specific names for our attributes or tables as this is limiting and a poor meta modeling practice.


A meta model provides an integrated view of the enterprise’s major metadata subject areas. Suppose we need a meta model that holds business definitions for our data elements and captures technical data lineage. Meta modelers make the mistake of putting the business metadata (definitions) in a separate set of tables and the technical metadata in a different set of tables without any relationships.

As a result, if the business is considering adding a new “customer type”, the metadata team can’t query the data lineage related metadata in the model to see what data elements would be impacted by this business decision. This severely limits the power that metadata management can provide.

The best practice of having an integrated meta model is missed by the vast majority of organizations as they chose to implement many smaller metadata management solutions, rather than an enterprise-wide metadata management effort.


A fundamentally sound meta model contains metadata that relates to both the current environment and the future/planned environment. Metadata management is very valuable in understanding and managing our current business and technical landscape; however, it can also play a central role in our organization’s future plans. For example, let’s assume that our company is considering a migration to a new ERP (enterprise resource planning) vendor. It would be very valuable to query the meta model to see how many of our current data elements are already available in the new ERP vendor’s solution.


Lastly, meta models are historical as a good meta model will include historical views of the metadata, even as it changes over time. This allows a corporation to understand how their business has evolved over the years. This is especially critical if the MME is supporting an application that contains historical data, like a data warehouse or an advanced analytics application. For example, if the business metadata definition for “customer” is “anyone that has purchased a product from our company within one of our stores, website or through our catalog”. A year later a new distribution channel is added to the strategy. The company now allows customers to order products through an app on their phone. At that point in time, the business metadata definition for customer would be modified to “anyone that has purchased a product from our company, website, within one of our stores, through our mail order catalog or through our company app”. A fundamentally sound meta model stores both definitions because they have validity, depending on what data you are analyzing (and the age of that data).

Metadata Repository

This is the industry’s first wide spread term to refer to the metadata management system. The term refers to the meta model and typically a management software package that may have been purchased. It is one of the six components of the MME.

Types of Metadata

There are two types of metadata that the MME will contain, technical and business.

Technical Metadata

Technical metadata provides the developers, DBA (database administrators), technical users, and other IT staff members the metadata they need to maintain, grow, and effectively manage an organization’s IT environment.

Technical metadata is absolutely critical for the ongoing maintenance and growth of the warehouse. Without technical metadata the task of analyzing and implementing changes to a decision support system is significantly more difficult and time consuming.

Examples of Technical Metadata

Physical, logical, and conceptual data element names, domain values, data quality rules, and formats

Physical, logical, and conceptual table/file names, keys. and indexes

Physical, logical, and conceptual table/entity relationships

The physical flow of data within an IT environment

Physical, logical, and conceptual data models

Data tags

NoSQL structures

Audit controls, and balancing information

The structure of data

Application names and boundaries

Mappings, extractions, transformations, and loads of data between applications

Encoding/reference table conversions

The relationship between models

History of data extractions and replications

User access patterns, frequency, and execution time of reports/queries

Table/data element access patterns

Technical business rules

Subject areas

Application archiving (e.g. data warehouse, data lake)

Job dependencies

Program, and job names and descriptions

Version maintenance

Security criteria, constraints, and measures

Purge criteria


Table 3: Examples of Technical Metadata

The table above gives us a solid example of technical metadata. The following table will give more examples of technical, except with an IT portfolio management focus.

Examples of Technical Metadata for IT Portfolio Management

Hardware assets (mainframes, Unix, servers, PCs, etc.)

Hardware configurations (disk space, memory, processor types, etc.)

Hardware locations

Hardware costs (purchase price, leasing fees, maintenance fees, etc.)

Software licenses

Software license expiration dates

Software installations (purchase price, leasing fees, maintenance fees, etc.)

Software costs

Installed software patches

System listings

System technology rankings (E = evolving, S = stable, A = aging, O = obsolete/unsupported)

System purposes

System inputs/outputs

Project listings

Project costs (estimates and actuals) (internal, consulting, hardware, and software)

Project success rates

Project staffing (internal, vendors, temporary, etc.)

Project estimated date of completion, actual date of completion, etc.

Project business justification and scope

Project status

Network links

Contact person

Hardware assets (mainframes, Unix, servers, PCs, etc.)

Hardware configurations (disk space, memory, processor types, etc.)

Hardware locations

Table 4: Examples of Technical Metadata for IT Portfolio Management

Operational Metadata

It is important to note that the DAMA-DMBOK© (Data Management Body of Knowledge) does list a third type of metadata called Operational metadata. Operational Metadata refers to metadata that a data warehouse team may add during the ETL process and which is designed to help the ETL process. Examples of operational metadata include ETL Load Date, Update Date, Load Cycle Identifier, Current Flag Indicator, Operational System(s) Identifier, Active in Operational System Flag and Confidence Level Indicator. Operational metadata should not be classified as a 3rd type of a metadata as it is a type of technical metadata.

Business Metadata

Business metadata is the link between IT applications and business users as it provides the semantic layer that helps business professionals locate, understand, and effectively utilize the organization’s data. Business metadata provides these users with a roadmap for access to the data in the data warehouse, analytical engines, sales systems, ERP applications, data lakes, big data stores, websites, and all the other applications in the IT environment.

Examples of Business Metadata

Other Key Metadata Management Terms and Concepts

Business Glossary

A business glossary provides a listing and sometimes a hierarchy, of the key business concepts of an organization in a common vocabulary. Often it will contain definitions, rules, and polices.

Table 6: Simple Business Glossary Example 

Data Dictionary/Data Glossary

A data dictionary (aka data glossary) can be said to be a business glossary designed for an organization’s IT staff. It would show a listing of the key business concepts and their associated technical instantiations in a common vocabulary.

Table 7: Simple Data Dictionary Example

Data Heritage

Data heritage represents the metadata about the original source of the data. For example, the data heritage of a business element called “Customer Name” could be “a sales person types in the customer name in the Salesforce system”.

Data Lineage

Data lineage represents information about everything that has “happened” to the data within an organization’s environment. Whether the data was moved from one system to another, transformed, aggregated, etc., ETL (extraction, transformation, and load) tools can capture this metadata electronically.


I hope that this article provided you a solid understanding of the foundational concepts that are using in the enterprise-wide management of metadata in an organization. Follow these concepts and keep them present as you build out your data management program.

Share on linkedin
Share on facebook
Share on twitter

Dr. David P. Marco, PhD, Fellow IIM, CDMP Master, CBIP, CDP

Dr. David P. Marco, PhD, Fellow IIM, CDMP Master, CBIP, CDP is an internationally recognized expert in the field of data warehousing, business intelligence, enterprise data management, data governance, and is the industry’s leading authority on metadata. Mr. Marco is founder and President of Enterprise Warehousing Solutions, Inc. (EWS), a Chicago-based enterprise data management consultancy dedicated to providing clients with best-in-class solutions. Author of several books and hundreds of articles and a Certified Data Management Professional, Mr. Marco is also a well-known speaker in his areas of expertise at conferences and symposia.

© Since 1997 to the present – Enterprise Warehousing Solutions, Inc. (EWSolutions). All Rights Reserved

Subscribe To DMU

Be the first to hear about articles, tips, and opportunities for improving your data management career.