Master and reference data management is much more than a technical solution. It is a combination of data integration, metadata management, data governance, and enterprise data architecture that provides information about an organization’s core business concepts
Master data management (MDM) has been identified as technology which will help organizations standardize their data about core business entities (e.g., customer, product, employee, etc.). Sounds good – but wait. Didn’t organizations try to achieve these benefits with enterprise resource planning (ERP), data warehousing (DW), and customer relationship management (CRM), just to name a few? What went wrong? All too often, these data integration initiatives put too much emphasis on technology and not enough focus on basic data management principles, such as data governance, enterprise information management, enterprise data modeling, data administration principles, and master metadata management.
Silver Bullet Technology Solutions
In the 1990’s, a major data integration attempt on the operational side was enterprise resource planning (ERP). The ERP vendors sold their products with the promise to integrate operational data, eliminate (or at least reduce) redundancy, increase consistency of reporting through increased data quality, and make reporting and system maintenance easier. In reality, ERP conversions turned out less than perfect. Most project teams had less than 24 months to study the complexities of their ERP product and to convert their legacy systems. In addition, project teams were staffed with technicians who had neither the time nor the knowledge of disciplines, methods, policies, procedures, and infrastructure necessary for true data integration. As a result, ERP initiatives were simply traditional system conversions with no cross-organizational integration activities and no data management principles applied during the conversion process.
During the same time, decision support people threw themselves into data warehousing (DW). Many DW teams were coerced to solve the reporting problems of their ERP products because the DW was supposed to focus on data integration and elimination (or at least reduction) of data redundancy. Another DW promise was to provide consistent historical data and ad-hoc reporting capabilities in addition to faster data delivery and faster data access for the business people. In reality, many DW practitioners abandoned the data management aspect of data warehousing and only concentrated on the data delivery aspect. They performed no cross-organizational data integration activities because it was too time consuming, and they did little or no data cleansing because it was too costly. This resulted in hundreds of silo data marts, which only increased data redundancy and data inconsistency.
While the DW community was struggling with their data integration efforts, many organizations decided to start yet another data integration initiative around the subject area of customer relationship management (CRM). The promise of the vendors was to provide data integration between the core business entities of customers and the products they purchase or the services they receive. The promise included non-redundant customer data and high data quality to support new business capabilities, such as increased customer satisfaction, knowledge of a customer’s wallet share, customized product pricing, geographic market potential, and so on. In reality, organizations purchased different CRM modules from different vendors and ended up with more silo applications and more departmental views because the purchased packages were not integrated. Once again, most project teams did not take the time to implement disciplines, methods, policies, procedures, and infrastructure to ensure true data integration. Another problem was that the various business departments did not cooperate with each other and did not coordinate their separate CRM activities.
The lesson should be obvious. Technology alone will not solve the decades-old data problems. Therefore, organizations must stop fooling themselves in believing that all they have to do is buy the next “silver bullet” technology solution and their data problems will vanish. Data is a very important asset of an organization, especially in the information age. Like all other assets, data must be managed centrally. Data must be standardized, inventoried, and reused. This applies to all data, and especially to master data. As previously mentioned, information management principles must be applied, such as data governance, enterprise information management, enterprise data modeling, metadata management principles, and master metadata management.
Industry experts define data governance as the “authority over the management of data assets” and assigning “accountability for the quality of your organization’s data.” Having authority over data assets is the function of data ownership. Being accountable for the quality of these data assets is the function of data stewardship.
Data is a business asset, and business assets are controlled by business people. Therefore, data owners and data stewards should be business people. They must be careful not to manage their data within the narrow focus of their own business unit (department or division); instead, they must ensure that their data is managed from an enterprise perspective so that it can be used and shared by all business units. An enterprise view is the hallmark of successful data governance programs.
Enterprise Information Management
Enterprise information management (EIM) is about the administration of data. One industry expert describes EIM as “a function, typically dedicated to an organization in IT, for maintaining, cataloging, and standardizing corporate data.” This is done with the help of data stewards under the umbrella of a data strategy, and by establishing data-related standards, policies, and procedures.
Enterprise Information Management has its origins in data administration (DA), which is a formal discipline for managing data as a business asset. The DA function was formalized in 1980. Since then, DA has also been known as data resource management (DRM), information resource management (IRM), and enterprise information management (EIM). More recently, some DA functions are also appearing in groups called information center of excellence (ICOE), integration competency center (ICC), or BI competency center (BICC). However, all too often, these new groups (whose members frequently do not have a DA background) do not know the technique of enterprise data modeling and do not apply DA principles with the same rigor as a trained EIM group.
Enterprise Data Modeling
Enterprise data modeling (EDM) is still the most effective method for defining and standardizing data. An EDM does not have to be constructed all at once, nor is it a prerequisite for MDM projects. Instead, the EDM function evolves over time and may never be completed. It does not need to be completed because the objective of this process is not to produce a finished model but to discover and resolve data discrepancies among different views and implementations of the same data.
An EDM often starts out as a high-level conceptual data model showing the core business entities and their data relationships. In this model, all entities are core business entities, and thus are master data. Since the conceptual data model is a high-level business model, it is very important to capture the significant supertype/subtype structures.
The logical data model is normalized, refined, and fully attributed. During the refinement process, some core business entities expand into larger subject areas of master data, and some data relationships result in additional entities of transaction data. For MDM purposes, it is not important to fully refine and attribute the transaction data entities. However, it is important to consider transaction data in this normalization process because many business rules and data quality rules for master data can only be discovered by modeling the business transactions.
One of the most important benefits from EDM is the conscious and purposeful application of data quality rules. Most published definitions for MDM strongly advocate the need for improving the quality of master data. This can be achieved by applying the data quality rules during the modeling process.
Metadata Management Principles
The ultimate value of EDM comes from applying stringent metadata management principles during the modeling process.
One fundamental principle is the formalized process for creating data definitions. A definition should be short, precise, and meaningful. Michael Brackett offers examples of a poor data definition and a good data definition for the data element “Well Depth Feet.” The definition “The depth of the well in feet” is very poor because it is not clear how the depth is measured. Is it the total depth of the drilled well? The depth of the casing? The depth to high water? The depth to low water? A much better definition is “The total depth of the well in feet from the surface of the surrounding ground to the deepest point dug or drilled regardless of the depth of the well casing.” Data definitions should be reviewed regularly by business people to ensure that they remain current and correct, and that they are understandable and agreeable to all business people.
Another important principle is applying a formalized structure to the data naming process. Using “favorite” data names or blindly copying informal names from existing systems is not advisable. Without formal data names, data elements cannot be readily recognized, or they may be misidentified and therefore misused. There are numerous data naming conventions, the most popular being the “prime – qualifiers – class word” convention. It prescribes that every data element must have one prime word, one or more qualifiers, and one class word. An example of a standardized data element name is Checking Account Monthly Average Balance. The main component (prime word) is “Account” which is further qualified by the word “Checking” to indicate the type of account. The class word indicating the type of data values (domain) contained in this data element is “Balance,” which is further qualified by the words “Monthly” and “Average” to indicate the type of balance.
Metadata Management principles also include a formalized process for creating technical column names. The first rule states that if approved and published abbreviations exist, then the abbreviations must be used for the column names. Another rule states that a column name should contain all name components (abbreviated) unless the column name is too long. In that case, the name component that is the least significant qualifier can be eliminated from the column name, provided that it does not create ambiguity, which can lead to a misunderstanding and misuse of the column. If the column name is still too long, the name component that is the second least significant qualifier can be dropped, and so on.
Another metadata principle states that all data elements must be atomic and cannot be further decomposed. Every data element must also be unique and must not be known under other names (synonyms). Every data element name must be unique (assigned to only one data element) and must not be reused for another data element (homonyms). Furthermore, every data element must be supported by business metadata, such as data definition, business rules, data type and length, data owner, data source, etc.
Master Metadata Management
Metadata is the “DNA” of all data standardization and integration initiatives because it serves three functions: documentation, navigation, and administration of data assets. Metadata is relevant to both master data and transaction data. The term master metadata management refers to metadata that pertains only to master data.
Metadata practices have been evolving over decades, but organizations are slow to adopt them, mainly because they do not understand the importance of metadata. The value chain of metadata reaches into the actual business processes in the business community because business processes are reflected in the data relationships and in the transactions between core business entities, and core business entities contain master data.
Implementing MDM requires more than just technology. It requires organizations to acknowledge that data is an important business asset and that it must be managed accordingly. Managing data as an asset requires changes in the organization. These changes include new practices, new disciplines, new methods (and some old ones resurrected), new tools and techniques, new accountability, new policies and procedures. Implementing these changes must be systemic and holistic, not isolated and sporadic. This requires new executive leadership that is dedicated to business integration. This new leadership position is not a technical position. It is a strategic business position that will oversee enterprise-wide data standardization initiatives, such as MDM.