There are some common components of every effective enterprise data model, starting with the subject area and conceptual models. These are foundational parts of every enterprise data architecture
A popular credit card commercial asks, “What’s in YOUR wallet?” One can ask a similar question of data architects – “What’s in YOUR data architecture?”
It has been argued that this question should be, “What specification artifacts should be in your target enterprise data architecture?” to identify three major categories of enterprise data architecture artifacts. It is important to understand the common components of an enterprise data model, an artifact of every effective enterprise data architecture.
Figure 1. Enterprise Architecture Artifacts
Enterprise Data Model: Core of Enterprise Data Architecture
The core of any enterprise data architecture is an enterprise data model (EDM). No enterprise data architecture exists without an enterprise data model. The EDM is an integrated subject-oriented data model defining the essential data produced and consumed across an entire organization.
- Essential means the data that is critical to the effective operation and decision-making of the organization. Few (if any) enterprise data models define all the data within an enterprise. Decisions must be made (and revisited) about the scope of enterprise data modeling efforts. “Essential” does not mean “common” or “shared.” Essential data requirements may or may not be common to multiple applications and projects. Some data defined in the enterprise data model may be shared by multiple systems, but other data may be critically important yet created and used within a single system. Over time, the enterprise data model should define all data of importance to the enterprise.
- Integrated means that all of the entities, attributes and rules in the model are defined once, without redundancy. The concepts in the model fit together as the CEO sees the enterprise, not reflecting separate and limited functional or departmental views. There is only one version of the Customer entity, one Order entity, etc. Every data element has a single name and definition. The data model may identify common synonyms and important distinctions between different sub-types of the same common business entity.
- Subject-oriented means the model is divided into commonly recognized subject areas that span across multiple business processes and application systems. Subject areas are focused around the most essential business entities. Most organizations have between 12 and 25 subject areas; more means the subjects are too-narrowly defined for effective business understanding and usage.
The goals of the enterprise data model are:
- To capture at a high level the collective data requirements of the enterprise.
- To align information systems and data management efforts with business strategy
- To guide data integration
- To guide continual improvement of data quality
- To build deeper business understanding and wiser interpretation of data
- To enable and organize data stewardship and to support data governance
Subject Area Model
The enterprise data model is an integrated set of data specifications (metadata), viewable through reports and subject area diagrams. Each subject area diagram depicts business entities and the relationships between these entities. Business entities are classes of things and concepts of interest to the enterprise. The model captures data about specific concepts of business entities. The model includes an official name and business definition for each entity (often common synonyms, instance examples and related business rules complement the business definition). Additionally, the model defines the relationships between two entities, usually as a bi-directional set of verb phrases, with business rules that govern the numeric relationships between instances of each entity. Other relationships identify one business entity as a kind of (sub-type) another entity.
The scope of each subject area includes 5-30 business entities and their relationships. Each subject area is described with an entity relationship diagram depicting business entities as boxes and business relationships as lines connecting the boxes. Several different modeling styles are commonly used to depict business relationships in entity relationship diagrams. The scope of a given subject area overlaps with the scope of other subject areas, so that a business entity and its relationships may be included in more than one subject area. The collective scope of the subject areas in the enterprise data model should cover all the essential interests of the enterprise.
The subject area model taxonomy enables people to access and navigate their way through the subject areas of most interest to them in the enterprise data model. It is also an essential organizational structure for data governance and stewardship. Furthermore, most enterprise data models are developed iteratively and incrementally, focusing on higher priority subject areas first. For all these reasons, it is very important to define a practical and commonly acceptable taxonomy / structure of subject areas from the very start.
Conceptual Data Model
The conceptual views of business entities and business relationships do not include any data attributes. The conceptual views model business semantics – the meaning of business terms — and in fact are more accurately described as semantic models (also known as ontologies). Non-technical people are often surprised to discover that these conceptual models have so little to do with technology.
The enterprise data model is often organized into three layers of abstraction: the subject area model, the enterprise conceptual data model and the enterprise logical data model. The subject area model is simply a list or hierarchy of the subject areas within the enterprise data model. It serves as an introduction to the model and an index to the conceptual and logical views. Sometimes subject areas are depicted graphically in a sort of conceptual picture or map of the enterprise.
Figure 2. Enterprise Data Model Layers
Additions to Enterprise Data Model
Some enterprise data models include essential data attributes, shown in more detailed “logical” views of the same business entities and relationships (either in the same subject areas or smaller subsets). An enterprise data model does not attempt to identify all the data attributes required by the enterprise. The model identifies the data attributes of most importance to the operation and management of the enterprise. The model depicts these attributes independent of any specific usage or application context. These “application neutral” logical views are quite different from application-specific logical data models. The enterprise data model is only partially normalized; no “data entities” are created to resolve many-to-many relationships. Including essential data attributes enables the enterprise data model to address its objectives better – to identify enterprise data requirements and to guide data integration.
Some enterprise data models are extended to include:
- A more complete business glossary, expanded beyond the definition of business entities to include other terms (including processes, roles and organizations).
- Data stewardship responsibility assignments – who is accountable for the quality of metadata in the model and the actual data in the enterprise, either for a subject area or for a business entity. The metadata attributes in the data model could be extended to include these assignments.
- Data quality requirements for essential data attributes, for specific dimensions of data quality, in any context or the most common such as:
- “Is this a required (mandatory, non-nullable) attribute?”
- “How current must the data be?”
- “How accurate and precise must the data be?”
- Entity life cycle states, shown as state transition diagrams, depict the trigger events that change the status of particularly important business entities. These diagrams are not supported by all data modeling tools, but the diagrams are relatively simple, so many organizations maintain a supplemental set of diagrams in another tool.
- Reference data value sets for particularly important data attributes, which may be defined externally or internally. While small value sets (domains with less than 20 values) may be listed in the data model itself, large reference data value sets are likely to be maintained outside the data model. Of course, all reference data value sets should be maintained in some form of master data management or code management application.
Enterprise Data Model Tools and Standards
The choice of data modeling tool used to capture and maintain the enterprise data model will dictate to some extent how the model is structured. Some organizations keep the enterprise conceptual and logical data models in one integrated data model, while other organizations synchronize two separate data model files. Any graphical depiction of the subject area model is likely to be maintained separately, outside the data modeling tool itself, and so its contents and structure must be synchronized with the data model as the data model evolves.
The enterprise data model is guided by modeling standards, especially naming conventions for entities and attributes. Each subject area view in the enterprise data model is developed collaboratively with data stewards and other subject matter experts. Data architects facilitate and coordinate these efforts through workshops and review sessions. The data model is developed and refined iteratively over time.
Although it is an essential part of an enterprise data architecture, the enterprise data model by itself is not enough. The model is part of the overall enterprise architecture. It is critical to understand how data relates to business strategy, process, organization, application systems and technology infrastructure. This is done through information value chain analysis and related data delivery architecture.