Skip to content.

Sections
Home » Resource Center » Real-World Decision Support (RWDS) Journal » April 2003 - Volume 1, Issue 19 » Use of meta data entities - Part 4 of a series on abstraction

Use of meta data entities - Part 4 of a series on abstraction

by Steve Hoberman

Steve Hoberman

In my first article in this series, I explained that abstraction is a tool that lets an artist efficiently capture and represent complex topics in a generic way. As data modelers, we are all artists to an extent, and abstraction is an efficient tool at our disposal as well. The first article in this series explained abstraction, the second article explained when to use abstraction, the third article discussed where to use abstraction, and this article explains the value of meta data entities when we choose to abstract. We will use an example from my last article to show first the value of abstracting and then how meta data entities can add even more value by retaining important business information within the design. After explaining what meta data entities are, we then discuss their pros and their drawback.

This article is part of a series on abstraction. Here are the future topics that will be covered:

  • Reusable abstract entities. I will share the abstract entities I use most often.
  • Reusable abstract relationships. I will share the abstract relationships I use most often.
  • Reusable abstract data elements. I will share the abstract data elements I use most often.

If you have questions on abstraction or if there are other areas within flexible design strategies you would like me to address, please let me know. I can be reached at me@stevehoberman.com. For more on abstraction and other modeling techniques, please refer to my book, The Data Modeler's Workbench, Tools and Techniques for Analysis & Design. I've also just opened up for registration a brand new Data Modeling Master Class, which contains 4 days of comprehensive modeling training including an in-depth section on abstraction.

An example highlighting the main benefits and drawbacks to using abstraction

Remember that the biggest obstacle to using abstraction was that business concepts are no longer represented on the model. We lose a level of business understanding on the design. Models can become more difficult to interpret and there is the potential for data quality problems to occur due to less relational database integrity. For example, in our last article we discussed the abstract design for a meta data repository.

For this example, a subset of the unabstracted meta data repository design is similar to what is shown in Figure 1.

Figure 1

Figure 1 Meta data repository unabstracted model

Figure 1 contains only a portion of a true meta data model but we can recognize at a minimum the concepts we have on our model and underlying database. A Data Model contains Entities, Relationships, and Data Elements. An Entity can appear on multiple models (e.g. Customer can appear on the data warehouse model and the sales data mart model), contain many data elements, be connected to other entities through many relationships, and eventually have a correlation to one or more tables in the underlying database. Logical Relationships and physical Constraints also have a relationship with each other. A Database Schema can contain many Tables, Constraints, and Data Elements and Tables can contain many Data Elements.

Many rules are enforced in this design (e.g. you can't have a Relationship without Entities) and the entity names in this model give the reader a very good idea what is contained inside (e.g. I know the entity Data Element contains the data elements). However, how much flexibility do we have here? When new types come along, we will need to update our design. For example, if we suddenly want to store stored procedures in our repository, where would we put them? We would need to expand our data model and therefore expand the underlying database and application.

So it appears there is value then in creating an abstract structure for this repository application. Let's assume we add abstraction evolving our design into what appears in Figure 2.

Figure 2

Figure 2 Meta data repository abstract model (Basic)

This is a very abstract model which can contain any types of meta data in our organizations (actually it is probably generic enough to use for anything we can think of). The Object entity contains the actual meta data value, such as Customer Last Name, Product Classification, Vendor, Customer to Sales Representative relationship, and the Order Data Model. Objects can relate to other objects, hence the recursive relationship. For example the Customer Last Name data element can belong to the Customer entity. The Characteristic entity contains all of the descriptive information about this meta data value, such as the actual definition (The Customer Last Name is the surname of the customer), format (Customer Last Name is 30 characters in length), nullability (Customer Last Name can not be NULL), and version information (Order Data Model last changed on March 2nd, 2003, 5 PM by Bob Jones).

If our model was simply these two entities, Object and Characteristics, we would still be able to capture all of the types of meta data in our organizations. However, what would stop us from making the following incorrect relationships using only these two entities?

  • The Order Data Model is Not Null
  • The Customer entity is 30 characters long
  • The Customer Last Name is a supertype

To take it a step further, we can see that the recursive relationships on these two entities also appear to give us free reign to relate anything to anything else. We can relate two Objects that make no sense to relate (the Customer Last Name contains the Customer entity), and two characteristics that make no sense to relate (the format 30 characters has a specific definition).

In other words, there appears to be nothing stopping us from making any possible relationships we would like between the Object and Characteristic entities. The three relationships that connect these two entities can let us relate anything to anything else, even if it makes no sense. We can still use this design, realizing its flaws, or we can think of another design approach which lets us retain more of the original business meanings and integrity on the design.

What are meta data entities?

Meta data entities are entities that contain as values meta data for business entities on our design. They do not contain business values like the other entities we deal with (Customer, Employee, etc.). If our model contains Customer information, a meta data entity might contain the definition of a customer and the relationships a customer can participate in. Meta data entities become extremely important when we abstract because they allow us to put some of the meta data we lost by abstracting back into our design.

An example of a design with meta data entities is shown in Figure 3. Figure 3 is the same as Figure 2 with the addition of two entities, Object Type and Characteristic Type, which can provide all of the integrity and business values we had in Figure 1.

Figure 3

Figure 3 Meta data repository abstract model

What saves us from incorrectly relating the types of Objects and Characteristics in this example are the Object Type and Characteristic Type entities. The Object Type entity categorizes the meta data, examples being entity, data element, relationship, and data model. Object Types can also have relationships to each other, shown with its recursive relationship. For example, Entities can contain Data Elements. Characteristics can also have relationships to each other, shown through its recursive relationship. Characteristic types contain the descriptive information about the characteristics, such as Definition, Format, and Version.

If we define the Object Type Data Model to only have relationships to Characteristic Types of Version and Name, then the example where the "Order Data Model is Not Null" can never happen, because Nullability is not one of the Characteristic Types we allow the Data Model object to possess. I encourage the reader to spend a few minutes and see if the Object and Characteristic Type entities really have the ability to keep the data in check, and enforce a level of data integrity.

An important note however. I am making the assumption that the values in the meta data entities have been set up correctly. If accidentally the Data Model type was assigned the Nullability characteristic type, then actual data with this incorrect relationship can exist.

Also note that besides maintaining the data integrity, the two meta data helper entities also provide valuable meta data to our users and reporting tools. For example, we can look up the types of anything in our repository.

The benefits and drawback to using meta data entities

We can see from our example that meta data entities allow us to enforce business rules that we lost when abstracting, as well as provide additional information to the technical team, users and reporting tools.

The drawback to using these tables however, is that someone needs to maintain them. Who do you think that would be? More than likely, it will be the person who designed them, namely the data modeler. That puts the data modeler into an interesting situation, as they are now responsible not just for the design, but also for maintaining some of the data. If a new Characteristic Type comes along, guess what? The data modeler is going to need to add the extra value. Optionally, the development team can put a process in place for the users to maintain the tables themselves. This frees the data modeler up from maintaining these tables and puts them out of the critical path of holding up the business from completing their business processes. However, having the users maintain the data means there is probably a bigger risk that the users might use these meta data tables for more than originally intended. For example if the user does not know what Nullability means, they might add an extra value called "Empty" which means pretty much the same thing and now we have redundancy built into this type table. The point here is that if you are going to use meta data entities, make sure you have a viable process in place to maintain the data in these tables, a process that hopefully does not involve you as the data modeler to support.

This article explains meta data entities through an example illustrating their usefulness and potential drawback. Our next article will focus on some trends and patterns I've noticed in abstracting entities.

About the Author

Steve Hoberman is an expert in the fields of data modeling and data warehousing, and teaches several data modeling courses throughout the year including a brand new Data Modeling Master Class. He is currently a global reference data expert for Mars, Inc. He has been data modeling since 1990 across industries as diverse as telecommunications, finance, and manufacturing. Steve speaks regularly for the Data Warehousing Institute. He is the author of The Data Modeler's Workbench, Tools and Techniques for Analysis & Design. Steve specializes in data modeling training, design strategy, and in creating techniques to improve the data modeling process and deliverables. He enjoys reviewing data models and is the founder of Design Challenges, a discussion group which tackles complex data modeling scenarios. To learn more about his data model reviews and to add your email address to the Design Challenge distribution list, please visit his web site at www.stevehoberman.com. He can be reached at me@stevehoberman.com