Creating and maintaining business and technical metadata definitions are two common data stewardship activities. Doing them well is essential to data stewardship success.
Data stewards perform many common activities, including developing and maintaining data domain values, creating and managing business data rules, and supporting the data quality rules, validation and resolution processes. However, the most common activities performed by business data stewards are the development and management of business and technical metadata definitions.
Business Metadata Definitions
One of the key tasks for the business stewards is to define the business metadata definitions for the attributes of a company. A couple of years ago a friend of mine shared with me the following quote by Daniel Davenport: “The greatest problem in communication is the illusion that it has been accomplished.”
I don’t know anything at all about Mr. Davenport or if he even knows what metadata means. However, his quote succinctly sums up the need for business metadata definitions.
Typically, it is wise to begin by having the business stewards define the main subject areas of their company. “Subject areas” are the “nouns” of the corporation, e.g. customer, product, sale, policy, logistics, manufacturing, finance, marketing, and sales. In general, companies will have between 25 – 30 subject areas depending on their industry. Once business data stewards have defined their subject areas then each of these areas can be refined further. For example, for product a company may want to distinguish between the different lines of business or by subsidiary. Within each subject area an enterprise will have a collection of entities (generally tables), and within each entity there will be key attributes (data elements) that require definitions. Most organizations do not define every attribute, concentrating on the elements that provide the most value to each entity at first.
Some data elements will require calculation formulas. For example, a company may have a data attribute called “NET_Revenue”. NET_REVENUE may be calculated by subtracting “gross costs” from “gross revenues.” Any calculation formulas or algorithms must be included in the business metadata definitions.
Once the key data elements have been identified then the business data stewards can begin working on writing metadata definitions on the attributes. The process for capturing these definitions needs to be supported by a metadata repository (MME). The repository would have metadata tables that would have attributes to hold the business metadata definitions. In addition, a web-based front-end would be given to the business stewards to key in the business metadata definitions. The repository would capture and track these metadata definitions historically. This historic tracking is accomplished via having “from” and “to” dates on each of the metadata records.
Also a metadata status code will be needed on each row of this metadata. This status code would show if the business metadata definition is “approved,” “deleted” or “pending approval.” This code is important because it is always poor policy to delete metadata rows. The second a row of metadata is deleted, the user will want it back.
When the first business metadata definitions are entered it is common to mark them as “pending”. This allows the business data stewards to gain consensus on this elements before moving their status to “approved.”
Technical Meta Data Definitions
The technical stewards are responsible for creating the technical metadata definitions for the attributes of a company. It is important to understand that technical metadata definitions will fundamentally differ in form from business metadata definitions. As business metadata definitions are targeted to the business users, technical metadata definitions are targeted for an organization’s IT staff. Therefore it is perfectly acceptable to have SQL code and physical file/database locations included in the technical metadata definitions.
Usually it is too much work to have the technical stewards list out all of the physical attributes within the company. It is wise to begin by having the technical stewards list out their key data attributes. By initially focusing on the core data attributes, it helps the IT department to have their technical metadata definitions fully clarified on their most important data attributes. Once the technical stewards have defined these initial physical attributes, they can start working on the remaining attributes.
The process for capturing these technical data definitions will be a mirror image of the process to capture business metadata; in fact the web-based user screens should look very similar. In addition, the same functionality which I described in the previous section (from & to dates, status codes, etc.) should also be included.
Once both the business and technical stewards define their metadata definitions what occurs in 100% of the situations is that discrepancies will be discovered between the definitions. For example, the business stewards may define “product” as any product that a customer has purchased. The technical stewards may define “product” as a product that is marked as active. These two definitions are clearly different. In the business stewards definition any product (active or inactive) that is currently on an open order for a customer would be valid. Obviously, the IT staff will want to work with the business users to repair these hidden system differences.
Business metadata and technical metadata are essential to a complete understanding of the organization’s data and information. Having the data stewards, business and technical, create and maintain the metadata definitions for the critical elements ensures that the data stewards are performing one of the most common activities of data stewardship