So many organizations do not have a basic component for success in the information age: an enterprise data strategy. The risks of not having and implementing a data strategy are immense.
The topic of a data strategy is almost as old as the data management profession itself. Yet, it is amazing how many companies, young and old, have no cohesive, enforceable data strategy. These companies get involved with data warehousing, business intelligence, customer relationship management, master data management, and other new technology initiatives without the framework of a data strategy. Years into their efforts, many business executives are still frustrated over their inability to trust their company’s data. They have spent millions of dollars on new technologies and “silver bullet” solutions, only to find that the state of their data assets has in many cases deteriorated rather than improved.
The explosion of tools and capabilities to create, manipulate, and access data is frightening without a data strategy. If we thought we had redundancy problems in the past, with the implementation of tracking items by RFID and incorporating unstructured data into our information portfolio, we have increased our challenges by a significant magnitude. In addition, at the same time as we are drowning in data, we glorify technology that automates our decision-making.
As a result, our decisions are often plain wrong. To make things worse, we are reducing, in some cases disabling, direct contact with our customers who are affected by our bad decisions, and who are frustrated that they have lost the personal connection to their service representative who could straighten things out. What makes business executives believe that this non-strategy is working?
A data strategy can be seen as a survival guide for the information age. Like other core company assets, such as financial assets, real estate assets, fixed assets, and so on, data assets should be controlled based on a strategy. A data strategy is a strategic plan for enterprise-wide data management that explains a company’s policies, procedures, roles, and responsibilities for standardizing its data, ratifying its business rules, controlling data redundancy, managing its master data, integrating structured with unstructured data, storing and using its data, as well as protecting its data. Note that it is essential to understand all the components of enterprise data management and include them in developing a data strategy.
Data standardization and integration
Data standardization and integration go hand in hand – you cannot have one without the other. This means that data redundancy has to be addressed. Redundant data must be identified, cataloged, resolved, and ultimately reduced. This is much easier said than done because it requires time-consuming human analysis. While there are data profiling tools that can help identify potential duplicates and master data management products that help manage core business data, it still takes the knowledge of a business person to understand the semantics of each data element. It also takes a willingness and commitment from the business people to negotiate and agree on how to standardize the data.
Everyone seems to be getting on the data quality bandwagon. The vendors talk about it, and much is written about it. Yet, in many companies, a lack of management awareness and support for data quality continues to be a problem. Once again, there is no “silver bullet” solution that can automagically turn your dirty data into good quality data. While there are data cleansing tools that can help, it still takes the knowledge of a business person to define the intended meaning and the intended content of each data element. These have to be documented as business rules before they can be fed to any data cleaning tool or business rules engine.
Metadata is one of the vehicles for achieving data standardization and integration. Metadata is contextual information about IT assets, such as data, processes, programs, and so on. Metadata components for data assets includes business definitions, domains (valid values), data formats (type and length), business rules for creating the data, transformation and aggregation rules, security requirements, ownership, sources (operational files and databases), timeliness, and applicability, just to name a few. Not many companies capture all of these metadata components, and those that do, don’t make effective use of it.
Metadata is no longer the dirty “D” word: documentation. It is now the nice “N” word: navigation. Documentation is often considered to be an IT overhead; after all, programmers can read the code, which is usually more reliable than any documentation anyway. However, navigation cannot be dismissed so easily because it is an essential tool for business people to navigate through their BI/DW/analytics environment.
Data modeling / Data Architecture
To most IT technicians, data modeling is synonymous with database design. When I ask “who invented entity relationship modeling” I get the answer: Dr. Codd. Wrong. Entity-relationship modeling, the first data modeling technique, was invented in the mid 1970’s by Dr. Peter Chen. Dr. Codd formalized the six Normalization rules and published the famous 12 rules of relational databases (actually 13 rules when counting rule Zero), but he did not invent the entity relationship modeling technique.
Data modeling originated as a business modeling technique before it became a database design technique. In the early days of relational databases (early 1980’s), the business data model, known as logical data model, and the database data model, known as physical data model, looked very similar because we did not de-normalize heavily for operational systems. In the BI/DW world, those two models are quite dissimilar, especially when we store the data multi-dimensionally.
Regardless of what database design schema you ultimately choose for storing the data, you should still create business data models for understanding the semantics and the business rules of the data first. Without understanding the semantics and the business rules of the data, it is impossible to standardize the data.
Data ownership and stewardship
A data strategy is a strategic plan for enterprise-wide data management and data governance. Who in the organization is, or should be, most interested in enterprise-wide data governance? IT staff or business people? Clearly, the business people. Thus, instituting enterprise-wide data governance is a business responsibility. This responsibility starts with data ownership and extends to data stewardship. The data owners are usually the originators of the data, or they are the primary users of the data. In either case, data owners are senior business people who have the authority to set policies and to create business rules for the data elements under their control. Data stewards also come from the business side, not from IT. They do not have authority to set policies or to create business rules, only to communicate and to enforce those policies and business rules. Data stewards also perform data audits and help resolve data disputes. Without data stewards and data governance professionals, it would be very difficult to fully implement a data strategy.
In summary, a data strategy is an essential and fundamental building block for all every organization. Adding additional applications to our IT portfolios without a data strategy is like building a skyscraper without first pouring a foundation. The risks are obvious and can cause significant damage.