Affiliated with:

Integeration 7

Independent data marts create stranded, unusable data.  Independent data marts can spread like a disease through an organization and must be eradicated.

There is a severe disease that has spread to epidemic proportions throughout our society. This disease is particularly dangerous as its effects are not readily identifiable at the time of infection. However, if this condition goes untreated, it can be debilitating and even terminal. This disease is not hepatitis, but rather “independent” data marts. While this imagery may seem a bit dramatic, unfortunately it reflects the reality in many companies.  This has been the condition of many EWSolutions‘ clients at the beginning of a data warehouse/ business intelligence / analytics engagement.

This article is the first of a two-part series on migrating from independent data marts to an architected solution.  This installment will address the characteristics of independent data marts, the flaws in their architecture and the reasons why they exist.  Part two (Data Mart Migration)  will address specifically how a company can migrate from the independent data mart architecture to an architected solution.

Characteristics of Independent Data Marts

Independent data marts are characterized by several traits.  First, each data mart is sourced directly from the operational systems without an enterprise data warehouse to supply the architecture necessary to sustain and grow the data marts. Second, these data marts are typically built independently from one another by autonomous teams. Typically, these teams will utilize varying tools, software, hardware and processes.

Integeration 8

Possibly the most visually descriptive trait of a company that has constructed independent data marts is that once they map out a schema of their decision support / analytics systems (DSSs), the schema will resemble that of a “spaghetti” chart (see Figure 1).*  What is most disturbing is the number of companies that have expressed that this chart resembles their current DSS architecture.

Obviously, this architecture is not an architecture at all.  Instead it is a series of “stovepipe” DSS systems.  This architecture greatly differs from that of an architected data warehouse (see Figure 2).

Integeration 9

The purpose of this article is to discuss independent data marts and the process for migrating to an architected solution.  However, it will briefly touch on the topic of DSS / analytics architecture. It will not go into a detailed discussion of top-down versus bottom-up approaches, except to say that the “classic” top-down approach is a more scalable and logical approach for constructing a DSS or analytics system.  It is surprising how often the top-down methodology is mistaken for a “galactic” approach.  This is a misunderstanding as the top-down approach is best used iteratively and incrementally to build the DSS system. When used in this fashion, the cost of building a data warehouse that feeds “dependent” data marts becomes highly comparable to the cost of building independent data marts.

Problems with Independent Data Marts

Redundant Data:

As the number of independent data marts grows, the amount of redundant data begins to grow uncontrollably across the enterprise.  This redundancy occurs because each of the independent data marts requires its own, typically duplicated copy of the detailed corporate data. Often a great deal of this detailed data is not required in the data marts, which typically provide summarized views.

It would be enlightening if a study were conducted to calculate the costs of maintaining non-necessary redundant data for Fortune 1000 companies. The end total would be in the billions of dollars in expenses and lost opportunity.

Redundant Processing:

A data warehouse provides the architecture to centralize integration and cleansing activities common to all of the data marts of a company. Without the data warehouse, all of these integration and cleansing processes need to be duplicated for all of the independent data marts. This greatly increases the number of support staff required to maintain the DSS / analytics system, creating a particularly disastrous situation for most companies in light of today’s IT staffing shortage.

Separate teams will typically build each of the independent data marts in isolation. As a result, these teams do not leverage the other’s standards, processes, knowledge and lessons learned. This results in a great deal of rework.

These autonomous teams will commonly select different tools, software and hardware. This forces the enterprise to retain skilled employees to support each of these technologies. In addition, a great deal of financial savings is lost, as standardization on these tools doesn’t occur. Often a software, hardware or tool contract can be negotiated to provide considerable discounts for enterprise licenses. These economies of scale can provide tremendous cost savings to the organization.

Scalability:

Independent data marts directly read operational system files and/or tables, which greatly limits the DSS system’s ability to scale.  For example, if a company has five independent data marts, it is likely that each data mart would require customer information.  Therefore, there would be five separate extracts pulled from the same customer tables in the operational system of record.  Most operational systems have limited batch windows and cannot support this number extracts.  With a data warehouse, only one extract is required in the operational system of record.

Non-Integrated:

As previously discussed, each independent data mart is built by autonomous teams, typically working for separate departments. As a result, these data marts are not integrated and none of them contain an enterprise view of the corporation.  Therefore, if the CEO asks the IT department to provide a “listing of our most profitable customers,” each data mart will offer a different answer.  Having worked with companies that have experienced this exact situation, I can attest that the CIO is rarely pleased to have to explain why his department cannot answer this seemingly simple question.

One of the chief phenomena facing corporations today is the current merger and acquisition craze.  Interestingly enough, one of the key factors fueling this movement is these companies’ desire to reduce their IT spending. In light of this situation, the costs associated with independent data marts become even more magnified as companies continue to focus on controlling their ever-growing IT costs.

It is important to note that many companies that have built independent data marts are currently in the process of migrating off of them.  The cost–in dollars and time–for the migration is not trivial.

Why Do Independent Data Marts Exist?

With all of these architectural flaws, it would seem surprising that so many companies have built their DSS systems around this architecture. There are several reasons why this aberration has occurred.

DSSs Are Complex:

When the decision support craze spread, most companies were looking to build a data warehouse of their own.  Unfortunately, the task of building a well-architected and scalable business intelligence system is complicated and requires sophisticated software, expensive hardware and a highly skilled and experienced team. Finding data warehouse architects and project leaders that truly understand data warehouse architecture is a daunting challenge, both in the corporate and consulting ranks.

In order to construct a data warehouse, an organization must truly come to terms with their data and the business procedures that the data represents.  While this task is challenging, it is a necessary step and one from which the true value of the DSS / analytics process is derived.

Independent Data Mart Shortcut:

Building independent data marts is less expensive than building architected decision support systems. In addition, independent data marts can be constructed fairly quickly and, unlike a data warehouse, do not require a company to really understand their data beyond that of individual departments. These points have been effectively used to sell the concept of constructing independent data marts. Unfortunately, it is this lack of thorough analysis and long-term planning that limits the independent data marts from being an effective business intelligence system.

Inappropriate Vendor Messages:

Many vendors have developed tools that are effective at building small, departmental independent data marts. These companies in their rush to market with these tools have worked very hard at selling the independent data mart concept (of course, it is never worded like this). The reasons are obvious. These companies can significantly reduce their sales cycles because only one department is involved in the software purchasing decision. In addition, their software requires much less sophistication because they merely need to build a standalone data store.

The current vendor buzzword in today’s market is “turnkey.” Everyone seems to offer a “turnkey” DSS solution. Unfortunately, merely purchasing a “turnkey” solution does not alleviate the task of learning and understanding a corporation’s data and their business processes. Integration of data from disparate systems requires a careful analysis and an understanding of business processes and the data that represents them. There isn’t a “magic bullet” or “turnkey” solution that alleviates this task.

Conclusion

Building independent data marts does not solve the challenges of users’ needs for business intelligence or decision support / analytics data.  Rather, it creates a collection of stranded islands of data that must be navigated manually or through costly integration techniques.  Part two (Data Mart Migration) will address specifically how a company can migrate from the independent data mart architecture to an architected solution.

LinkedIn
Facebook
Twitter

Dr. David P. Marco, LinkedIn Top BI Voice, IDMMA Data Mgt. Professional of the Year, Fellow IIM, CBIP, CDP

Dr. David P. Marco, PhD, Fellow IIM, CBIP, CDP is best known as the world’s foremost authority on data governance and metadata management, he is an internationally recognized expert in the fields of CDO, data management, data literacy, and advanced analytics. He has earned many industry honors, including Crain’s Chicago Business “Top 40 Under 40”, named by DePaul University as one of their “Top 14 Alumni Under 40”, and he is a Professional Fellow in the Institute of Information Management. In 2022, CDO Magazine named Dr. Marco one of the Top Data Consultants in North America and IDMMA named him their Data Management Professional of the Year. In 2023 he earned LinkedIn’s Top BI Voice. Dr. Marco won the prestigious BIG Innovation award in 2024. David Marco is the author of the widely acclaimed two top-selling books in metadata management history, “Universal Meta Data Models” and “Building and Managing the Meta Data Repository” (available in multiple languages). In addition, he is a co- author of numerous books and published hundreds of articles, some of which are translated into Mandarin, Russian, Portuguese, and others. He has taught at the University of Chicago and DePaul University. DMarco@EWSolutions.com

© Since 1997 to the present – Enterprise Warehousing Solutions, Inc. (EWSolutions). All Rights Reserved

Subscribe To DMU

Be the first to hear about articles, tips, and opportunities for improving your data management career.