Fully architected solutions to data warehouse / analytics systems, not independent data marts, is the right way to ensure all the benefits of integrated data
The need for an architected solution for decision support / analytics data, as opposed to the proliferation of independent data marts is becoming increasingly apparent. The spread of independent data marts have created “islands of data” that are resistant to connection and analytical use, making them agents of costly disease instead of sources of support for business intelligence and decision-making.
“Stranded on Islands of Data“, Part One of this two-part series covered the characteristics of independent data marts, the flaws in their architecture and the reasons why they exist. This article, Part Two, focuses on the approaches for migration, initial planning, how to identify a migration path and implement an architected solution for decision support / analytics. Also, it presents a fictitious case study from an EWSolutions‘ client, illustrating how a corporation can migrate from independent data marts to an architected solution of an enterprise data warehouse with multiple dependent data marts.
|Big Bang||1. Provides the fastest path for migration.
2. Allows immediate economies of scale.
|1. Labor intensive.
2. Requires tremendous coordination.
3. Complex parallel testing.
4. Most risk.
|Best used when the independent data mart problems is not very pervasive.|
|Iterative||1. Reduces risk.
2. Lessons learned are leveraged.
3. Does provide eventual economies of scale.
|1. Migration time is elongated.
2. Multiple development efforts need to be managed and coordinated.
|This approach is best used when the independent data mart problem is large and complex.|
|Table 1: Big Bang versus Iterative Approach|
Approaches to Migration
There are two general approaches for migration: “Big Bang” and “Iterative.” Table 1 summarizes the advantages and disadvantages of each approach.
Big Bang Approach:
As the name implies, all of the independent data marts will be re-engineered simultaneously into a structured DSS architecture. There are a couple of advantages to this approach. First, it can provide the fastest path for migration. Often companies will need to change their DSS architecture as quickly as possible because of a need to implement additional DSS projects that promise to generate a high ROI. Second, this approach allows for immediate economies of scale rather than slowly attaining them in an iterative approach. The disadvantages to this approach are that it is labor intensive and requires tremendous coordination. In addition, the “Big Bang” approach is the more complex to implement and thus provides the highest exposure.
This approach is best suited when the independent data mart problem is relatively small and not highly complex. However, when the problem is large, the complexity of the migration grows at a tremendous rate.
This approach re-engineers the independent data marts (one or two data marts at a time) in manageable phases. The advantages to this approach are several. First, it allows a company to manage and reduce the risk involved in a migration effort. This occurs because the migration can be accomplished in a phased manner, thereby increasing the probability of the project’s success. Second, as each project phase is executed, lessons are learned and leveraged for subsequent phases. This is very valuable as typically once the first phase is completed, the follow-up phases run much more smoothly.
The major disadvantage to this approach is that it takes longer to fully complete the migration. This approach is best used when the independent data mart problem is large and too complex to tackle all at once.
Many companies fail in their migration efforts well before they start. The chief reason for this is the lack of initial planning and sponsorship. Attaining executive sponsorship is one of the most important tasks at the onset of the project. This is critical, as typically autonomous teams in different corporate departments have constructed each independent data mart. Therefore, having a project champion that has cross-departmental authority is critical for dealing with the political challenges that are commonplace in these migration efforts.
During the initial planning phases, it is important to plan on implementing a meta data repository that can support future DSS development efforts and that will provide a semantic layer between the business users and the DSS system. The data mart migration provides an outstanding opportunity to implement the meta data repository. Before the data mart migration begins, it is best to standardize the data naming nomenclature for the DSS system. Implementing standard data naming nomenclature will aid in the DSS system’s maintenance and provide cleaner and more understandable meta data.
|Independent Data Mart Research
|Table 2: Independent Data Mart Migration|
A great deal of research should be conducted on the independent data marts before a migration is possible. (Table 2 summarizes these tasks.) The most important research activity is to understand the business needs that each independent data mart is meeting. Typically multiple independent data marts will exist to meet the same or similar business needs. These situations are common and do suggest a path for migration. The results of this research will identify the independent data marts that will be the most difficult to migrate.
During independent data mart migration is an excellent time to standardize on hardware and software for the DSS project. For each differing software or hardware platform, a company needs to have trained personnel to support it. Therefore, by limiting the redundant software/hardware, the corporation reduces the support strain on their IT staff. In addition, purchasing economies of scale can also be achieved.
The central covenant of any independent data mart migration effort is to “never deliver less functionality to the business users than what they have today.” Generally business users do not react well to spending money on infrastructure because they don’t initially see its value. The key business users need to understand that a bad system architecture leads to a non-scalable and non-flexible system that will eventually need to be rewritten at a very high cost. Therefore, during migration the users must be assured that they will not receive less functionality (information, ease of use and response time) than they are currently receiving.
There are several activities that need to be conducted before a migration path will be evident.
First, diagram the current DSS architecture. This is critical for identifying which legacy systems are feeding which independent data marts (See Figure 1).
Often independent data marts will be sourced from the same legacy systems. By targeting independent data marts with the same source data, multiple independent data marts often can be removed with minimal effort. Identifying redundant data often suggests a migration path.
Identify Paths of Least Resistance
It is important to target those independent data marts whose data will most likely be used in future DSS efforts. By targeting these data marts first, it will ease the task of keeping all new DSS development activity within the newly architected environment.
The next step is to identify those data marts whose transformation rules are known and documented. Understand that even the best-documented transformation rules will have gaps. Moreover, even those marts that have been built using ETL (extraction/transformation/load) tools have meta data (documentation) gaps. For example, ETL tools many times provide the functionality to call user exits that are hand-coded programs. The processes performed by these user exits will not be captured in the ETL tool’s meta data stores. If documentation does not exist for a mart, programmers will need to manually analyze the ETL program’s code to extract the transformation rules. Manually analyzing code to extract transformation rules is a very time-consuming and expensive activity.
It will be critical to obtain support from the current independent data mart IT teams and business users. Identify those data mart teams most likely to work cooperatively with the centralized DSS team. Recognize the strengths and weaknesses of those teams that can and will provide the most aid. If particular data mart teams/business users are not willing to assist with the migration effort, it is best to delay the migration of their particular data mart. If this is not an option, utilize your executive sponsorship to “motivate” this group to provide their support.
Strengths and weaknesses:
Keep in mind that any team will have its stronger and weaker areas of knowledge. As much as possible, keep teams’ areas of weakness off of the critical path. Any mission-critical team weaknesses need to be shored up with internal members from the other data mart teams or from outside vendors.
|Table 3: DSS Background|
The following case study puts the concepts we’ve discussed into action. This case study, drawn from various EWSolutions‘ clients, illustrates the iterative approach to independent data mart migration.
The XYZ company is a Fortune 500 consumer electronics firm. XYZ recently acquired a smaller company (Acme Electronics) that has a single marketing data mart about which little is known. In addition, XYZ is standardizing on a new order entry system in five years, and existing batch windows for the legacy systems have reached their limit. XYZ’s management team is stable, well organized and fully supports the migration effort. Table 3 lists the DSS specific details.
Phase One: By viewing the data, it is evident that XYZ’s marketing and finance data marts share two common data sources (old and new order entry systems). In addition, the XYZ marketing data mart has a strong end-user community that will be highly supportive of the migration effort. In addition, both the marketing and finance data marts’ business users have agreed to freeze their additional functionality requests for phase one of the migration.
Phase one does not include migrating the XYZ quality control data mart or the Acme marketing data mart due to the lack of support in the quality control mart and all the unknowns associated with the Acme marketing mart.
Phase Two: During this phase, the operational logistical system’s data will be brought into the data warehouse and the quality control data mart is now being sourced directly from the enterprise data warehouse. In addition, during this phase the marketing and finance teams change requests that were frozen during phase one are now being developed. Lastly, a new dependent accounting data mart is now being sourced from the data warehouse.
Phase Three: This phase merges the functionality of the former Acme Electronics marketing data mart into the existing dependent marketing data mart. Also, additional data marts are continuing to appear (e.g., CEO data mart). Figure 2 illustrates all three phases of the DSS architecture.
It is important to understand that the process for migrating from an independent architecture is a costly proposition that will only get more expensive and difficult as time goes on. Remember, as with any disease, the earlier it is detected and treatment begins, the sooner the patient will become healthy. However, if treatment is delayed, the patient’s condition will worsen and eventually become terminal. To use this metaphor in technical terms, the sooner that an organization recognizes the continuing challenges and lack of benefits to remaining with an independent data mart approach to decision support / analytics system design and moves to a properly architected solution as described here, the sooner the organization can realize all the benefits and start to reduce the growing costs of “stranded islands of data.”