When and How to Federate
By Douglas Hackney
In previous articles we've explored the reasons why a single, monolithic data warehouse (DW) architecture is incapable of supporting the reality of today's heterogeneous, speed driven market (past columns are available at www.egltd.com and www.dmreview.com). This month, we'll examine when it is appropriate or required to move to a federated BI architecture.
As we have seen by the rapid growth and diversification of the business intelligence (BI) market, our current and future reality is driven by a wide panoply of business driven solutions focused on solving domain specific problems. There is little doubt that the BI market has outgrown its humble and relatively simple technologist and architecture roots. In fact, the market research company survey.com projects that the business intelligence market will be worth $148 billion dollars by 2003. In a market of this size, complexity and impact, a simple, monolithic architecture is desperately inadequate.
This current and future BI market is built on the foundation of a modern BI infrastructure, consisting of a federated BI architecture accommodating all the components of a contemporary BI system: packaged/turnkey DWs and data marts (DM), packaged/turnkey analytical applications (AA), custom built DWs and DMs, custom built AAs, data mining, online analytical processing (OLAP) tools, query and reporting (Q&R) tools, production reporting tools, data quality tools, extraction transformation and load (ETL) tools, system management tools, information delivery tools, enterprise information portals, reporting systems, knowledge management systems, database systems, etc. The federated BI architecture is the "big tent" that provides the foundation and environment to facilitate and enable business information flow, analysis and decision making in the typical organization's heterogeneous environment.
When to Federate
Sites that are just embarking on the design and implementation of a BI system will most likely choose a federated architecture by default, as their system will be composed of a mixture of packaged and custom solutions. Sites that have started by attempting to develop a single, monolithic system are often highly stressed when faced with business driven scenarios that their old-think architecture is not optimized to support.
You should design and develop a federated BI architecture when you are faced with the following scenarios:
Mergers and Acquisitions (M&A)
The wave of M&A activity across all industry segments has presented the BI professional with a very challenging scenario. Often, teams that are happily humming along building their custom, hub-and-spoke monolithic system awake one day to learn that they are the proud owners of a new company or two, along with their accompanying BI systems. If very rarely makes sense to flush a fully functioning data warehouse infrastructure down the toilet, so these teams are forced to adopt a federated BI architecture.
Turnkey ERP DW(s) + Custom DW(s)
Today's market is quickly moving from the "build" stage of evolution to a "buy" mode. Just as we all used to develop general ledger and order entry systems, but would never even consider it today, BI is moving from hand built, custom systems to packaged offerings. Teams that have a partially completed monolith data warehouse system (which means 99.99% of the systems on this planet) are often presented with a packaged data warehouse as an extension to the ERP or OLTP system that the business has just invested $20-40 million into. In this case, it is not conducive to career development to reject this turnkey system. Don't look a gift horse in the mouth as it were, and at a minimum consider it to be a wonderful way to extract data from the arcane innards of the ERP package.
Multiple DW / DM Systems
Organizations of any size, especially those that have been involved with data warehousing for more than a year or two, all have multiple DW / DM systems. A typical large scale site has between 10 and 40 data warehouse systems and countless scores of data mart systems of varying degrees of architecture. These sites must adopt a federated BI architecture to achieve any semblance of data integration across the multiple systems.
Turnkey analytical applications
Turnkey analytical applications (AA) at extremely low price points ($5 - 50,000) are sweeping the marketplace. These systems are sold directly to the business users and are usually dumped on the doorstep of IT as soon as the purchaser is promoted or moves on to greener pastures. Most of these systems are non-architected and many are only marginally capable of being integrated at all. This scenario requires a federated BI architecture to accommodate these high-power, high-impact, politically sensitive solutions.
Speed to market driven solutions
There is nothing more important in today's business world than speed to market. It will trump architecture every time. In a survey of over 1,500 BI professionals I've found one example of a business delaying an incremental income adding/margin affecting BI solution to allow for architecture. In order to be part of the solution, and to facilitate speed to market, you must accommodate the time requirements of the business in your federated BI architecture.
Be ready for these inevitable scenarios and be ready to be part of the solution instead of part of the problem by laying the foundation of a federated BI architecture today.
How to Federate
- Create a communication forum for the data warehouse (DW) and data mart (DM) teams in your enterprise. A typical large organization has ten to forty DW systems and teams, and scores of DM systems and teams, all of various levels of architecture and functionality. The most important thing for you to do in your efforts to federate these stovepipe DW systems is to create an opportunity, structure and forum for cross team communication. Your goals are to identify and understand the data content of the systems, the key stakeholders of the systems and the political drivers & sponsorships of the systems. The business is very sensitive to duplication of effort and to penetration of the various data fiefdoms. You need to pay particular attention to who the political "owners" of the various systems are, and to their level of sensitivity to data sharing across the enterprise of "their" data.
- Document your existing DW/DM systems via a high-level enterprise data warehouse architecture (EDWA). The highest level is an entity level diagram showing the various systems and any existing cross data flow and meta data exchange between them. (see figure one) This investigation and documentation process is accomplished via presentations by each team at your communication forum and by follow on interviews with each team. During this process, you will discover that many of these systems would not pass any sort of industry standard test to qualify as a true "Data Warehouse," but the teams are very wedded to this term due to political and budget reasons. You must be sensitive to these issues, and not be derisive or negative in any way related to their efforts and their systems. This effort at documenting all the multiple DW/DM systems takes significant investment in time and resources, and should not be taken lightly. It is also a prerequisite to achieve the goal of the federated BI architecture. You cannot identify and share critical data across the multiple systems without identifying and documenting it.
- Document each of the existing DW/DM systems at the data flow level. This level includes data flow from each data source, any transformation and integration steps and meta data repositories. (see figure two) In this stage of documentation, you will be identifying each of the data source systems associated with each of the DW/DM systems in your enterprise. Rate each major data element in terms of quality, availability and ease of access. You will also be documenting each of the ETL stages, intermediate data access stages, and systems that use the resulting integrated data. It is important to clearly identify the state of quality, aggregation and utilization of the data at this step. You will need to identify data that is "in-process data," or data that is being extracted while it is in the midst of an OLTP process. Often "in-process" data requires special business rules to enable utilization by business users. This is also an excellent opportunity to rate or tag each of the utilization stages as to the political sensitivity of use or sharing with other groups in the organization.
- In conjunction with your users, determine what data offers value-add and impact across multiple systems. This step requires the involvement and commitment of business resources, which is often a challenging commodity. You must have this commitment of dedicated resources to accomplish your goal, so don't start down the federated BI architecture path without an up-front commitment by your business user community and key top-level stakeholders. The critical characteristic you are looking for is high-impact unique data integration. This is data that is formed by the integration of two or more data sources (in this case, DW/DM systems around the business), is not available anywhere else in your enterprise and provides politically meaningful impact to the business.
- Collect the various build phase candidates that derive from step 4 and analyze them to determine impact and viability. (To help with this analysis there is a free build phase candidate automated assessment in the resource library at www.EGLtd.com.) Pick the candidate that provides the best balance between business impact and risk. One of the most important elements to examine is whether the phase candidate advances the strategic agenda(s) of the business and the top-level stakeholders. If your phase candidate is providing nothing more than an interesting data set, or something important to a mid-level manager or VP, it will lack the political "critical mass" to survive the budgeting process and other aspects of the Machiavellian world of corporate politics.
- Implement an enterprise class ETL tool that supports a common, global meta data repository across the multiple DW/DM systems in your enterprise. An enterprise class ETL tool is required to provide the backbone of data extraction, transformation and integration required for a sustainable federated BI architecture. The most critical elements you gain from the implementation of an ETL tool are a fighting chance to maintain what you have built and some amount of automated meta data population and maintenance. If you attempt to build your federated BI architecture ETL processes by hand, you will probably be able to get them built with your team of the "best and the brightest," but you will struggle to maintain it with the much lower talent and productivity level of a maintenance level resource.
- Build a small, focused iteration of the federated BI architecture based on the winning candidate from step 5. Document and publicize success to establish and sustain the political will required for future iterations. Your keys to success in this step are to keep your iterations very small in scope, very focused on meaningful business pain, measurable (you must be able to measure your success) and marketable. It does no good to build a life-saving system for the business if you cannot measure and demonstrate that success through internal marketing and advertising efforts. The goal of the federated BI architecture process, as in all DW initiatives, is to engender and sustain political will in the enterprise. Political will translates into resources and funding for your project(s). If you don't invest as much time in managing the politics as you do the technologies, you won't survive very long. In your federated BI architecture iterations, stay small and stay focused on the measurable relief of politically significant business pain.
This article appeared in the May - July/August 2000 issues of DM Review magazine (www.DMReview.com).
Note: Douglas Hackney's articles and white papers are available on the www.EGLtd.com website
About the Author
Douglas Hackney is president of Enterprise Group Ltd., a consulting company specializing in business intelligence (BI). EGL helps organizations understand, plan, design, implement and sustain BI systems to better manage and use business information. Mr. Hackney has over 20 years of experience in business management and in designing and implementing business intelligence solutions for Global 2000 organizations. He offers clients practical knowledge of the challenges and critical success factors involved in building and managing BI systems across a variety of industries and business applications. His approach is distinguished by his ability to discover and understand the business needs of the organization, then answer those needs with information technology solutions.
Mr. Hackney is a frequent and highly rated speaker at international industry conferences, including participation at DCI's Data Warehousing / Business Intelligence Conferences, DCI Customer Relationship Management (CRM) Conferences and The Data Warehouse Institute. He also speaks at private and public events, industry user conferences and at industry, educational and product seminars and conferences around the globe. Mr. Hackney is a founding board member of the International Data Warehouse Association, and often serves as a judge for industry awards, such as the Excellence in Business Information Award, the RealWare Data Warehousing Award and the Industry Solution for Data Warehousing Award.
Mr. Hackney is the author of Understanding and Implementing Successful Data Marts. He also is a contributing editor and writes a monthly column for DM Review, and contributes to and is quoted often in other industry publications such as eWeek, Computerworld, Enterprise Systems Journal, Forbes, etc.