The Challenges of Going Global
By Sharon Allen
The Situation:
Many of us in Information Technology receive data requests from our business units to answer questions about revenue, cost, and events from a corporate level. Business analysts from multiple departments and regions want a wider view of the corporation. We puzzle over how to process those requests in a quick, comprehensive, and satisfactory manner.
For years now, we have set up a hoc reporting systems to satisfy requests and put out fires with little thought to the future. Now the business analysts are clamoring for links between those little ‘emergency fix’ systems. They want comparisons and overviews. They want to group answers together and bundle in other valuable information. Through the best of intentions, we find ourselves backed into a reporting corner. We need a community-reporting environment with a global perspective. This is a simple conclusion, but by no means a simple task.
Imagine multinational teams of restoration specialists, librarians, and archeologists faced with a mountain of potsherds. Each team is eager to help the researchers and enthusiasm abounds. But no one knows where to start. It seems elementary; just match the broken pieces back together and organize them so that researching is easy.
Our business analysts think that IT knows the correct system to pull their requested data from. It is assumed that the dependency rules, consistency issues, and scope of data they might really need (as opposed to what they ask for) is understood when we fulfill the request. But what IT really knows is that the regional systems (even if they started as the same software used the same way) have been updated, customized, retrofitted and forced to solve local problems. IT knows that even when the same problem challenged more than one region, software and data solutions were decentralized and creatively evolved by different teams over time.
Actual data in those regional systems also developed in a decentralized manner, allowing the same customer record, place record, or product record to be input and maintained across the locales in slightly different forms. All those valuable pieces of our history are organized in slightly different ways, supporting slightly different business rules in different environments, languages, and time zones.
Data is fractured, fragmented, and redundant from an enterprise point of view. Yet, the answers to the questions about our universe are buried in that mass of bits and bytes. The mountain of replicated, slightly worn, data pieces awaits the experts from a multitude of disciplines to sift, sort, and rebuild everything back to museum quality perfection.
We are faced with the understanding that the data is as much a shared natural resource to our company as the ocean is a shared natural resource to the world. Our data Global Village is being rocked with the awareness, for perhaps the first time, that local data decisions have and will have far reaching impacts. Therefore the only path to success is acknowledging a shared problem and agreeing on a shared solution.
Which Methodology
We can’t buy a plan from the corner market or make assumptions about project norms in this multi-cultural, multi-language, multi-organization, multi-time zone, multi-system effort. We must go back to the beginning and ask questions like:
- How are we going to organize this effort? It might seem silly to ask, but there are parts of the team all over the world with different experiences and expectations about everything from project management to data warehousing methods. Then just to add to the challenge everyone expresses those experience and expectations in a different language.
- What communication links do we need to put in place: for the people? for the data? for the research materials? for the tools?
- How do we set up: debates? discussions? summit meetings?
- How are we going to deal with the 24/7 time problems of: developing globally? access to each other? access to data? access to tool support? global software licensing? Help desk post-production?
- How is the team going to deal with language challenges: in the data? with each other? in the documentation? in the training sessions?
- How do we capture and manage: issues? resolutions? options?
- How do we establish quality checks: on our strategies? on the local data? on the combined data?
- How do we develop a global schedule? manage to it? adjust it?
- How do we configure the whole thing with without massive bottlenecks: in network performance? in release procedures? in backups? in security measures?
- And then there are the questions about style and general understanding of the Data Warehousing project itself. What are our ground rules about: the basic design approach? tools? infra-structure? deployment structure? documentation? front-end look and feel? security?
Target a Business Need:
We can’t buy a scope either. Yet one of the major success factors in any effort is a well-defined objective; one that is meaningful yet manageable and adds business value to the company.
Business analysts have been satisfying business needs since the company started. They constantly acquire, reformat, calculate, and organize data, to create information rich reports. They do it creatively; using whatever sources they can lay their hands on. A good business analyst is a great data detective.
All of those efforts identify a business need. If we got even one copy of each of those products we would be buried under a landslide of very clearly defined needs. Identifying what is needed is not nearly as difficult as choosing which to pick.
Plus we need to deal with the important issue of dependency. To satisfy a well-defined business need we have to make very sure the answer is not dependent on being able to calculate and combine the answers to a dozen other questions. For example we can’t pick "Profitability" without having the complete picture of cost and revenue or "Utilization" without the being able to answer use and capacity.
We need to know:
- What are our potential business need targets?
- What are the dependencies?
- What are the risk factors?
- How do we prioritize and weight everything?
Plan for Conforming Dimensions:
One of the most valuable decisions we can make is to build conformed dimensions for the global data warehouse or operational stores.
A conformed dimension is a dimension that means the same thing with every possible fact table to which it can be joined.
Without a strict adherence to conformed dimensions, the data warehouse cannot function as an integrated whole. The Data Warehouse Lifecycle Toolkit - Kimball
Conforming the dimensions ensure data consistency. Time parameters would mean the same to all data records. Geography would have the same values. All customers would be found in a master set of customers and neither duplicated nor misidentified. Master sets of data would very likely be populated from a variety of enterprise systems, or even outside sources.
Conforming doesn’t just involve structure. It is not just the definitions of the columns, their sizes, and table names that are important. Conformed dimensions would be a significant step in sharing the data globally.
But how do we manage that goal? To be a true conformed dimension there must be a central publishing agency for any specific dimension. The publishing agency must take responsibility for the quality of the master set by updating, inserting, and inspecting the data in that dimension for use by our Global Data Village.
In order to leave the local view behind and really plan for global we need to decide:
- How are we going to set up the task of conforming data?
- Who takes on the ‘Central Design Team’ task?
- How do we do global data validation prior to publishing?
- How do we resolve the errors?
Agreeing on Vocabulary and Definitions:
Finally, we must find a way to agree on words and meanings. We toss around terminology in our discovery sessions as though we understand everything that is being said. It is important to stop and verify what we think we know and then document it for everybody.
We need to agree on:
- What is meant by: gross, net, completed, customer, and the myriad other concepts in our work?
- How do we document and share the definition of the tables, columns, data, formulas, business rules/constraints, aggregates, and facts/measures?
- What kind of forum can we set up to encourage question?
Conclusions
What do we have to do to be successful? To quote a great film "Choose, but choose wisely". Recognizing what must be done can create a strong framework for the success of a global project. So we have to be focused and deliberate in an approach; we have to develop a unified corporate data vision; and we have to invent ways to make this work.
About the Author
Sharon Allen is the data warehouse architect for a global transportation company. She has eleven years of data modeling experience in a variety of industries including aerospace, entertainment, manufacturing, and transportation. Her commitment to problem solving and compassion for the quirks of the real world lend themselves well to the challenges of trying to organize data that is basically "A sock between two puppies". She can be reached at (949) 720-2634 or via email at SLAllen@irvineco.com