Data Warehouse Architectures for Multinational Organizations: Part 3

Look to organizational culture and strategy before technology when deciding on a data warehouse model

Big Data Industry Architect, IBM

This three-part series for the Big Data Architects column was written by Tommy Eunice, software architect on the IBM Big Data Industry team.

The first two installments of this series introduced the complexities and challenges associated with setting up data warehouses for multinational organizations and the decentralized, distributed, and centralized data warehouse models. This concluding installment focuses on the value of these models and the beneits they can provide multinational organizations. These organizations in particular need to consider which model to use by weighing their decision more on the organization’s culture and strategy and less on the technology.

Comparing and contrasting data warehouse models

The decentralized data warehouse model is the de facto standard because many organizations grow through acquisitions and leave management of the analytics platforms to the existing regional operations staff. As long as there is no intention to manage by using detail data across regional boundaries, this hands-off approach can be successful. Central management can use spreadsheets to look at overall costs, profitability, and revenue for each region and leave it to the regions to meet the targets.

There is often resistance to efforts compelling a successful regional team to be bound by centralized designs and policies. Regional staffs believe they understand their markets and systems better than a central organization and can manage successfully if left alone. While that scenario may be true in many cases, central management is often in a strategic point to observe when one region is more efficient or more profitable than another region. It also is often well positioned to question the veracity of claims made by the regions. Headquarters can ask the regions to improve performance or risk the extra burden of being managed centrally and more closely by headquarters’ staff. As long as the regions can meet the targets set by management, there may not be a need for consolidated, centralized reporting.

The distributed data warehouse model allows aggregated data to be analyzed directly by headquarters. This aggregation may be a daily data upload to headquarters after regional nightly processing is completed. Or the analysis can be accomplished with federated query technology. Headquarters sets standards for data consolidation to be implemented by the regions. For example, it creates a set of accounts and subaccounts to collect the financial data from each region and sets criteria for each account to help ensure consistency.

The distributed model can be implemented with federation technology such as IBM® InfoSphere® Federation Server or IBM InfoSphere Data Explorer software. Using this technology lets the enterprise data warehouse (EDW) technology remain in the regions, and only when required—at query time—will the data be pulled across to headquarters. This approach helps greatly reduce the cost of a solution while providing the advantage of getting the relevant information to headquarters quickly.

The distributed data warehouse model carries the risk that regional data warehouse environments will diverge over time and the cross-market view will be lost. However, this risk can be mitigated through strong governance procedures. While some rigid standards should be in place to help ensure the distributed data warehouse is adequately maintained, allowing regional decision making helps regions feel as though their system provides specific local value. This autonomy helps ease the challenge of conforming to headquarters-based standards and accommodates regional variations.

The consolidated data warehouse model can be well suited for organizations that have strong central staffing functions. This type of business typically has a culture that favors a single governance and policy-making process across all lines of business. From experience, the most effective centralized analytical systems are paired with centralized operational and front-office systems. If the operational systems are aligned and managed centrally, then the data from those systems is already well organized for analytics and requires much less governance for accurate consolidation than data from varied and disparate local operational systems. This pre-consolidation of local data helps reduce the overall cost of building the centralized analytical systems. Standardized operational systems also can reduce the risk of having data governance creep away from central policy over time and degrade the value of the consolidated system.

Determining an advantageous architecture alternative

Discussions with regional offices on how to consolidate data can be valuable. Not only will each region see how the other regions work, but the process should reveal data quality practices—or a lack thereof. Once rules for combining the data are hammered out and policies and processes are in place to maintain those rules, the organization can begin managing the regions with enhanced consistency.

Starting the process can be challenging, and entrenched local policies and staff may resist attempts to understand their sometimes arcane data governance rules. Fortunately, software tools such as IBM InfoSphere Business Information Exchange can capture the rules and implement them in technology to help ensure the rules are consistent, transparent, accessible, understandable, and adhered to after initial discovery and implementation.

No one architecture is perfect for any organization. But having a plan for organizing analytics across all regions can at least address the challenges inherent in analysis across regions and point to a possible solution that is appropriate for the organization. Weigh each of the options against the technical capabilities of the business before opting for an architecture. The challenging aspects can be getting regions to conform to governance policies and to accept mandates from headquarters. Be prepared with tangible benefits to help convince management to make a decision on the data warehouse model that is well suited for the needs of the organization.

Please share thoughts or questions in the comments.