The Intersection of Data Virtualization and Enterprise Data Warehouses
Eight Best Practices - Part 1
By: Robert Eve
Nov. 26, 2009 08:00 PM
Large enterprises and government agencies are drowning in data. IT teams deploy a myriad of data warehouse-centric solutions - BI, predictive analytics, data and content mining, portals and dashboards - to harness and deliver data for intelligent decision-making.
Yet, large enterprises are also expected to act like start-ups: nimble, agile and flexible to adapt to ever-changing market conditions. Impossible? By examining the best practices of their successful peers and adapting these to their own enterprises, data teams and enterprise architects can contribute to their corporate initiatives.
The intersection of data virtualization and enterprise data warehouses represents corporate best practices for delivering the rich data assets available in the enterprise data warehouse with the myriad sources of data now available outside the data warehouse. In Part One of this two-part series, I examine ways to use data virtualization to improve data warehouse effectiveness. By extending the warehouse schema to include additional data, data virtualization delivers greater business value from existing warehouse and non-warehouse data assets. Part Two will target improving data warehouse efficiency by showing four best practices where data virtualization, used alongside data warehouses, save time and money.
1. Data Warehouse Extension
Data virtualization federates data-warehouse contents with additional information sources. These complementary views are conducive to adding current data to historical warehouse data, detailed data to summarized warehouse data, and external data to internal warehouse data.
In Figure 1, data virtualization middleware hosts new views that integrate additional RDBMS and web service data sources as well as extend existing sources such as packaged applications.
To analyze the effectiveness of its pharmaceutical sales and marketing programs, a life sciences leader uses data virtualization to federate externally sourced competitor sales data from an industry data services provider with internal prescription sales data from its sales data warehouse. This total market view provides the sales team with the intelligence required for more effective sales and marketing programs that increase revenues.
2. MDM Hub Extension
In the integration pattern shown in Figure 2, the data virtualization middleware hosts new complementary views that integrate additional RDBMS and Web service data sources as well as extend existing sources, such as packaged applications.
A global investment bank federates its HR master data with myriad internal benefits and compensation systems as well as external payroll services to provide its employees with a 360o view of their total compensation through a self-service employee benefits portal. Securely exposing this information improves retention and lessens HR staff workload.
3. Data Warehouse Federation
Optimizing business performance requires data from across these various warehouses and marts. The effort of physically combining multiple marts and warehouses into a complete enterprise-wide data warehouse is simply too costly and time-consuming.
Data virtualization federates multiple physical warehouses. Two examples include combining data from sales and financial data warehouses, or combining two sales data warehouses after a corporate merger or acquisition. This approach creates an integrated view by using abstraction to rationalize the different schema designs, and thereby achieves a logical consolidation of multiple warehouses.
In the federation pattern shown in Figure 3, the data virtualization middleware hosts federated warehouse views that logically integrate both data warehouses.
One of the world's leading pharmaceutical companies uses data virtualization to enable research scientists to access and analyze data from research, clinical trial, FDA submission and other data warehouses. Scientists use this data to accelerate time-to-market for new compounds and drugs, thereby increasing revenues in an otherwise lengthy and costly development process.
4. Data Warehouse in Enterprise Architectures
Data virtualization, which is included in IaaS, integrates data warehouses into an unified enterprise information architecture, as shown in Figure 4. IT teams use data virtualization middleware to form an enterprise data virtualization layer that is home to a consistent and complete logical schema covering multiple consolidated and virtual sources. When designing the enterprise information architecture, developers use data virtualization design tools to develop semantic abstractions in the form of web services or relational views. At runtime, end user-level applications, reports or mash-ups are created to call web data services, on demand, to query, federate, abstract and deliver the requested data to information consumers.
Several government agencies are using data virtualization to create a common information layer that spans agency information databases and enables intelligence analysts to better control threats. Agencies involved include the Drug Enforcement Administration (DEA) and the Immigration and Naturalization Service (INS). The common information layer delivers access to passenger, crew and manifest data from a U.S. Coast Guard port arrivals data warehouse, for example.
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
SYS-CON Featured Whitepapers
Most Read This Week