|
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Service-Oriented Architecture Using Data Services to Build Functional Services
Data management infrastructure cuts costs and accelerates development
By: Vivek Singhal
Apr. 26, 2006 03:30 PM
In almost every significant SOA deployment, a few services have advanced requirements that force those services to intelligently manage the data that they use. A fault-tolerant service might be deployed on a cluster of machines, which means that the instances of the service must share data across several machines.
In a SOA, services act as the building blocks for implementing business processes. Each service, hereafter called a functional service, offers a set of operations. The implementations of these operations usually involve querying and updating data from one or more data sources. Data services are the next layer in the SOA architecture. Data services support functional services by acting as a high-level abstraction for data: rather than directly exposing functional services to the complexity of data replication, data transformation, and data federation, data services hide those details and present a simple view of enterprise data. Data services expose high-level data manipulation operations, whereas functional services expose business domain-oriented operations. Numerous commercial data management infrastructure products are valuable for writing data services. These infrastructure products reduce the cost, improve the reliability, and accelerate the development of data services. This article describes several categories of data management infrastructure that are available today that can be used to build data services. A case study from a major hotel chain illustrates how data services were used in a real-world SOA. We'll discuss the next generation of data management infrastructure, which blends independent data management capabilities to provide a cohesive platform to deliver consistent, reliable, and pervasive access to data.
Data Access Infrastructure A data access infrastructure can simplify the use of heterogeneous data sources by providing a view of the data that's independent of the underlying data source type. For example, data might originate from relational databases or mainframe resources, but data services could present that data to functional services as Java objects. As data services use new types of data sources, there's little or no impact on the functional services. The functional service simply gets richer data from the data services. Several vendors offer infrastructure for accessing data sources. For example, there are products that provide a JDBC interface to mainframe applications, so that data services can use a familiar interface to query or update the mainframe. Similarly, XQuery products make it easier to manipulate data stored in XML files. Data services can exploit these data access products to connect to a broad range of data source types.
Data Replication Infrastructure Making a local copy of data is easy. Ensuring that the copy stays up-to-date is more challenging. A data replication infrastructure can automate the replication of data so that both the initialization and subsequent updating of the local copy occurs automatically. A data replication infrastructure can offer a spectrum of "qualities of service" (QoS) that meets the varied requirements of data services. For example, a local copy might simply correspond to a snapshot of a data source that's periodically refreshed to reflect recent changes. Or the local copy might be continually updated via distributed transactions so that the copy is guaranteed to be identical to the original data source. Another QoS is whether the local copy is writeable or read only. Yet another QoS is whether the local copy is recoverable after process failure (because the local copy is backed up to disk) or not (because the local copy is stored in volatile memory). Not only does replication infrastructure unburden a data service from the drudgery of synchronizing data, it also provides the data service with a high-level abstraction for managing data. The data service merely declares its QoS requirements and the replication infrastructure hides the complexity of synchronization in accordance with those requirements. These powerful abstractions accelerate the development of data services and they ensure that data services have reliable access to replicated data. Data warehouse technology is commonly used for disk-based data replication. Traditional data warehouses use a batch-oriented approach to initialize and update the local copy. Real-time data warehouses support continuous incremental updates to the local copy, which means that the local copy is nearly synchronized with the original data source. Both traditional and real-time data warehouses produce a read-only local copy of data, which means that updates are disallowed because they aren't propagated back to the original data sources. Distributed in-memory caching, which automatically synchronizes data across a group of high-speed caches, is another example of replication infrastructure. Each cache is typically deployed directly within an application process, which provides the application instant access to the cache's data. This infrastructure accelerates the performance of data services by provisioning data directly into the application address space, but with the limitation that the local copy isn't fault-tolerant. Another data replication technology can provision data for disconnected mobile applications. This represents a powerful abstraction because a data service simply relies on the replication technology to manage the complexity of synchronizing data whenever a mobile computer is connected to the network. The data service is mostly unaware of whether the machine on which it's deployed is connected to the network, shifting the burden of data management to the replication infrastructure.
Data Integration Infrastructure But consider a more complex example. Suppose a data service needs to aggregate data from multiple relational databases. And suppose that the relational databases use different schema. In this example, not only must relational data be converted to an object representation, but the schema differences have to be reconciled too. Data originating from the different data sources has to be integrated into a common representation that's easy for the data service to use. Infrastructure for data integration and data federation can address these challenges. For example, enterprise information integration products provide a data service with a tailored view of data, where the data originates from distributed data sources with different schema and data types. This infrastructure is valuable for building data services because it abstracts the underlying data sources and presents a unified data representation that's independent of the format and location of data.
Case Study To build the data services that supported the Availability and Permissibility functional services, they needed data infrastructure that solved several problems. There was a mismatch between the relational representation used to store data and the object-oriented index data structures used for computation. Data had to be synchronized across a cluster of processes that implemented each functional service. And the functional services required instant access to data to deliver fast responses to requests. To build its data services, the hotel chain selected Progress ObjectStore Enterprise from Progress Software. ObjectStore acts as a high-performance distributed durable cache for object-oriented data. It offers transparent storage of object-oriented data structures; it delivers data automatically to distributed in-memory caches; it guarantees strong transactional consistency semantics; and it provides instant access to data. One key challenge overcome by the data services was to transform data from a representation optimized for storage into a different representation optimized for computation. The relational representation of data was normalized and semantically complete, but very inconvenient to manipulate. To answer a typical query about room availability, an expensive multi-table join operation was required. In response, they developed an optimized index structure that contained the same semantic information as the relational database, but in a different format that could efficiently answer queries. The initial construction of the index structure required a full scan of the relational database, a process that took several hours. Once the index structure was built, updates to the relational database required corresponding incremental updates to the index structure. It was mandatory to ensure that the index structure was backed up to disk so that a temporary failure of the data service wouldn't result in a lengthy outage to reconstruct the index. The use of a data management infrastructure enabled the hotel chain to build data services that supported the operational requirements of the Availability and Permissibility functional services. While they contemplated building an in-house implementation of data management infrastructure, they quickly determined that it was quicker, more cost-effective, and less risky to buy a commercial infrastructure product instead. Reader Feedback: Page 1 of 1
Your Feedback
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week
Breaking Cloud Computing News
|
||||||||||||||||||||||||||||||||||||||||||||||||||||