|
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
Industry Commentary The Problem with XML-Based Storage
The Problem with XML-Based Storage
By: Israel Hilerio
Feb. 22, 2002 12:00 AM
As Java and XML continue as the de facto standard for developing enterprise applications, issues arise in using these technologies. For example, the need to store XML data, and the criteria for selecting the appropriate repository. Here's a real-world example of how applying XML to the wrong problem can lead to the wrong solution. I inherited a solution last year from a team that needed to develop a solution quickly and didn't have all the requirements outlined for them. They knew some of the basic structures and relationships between data elements, but they were convinced that the initial schema wasn't going to be the final one. While this is a problem faced by most development teams, they were expected to release new functionality on a weekly basis. The problem involved the development and maintenance of a B2B fulfillment application. The main components of the fulfillment system were customer catalogs and orders. They chose XML as the mechanism for defining customer catalogs and expression orders. Initially this worked out great for catalogs because they were able to extend and add relationships between existing and new elements on a near real-time basis. This also worked out great for order objects because they needed to share data between multiple systems. Integration with other systems was achieved through the combination of Java servlet channels and XML messages. They ended up selecting a company with an innovative XML repository. The partnership blossomed until we encountered some operational and growth issues. Some dealt with the maturity of the product. As with most nonrelational databases, some concerned the performance of writes as opposed to small reads. Large reads became a problem as there was no caching or cursor mechanism built inside the engine. HTTP, the mandated protocol to query the database and manage data, isn't the most efficient protocol, as many of you know. While it makes sense from a Web-accessibility perspective, access to the database was within the internal hosted network - from servlets, never from the Web. We discovered a space reclamation problem. Documents marked for deletion were never deleted at all. This meant that document replacement took twice as much space as document creation. To reclaim space we needed to run a defrag utility every few weeks that would search through the system and delete the marked documents. Furthermore, the instability of the database forced us to reindex it every time it was brought down for backups. An additional side effect was that, in order to reduce the number of persistence engines, some RDBMS behavior inside the XML repository was replicated. While some of this behavior was unique to the vendor they chose, other parts were relevant to most nonrelational XML repository solutions. At the time, we didn't consider RDBMS tools; we believed our schema had to be adaptable and were fearful of the limitations associated with a rigid schema. Also, it wasn't until recently that RDBMS systems have added credible XML support to their environments. At the request of our customer and to alleviate some of our operational issues, we decided to look into porting the system to an RDBMS with XML support. Let me outline the lessons we learned:
In summary, don't assume. You know what they say! Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week
Breaking Cloud Computing News
|
|||||||||||||||||||||||||||||||||||||||||||||||||