|
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
XML Protocols XML Scehmas And DTDs Working Together
XML Scehmas And DTDs Working Together
By: Paul Grosso
May. 22, 2001 12:00 AM
This article compares and contrasts the broad functionality of XML Schemas (whose approval by the World Wide Web Consortium is imminent) with that of document type definitions, currently part of the XML 1.0 Recommendation.
Let's begin with some background to describe the need for schemas and DTDs. XML holds out the promise of sharing all kinds of information effortlessly, with automated and loss-free exchanges among disparate information systems. XML defines how to mark up content to give it both structure and additional meta-information. This increases the worth of the data many times by allowing a wide variety of applications to make greater use of the content. However, as with any structured information exchange, all parties must adhere to a mutual agreement on the syntax and semantics of the data or chaos will ensue. The XML 1.0 specification fulfills the need for such an agreement by providing for a DTD, which describes the tags and hierarchy that the XML data may include. For example, consider an engine manufacturer that provides service information to its OEM customers. Because XML provides a media-neutral and processible data format, the company can deliver its service information in XML so its OEM customers can automatically incorporate this data into their own service manuals. This allows each OEM the freedom to apply its own formatting and behaviors so the service information automatically appears as if the OEM had produced it. To achieve this kind of exchange, the engine manufacturer and the OEM must agree on which tags will be used, and how. That's where a DTD or schema comes in. The XML 1.0 specification provides the mechanism of a document type declaration, which defines a set of tags and attributes and describes how they can be put together to form a valid document. Most such declarations are commonly put into a file separate from the document itself; this external subset of the document type is usually called the DTD, which this article informally uses instead of "document type declaration." (There can also be an internal subset part of the declaration that would be contained at the top of the document itself.) To be merely well formed, an XML document doesn't need to be associated with a DTD. However, for most if not all applications to be able to process an XML document as intended, that document must conform to some set of expectations about what tags can appear where, what attributes can be on which tags, which kinds of values the attributes can have, and so on. A DTD describes the expectations or constraints desired for a given application. If a well-formed XML document satisfies all the constraints defined in its DTD, it's said to be valid. The kinds of constraints a DTD can define include:
DTDs in XML have a different syntax than that of XML documents; DTDs use a "declaration" syntax instead of XML tags. In other words, a DTD isn't an XML document, so parsing it requires additional technology. For a number of reasons, XML experts believe there are benefits to creating an XML syntax for DTDs that would replace the special syntax of a DTD with the now-standard XML syntax. Both the need for stronger data typing, as well as the desire to develop an XML-based description of constraint declarations, led to the development of an XML Schema language. The XML Schema Working Group of the W3C has developed a "Proposed Recommendation" for a schema language that provides a means for defining the constraints on the structure and content of an XML document. This Proposed Recommendation is likely to become a full-fledged standard in the near future. Both DTDs and schemas support the validation of a document's structure, that is, both specify valid elements, their content models, valid attributes, valid attribute types, and default attribute values. Schemas offer a couple of significant additional capabilities:
(A point of controversy that hasn't been settled is the claim that schemas fall short compared to DTDs because they lack support for parameter entities, which allow fine-grained customization and modularization of DTDs. Because of this, some have argued that maintaining complex constraint declarations will be more difficult as schemas than as DTDs. Even if this concern proves valid, it will affect only those who create and maintain DTDs, not the vast majority of users whose job is to create and maintain information.) The primary advantage of schemas over DTDs is their support for validating element content. While DTDs allow for very basic constraints on attribute values, schemas not only strengthen the data-typing constraints that can be applied to attribute values to a great degree, but also allow the same strong level of data-typing constraints to be applied to element content. This means that your schema can require, for example, that the content of your "NumberOfDependents" element is a valid integer between 0 and 30, or that the content of your "Telephone" element is a string that matches a certain pattern (e.g., 3-3-4) that you have determined all phone numbers must match. The degree to which content validation provides an added benefit is application-specific. One could make a general argument that document-oriented applications are likely to benefit relatively less from data typing than data-oriented applications. However, these generalizations won't apply universally, so existing applications will migrate at different speeds. Migration from DTDs to schemas will take place over many years. Even after schemas become an approved standard, they will coexist with - but not replace - DTDs. As a part of XML 1.0, DTDs will always be supported by validating XML 1.0 parsers, and for many applications, DTDs can supply all the constraints they need. Even for those who find some benefit in migrating to schemas, parts of DTDs may still remain. One reason for that is that the declaration of entities - something that can be done as part of a DTD - isn't supported in the current version of XML Schemas. If entity declarations are needed, they still must be done using standard XML 1.0 declarations in either the internal subset (which goes at the top of the document) or the external subset (i.e., within the DTD). A given document may thus have both a DTD and an XML Schema that provides additional constraints. Companies who currently use DTDs will be able to convert to schemas later, but will never be required to do so. Validating XML processors will always have to support DTDs. Tools for converting XML DTDs to schemas include Arbortext's Epic Architect, a developer's kit that bundles in TIBCO Extensibility's schema development tool, which automates many aspects of DTD-to-schema conversions. (Conversion of SGML DTDs to schemas may present additional obstacles.) Regardless of the use of DTDs and schemas, the specifications that describe the semantics of XML documents will remain unchanged, and therefore XML data will remain interoperable. There will be no significant issues with regard to the coexistence of DTDs and schemas. Companies who use DTDs for validation may exchange documents with companies that use schemas. The only risk is that documents created in a DTD-based application may contain content that fails to comply with the potentially more highly constrained data-typing specifications of the related schema. There are two basic ways to deal with this:
You can locate the W3C drafts and reports from the index at www.w3.org/TR/. Although the XML Schema Recommendation, which comes in two parts, is a substantial bit of reading, you can get a good overview by reading the XML Schema Primer, available at www.w3.org/TR/xmlschema-0/. A public page on the W3C site at www.w3.org/XML/Schema provides a good set of pointers to various schema-related resources. Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week
Breaking Cloud Computing News
|
|||||||||||||||||||||||||||||||||||||||||||||||||