Comments
Patrick Collands wrote: collands (AT) gmail com I'd be very grateful for an invitation. Thank you.
Cloud Expo on Google News

SYS-CON.TV

2009 East
PLATINUM SPONSORS:
IBM
Smarter Business Solutions Through Dynamic Infrastructure
IBM
Smarter Insights: How the CIO Becomes a Hero Again
Microsoft
Windows Azure
GOLD SPONSORS:
Appsense
Why VDI?
CA
Maximizing the Business Value of Virtualization in Enterprise and Cloud Computing Environments
ExactTarget
Messaging in the Cloud - Email, SMS and Voice
Freedom OSS
Stairway to the Cloud
Sun
Sun's Incubation Platform: Helping Startups Serve the Enterprise
POWER PANELS:
Click For 2008 West
Event Webcasts
Open Source Database Special Feature: An Introduction to Berkeley DB XML
Basic concepts, the shell commands, and beyond

Eager vs Lazy Evaluation
You may have noticed that after each query, BDB XML prints out the number of entries and the query evaluation method:

436 objects returned for eager expression

Eager is the default query evaluation method. Evaluating a query eagerly means that the BDB will store a result as soon as it finds any. In other words, eager evaluation grabs all of the results and stores them in a data structure, and they are available immediately after query execution. However, this is not the case when queries are evaluated lazily. In lazy evaluation, the database will not keep the results in a data structure. It will know how to get them (using pointers), but it will not do anything until the results are retrieved. Results are stored in sets. To get all of the results we have to iterate through the result set, using the next operator. This is what happens internally when we use the "print" command in the dbxml shell. It iterates through the entire set and gets every element of the result set. Thus, when queries are evaluated eagerly, the result set will be filled immediately after executing the query, as opposed to when the queries are evaluated lazily, and the result set either is empty or it has some of the results but definitely not all of them.

It may sound as though lazy query evaluation is never useful, but this is not the case. If you do not need all of the objects returned by the query, using lazy evaluation makes more sense. You can see this with the following query:

dbxml> setLazy on
Lazy evaluation on

dbxml> query 'collection("xbench.dbxml")/dictionary/e
[contains(. , "the hockey")]/hwg/hw'

Query - Starting query execution
Lazy expression 'collection("xbench.dbxml")/dictionary/e
[contains(. , "the hockey")]/hwg/h
w' completed

Note that execution time for this query is ignorable (there is no execution time info printed out by the database). That's because the actual results aren't retrieved yet. BDB XML will retrieve the results only after "print" command. We know that there are 436 objects returned by this query. Instead of getting all of the results, let's get only top eight of them. We can do this by using "print n 8" command.

XML Schema Validation
One of the new and cool features of the BDB XML is its ability to validate XML. First we need to create a container with XML Schema validation enabled. Listing 8 shows the XML sample (10MB XML sample with XML Schema, see the first entry in the References section) that I am going to put into this container.

This document is assigned an XML Schema. The part that shows this assignment is:

<dictionary xmlns:xsi=
"http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation=
"http://www.cs.umb.edu/~smimarog/
xmlsample/TCSD1.xsd">

This schema is located at www.cs.umb.edu/~smimarog/xmlsample/TCSD1.xsd.

XML Schemas and XML documents can be located on the same machine or on different machines. In this example, XML Schema and XML data are located on two different machines.

dbxml> createContainer validate_xbench.dbxml d validate
Creating document storage container, with validation

dbxml> openContainer validate_xbench.dbxml

dbxml> putDocument dict_10_valid C:\dictionary10_schema.xml f
Document added, name = dict_10_valid

A natural question is whether it's possible to add an XML document into this container without validating. Validation in Berkeley DB XML is very fast, which is a big time-saver. I have found that validating a document in BDB XML takes much less time than some commercial XML editors. However, it may be costly to validate each document when documents are huge. Besides, sometimes XML documents are not assigned to any schema. Listing 9 shows the XML sample (10MB XML sample in the References section) that I am going to put into this container.

Within this container we have two documents named dict_10_valid, and dict_10; the first document is validated, but the second is not. In some cases it's desirable to restrict queries to a specific document in the collection. We can achieve this by using the "doc" function.

dbxml> query 'doc("validate_xbench.dbxml/dict_10")//hwg'
733 objects returned for eager expression
'doc("validate_xbench.dbxml/dict_10")//hwg'

By saying doc("validate_xbench.dbxml/dict_10"), the queries are restricted to run on only the dict_10 document.

Indexing
Indexing XML documents is very important for good query performance. In fact, indexing XML data is literally the most important task for the user. There are limited automatic XML indexing features in BDB XML, but indexing is best done manually by the programmer. In this section I will introduce you to the basics of XML indexing in BDB XML. Here is the format of an index:

[unique]-{path type}-{node type}-{key type}-{syntax type}

An index in BDB XML is composed of four parts:

  • Path Types
  • Node Types
  • Key Types
  • Syntax Types
Uniqueness
Uniqueness indicates that the value being indexed is unique in the XML document. For example, in an employees data set, employee number will be unique, along with the social security number.
About Selim Mimaroglu
Selim Mimaroglu is a PhD candidate in computer science at the University of Massachusetts in Boston. He holds an MS in computer science from that school and has a BS in electrical engineering.

In order to post a comment you need to be registered and logged in.

Register | Sign-in

Reader Feedback: Page 1 of 1

Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.

Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.

Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.

Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.

Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.


Your Feedback
SYS-CON Belgium News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
SYS-CON Canada News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
SYS-CON Germany News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
SYS-CON UK News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
XML News Desk wrote: Open Source Database Special Feature: An Introduction to Berkeley DB XML. In this article I am going to introduce you to the latest version of the Berkeley DB XML, version 2.2.8. Berkeley DB XML (BDB XML) is built on top of the well-known Berkeley Database (BDB). BDB XML is an open source, native XML database. Like its ancestor, BDB, it's an embedded database. It provides APIs for the Java, C++, Perl, Python, PHP, and Tcl languages. It supports the popular XML query languages XQuery and XPath 2.0. I will show you how to use BDB XML in two ways. This month I will introduce the BDB XML shell, and next month we will explore using BDB XML with Java. BDB XML has a lot of features, and I will try to cover the most important ones.
Latest Cloud Developer Stories
The Enterprise Cloud Requires a real time infrastructure and a management discipline that understands and can enforce service level discipline.
CloudBench Applications, Inc. announced its financial results for the three months and nine months ending September 30, 2009. All amounts are stated in Canadian dollars unless otherwise noted. Revenues from BasicGov, the Company's cloud computing solution for local government, gr...
The new contract is an industry first, with CSC being the first Microsoft partner to lead and win a cloud computing services agreement of this scale. Under terms of the contract, CSC will provide Royal Mail Group's 30,000 employees with access to new IT services using Microsoft's...
Operates in over 170 countries and is one of the world’s leading providers of communications solutions and services. Richard Tarboton talks for MeettheBoss.TV on his role as Head of Energy & Carbon for BT and what they are doing towards reducing carbon emissions.
CA is going to put its Agile Planner software on salesforce.com’s Force.com platform in the first half to accelerate development time and give users visibility over their development initiatives to reduce time-to-market. Customers are supposed to be able to accelerate the deploym...
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
Click to Add our RSS Feeds to the Service of Your Choice:
Google Reader or Homepage Add to My Yahoo! Subscribe with Bloglines Subscribe in NewsGator Online
myFeedster Add to My AOL Subscribe in Rojo Add 'Hugg' to Newsburst from CNET News.com Kinja Digest View Additional SYS-CON Feeds
Publish Your Article! Please send it to editorial(at)sys-con.com!

Advertise on this site! Contact advertising(at)sys-con.com! 201 802-3021

SYS-CON Featured Whitepapers
ADS BY GOOGLE

Breaking Cloud Computing News
CloudBench Applications, Inc. announced its financial results for the three months and nine months e...