|
Comments
Did you read today's front page stories & breaking news?
SYS-CON.TV
|
XML Protocols Working with Xindice
Working with Xindice
By: Roy Hoobler
Oct. 24, 2002 12:00 AM
In my free time, I've been working on a CMS/Portal application using Java and XML. I was glad to discover some XML database tools that are now available - as more and more data is being stored and transmitted in XML format, XML databases are worth considering. Moving an XML application to Xindice (pronounced zin-dee-chay) is an interesting experience. Xindice is a "new" open-source database engine, so a lot of issues must be resolved by brute force. However, getting started using the Java API was fairly easy, and getting a test class put together only takes an hour or two. Xindice actually has an HTTP server built in that runs on port 4080. However, the query mechanism over HTTP didn't work on my machine, so I started to build my own servlet that could search documents stored in Xindice.
Why Switch to Xindice?
Working with Xindice requires setting it up and building a Java package that includes a few classes to query the database. I also built a servlet class so the database could be queried over HTTP. This is new technology, so be sure to test and design well before deciding whether it's the right solution.
Setting Up Xindice
Documents are stored in collections on the server. For my articles, I created an "articles" collection using the xindiceadmin tool: ./bin/xindicdeadmin add_collection -c /db -n articles. Collections are a repository for XML documents. The fact that collections can contain other collections separates them from relational models. The documents database itself contains a human-readable hierarchy. For now, all my articles are placed in an "articles" collection, but I could (and probably will) separate this into "articles/xml", "articles/ msdotnet", and "articles/java" collections. To begin with you may want to add a few documents using the command-line utility. Documents are stored in Xindice in a special format to which indexes can be added to increase performance. From the command line, adding a document is simple: xindice add_document-c/db/articles-ffx102. xml -n SSL1. The -n parameter is the unique key for the document. Even though it's an optional parameter, it's a good idea to supply your own. Later, if you build a mechanism to retrieve a single document, looking it up by the generated key (i.e., 0625df60001a5d4000bc49d00060 bf5) won't be very convenient. SSL1 or SSL09-2002 is a lot easier to type in and retrieve later. Beware: Xindice currently allows you to store documents with duplicate keys. Querying will return the results from both documents, but you'll only be allowed to retrieve the first document added. Before diving into the Java classes, test to make sure everything is running well through the command-line interface. One note about the command line: double quotes need to be placed around XPath expressions. This is usually not the case, and it's not the case when using the Java classes. Use bin/xindice retrieve_document -c /db/articles -f fx102.xml -n SSL1 or bin/xindice xpath -c /db/articles -q /article["contains(title, 'SSL')"]/title. The documentation that comes with Xindice explains everything. Check the User Guide, Developer Guide, and Administrator Guide - the information you need may be in any one of these guides (you can find the documentation at http://localhost:4080).
Creating a Java Class to Search DocBook Articles
The xmlQuery class shown in Listing 1 is based on the examples provided with Xindice. The terminology isn't the same as that of a relational database class, but there are enough similarities that this type of connecting and querying should be very familiar to most developers. Instead of connections, tables, and queries; managers, collections, and services provide the interface to get data from the system. The concept of a collection may be more familiar to developers who have previously worked with content management tools. In the xmlQuery class, the getResults method builds a nonvalid XMLdocument as a string from the Results object's getContent() method. In the future these results may be appended to a valid XML document or sent to the client as a stream of results. Xindice returns a complete XML document with an XML declaration, so the first line (the XML declaration) is stripped from the results. Since Xindice is open source, the source code can also be modified to return result streams as one document. The JavaDoc documentation, which explains other classes and other methods to use, is also included with the distribution. The second class, searchParams.java (see Listing 2), encapsulates the parameters needed to search for documents. At first the package used parameter list, but after it grew to about five parameters, switching to a class made more sense. To keep things simple, I created a public class with six public fields. For a real implementation, more than one class may be involved, using either a bean model or something different. The third class, docBookSearch, does most of the work (see Listing 3), creating a valid XPath query for searching documents from the parameters passed in from the servlet. The best thing about this class is that it can be tested from a standard Java class with a "Main" method and, after testing, called from the servlet class below. The logic in this class needs to be refined and will become more complex to handle different types of queries and produce different results. The fourth class (see Listing 4), dbSearchServlet, is the the servlet itself. The following code retrieves the results from the DocBookSearch class and passes them to the browser:
response.getOutputStream().println This class could function with less code, but some parameter checking has been included. After compiling the class, it should be in the ./WEB-INF/classes directory. To make things easier, I put everything into a package, added it to the ./WEB-INF/lib directory, and then added the code shown in Listing 5 to the ./WEB-INF/web.xml file:
Finishing Up
The goal, however, is to make the "searchString" parameter the only required parameter. Another servlet class will be "getArticle", which will take an article key as a parameter from the example given earlier, an entire article will be accessible by the URL: http:// localhost:8080/examples/xArticle?SSL1. On the portal, the search will be invoked in an XSLT document. This is another advantage of having an XML database incorporated with a servlet. In a previous article I wrote about searching through one XML document. With Xindice, implementing a variety of searches will be much simpler. The XSLT code looks something like Listing 6. Of course, the search terms will be replaced by parameters or variables. The database can also be queried by other Web applications and programs. A few more parameters could produce results in RSS (Rich Site Summary) format as well; an XML-based application provides a lot of possibilities and a very open system. With XML architecture, the same type of system could be achieved. With Xindice as a datastore, the problem of storing and querying large amounts of XML documents is solved, enabling more XML-driven applications.
References
Other XML Database Products
Reader Feedback: Page 1 of 1
Latest Cloud Developer Stories
Subscribe to the World's Most Powerful Newsletters
Subscribe to Our Rss Feeds & Get Your SYS-CON News Live!
|
SYS-CON Featured Whitepapers
Most Read This Week
Breaking Cloud Computing News
|
|||||||||||||||||||||||||||||||||||||||||||||||||