Level: Intermediate Arun Gaikwad (agaikwat@comcast.net), Independent Software Consultant
01 Sep 2002 This article is an introduction to an Open Source Native XML Database System, called Xindice (pronounced zeen-dee-chay). It is also an introduction to Native XML Database concepts. Introduction
Native XML Database Systems (NXDs) are becoming increasingly relevant as XML itself is becoming more popular (Web services being the latest application of XML). Many questions arise at the mention of the words "XML Database" including:
- Why another type of database technology?
- How is it different from Relational Database Management Systems (RDBMSs)?
- When should I use a Native XML Database System or a Relational Database System?
- What kind of Native XML Database products are available?
I will try to answer some of these questions with the help of a Native XML Database System called Xindice. Xindice is an Open Source Native XML Database System. In this article, you will learn how to:
- Install Xindice
- Create and delete collections
- Insert and delete documents into these collections
- Use XQuery to query these documents
You can perform these operations on the command line or embed them in Java programs using the Java API. You will also learn to use the Java API to write JDBC style programs to communicate with Xindice.
Why native XML database systems?
An XML Database System is something which you may think is unnecessary but once you start using it, you wonder how you would survive without it. I say this from personal experience. When I first heard of Native XML Database Systems about two years ago, I completely ignored them thinking that it was just hype. At that time, I was involved in the development of a project for a large financial brokerage company. We were using XML to send and receive financial feed data. It was necessary to save the feed data in some kind of permanent storage. As a Relational Database programmer, my first choice was to use a Relational Database System to save these XML documents. I decided to use CLOBs (Character Large Objects) with a modern RDBMS to save these documents. Since the RDBMS supported a Java API to insert and retrieve CLOBs, this was a very easy task. As our project evolved, I found that this approach had a major drawback. This was nothing but DIDO (Document In, Document Out). Retrieving partial documents or nodes from a DOM tree was not possible. I would have found a tool which saved the XML documents, performed database-like queries on nodes, and retrieved partial or full documents very useful. This is when NXDs came into the picture. If I had to do this project all over again, I would definitely use an NXD. If you need simple DIDO functionality, you might want use an RDBMS to save your documents, but for extended functionality such as Query and Update you should consider an NXD. Sometimes people try to save XML documents into Normalized Relational Database tables by mapping the document nodes into Relational format. This is not always easy. It is relatively easy to build an XML document from RDBMS tables, but not to store them because XML documents are hierarchical and almost free format. Some of the potential applications suitable for NXDs include:
- Membership databases
- Catalogues
- Corporate portals
- Parts databases
- Business to business document exchange
- Web services
Many database products available today claim to be Native XML Database Systems. While I was evaluating some of these products, two made it to the top of my list. One was a commercial product and the other was Xindice from Apache. The commercially available product is quite powerful and comes with GUI Interfaces to perform administration while Xindice is an Open Source project that is still evolving. Xindice comes with a command line interface to administer the database as well as a Java API. I chose Xindice to write this article due to its simplicity, open source nature, and powerful collection management. Many other products, case studies, and performance benchmarks are covered in a forthcoming book titled XML Data Management edited by Chaudhri, Rashid, and Zicari (see Resources). Now let's look at how to set up and use Xindice.
Xindice installation
Installation and configuration is very easy. I have installed and configured Xindice within 15 minutes. Here are the required steps:
- Download the zip file from www.xindice.org or apache.org.
- Unzip the file into a suitable directory.
- Add the Xindice bin directory into your path. For example,
C:\Xindice\xml-xindice-1.0\bin .
- Set the JAVA_HOME environment variable to point to the location where the JDK is installed. Remember that you need Sun's Java SDK version 1.3 or higher.
- Set the XINDICE_HOME environment variable to point to the location where Xindice is installed. For example
C:\Xindice\xml-xindice-1.0
- Add the Xindice.jar file to your CLASSPATH. For example,
C:\Xindice\xml-xindice-1.0\java\lib\Xindice.jar.
- Change to the Xindice directory. For example
C:\Xindice\xml-xindice-1.0
- Type
startup on the command line to start the server running on port 4080
- Go to
http://localhost:4080 from your Web browser and view the Xindice Web page
Xindice administration
Now, you'll perform some basic administration work such as creating a collection. A collection is similar to a table in an RDBMS. A collection could be schema based or non-schema based. Schema based collections need a well-defined XML Schema to begin with. Xindice collections do not need XML Schema. Let's create a collection called partsdb. At the command line, type:
xindiceadmin ac -c /db -n partsdb
|
The reply is:
If you try to do the command again, the reply is:
ERROR : Collection Duplicated
|
You have successfully created a collection. Now, list the collections by typing:
The reply is:
To delete a collection, you can use:
xindiceadmin dc -c /db -n partsdb
|
Collection management
In this section, I show you how to manage a collection. To add a new document called parts.xml (see Listing 1) into the collection, use the command:
xindice ad -c /db/partsdb -f C:\xml\parts.xml -n parts
|
parts is a key. If you omit the key (which is the -n option), Xindice creates its own key. Listing 1. parts.xml
<?xml version="1.0"?>
<parts>
<part sku="101">
<desc>Ball Bearing</desc>
<maker>S.K.F.</maker>
<instock>Yes</instock>
<price>$20.00</price>
</part>
<part sku="102">
<desc>Gasket</desc>
<maker>A.B.C.</maker>
<instock>Yes</instock>
<price>$2.00</price>
</part>
</parts>
|
When you add a document, it is validated for correct XML syntax. Now you can retrieve this document by issuing the command:
xindice rd -c /db/partsdb -n parts -f <your file>
|
The document is saved in the file name that you supplied. If you omit the file name, the document is displayed. Similarly, you can delete the document by using:
xindice dd -c /db/partsdb -n parts
|
Querying a document
You can perform some queries on the document. Xindice uses XQuery-like syntax. If you want to retrieve information about a part with SKU 101, enter:
xindice xpath -c /db/partsdb -q /parts/part[@sku="101"]
|
Listing 2 shows the query results in a document. Listing 2. Query results
<?xml version="1.0"?>
<part sku="101" xmlns:src="http://xml.apache.org/xindice/Query"
src:col="/db/partsdb" src:key="parts">
<desc>Ball Bearing</desc>
<maker>S.K.F.</maker>
<instock>Yes</instock>
<price>$20.00</price>
</part>
|
Note that Xindice has added some new attributes in the part tag. These are very helpful and can be used to discover the origin of the document, such as the collection (src:col) and key (src:key). Command line operations are certainly helpful but the real power of any tool is programming with the API. Xindice provides a Java API to develop applications. I look briefly at how to write Java Programs to access Xindice. Application development using Java
Listing 3 shows a Parts.java program to access a parts document stored under partsdb: Listing 3. Parts.java program
import org.xmldb.api.base.*;
import org.xmldb.api.modules.*;
import org.xmldb.api.*;
public class Parts {
public static void main(String[] args) throws Exception {
Collection col = null;
try {
String driver = "org.apache.xindice.client.xmldb.DatabaseImpl";
Class c = Class.forName(driver);
Database database = (Database) c.newInstance();
DatabaseManager.registerDatabase(database);
col = DatabaseManager.getCollection("xmldb:xindice:///db/partsdb");
String xpath = "//parts/part[@sku='101']";
XPathQueryService service =
(XPathQueryService) col.getService("XPathQueryService", "1.0");
ResourceSet resultSet = service.query(xpath);
ResourceIterator results = resultSet.getIterator();
while (results.hasMoreResources()) {
Resource res = results.nextResource();
System.out.println((String) res.getContent());
}
}
catch (XMLDBException e) {
System.err.println("XML:DB Exception occurred " + e.errorCode);
}
finally {
if (col != null) {
col.close();
}
}
}
}
|
After you compile and execute the Parts.java program, it outputs an XML document similar to Listing 2. The Java code is self-explanatory, but let's look at some of the details. Note that this program has assumed that the Xindice server is running on the same machine as the program, but it need not. The server can be accessed remotely. If the server is running on different machine, for example myhost, the URI is:
Xindice://myhost:4080/db/partsdb
|
Now, look into the details of the code. The first three lines are import statements for required Java classes.
import org.xmldb.api.base.*; // imports base API module.
// Required by all applications using XML:DB.
import org.xmldb.api.* // imports DatabaseManager Class.
import org.xmldb.api.modules.*; // XpathQueryService is a module defined in the API.
|
The next step is to register the database driver as done by the following code:
String driver = "org.apache.xindice.client.xmldb.DatabaseImpl";
Class c = Class.forName(driver);
Database database = (Database) c.newInstance();
DatabaseManager.registerDatabase(database); |
Once
the driver is registered, you can access and retrieve
collections by invoking the getCollection method and
providing it the database URI as follows: col =
DatabaseManager.getCollection("xmldb:xindice:///db/partsdb");
|
Next, get a reference to the service called
XPathQueryService as follows: String xpath = "//parts/part[@sku='101']";
XPathQueryService service = (XPathQueryService)
col.getService("XPathQueryService", "1.0"); ResourceIterator
results = resultSet.getIterator(); |
The following
step is quite straight forward -- just walk through the
result set and print it: while (results.hasMoreResources()) { Resource
res = results.nextResource(); System.out.println((String)
res.getContent()); } |
Database update using XUpdate
So far, you have seen how to store and retrieve XML documents in Xindice. But how do you update these stored documents? The answer is XUpdate. You can use XUpdate to add and delete elements as well as to change the contents of the elements. The best way to update Xindice is to embed XUpdate statements into Java programs. In this section, you will see some important features of XUpdate and an example embedded XUpdate Java program.
Note: The XUpdate specifications are still under development.
You can perform the following (and other) important operations using XUpdate.
-
xupdate:insert-before
inserts a new node before the selected node.
Insert a new element named model before the element maker in the partsdb XML document as follows:
<xupdate:modifications version="1.0" xmlns:xupdate=http://www.xmldb.org/xupdate>
<xupdate:insert-before select="/parts/part[@sku="101"]/maker">
<xupdate:element name="model">BAB-101</xupdate:element>
</xupdate:insert-before>
</xupdate:modifications>
|
This updates the partsdb XML document. Listing 4 shows the new document.
Listing 4. New document
<?xml version="1.0"?>
<parts>
<part sku="101">
<desc>Ball Bearing</desc>
<model>BAB-101</model>
<maker>S.K.F.</maker>
<instock>Yes</instock>
<price>$20.00</price>
</part>
<part sku="102">
<desc>Gasket</desc>
<maker>A.B.C.</maker>
<instock>Yes</instock>
<price>$2.00</price>
</part>
</parts>
|
-
xupdate:insert-after inserts a new element after the selected node.
insert-after is similar to insert-before but adds the element after the selected node. You could insert element model as follows:
<xupdate:modifications version="1.0" xmlns:xupdate=http://www.xmldb.org/xupdate>
<xupdate:insert-after select="/parts/part[@sku="101"]/desc">
<xupdate:element name="model">BAB-101</xupdate:element>
</xupdate:insert-after>
</xupdate:modifications>
|
-
xupdate:update replaces all the child nodes of the selected node.
Use xupdate:update to change the price of the part with SKU 102 to $31.00 as follows:
<xupdate:modifications version="1.0" xmlns:xupdate=http://www.xmldb.org/xupdate>
<xupdate:update select="/parts/part[@sku="102"]/price">31.00</xupdate:update>
</xupdate:modifications>
|
-
xupdate:append appends a specific node to the selected node.
Append is very much like insert.
-
xupdate:remove removes the selected node.
You can delete the element called model as follows:
<xupdate:modifications version="1.0" xmlns:xupdate=http://www.xmldb.org/xupdate>
<xupdate:remove select="/parts/part[@sku="101"]/model"/>
</xupdate:modifications>
|
-
xupdate:rename renames the selected node.
You can rename element maker to manufacturer as follows:
<xupdate:modifications version="1.0" xmlns:xupdate=http://www.xmldb.org/xupdate>
<xupdate:rename select="/parts/part/maker">manufacturer</xupdate:rename>
</xupdate:modifications>
|
-
xupdate:variable defines a variable.
Use xupdate:variable to define a variable and assign a selected node to it. For example, suppose you want to move the instock node after price. You can do this with xupdate:variable, xupdate:remove, and xupdate:insert-after as follows:
<xupdate:modifications version="1.0" xmlns:xupdate=http://www.xmldb.org/xupdate>
<xupdate:variable name=v_instock select="/parts/part/instock"/>
<xupdate:remove select="/parts/part/instock"/>
<xupdate:insert-after select="/parts/part/price">
<xupdate:value-of select="$v_instock"/>
</xupdate:insert-after>
</xupdate:modifications>
|
You can see that XUpdate is a very simple language used to update XML content. You can also use the Java API to achieve these modifications. Listing 5 shows a simple Java program that performs the xupdate:insert-before operation as described earlier. This program is similar to the Parts.java program in Listing 3 that prints the XML content from the database. Listing 5. Java program to perform xupdate:insert-before
import org.xmldb.api.base.*;
import org.xmldb.api.modules.*;
import org.xmldb.api.*;
public class PartsXupdate {
public static void main(String[] args) throws Exception {
Collection col = null;
try {
String driver = "org.apache.xindice.client.xmldb.DatabaseImpl";
Class c = Class.forName(driver);
Database database = (Database) c.newInstance();
DatabaseManager.registerDatabase(database);
col = DatabaseManager.getCollection("xmldb:xindice:///db/partsdb");
String xupd = "<xupdate:modifications version="1.0"" +
"xmlns:xupdate=http://www.xmldb.org/xupdate>" +
"<xupdate:insert-before select="/parts/part[@sku="101"]/maker">" +
"<xupdate:element name="model">BAB-101</xupdate:element>" +
"</xupdate:insert-before>" +
"</xupdate:modifications>";
/* We are using XUpdateQueryService */
XUpdateQueryService service =
(XUpdateQueryService) col.getService("XUpdateQueryService", "1.0");
service.update(xupd);
}
catch (XMLDBException e) {
System.err.println("XML:DB Exception occurred " + e.errorCode);
}
finally {
if (col != null) { col.close();}
}
}
}
|
Note that instead of using XPathQueryService, I used XUpdateQueryService.
Conclusions
In this article, you have seen how Native XML Database Systems can be used to build XML-based applications. NXDs are not going to replace RDBMSs. In most cases, NXDs and RDBMSs are used together. NXDs are definitely going to play a major role in the future.
Resources - Download Xindice from www.xindice.org or apache.org.
- Find information about XML and Databases on Ron Bourret's Web site.
- Check out
XML Data Management
edited by Akmal B. Chaudhri, Awais Rashid, and Roberto Zicari (Addison-Wesley, ISBN 0201844524). This book is full of useful information on XML database product architectures, case studies, and performance benchmarks.
- If you are interested in storing XML in a Relational Database System, look at
Designing XML Databases
by Mark Graves (Prentice-Hall, ISBN 0130889016). This useful book describes several approaches and provides Java code examples for a number of modern RDBMSs.
- Explore Dr. Richard Edwards' work on XML Repositories and minx, his generic architecture for storing XML Documents in a Relational Database.
About the author  | |  | Arun Gaikwad is an independent software consultant. He has more than 11 years of experience in design, development, and implementation of large software projects. He can be reached at agaikwat@comcast.net. |
Rate this page
|