 | Level: Intermediate Brett McLaughlin (brett@oreilly.com), Author, O'Reilly and Associates
17 Apr 2003 This tip introduces the XMLWriter class, a specialized SAX filter that handles output of stream-based XML. The tip also examines DataWriter, a subclass of XMLWriter that offers even more output capabilities. Both classes are examined in the context of handling the output of large XML documents and datasets.
After the last two tips, you should be quite comfortable using the XMLFilter class.
If you haven't read those two tips, you should go back and work through them now; in particular, familiarize yourself
with the various examples from the last tip. This tip introduces several new classes, both of which are implementions
of the XMLFilter class. These classes will expand your ability to manipulate XML data using the SAX API, and begin to lay the foundation for the output of large XML documents, which is of course the focus of this series.
The XMLWriter class
As you'll recall from the last two tips, XMLFilterImpl provides a basic, default implementation
of the XMLFilter interface. Most filters actually extend XMLFilterImpl
to get this basic behavior for free, without having to code up a lot of filler
methods. The first class I look at in
this tip follows that pattern: XMLWriter is an extension of
XMLFilterImpl. This specific filter, as its name implies, allows you to use SAX to do
something it really wasn't intended to do -- write XML.
Before getting into the details of how XMLWriter works, though, you need to
set up your environment. First, visit David Megginson's Web site (see Resources), and follow the
"Other Free Software" link. Then download the XMLWriter package; as of this writing, I'm using version 0.2.
Extract the downloaded archive, and add xml-writer.jar to your Java classpath. You can use the
-cp argument to the java command, the
-classpath argument to the javac command, or the
CLASSPATH environment variable. That .jar file gives you access to the relevant classes.
Actually using XMLWriter is a snap. The constructor for the class takes two parameters: the first,
as with all filters, is an instance of XMLReader, and the second is an instance of
java.io.Writer. This writer should be attached to an output sink of your choosing,
such as a
file. You then attach the XMLWriter to the top of your filter chain, right before your
XMLReader. Listing 1 shows a simple example.
Listing 1. Using XMLWriter
XMLReader reader =
XMLReaderFactory.createXMLReader(vendorParserClass);
XMLWriter writer = new XMLWriter(reader, new
FileWriter("outputFile.xml"));
RemoveNamespaceFilter filter = new RemoveNamespaceFilter(writer,
"http://www.ibm.com/developer");
ContentHandler contentHandler = new
ApplicationContentHandler(someDataParameter);
ErrorHandler errorHandler = new
ApplicationErrorHandler(someOutputStream, someErrorThreshold);
filter.setContentHandler(contentHandler);
filter.setErrorHandler(errorHandler);
InputSource inputSource = new InputSource(myXmlUri);
filter.parse(inputSource);
|
Plugging in the writer like this doesn't have any effect on the input XML, the behavior of the filters, or even what
the XMLReader receives as data or does with that data. So you can insert writers into your chain
any time you wish, without affecting the processing you have already set up.
What the writer does do, though, is allow for a quick snapshot of the XML data at the time when the writer processes
the data. In this case, the input XML comes into the RemoveNamespaceFilter, the specified
namespace is removed, and then the modified XML is fed into the XMLWriter. The writer
then spits out this data into the supplied Writer, which in this case is attached to a file.
After the XML has been written to this file, the data is then passed on to the registered XMLReader,
and processing continues. The net effect of all this is a snapshot of your XML data, which can be a very useful thing;
recall that at no point in this process is the XML stored in memory. Instead, the SAX API is just streaming data, and
one filter happens to be writing that data as it is streamed.
The DataWriter class
While XMLWriter is useful in the cases where you are reading in XML and immediately outputting
it, or are outputting a filtered set of the data, at times you want to output data that is not already
in an XML format. This is the case where you have data pulled from a database, or user input on Web forms, or some
other programmatically obtained information that you need to turn into XML for storage purposes. Using
a filter doesn't help much. However, the DataWriter class is perfect for these cases.
Note: Many of the methods detailed here are actually inherited by DataWriter
from its parent class, XMLWriter. However, DataWriter adds some
important methods of its own to the mix, and I urge you to always use DataWriter when
you don't need filter capability.
First, you need to create a new instance of DataWriter. You can pass into the constructor
a java.io.Writer, such as a FileWriter. You can then use the
variety of output methods that DataWriter has to emit XML, without having to worry
about syntax. Listing 2 shows an example.
Listing 2. Using DataWriter
DataWriter w = new DataWriter(new FileWriter("guitars.xml"));
w.startDocument();
w.startElement("guitars");
w.startElement("guitar");
w.dataElement("brand", "Collings");
w.dataElement("model", "D2HAV custom");
w.endElement("guitar");
w.startElement("guitar");
w.dataElement("brand", "Martin");
w.dataElement("model", "D-28 custom");
w.endElement("guitar");
w.endElement("guitars");
w.endDocument();
|
This is pretty basic, and would output the XML shown in Listing 3.
Listing 3. Output from Listing 2
<?xml version="1.0" standalone="yes"?>
<guitars>
<guitar>
<brand>Collings</brand>
<model>D2HAV custom</model>
</guitar>
<guitar>
<brand>Martin</brand>
<model>D-28 custom</model>
</guitar>
</guitars>
|
Easy enough, isn't it? You can use any of the SAX callback methods to emit XML, so you can create elements with
attributes by using startElement(String uri, String localName, String qName, Attributes atts),
for example. DataWriter also provides several other convenience methods -- like those shown in Listing 2 which don't require attributes -- as well as methods for empty elements. One other, extremely useful method, is
setIndentStep(). This allows you to set indentation. Listing 4 is simply Listing 2 with
this extra method call.
Listing 4. Using DataWriter and indentation
DataWriter w = new DataWriter(new FileWriter("guitars.xml"));
w.startDocument();
w.setIndentStep(2);
w.startElement("guitars");
w.startElement("guitar");
w.dataElement("brand", "Collings");
w.dataElement("model", "D2HAV custom");
w.endElement("guitar");
w.startElement("guitar");
w.dataElement("brand", "Martin");
w.dataElement("model", "D-28 custom");
w.endElement("guitar");
w.endElement("guitars");
w.endDocument();
|
The result is the XML shown in Listing 5, which is nicely formatted and indented.
Listing 5. Output from Listing 4
<?xml version="1.0" standalone="yes"?>
<guitars>
<guitar>
<brand>Collings</brand>
<model>D2HAV custom</model>
</guitar>
<guitar>
<brand>Martin</brand>
<model>D-28 custom</model>
</guitar>
</guitars>
|
Wrapping up
As in the second tip of this series, which introduced XML filters, this article has really just given you the ground rules
for a new set of utility classes. At this point, you should take some time to get familiar with these new classes, and
make sure you're comfortable with both XMLWriter and DataWriter. Then,
in the next tip, I'll begin to illustrate some practical uses of these two writer classes. I'll look at several examples
of how to completely avoid the overhead of keeping a document in memory while still outputting your XML, despite the
size of the underlying dataset. Until then, have fun with the API detail in this chapter, and I'll see you online.
Resources
About the author  | 
|  |
Brett McLaughlin has been working in computers since the Logo days (Remember the little triangle?). He currently specializes in building application infrastructure using Java-related technologies. He has spent the last several years implementing these infrastructures at Nextel Communications and Allegiance Telecom, Inc. Brett is one of the co-founders of the Java Apache project Turbine, which builds a reusable component architecture for Web application development using Java servlets. He is also a contributor of the EJBoss project, an open source EJB application server, and Cocoon, an open source XML Web-publishing engine. |
Rate this page
|  |