Level: Introductory Rodolfo Raya (rmraya@maxprograms.com), Consultant, developer, co-owner, Maxprograms
01 Feb 2001 This article explains how to generate printable reports from a database using Java- and XML-related technologies like Xalan, Xerces, and FOP. It demonstrates how Java and XML make it possible to develop reporting tools that work under different operating systems using the same source code.
My company has a problem. One of our clients wants to use
Linux, and we need to port a large set of database applications from Windows to
Linux. Those applications were written in non-portable C++, and we will rewrite
them in Java. We selected Java because we need applications that work in both
operating systems using the same source code to simplify maintenance. But we
don't like the options we currently have for database reporting in Java. What my company needs is a way to generate reports from the data stored in the database, while
keeping the report format and content outside the application. Over the years
we've discovered that a high percentage of requests for changes are related to
printed reports. If we keep the report outside the application, we don't have to
recompile anything when we receive a request like "move this title to the left."
There are commercial products that make it possible to do this, but after
testing and evaluating several report tools for Java, we decided to create our
own. In this article I will describe the reporting tool we are working on. What we have today
A key piece in the set of Windows applications
is a report generator program I wrote using C++ Builder. That tool creates a
visual composition of the report layout, making reports maintenance an easy
task. Reports are currently stored in a proprietary binary format, and without
the report generator no one can edit them. This program only runs in Windows,
and because it uses third-party Delphi components without source code, I'm unable to
port it to Linux. We need either a replacement tool to edit reports in Windows
and Linux, or we need to write the reports in plain text so we can use any editor. We also have a set of Delphi components that we use in the rest of our
Windows applications. These components are used to read the report definition
from disk, open the database, extract the data, and show a preview or print a
nice report. We can't use these components in Linux, which means we can't use our
reporting tools in that OS. The tool I wrote using C++ Builder works great when we have a local copy of
the database available. Most of the time,
unfortunately, data is stored in servers outside of our office. We use Telnet to
administer remote sites, but when we are working with a Linux server we simply
can't run the report editor remotely. To download a copy of the database for
local use generates another problem: the database usually is too large.
What we want tomorrow
As stated before, we need a new way to
generate database reports. Our wish list for the new reporting system is very
long, but at minimum we would like the following:
- To store reports in plain ASCII text, allowing us to use any editor to
update them. We use Telnet to administer remote Linux boxes, and it should be
possible to use character-based editors like vi.
- A set of JavaBeans that we can use in our applications to read and
display/print reports.
- A stand-alone application that we can use in Windows and Linux to
preview/print any report, especially from the command line under Linux.
Proposed solution
The solution I'm working on involves storing reports in XML format. An XML
document is beautiful plain text, and the tools I'll need for development (Java
compilers and IDEs) are available in both target platforms, Linux and Windows.
Since a picture is worth a thousand words, I have included Figure 1 to provide a visual description of the planned solution.
Figure 1. The solution

Report definitions have to describe the data that will be presented to
the final reader and the appearance that the document should have. The report
type we use is called a "banded report" -- a set of data sources plus a page
layout composed of different band types. For the sake of simplicity, this
article will focus only on three band types: page header, page footer, and
detail band (the zone where the data will be listed). Figure 2 shows a sample of a report layout (left side) and a sample of the output
that can be generated (right side). In the example, bands are filled with static
text (labels) and data fields; in the real world, bands include graphics, bar
codes, calculations, and other things we will leave aside. XML will be used by our new reporting system to describe the source of the
data, the size and content of the header and footer of each page, and the origin
and distribution of the data being reported.
Figure 2. A sample report layout and output

Processing a report can consume a lot of database resources, so we must be
sure we are dealing with a valid document before we start. It's not funny to
discover that the report definition is incomplete after fetching thousands of
records from the server. Remember that sometimes we will use text editors that
lack parsing capabilities, and we may make mistakes easily. A missing XML tag
can leave the report processor in an infinite loop. To validate an XML file, we
need to check if it complies with a Document Type Definition (DTD). As we didn't
have a DTD ready to use, I had to write one from scratch, as displayed in the
next section.
Generating a DTD
The DTD should model the report completely. Using
a DTD editor or a text editor (see Resources), we
start defining the root element for every report, as shown in Listing
1:
Listing 1: Defining a root element for
reports
Basic items in a report:
- Page size.
- SQL statement for getting data.
Additional attributes for the report:
- Report name
- Report title
- DTD version number
| <!ELEMENT Report
(Page,SQLSatement)><br /><!ATTLIST Report<br />Name ID
#REQUIRED<br />Title CDATA #REQUIRED<br />Version CDATA #FIXED
"1.0"<br />> |
Now our document has page formatting and one SQL statement. We use an SQL
query to get the data displayed in the detail band, and sometimes we use
additional queries to summarize results in the last page or to show specific
information in header bands. This sample is limited to one query only. The attribute Version will be used in future versions to ensure
backward compatibility. The report processor bean (see Figure 1) should check what version of the DTD was used to generate the report.
Defining a page
Before we define what a page is in XML we need to
understand why we need a page at all.
A page is the canvas on which we will design our report. It has
size (Height and Width ) and contains one or more report
bands. In XML terms we define our page as: <!ELEMENT Page (ReportBand+)>
<!ATTLIST Page
Height CDATA #REQUIRED
Width CDATA #IMPLIED
>
|
The definition of ReportBand is straightforward: It is a
placeholder for all the elements (ReportElements ) we want to print.
A ReportBand has a band type attribute that is used to select the
position of the band when rendering the final view (header, footer or detail
band). The ReportElements we will use are restricted to Labels
and DataFields (see the DTD for more details). Data for the report will be gathered from JDBC sources, and we need to
specify this, as shown in Listing 2: <!ELEMENT SQLStatement EMPTY>
<!ATTLIST SQLStatement
Name ID #REQUIRED
Username CDATA #REQUIRED
Password CDATA #REQUIRED
Statement CDATA #REQUIRED
Driver CDATA #REQUIRED
URL CDATA #REQUIRED
> |
The DTD should also describe additional elements, such as:
- ReportElement
- Label
- DataField
- ReportElementType
- Font
The above elements are defined in a simplified version of the real DTD. Please take some time to browse the DTD; it will be easier for you to understand the report model after you read it.
Writing an XML report definition
Now that we have a DTD that
defines our documents, we can use any editor to write a test report. It doesn´t
matter if the editor is a validating one or not. We will use the DTD with our
processor bean to verify the report. Our sample report begins with something
like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Report SYSTEM
"file://localhost/C:/MXReports/MXReports.dtd">
<Report Title="Currencies" Name="ID0" Version="1.0">
|
The first line lets the reader (human or application) know the version of XML
and the character set that are used in the document. The second line explains
the type of root node being used, and shows the DTD that should be used for
validation (make sure you use the right location of the DTD when you write an
XML document). The last line defines attribute values for the root node. The RDBMS behind this project is InterBase, and the sample SQL statement for
this article will use InterBase syntax and Employee.gdb, an example database
included in the standard distribution. Don't worry if you don't have InterBase;
any database with JDBC driver can be used if you want to test these ideas. Employee.gdb (the example database that we will use) includes a table named
COUNTRY that has two fields: COUNTRY and CURRENCY. We will write an XML report
listing the contents of those fields. The SQL statement to retrieve the data
from the database is:
SELECT Country, Currency FROM Country ORDER BY Country
|
This translates to:
<SQLStatement Username="SYSDBA" Password="masterkey"
Driver="interbase.interclient.Driver"
URL="jdbc:interbase://localhost/c:\mxreports\employee.gdb"
Statement="SELECT Country, Currency FROM Country ORDER BY Country"
Name="ST1"/>
|
The sample report, which is available separately (see Resources), includes a header band with a title label and a detail band with two data fields (country + currency).
The report must have a detail band, or it will be useless. The DTD that
is provided asks for at least one band but does not require it to be a
DetailBand. Extracting data from the database
At this stage we have a database, a sample report, and a DTD. We need something to combine those
elements, generating something more useful. There is a box in Figure
1 that represents a report processor bean. That bean is used in a Java
application to generate an intermediate document also in XML format. The steps to generate the intermediate document are:
- Read and parse the XML report
- Establish a connection to the database and execute the
SELECT statement
specified in the report
- Generate the content of the first header band
- Iterate through all the rows of the query's result set generating content
for the detail bands
- Check the number of detail bands generated, inserting a new footer/header
pair when the page size demands it
The code in Listing 3 illustrates the first part of the
process listed above: Listing 3: Reading and parsing the
report
// create a parser and read the document
String reportFileName = args[0];
FileInputStream file = new FileInputStream(reportFileName);
DocumentBuilderFactory dFactory = DocumentBuilderFactory.newInstance();
dFactory.setValidating( true );
DocumentBuilder docBuilder = dFactory.newDocumentBuilder();
Document doc = docBuilder.parse( file );
Element root = doc.getDocumentElement();
|
// get page size from the document
NodeList nl = root.getElementsByTagName("Page");
Element pg = (Element)nl.item(0);
pageWidth = new Float( pg.getAttribute("Width") );
pageHeight = new Float( pg.getAttribute("Height") );
|
// read SQL related stuff from the document
NodeList nl2 = root.getElementsByTagName("SQLStatement");
Element sql = (Element)nl2.item(0);
driver = sql.getAttribute("Driver");
url = sql.getAttribute("URL");
statement = sql.getAttribute("Statement");
username = sql.getAttribute("Username");
password = sql.getAttribute("Password");
|
// open the database and get the data
Class.forName ( driver );
DriverManager.setLogStream(System.out);
Connection con = DriverManager.getConnection( url, username, password );
Statement stmt = con.createStatement ();
ResultSet rs = stmt.executeQuery ( statement ); |
The code shown in Listing 3 allows us to load a report from disk and store it in
memory as a validated XML tree. It shows you how to get data stored in the
report, such as page size and all the parameters needed to establish a
connection to the database. The variable doc in the example holds a complete version of the
report in what is called a DOM (Document Object Model) tree. We can keep
the whole report definition in memory because, as is always the case, it's a
small document. The data used to generate the report will be fetched one record
at a time, and it doesn't matter how large the data set is. The rest of the source code used to read the report into memory is available
for download (see Resources). For the
sake of brevity, I have omitted all exception catching and several variable
declarations belonging to this excerpt.
Generating the final view
Once we are able to read the XML report
definition and get the data from the database with our program, we face a
problem concerning what we should do with what we have. I've played with the
following options:
- Generate a set of graphics, each of them representing a page, and render
them with a special viewer.
- Generate an intermediate XML document and use XSL transformations to
obtain different output formats.
The first option is exactly what our current C++ software does. It works for us and
is enough for the moment. However, the second option provides additional benefits for the
future. If you check Figure 1 again, you will see that option 2 was the one we selected for the new reporting system. My first idea was to use a style sheet to transform the intermediate XML document into HTML, but then I remembered that we need printed output. HTML would make it hard to use footer bands in every page,
and HTML printing varies with every browser.
FOP versus RTF
Instead of generating HTML, there's the option of
generating PDF files using FOP (Formating Objects Processor), a utility
developed by Apache.org. According to Apache's FOP documentation: "FOP is the
world's first print formatter driven by XSL formatting objects. It is a Java
application that reads a formatting object tree and then turns it into a PDF
document. The formatting object tree can be in the form of an XML document ... or
can be passed in memory as a DOM document ... or SAX events" (see Resources). An issue that concerns me about using FOP is the possible need to have Acrobat Reader installed on the computer for viewing the result. Another option is to generate RTF files from our intermediate document using
a specialized style sheet. Microsoft has made available the whole definition of
RTF files. While I would prefer to use RTF, it may be simpler to use FOP because
developing a style sheet to generate RTF could be a difficult task, and I need
this project working not now but yesterday. While drafting the code used to read and parse the report, I decided to use
Xerces and Xalan from Apache. Because, like InterBase, they are open-source and
multiplatform projects, I figured I wouldn't have to be tied to proprietary,
unportable code again. There is a problem now, as I'm unable to compile the
latest FOP release using the latest version of Xalan. (Xalan, Xerces, and FOP
are different projects and they evolve at different rates.) However, the latest
version of FOP for beginners includes compatible versions of Xalan and Xerces.
Using the set of JAR files that were in the distribution for beginners, I can
keep my work moving forward. So it's clear that I decided to go with FOP. Using Xalan with the code in the
previous section, we built a tree that represents our report definition, and we
have a connection to the database ready. Now we have to do something with that
tree and the database. We should modify the report processor bean and make it
generate the intermediate XML document. The intermediate XML document that we have to generate must have a header,
and we write one to the output device with the code in Listing 4. Listing 4: Writing the header
System.out.println("<?xml version=\"1.0\" encoding=\"UTF-8\"?>");.
System.out.println("<report xmlns:fo=\"http://www.w3.org/1999/XSL/Format\">");
|
I write to the output device, and you should redirect it if you want to save
the resulting document to a file for later reuse. Notice that I don't use
XMLReports.DTD anymore but use the DTD used by FOP instead. After writing the header, I need to process all report elements that belong
to the HeaderBand. For each element in the HeaderBand branch of our tree, I have
to:
- Check its type (Label or DataField)
- If it is a DataField, get its value from the data source
- Specify font, size, and position to use when rendering
- Write everything to the output device using XML format
When I loaded the report into memory, I obtained the page size and the
heights of all bands. With a simple calculation, I know how many detail rows fit
in our canvas: Listing 5: Calculating the detail
rows
numberOfRows = (pageHeight - headerHeight - footerHeight) / detailHeight; |
After processing the header band, I have to read and process
NumberOfRows records from the database using the same method used to
process the elements of the header. When I am done reading all data that fits in
one page, I have to process the FooterBand and, if there still are records
available, write a new page header starting over and over. Seems easy, doesn't it? Doing this repetitive work, I created an XML document
that ideally would serve as input to FOP. The ideal document should contain only
a set of fo (formating objects) as specified in FOP's documentation.
But the hard part is ahead. I need an XSL style sheet that FOP can use to
transform my data into the desired PDF file.
How to use FOP
If you download the FOP version for beginners, you
will get a lot of samples that are really easy to follow. After unzipping FOP,
you will find xml2pdf.bat and xml2pdf.xsl at the root directory. The latter is
the XSL style sheet to use if you want to continue working with this project. So
the style sheet is right there and ready to use. When you have an XML document ready and the right XSL style sheet (name it
xml2pdf.xsl), all you have to do is run xml2pdf.bat. Simply provide your XML
document, style sheet, and output PDF file as parameters for the batch file. Once
you have FOP properly installed, go to FOP's root directory and test it
with:
xml2pdf test.xml xml2pdf.xsl test.pdf
|
The sample program does not generate a document that you can use to
successfully build a PDF file using xml2pdf.xsl. Remember that this is just an explanation of a new
approach, and that here we are dealing with a small subset of the options a
report could have. Looking at xml2pdf.xsl, you will be able to identify the
missing pieces in this puzzle.
Alternative paths
When we started researching a solution to our
reporting problem, we found several options that we still keep under
consideration, like the generation of RTF files that could also be used in any
word processor or spreadsheet. Applying XSL on the intermediate document to generate a large HTML document as
an alternative reporting method creates a problem with page footers, but having
a set of HTML pages could be a good option. Sun developed Java Help, a help
viewer similar to the help system in Windows 98/2000 and Internet Explorer
4.x/5.x. Java Help may be used to compile and preview/print a help document
that would meet our requirements. Note: This would be the easiest path to take
if you don't care too much about the differences between the previewed and the
printed version.
Additional enhancements
So far I have described how you can
generate a report in PDF format from the data stored in a RDMS. The method I
explained uses several tools to achieve the desired goal. With our current tool
(the one I wrote in C++), we can do design and print or preview a report using
only one program. My dreams for the future tool include a WYSIWYG editor that
can use the DTD and help build SQL queries visually and preview or print the
final report, all functions in one program.
Conclusion
In this article I tried to explain how Java and XML are
helping my company replace a database reporting system that is tied to one
operating system. Searching for a cross-platform solution, I found promising
tools like Xalan and FOP. There's still a lot of development and testing to do
before we release a new reporting system, but the results obtained so far
justify all the effort invested. I invite you to send me your comments about the article. And more: I invite you to send me your ideas to improve this project. I would like to see a
professional database reporting tool emerge from this sketch.
Download | Name | Size | Download method |
|---|
| x-dbrep.zip | 3KB | HTTP |
Resources - Visual DTD, now incorporated into the XML and Web Services Development Environment from alphaWorks.
- Windows Notepad. It might look like a joke, but I use it because it is a
nice tool.
-
Xeena
I found it the only usable editor when working with large DTDs like
DocBook. Providing the right XSL style sheet you can perform a transformation
and preview your document. However, you will only see a tree representation
when editing.
-
Morphon
A commercial XML editor that allows you to define cascaded style sheets to
preview the document while writing. A nice candidate as a WYSIWYG editor to
use in the future.
-
JDK 1.2
-
Xalan 2
- Download x-dbrep.zip to work with the following code files:
- MXReports.DTD
This is really an HTML file and not a DTD to avoid confusing your browser.
-
ReportLoader.java This
program loads a report in a DOM tree and establishes a connection to a
database using parameters defined in the report.
-
Currency.xml is a
sample report easy to understand. This is also an HTML file to view in your browser.
- A database with suitable JDBC driver. If you use InterBase you will need
jndi.jar and jdbc2_0-stdext.jar that are available at java.sun.com
About the author  | 
|  |
Rodolfo M. Raya works for Maxprograms as consultant/developer/co-owner, trying to find the magic formula to fit 8 days in a week. No one knows why, after he spent 13 years
programming in C/C++, he is dealing now with Java, XML, and all those new
technologies that take time to understand and digest. This article is dedicated
to Martín, his little baby who makes sleeping at night another difficult task.
You can contact Rodolfo at rmraya@maxprograms.com. |
Rate this page
|