Skip to main content

skip to main content

developerWorks  >  XML  >

Leverage XSLT to build applications

Moving beyond format transformations to multi-tiered solutions

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Intermediate

Chen Shu (chenshu@us.ibm.com), Software Engineer, IBM
Nianjun Zhou (jzhou@us.ibm.com), Advisory Software Engineer, IBM
Dikran Meliksetian (Dikran_Meliksetian@us.ibm.com), Senior Technical Staff Member, IBM

25 Mar 2003

This article describes a methodology for building an XML-based, end-to-end, multi-tiered solution by leveraging XSLT technology. The authors introduce this methodology through an example application in which XSLT is not only used in the transformation at the presentation layer, but also in retrieving data from heterogeneous data repositories and generating data-centric XML documents at the back-end. This application also provides data computation, such as statistical analysis in the middle tier.

As a widely accepted standard data format, XML successfully integrates multiple application components seamlessly. Here's what the W3C has to say about XSL:

XSL is a language for expressing stylesheets. It consists of three parts: XSL Transformations (XSLT): a language for transforming XML documents, the XML Path Language (XPath), an expression language used by XSLT to access or refer to parts of an XML document. (XPath is also used by the XML Linking specification). The third part is XSL Formatting Objects (XSL-FO): an XML vocabulary for specifying formatting semantics. An XSL stylesheet usually specifies the presentation of a class of XML documents by describing how an instance of the class is transformed into an XML document that uses the formatting vocabulary.

However, XSLT can be used to perform additional tasks within an application that uses XML as its main data representation model. (See Resources for more information on XSLT, XPath, and XSL-FO.)

In this article, we demonstrate how to build an application where XSLT is used beyond its traditional role of format transformation. Within the example application, we leverage XSLT to accomplish the following tasks:

  • Transform the repository data from a relational representation to XML
  • Perform statistical analysis of the data represented as an XML document
  • Generate an XML document based on a particular business logic
  • Render the XML as HTML, WML, and VXML

This application demonstrates that you can easily create search applications using legacy information and serve the results in multiple output formats by adding the proper XSL transformtion scripts. The methodology can be used in a variety of applications that need simple data analysis and data format transformation.

This article is organized as follows. In the next section, we briefly introduce the example application and the requirements upon which we built it. In the following section, we describe the architecture of our solution, followed by a detailed description of the XML transformations required within the application. Finally, we conclude by stressing the versatility of the solution with respect to changes in requirements, as well as input and output data formats.

The example

The example application is a search framework that attempts to minimize the sequence of questions and answers required to find a searched object. The system is called Guided Adaptive Search Framework (GASF). Figure 1 depicts the process flow of the GASF system.


Figure 1. Flow of GASF
Figure 1. Flow of GASF

To illustrate this process, let's consider a particular implementation of GASF, such as an employee directory search system for a large corporation with many branches in multiple cities. An end-user wants to find specific information, such as the mailing address of a person named John Smith who works in New York City. The system prompts the user with a list of questions (the questionnaire) which might include first name, last name, city, telephone number, and whether this person is a manager or not. The user selects a question to answer (for example, first name) and gives the answer (let's say "John"). The system performs a repository search and finds a large number of results. The system creates a second questionnaire by removing the already answered question and reorganizing the order of the remaining questions based on the statistical distribution of the possible values for these questions. The user is then presented with this second questionnaire. This sequence is repeated until either the name is found or it is determined that such a person does not exist. One of the objectives of the process and the statistical analysis is to minimize the number of question-answer cycles.

The requirements

The requirements for the GASF can be summarized as follows:

  1. The data repository can be implemented with either a relational database or an LDAP directory with arbitrary schemas. Although data retrieved from RDBMS or LDAP repositories have vastly different formats, the GASF presents the data to the application in a uniform format irrespective of the origin.
  2. The system filters out the questions that are irrelevant or unnecessary for the search in progress. For example, if in the current state of the search it is determined that a certain attribute for all the prospective results has the same value, that attribute is irrelevant and no question based on that attribute should be generated.
  3. The system minimizes the number of questionnaire preparation-answer processing cycles.
  4. The type of device that the request is coming from dictates how the output is rendered. For example, if a phone is used to request the search, the questionnaire is presented using VoiceXML. If a PDA is used, then the questionnaire is rendered as WML to accommodate the small screen of a PDA.
  5. Finally, we are targeting the development of a framework to be used in building similar applications with potentially different input/output requirements. We want to minimize the coding effort by externalizing the application-specific requirements. We accomplish this by making the framework initialize itself with application-specific configuration files. For example, the data repository location, the list of attributes, and the output formats are all specified in the configuration files.


Back to top


Architecture of the solution

To satisfy the requirements for GASF, we have come up with the following XML-based end-to-end solution (which is illustrated in Figure 2):

  1. Use XML Integrator (XI Engine -- see Resources) to interface to the data repository. A script does the specification of the search and the conversion of the retrieved data into XML.
  2. Use an Analysis Engine to calculate the distribution of the values of the retrieved attributes, and to determine what questions to ask and in which order. For example, if a particular attribute of all the retrieved instances has the same value, that attribute is dropped from the next set of questions. The attribute that has the widest distribution of values is considered the best candidate to be the first question asked in the next cycle. The questionnaire is created based on the XML representing this statistical information that is generated by the Analysis Engine.
  3. Use a Transformation Engine to transform the questionnaire and present it to the user based on type of device being used.

Figure 2. Architecture of GASF
Figure 2. Architecture of GASF

The purpose of this methodology is to be a middleware solution that is independent from the underlying data repository schema and the presentation layer. The advantages of this solution are the separation of the transformation specific code from the application logic, and the ease with which any changes in either the database schema or the XML structure can be handled without application code changes.



Back to top


Details of the solution

In this section, we describe the various XSL style sheets and the XML format used in a simplified example of the employee directory search application. Let's assume that the employee data are stored in a relational database with the following schema, which uses IBM DB2 as an example:


Listing 1. Schema of employee database
CONNECT TO EMPLOYEE;
-- DDL Statements for table "EMPLOYEE"."EMPLOYEE"
CREATE TABLE "EMPLOYEE"."EMPLOYEE" (
       "ID" CHAR(6) NOT NULL , 
       "LASTNAME" VARCHAR(100) , 
       "CITY" VARCHAR(100) NOT NULL , 
       "FIRSTNAME" VARCHAR(100) NOT NULL ,
       "PHONE" VARCHAR(100) NOT NULL , 
       "ADDRESS" VARCHAR(100) NOT NULL , 
       "ISMANAGER" CHAR(1) NOT NULL )
       IN "USERSPACE1"

COMMIT WORK;
CONNECT RESET;
TERMINATE;

XI Engine

The XML Integrator (XI) Engine is a tool for converting data between XML and structured data formats such as relational or LDAP data. The XI Engine is based on a script representing the relationship between the two information structures. It is available on IBM alphaWorks. More details on XI are available in Resources. Currently, XI supports two notations, DTDSA and XRT, to specify this relationship.

In our example, we use the XRT notation, which is a loose extension of XSL that ties together query statements with an XSL transformation. Listing 2 shows the XRT script that is used to retrieve data from the Employee database and create the intermediate data XML shown in Listing 3.

The XRT script contains two parts. The first part defines how the data is retrieved from the data repository, which includes the location of the data source, the SQL queries, and the relationship of the queries. The second part defines an XML transformation template. The execution of the XRT script first creates an internal XML representation that abides by a standard XRT schema; this XML representation is then transformed using the template that's defined by the second part of the XRT script.

For our application, the second part of template is the same for all the searches, but the queries themselves are dynamically changed by the addition of more search constraints based on the answers from the user.


Listing 2. XRT script for XI data retrieval
<?xml version="1.0" encoding="UTF-8"?>
<xrt:xrt xmlns:xrt="http://www.xrt.org" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xrt:rdbms2xml>
        <xrt:locator xrt:name="d" xrt:url="jdbc:db2:employee"
          xrt:driver="com.ibm.jdbc.app.DB2Driver" xrt:userid="foo"
           xrt:password="bar"/>
        <xrt:sqlsearch xrt:qid="q1">
        <xrt:query>select lastname,firstname,city,address,phone,
            ismanager from employee 
			    where firstname = 'John' and lastname = 'Smith'</xrt:query>
	    </xrt:sqlsearch>
    </xrt:rdbms2xml>
    <xrt:xml2xml>
        <xsl:stylesheet version="1.0" xmlns:xrt="http://www.xrt.org"
         xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
            <xsl:template match="store2xml">
                <xsl:element name="entries">
                    <xsl:apply-templates select="q1" />
                </xsl:element>
            </xsl:template>
            <xsl:template match="q1">
                <xsl:element name="entry">
                    <xsl:element name="attr">
                        <xsl:attribute name="name">lastname</xsl:attribute>
                        <xsl:attribute name="value">
                            <xsl:value-of select="LASTNAME/@value" />
                        </xsl:attribute>
                    </xsl:element>
                    <xsl:element name="attr">
                        <xsl:attribute name="name">firstname</xsl:attribute>
                        <xsl:attribute name="value">
                            <xsl:value-of select="FIRSTNAME/@value" />
                        </xsl:attribute>
                    </xsl:element>
                    <xsl:element name="attr">
                        <xsl:attribute name="name">city</xsl:attribute>
                        <xsl:attribute name="value">
                            <xsl:value-of select="CITY/@value" />
                        </xsl:attribute>
                    </xsl:element>
                    <xsl:element name="attr">
                        <xsl:attribute name="name">address</xsl:attribute>
                        <xsl:attribute name="value">
                            <xsl:value-of select="ADDRESS/@value" />
                        </xsl:attribute>
                    </xsl:element>
                    <xsl:element name="attr">
                        <xsl:attribute name="name">phone</xsl:attribute>
                        <xsl:attribute name="value">
                            <xsl:value-of select="PHONE/@value" />
                        </xsl:attribute>
                    </xsl:element>
                    <xsl:element name="attr">
                        <xsl:attribute name="name">ismanager</xsl:attribute>
                        <xsl:attribute name="value">
                            <xsl:value-of select="ISMANAGER/@value" />
                        </xsl:attribute>
                    </xsl:element>
                </xsl:element>
            </xsl:template>
        </xsl:stylesheet>
    </xrt:xml2xml>
</xrt:xrt>




Listing 3. Intermediate data XML generated from XI
<?xml version="1.0" encoding="UTF-8"?>
<entries>
    <entry>
        <attr name="lastname" value="Smith" />
        <attr name="firstname" value="John" />
        <attr name="city" value="New York" />
        <attr name="address" value="18 Broadway, New York, NY12000" />
        <attr name="phone" value="123-456-9012" />
        <attr name="ismanager" value="N" />
    </entry>
    <entry>
        <attr name="lastname" value="Smith" />
        <attr name="firstname" value="John" />
        <attr name="city" value="Miami" />
        <attr name="address" value="123 Flagler St., Palm Beach, FL23000" />
        <attr name="phone" value="234-567-9012" />
        <attr name="ismanager" value="N" />
    </entry>
    <!--48 more entries down here -->
</entries>

Analysis Engine

Having retrieved the data as an XML document that contains 50 entries (see Listing 3), the Analysis Engine performs a statistical analysis on the XML data by applying the XSL shown in Listing 4. This style sheet can be used in all the search applications as long as the XML data complies with the common format as in Listing 3.


Listing 4. XSL for statistical analysis
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
	version="1.0">
<xsl:output method="xml" />
<xsl:key name="attr-by-name" match="attr" use="@name" />
<xsl:key name="attr-by-name-value" match="attr"
		use="concat(@name, '+', @value)" />
<xsl:template match="entries">
<attributelist instances="{count(entry)}">
    <xsl:for-each select="entry/attr[generate-id() =
        generate-id(key('attr-by-name', @name)[1])]">
	    <attribute name="{@name}">
		    <xsl:for-each select="key('attr-by-name', @name)
                [generate-id() = generate-id(key('attr-by-name-value',
                                 concat(@name, '+', @value))[1])]">
			    <instance value="{@value}"
                occurrence="{count(key('attr-by-name-value',
                   concat(@name, '+', @value)))}" />
		    </xsl:for-each>
	    </attribute>
    </xsl:for-each>
</attributelist>
</xsl:template>
</xsl:stylesheet>

The outcome of the statistical analysis is the XML document shown in Listing 5:

  • The total number of instances is 50
  • Each instance has a different phone number
  • All instances have the same lastname ("Smith") and the same firstname ("John")
  • None of them is a manager
  • The number of different city locations and addresses is five

Listing 5. XML document with statistical analysis results
<?xml version="1.0" encoding="UTF-8"?>
<attributelist instances="50">
    <attribute name="lastname">
        <instance occurrence="50" value="Smith"/>
    </attribute>
    <attribute name="firstname">
        <instance occurrence="50" value="John"/>
    </attribute>
    <attribute name="city">
        <instance occurrence="8" value="WASHINGTON"/>
        <instance occurrence="10" value="DALLAS"/>
        <instance occurrence="15" value="AUSTIN"/>
        <instance occurrence="1" value="New York"/>
        <instance occurrence="16" value="Miami"/>
    </attribute>
    <attribute name="phone">
        <instance occurrence="1" value="1234567"/>
        <instance occurrence="1" value="2345671"/>
        <!--48 more instances -->
    </attribute>
    <attribute name="address">
        <instance occurrence="8" value="20 Burr Road "/>
        <instance occurrence="10" value="1024 24ST"/>
        <instance occurrence="15" value="3901 110Ave"/>
        <instance occurrence="1" value="18 Broadway "/>
        <instance occurrence="16" value="123 Flagler St."/>
    </attribute>
    <attribute name="ismanager">
        <instance occurrence="50" value="N"/>
    </attribute>
</attributelist>

From the instance value distribution of attributes shown in Listing 5, GASF determines that it has to create another questionnaire to further drill down for the searched object. The Analysis Engine generates the next questionnaire by reorganizing the Initial Attribute List XML document shown in Listing 6, which defines the initial mapping between attributes and corresponding questions for different media types.


Listing 6. Initial Attribute List XML document
<?xml version="1.0" encoding="UTF-8"?>
<gasf>
    <attribute name="lastname">
        <html><question>Enter the last name<question></html>
    </attribute>
    <attribute name="firstname">
        <html><question>Enter the first name<question></html>
    </attribute>
    <attribute name="city">
        <html><question>Enter the city<question></html>
    </attribute>
    <attribute name="address">
        <html><question>Enter the mailing address<question></html>
    </attribute>
    <attribute name="phone">
        <html><question>Enter the phone number<question></html>
    </attribute>
    <attribute name="ismanager">
        <html><question>Is this person a manager?<question></html>
    </attribute>
</gasf>

During this reorganizing process, attributes that have already been specified by the user, such as lastname and firstname, are eliminated from the questionnaire; attributes that have a constant value for all instances, such as ismanager, are dropped; the remaining attributes are ordered so that the attribute with the widest distribution becomes the first question in the questionnaire list. Listing 7 shows the generated questionnaire in XML format.


Listing 7. Questionnaire XML document
<?xml version="1.0" encoding="UTF-8"?>
<gasf>
    <attribute name="phone">
        <html><question>Enter the phone number<question></html>
    </attribute> 
    <attribute name="address">
        <html><question>Enter the mailing address<question></html>
    </attribute>
    <attribute name="city">
        <html><question>Enter the city<question></html>
    </attribute>
</gasf>

Transformation Engine

Finally, the questionnaire XML document is rendered to the user according to the device being used, by applying the appropriate XSL transformation style sheet. For example, if the request originates from a PC Web browser, then the XSL style sheet shown in Listing 8 is used to render an HTML Web page.


Listing 8. Questionnaire XSL for transforming XML to HTML
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml"/>
    <xsl:param name="url"/>
    <xsl:template match="gasf">
        <html>
            <form method="post" action="{$url}search">
                <p>Choose one of the following questions: </p>
                <xsl:apply-templates select="attribute"/>
                <p><input type="submit" name="submit"/></p>
            </form>
        </html>
    </xsl:template>
    <xsl:template match="attribute">
        <p>
            <input type="radio" name="attribute" VALUE="{@name}"/>
            <xsl:value-of select="html/question"/> 
        </p>
    </xsl:template>
</xsl:stylesheet>



Back to top


Extensibility

It's easy to add more rendering formats with this system -- you just need to do two things:

  1. Add the media type and the corresponding transformation in the Initial Attribute List XML. For instance, if you want to add a WML format for PDAs, you need to add wml as a subelement of each attribute element in Listing 6, and put the transformation requirement in it. In the same fashion, if the user is using a phone to interact with the system, you may want to add a vxml subelement. Listing 9 shows an example of such an extended attribute element.
  2. Add a transformation XSL style sheet similar to that in Listing 8 to transform the questionnaire XML in Listing 7 for the new media type. Listing 10 and Listing 11, respectively, show sample transformation XSL style sheets for XML to WML and XML to VXML.

Listing 9. Example of Extended Initial Attribute
<attribute name="phone">
  <html><question>Enter the phone number</question></html>
  <wml><question>Tap phone number</question></wml>
  <vxml option="1">
    <question>press or say the phone number</question>
    <type>digits</type>
    <grammar src="builtin:grammar/digits?minlength=1;maxlength=7" mode="dtmf"></grammar>
    <catch event="noinput nomatch">
      <reprompt/>
    </catch>
  </vxml>
</attribute> 



Listing 10. Sample questionnaire XSL for transforming XML to WML
<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml"/>
    <xsl:param name="url"/>
    <xsl:template match="gasf">
        <xsl:text disable-output-escaping="yes">
            <![CDATA[<!DOCTYPE wml PUBLIC "-//WAPFORUM//DTD WML 1.1//EN"
                "http://www.wapforum.org/DTD/wml_1.1.xml">]]>
        </xsl:text>
        <wml>
            <card id="questionary" title="Questionary">
                <do type="accept">
                    <go method="post" href="{$url}">
                        <postfield name="attribute" value="$(attribute)" />
                    </go>
                </do>
                <p>Choose one of the following questions: </p>
                <p>
                    <select name="attribute">
                        <xsl:apply-templates select="attribute"/>
                    </select>
                </p>
            </card>
        </wml>
    </xsl:template>
    <xsl:template match="attribute">
        <option value="{@name}">
            <xsl:value-of select="wml/question"/>
        </option>
    </xsl:template>
</xsl:stylesheet>



Listing 11. Sample questionnaire XSL for transforming XML to VoiceXML for phone
<?xml version="1.0" encoding="UTF-8"?>
    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <xsl:output method="xml" />
    <xsl:param name="url" />
    <xsl:template match="gasf">
        <vxml version="1.0">
            <meta name="Content-Type" content="text/x-vxml" />
            <property name="caching" value="safe" />
            <form id="main" scope="dialog">
                <block>
                <prompt>Choose one of the following questions:
                    <break msecs="10" />
                </prompt>
                <goto next="#check_searchable_attribute" />
                </block>
            </form>
            <menu id="check_searchable_attribute" scope="document" dtmf="false">
                <prompt>
                    <enumerate>
                        <break msecs="10" />
                            Press<value expr="_dtmf" mode="tts" />or say
                        <value expr="_prompt" mode="tts" />
                    </enumerate>
                </prompt>
                <xsl:apply-templates select="attribute" />
                <choice dtmf="9" next="#check_searchable_attribute">
                    nine for Repeating the menu</choice>
                <choice dtmf="0" next="#quit">
                    zero to Exit the Search System
                    <grammar type="application/x-jsgf">quit | exit | goodbye
                    </grammar>
                </choice>
                <noinput>
                    Please select at least one option
                    <reprompt />
                </noinput>
                <nomatch>
                    Sorry, that is not an option. Try again
                    <reprompt />
                </nomatch>
                <catch event="error.badfetch">
                    <prompt>Some where something went wrong, lets try again
                    </prompt>
                    <goto next="#check_searchable_attribute" />
                </catch>
            </menu>
            <form id="quit" scope="document">
                <block>
                    <prompt>Thank You for using the Voice Search System, Goodbye
                    </prompt>
                </block>
                <block>
                    <exit />
                </block>
            </form>
        </vxml>
    </xsl:template>
    <xsl:template match="attribute">
        <choice dtmf="{vxml/@option}" next="{$url}?attribute={@name}">
            <xsl:value-of select="@option" />for
            <xsl:value-of select="vxml/question" />
        </choice>
    </xsl:template>
</xsl:stylesheet>

It is also very easy to use GASF to build other search systems using various existing data sources. Again, two steps are required:

  1. Create the script that contains a data source definition similar to that in Listing 2.
  2. Create the Initial Attribute List XML similar to that in Listing 6, which contains all the attributes to search on.


Back to top


Conclusion

The methodology demonstrated by GASF can be used in a variety of applications -- such as Web content management systems, knowledge management systems, and business-to-business transactions -- where you have the need to compose an XML object from various data sources, and then process and render it on the fly. We believe that as XSLT technology matures, this can be performed more efficiently and extensively.

The primary advantage of leveraging XSLT to enable applications is its flexibility and low cost of development. For applications that do not need to support high volume transactions, XSL transformation can provide a quick, easy, and cost-saving solution.



Resources



About the authors

Chen Shu is a software engineer with IBM Internet Technology group, where she plays an active role in e-business application prototyping using emerging Internet technologies. Her areas of interest include XML, Web services, and pervasive computing. You can contact Chen at chenshu@us.ibm.com.


Nianjun Zhou is an advisory software engineer with IBM Internet Technology group. He has worked on several projects related to Grid computing, XML-based content management, XML, and relational database/LDAP transformation. His interests include using computer technologies to develop new applications that can enhance the efficiency of knowledge sharing and information management in general. You can contact him at jzhou@us.ibm.com.


Dikran S Meliksetian is a senior technical staff member with IBM Internet Technology group. He has previously been involved in the development of Content Management solutions, and is currently involved in a number of Grid Computing projects. You can contact him at Dikran_Meliksetian@us.ibm.com.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top