 | Level: Introductory Gavin Bong (gavinb@eutama.com), Software Engineer, eUtama Sdn. Bhd.
01 Mar 2002 SOAP specifies an encoding to represent common types found in databases, programming languages (for example, Java programming language), and data repositories. Apache SOAP's toolkit supports encoding by supplying a base set of (de)serializers; classes that do the grunt work of mapping Java types to serialized XML representations. Part 1 of this two-part series explored the use of these (de)serializers. Here in Part 2, Gavin Bong shows you how to write your own (de)serializers when none from the toolkit suit your needs. He also provides an example application that demonstrates many of the concepts explored in this series. In the first article in this series, you saw how SOAP maps data types to XML, and learned how to use the serializers and deserializers (hereafter referred to collectively as (de)serializers) included in the Apache SOAP toolkit. In this installment, I'll walk you through a cookbook that will show you how to write your own (de)serializers. I would advise you to have the sources of some of the base (de)serializers available for reference.
You may also want to reread the "Type Mapping Pattern" section in Part 1 to refresh your memory on how type mappings are resolved internally.
Once I've finished with the cookbook, I will present a simple application that implements schema-constrained SOAP. The application will describe an interaction in which a purchase order document that's purposely noncompliant with Section 4 encoding is sent using SOAP.
Root and normal (de)serializers
If none of the (de)serializers bundled with the Apache SOAP toolkit will work with
your Java class, you may need to write a custom (de)serializer yourself. First, you need to make the distinction between what I call root (de)serializers
and normal (de)serializers. The initial
bootstrapping of the serialization and deserialization of RPC
parameters and responses is handled by the root (de)serializers. The three root
(de)serializers in Apache SOAP are listed in Table 1.
Table 1. Root (de)serializers
encodingStyle | (de)serializer | | Section 5 | ParameterSerializer | | Literal XML | XMLParameterSerializer | | XMI | XMIParameterSerializer |
The appropriate root (de)serializer is dispatched based on two things:
encodingStyle and the class Parameter
(for serialization) or the QName SOAP-ENV:Parameter (for deserialization).
To see how the actual dispatching takes place, you need to understand the chain
of events that leads to serialization of a Java type at the client side.
Call call = new Call();
...
resp = call.invoke(url, ""); //1
|
When the invoke method of the Call class is called
at line 1 above, the Call class iterates through its associated
parameters.
org.apache.soap.rpc.RPCMessage:
...
Serializer ser = xjmr.querySerializer(Parameter.class, actualParamEncStyle); //2
ser.marshall(...);
...
|
For each parameter, the type mapping registry is queried and the marshall() method
subsequently called on the returned serializer. The returned serializer in line 2
is a root serializer.
Let us now turn our attention to the deserialization process at the Web service.
On the server side, the listener for SOAP RPC messages is implemented as a
servlet. The doPost method retrieves the SOAP request
and attempts to reconstruct the Call object from it.
org.apache.soap.rpc.RPCMessage:
...
Bean paramBean = smr.unmarshall(actualParamEncStyle, RPCConstants.Q_ELEM_PARAMETER
...); //3
Parameter param = (Parameter)paramBean.value;
...
|
In line 3, RPCConstants.Q_ELEM_PARAMETER resolves to SOAP-ENV:Parmeter. It is here at
line 3 that the dispatching to the root deserializer occurs, when the unmarshall() method is
called.
The root (de)serializer in turn will query the type mapping registry for the next
normal (de)serializer to be called. This call stack will start to unravel only when
the registry returns (de)serializers for primitive Java types (during serialization) or when
the XML elements are purely containers for simple types (during deserialization).
The bulk of the supplied normal (de)serializers -- which can be found in the
package org.apache.soap.encoding.soapenc -- work on Section 5 encoding, as do most of the helper classes in Apache SOAP. This is the reason why I
will be focusing the cookbook solely on Section 5-encoding (de)serializers.
Figure 1. APIs for writing (de)serializers

Serializer cookbook
A serializer implements org.apache.soap.util.xml.Serializer
and realizes a single method:
void marshall(
java.lang.String inScopeEncStyle,
java.lang.Class javaType,
java.lang.Object src,
java.lang.Object context,
java.io.Writer sink,
NSStack nsStack,
XMLJavaMappingRegistry xjmr,
SOAPContext ctx)
throws java.lang.IllegalArgumentException,
java.io.IOException
|
Let us investigate each of marshall()'s parameters in turn:
-
inScopeEncStyle
:
This represents the encodingStyleURI as specified in the
enclosing Call or Response object.
-
javaType
:
This is the run-time type of the object that is to be serialized.
-
src
:
This is a reference to the Java object to be serialized.
-
context
:
A String denoting the accessor name. If this serializer
is invoked by the ParameterSerializer, then the context
value is equivalent to the named property in the Parameter
class (declared on SOAP client) or hardcoded to return
if this is a SOAP server. It must be non-null.
-
sink
:
The destination sink to which the SOAP XML instance will
be written.
-
nsStack
:
A data structure that implements a stack of namespace
declarations that are currently in scope.
-
xjmr
:
This is the smr which you'll use to query for the
serializer that you'll use next based on the Java type.
You will also invoke the xjmr marshall method to delegate to other
serializers based on javaType and encodingStyle -- you'd do this for
compound structures like Hashtable or Vector, for instance.
-
ctx
:
This is used to pass in things like javax.servlet.http.HttpServletRequest and
javax.servlet.http.HttpSession from the servlet context.
The cookbook for the marshall() method is as follows: Step 1: Create new namespace scope
Use the NSStack class to track the scope of XML namespace
declarations. Later on in the method, NSStack can be used to
add a new namespace and search through the stack for the prefix
given a URI. Step 2: Check constraints on object argument
Two conditions need to be satisfied in order for
serialization to happen: the serializer must be given a
supported type and the object to be serialized must be non-null. The following code snippet demonstrates
how these constraints are enforced in VectorSerializer.
if ( (src != null) &&
!(src instanceof Vector) &&
!(src instanceof Enumeration))
throw new IllegalArgumentException("Tried to pass a '" +
src.getClass().toString() + "' to VectorSerializer");
|
Several built-in serializers actually compare the javaType parameter
with the expected type, like this:
if(!javaType.equals(Foo.class)) ...
|
I wouldn't recommended using this technique, however, as it susceptible to the impostor type bug (see Resources).
Step 3: Generate a null accessor
If the object argument is null, you need to generate
a null accessor for the type.
SoapEncUtils.generateNullStructure(inScopeEncStyle,
javaType,
context,
sink,
nsStack,
xjmr);
|
Step 4: Serialize the object
Serializing the object into a Section 5-compliant SOAP XML document involves
three steps: generating the opening element for the accessor,
serializing the value of the object, and closing the element. The first step is easily achieved by calling the following utility method:
SoapEncUtils.generateStructureHeader(inScopeEncStyle,
javaType,
context,
sink,
nsStack,xjmr);
|
This code will call queryElementType
to find the mapped QName for javaType:
<context xsi:type="QName">
|
If the object argument, src, is a simple type, then the second step, serializing the value of the object, is
a simple matter of calling the src.toString() method
and writing that out. Otherwise, you will need to identify the constituent
parts of the object and individually pass them to more primitive serializers.
If you investigate the source for the built-in serializers, you'll notice
that these constituent parts can be identified in many ways:
- Java reflection (for example,
BeanSerializer)
- Iterating through a
List data structure (for example, ArraySerializer)
- Direct access via a priori knowledge of the class (that is, you know in advance that you serializer only works for one specific class)
Having identified the other serializers, you can delegate to them by calling:
xjmr.marshall(inScopeEncStyle,
componentType,
componentValue,
accessorName,
sink, nsStack, ctx);
|
Here, componentType and componentValue are representative of the run-time
type and object reference, respectively, for any constituent parts of the
original src parameter. The marshall() method actually calls querySerializer to retrieve the associated serializer and subsequently calls the marshall() method of the associated serializer. Obviously, this will only work if you've registered the serializers for all components in the type mapping registry. The last step, closing the element, is completed by simply writing out the closing tag for the accessor.
sink.write("</" + context + '>');
|
Finally, you must clean up after yourself by leaving the current namespace scope.
Deserializer cookbook
A deserializer implements org.apache.soap.util.xml.Deserializer
and realizes a single method:
Bean unmarshall(
java.lang.String inScopeEncStyle,
QName elementType,
org.w3c.dom.Node src,
XMLJavaMappingRegistry xjmr,
SOAPContext ctx)
|
The purpose of the unmarshall() method is to reconstruct the parameters as Java objects.
In order to do that, you need to process the XML fragment
contained by the src DOM node. The preferred programming model
to achieve this is to use the DOM wrapper methods in org.apache.soap.utils.xml.DOMUtils in
conjunction with the type mapping registry. In general, DOMUtils is
deserialization's counterpart to serialization's SoapEncUtils.
It is important to note that the XML contained in src is
guaranteed to be free of multireference values. All multireference hrefs
have been resolved back to the actual value by the root deserializer,
ParameterSerializer.
Thus, the deserialization cookbook is as follows: Step 1: Check for null
It is advisable to check for the nullability attribute, like so:
Element root = (Element)src;
if (SoapEncUtils.isNull(root))
{
return new Bean(Your.class, null);
}
|
Step 2: Reconstruct the Java object
The process of reconstructing a Java object varies depending on the category its data type falls into. (For more on these type categories, see Part 1.)
Simple type
If you're deserializing a simple type,
just use DOMUtils.getChildCharacterData(Element)
to retrieve the string value of src and optionally preprocess it
(for example, map the string "NaN" to Float.NaN in
FloatDeserializer) before using it to initialize the object
that's to be returned.
Compound type
Compound types fall into two major categories. The first comprises types
with a homogeneous structure of repeating elements; examples include Java arrays and classes implementing java.util.List and
java.util.Map. The other category is representative of all other Java
classes that exhibit arbitrary structure. The deserialization process, then,
boils down to the navigation of the XML structure to identify relevant
descendant elements and the subsequent delegation of deserialization
responsibilities to more primitive deserializers, as follows:
-
Navigating the DOM. If you're dealing with a compound type from the first category, you may use
DOMUtils.getFirstChildElement() and
DOMUtils.getNextSiblingElement() to navigate through all its
repeating members. Otherwise, use the DOM API to identify the
elements that represent member properties.
-
Delegate deserialization to other deserializers. First, you must extract the SOAP type:
QName soapType = SoapEncUtils.getTypeQName(rootElement);
|
Next, delegate to more primitive deserializers:
xjmr.unmarshall(inScopeEncStyle, soapType, rootElement, ctx);
|
xjmr.unmarshall internally calls queryDeserializer and then
invokes unmarshall on the returned deserializer. The two steps above
are better collapsed into one by delegating deserialization to
ParameterSerializer. This is done because, in situations where the xsi:type
attribute is missing, we would like to invoke xjmr.unmarshall() with the
soapType set to the QName {""}/X, where X is the root element's tagName.
Since the code to achieve this is already conveniently packaged inside
ParameterSerializer.unmarshall(), the shortened version of the process becomes:
Bean paramBean = xjmr.unmarshall(inScopeEncStyle,
RPCConstants.Q_ELEM_PARAMETER,
rootElement, ctx);
|
-
Initialize the target object. The target object is the object instance you're reconstructing.
As member properties get deserialized, you can restore their values by invoking the
mutator methods on your target object, as follows:
Foo foo = new Foo();
foo.setS( paramBean.value );
|
Step 3: Return the reconstructed object
The Bean class encapsulates the run-time type
and the actual returned instance. The deserializer knows
what class it should be returning because in most cases it has been tailored for a specific class. For generic deserializers like
BeanSerializer and ArraySerializer, the javaType property in the type mapping
conveys the type to be returned:
return(new Bean(Foo.class, foo));
|
Registering root (de)serializers
I've already mentioned that if you intend to introduce custom encodingStyles,
then you must write root (de)serializers. Root (de)serializers are implemented the same way as normal (de)serializers
except for one small difference: all root (de)serializers
are registered into the type mapping registry with a specially designated QName and
Java type that will tell Apache SOAP to bootstrap the (de)serialization
process based on the encodingStyle property. In the sample code below,
take note of the highlighted values, which you must use when registering root (de)serializers. [Client]
smr.mapTypes(customEncURI,
RPCConstants.Q_ELEM_PARAMETER,
Parameter.class,
customSerializer,
null);
[Server]
<isd:map encodingStyle="customEncURI"
xmlns:x="http://schemas.xmlsoap.org/soap/envelope" qname="x:Parameter"
javaType="org.apache.soap.rpc.Parameter"
java2XMLClassName="foo.customSerializer" />
|
Schema-constrained SOAP
In this section, I'll walk you through an alternative solution
to BeanSerializer for (de)serializing complex types. This technique,
which I'll call schema-constrained SOAP, uses an XML Schema to
describe the literal XML structure of the RPC parameter(s). Here, we're
agreeing to interoperate strictly on the format of the message, without
caring about the data model on the client and server. To avoid confusion,
it should be noted that the RPC invocation is still encoded using Section 5,
but the parameter(s) are not.
I'll illustrate this technique with an example application; you can download the full code from Resources below. A client sends a purchase order to a Web service, and the service responds with an acknowledgement string. The method signature exported by the Web service is thus:
public String eatPo (PurchaseOrder p);
|
In order for this technique to work, we need a XML/object data
binding framework. For this example, I chose to use Exolab's Castor toolkit.
(See the Resources section below for links to Castor and a list of other serialization frameworks, like JSX, JAXB, and Schema2Java.)
The steps for this technique are as follows:
- Agree on the XML format for
PurchaseOrder.
- Generate the Java classes using Castor.
- Write a custom (de)serializer.
- Write type mappings for the client and server pieces.
Step 1: Agree on the XML format for PurchaseOrder
For this use case, I removed the order details section from my PurchaseOrder
schema to keep things simple. Also, note that the PONumber attribute makes this
schema noncompliant with Section 5 encoding.
Figure 2. PurchaseOrder.xsd

Step 2: Generate the Java classes using Castor
Run Castor's SourceGenerator command-line tool to generate Java classes
that implement the schema in PurchaseOrder.xsd:
java org.exolab.castor.builder.SourceGenerator
-i PurchaseOrder.xsd
-package com.raverun.po.castor
|
The SourceGenerator tool only recognizes the latest schema namespace -- http://www.w3.org/2001/XMLSchema. Next, compile the set of Java classes. Note that you'll need to use the -deprecation option as the
generated files uses SAX 1.0 APIs. To circumvent this manual compilation,
Exolab is working on an Ant taskdef to automate it.
Step 3: Write a custom (de)serializer
You will now implement the (de)serialization methods in PurchaseOrderSerializer by
utilizing its counterpart methods exposed by the PurchaseOrder class. For serialization,
PurchaseOrder can marshall to a java.io.Writer or a
org.xml.sax.DocumentHandler. As shown in Listing 1, you delegate the
serialization to PurchaseOrder's marshal() method. One caveat: the XML
stream generated by the marshal() method contains the XML prolog. PurchaseOrderSerializer strips off this prolog
by wrapping sink with FilterXmlProlog, a java.io.FilterWriter. Listing 2 contains some exceptional cases that might arise during the deserialization process.
Listing 1. Extract from marshal() method in PurchaseOrderSerializer
----o<---------
SoapEncUtils.generateStructureHeader(inScopeEncStyle,
javaType,
context,
sink,
nsStack,
xjmr);
PurchaseOrder po = (PurchaseOrder)src;
try{
po.marshal( new FilterXmlProlog(sink) );
}catch(Exception e){
throw (new java.io.IOException("Castor: Error marshalling"));
}
sink.write( StringUtils.lineSeparator );
sink.write("</" + context + '>');
----o<---------
|
Listing 2. Exceptional cases during deserialization in PurchaseOrderSerializer
(b1) Null PO.
---------------------------
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po"
xsi:null="true"/>
---------------------------
(b2) Non null but nothing submitted in the body.
---------------------------
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po" />
---------------------------
(b3) PO that violates the schema.
---------------------------
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po">
<foo bar="123"/>
</po>
---------------------------
|
Step 4: Write type mappings for the client and server pieces
Lastly, you need to declare the type mappings to reference your custom (de)serializer.
It might surprise you to see Section 5 specified as the encoding for PurchaseOrder.
This is done for convenience's sake, as it grants you the ability to use ParameterSerializer
to bootstrap the deserialization process and also to use SoapEncUtils in the
serialization code.
[Client]
SOAPMappingRegistry smr = new SOAPMappingRegistry();
smr.mapTypes(Constants.NS_URI_SOAP_ENC,
new QName("urn:raverun", "po"),
PurchaseOrder.class, pos, null);
[Server]
<isd:map
encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:x="urn:raverun" qname="x:po"
javaType="com.raverun.po.castor.PurchaseOrder"
xml2JavaClassName="com.raverun.po.PurchaseOrderSerializer" />
|
Potential problems with this solution
You should keep the following issues in mind while examining the schema-constrained SOAP example:
-
In order to be standards compliant, you must turn off any claims about
Section 5 encoding in the <po> element. (See Listing 3 for a more compliant
SOAP XML instance.) The SOAP 1.1 specification (see Resources)
describes this requirement as follows:
A value of the zero-length URI ("")
explicitly indicates that no claims are made for the encoding style of
contained elements.
An alternative to the null encodingStyle is to introduce
a custom encodingStyleURI, tailored to your communication needs.
-
There are some bugs to watch out for in Castor, but all have workarounds.
If you're using a version of Castor older than 0.9.3, schema validation does not work as
expected. The solution is to upgrade to the latest release. On the other hand, Castor 0.9.3
(the version I used) generates a spurious message to the standard output stream.
The message I encountered:
Warning : preserved is a bad entry for the whiteSpace value.
|
The latest version of Castor, 0.9.3.9, suppresses this warning.
-
PurchaseOrderSerializer does not serialize to multireference values.
However, it will deserialize them correctly. This is not a feature of
PurchaseOrderSerializer per se, but of ParameterSerializer.
Listing 3. A more compliant SOAP instance
<ns1:eatPo
xmlns:ns1="urn:poservice"
SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<po
xmlns:ns2="urn:raverun"
xsi:type="ns2:po"
SOAP-ENV:encodingStyle="">
<purchaseOrder xmlns="http://www.example.com/PO1">
<header PONumber="9999-1212">
<Date>2001-09-25T14:40:13.453</Date>
</header>
...
</purchaseOrder>
</po>
</ns1:eatPo>
|
Latest interface changes
The last official release of Apache SOAP (version 2.2) came out in May 2001. Although
development focus has shifted to Axis (currently at beta 1), bug fixes are
continually being added. While we await a 2.3 release (if there is one), users of the
official release should be aware that there have been major updates in the codebase,
especially in SOAPMappingRegistry and its related
classes. Existing code may need some changes to interoperate with the fixes.
Here is the list of notable changes:
- Schema namespaces now reference the 2001 recommendation namespace by default.
Version 2.2 referenced the 1999 namespace.
- As a corollary, if you instantiate
SOAPMappingRegistry with its no-arg constructor,
a 2001 namespace-aware instance is returned.
- Instance creation for
SOAPMappingRegistry has been redesigned according to the
static factory pattern.
Thus, you now should use the factory method getBaseRegistry(schemaURI)
instead of the overloaded constructor SOAPMappingRegistry(schemaURI):
public static SOAPMappingRegistry getBaseRegistry (String schemaURI);
|
- Version 2.2 offers the ability to chain registries. These methods were recently added:
public SOAPMappingRegistry(SOAPMappingRegistry parent);
public SOAPMappingRegistry(SOAPMappingRegistry parent, String schemaURI);
public SOAPMappingRegistry getParent()
public String getSchemaURI()
|
The resolution of type mappings will percolate up the chain until a match is found.
- The
DeploymentDescriptor class treats the qname attribute as optional in type mapping declarations.
 |
Conclusion
I hope that the examples in this article have made clear the theoretical concepts outlined in the first article in this series. If Web services operating across many machines on the network are to become a widespread reality, developers must understand how programmatic objects are transmitted from one machine to another. A better understanding of SOAP's type mapping abilities should help you build better distributed applications and services.
Resources
About the author  | |  |
Gavin Bong is a Java developer from Kuala Lumpur, Malaysia.
His areas of interest include service-oriented architectures and wireless Java. You can contact Gavin at gavinb@eutama.com.
|
Rate this page
|  |