 | Level: Intermediate Benoit Marchal (bmarchal@pineapplesoft.com), Consultant, Pineapplesoft
13 Feb 2004 In this tip, Benoît discusses the different solutions available for passing binary data (typically files) to a Web service.
The evolution of Web service protocols has gone from supporting very simple
requests with simple parameters to fully supporting modern,
object-oriented languages. XML-RPC, arguably one of the earliest forms of
Web services, only supported simple types -- strings, integers, booleans, and
the like. SOAP took this one step further with its encoding rules for
objects. The last step -- improving on the binary -- came with SOAP
with attachments.
SOAP with attachments was originally introduced as
an extension to SOAP 1.1, and it is supported by the major SOAP kits.
Although SOAP 1.2, the official W3C release, does not support attachments
yet, work is under way to include them in the (ideally) near future.
Web services and binary data
I have little doubt that XML's success in
application integration comes from its reliance on a textual encoding (as
opposed to binary protocols such as CORBA, an object-oriented RPC standard, or
RMI, a Java-specific RPC standard). Textual encoding is preferable for several
reasons, the most critical of which may be that it is easier to
debug and easier to roll up a special implementation when the need arises.
Still, the reliance on textual encoding has a darker
side, and XML offers no efficient solution for including binary data.
According to the W3C XML Schema specification,
binary data should be encoded in base 64 or
hexadecimal. Unfortunately, 64-encoded data is 50% larger than
non-encoded data. Hexadecimal encoding doubles the size. This overhead
is acceptable for small pieces of binary data, but it is clearly an issue
for larger sets.
Binary data is useful in many applications. For example:
- Security applications need keys, hashes, certificates, and the encrypted data itself.
- Multimedia applications work with photos, music, and movies.
- In some applications, an XML representation of the data is deemed too inefficient -- CAD/CAM comes to mind.
- Thousands of file formats predate XML: word processing, spreadsheets,
fonts, vector graphics, genealogy, and many others.
While it is possible to create XML versions of these file formats (similar to SVG for
vector graphics), binary data has been around for a long time and
will likely remain popular.
Finally, there is the issue of XML itself! It
is not trivial to include an XML document inside another XML document (the
syntactically correct solution relies on CDATA sections and character
escaping).
 |
MIME and base 64
To clear up a source of frequent confusion, MIME does not mandate base 64 encoding.
Specifically, HTTP implementations do not encode attachments; only mail
clients encode attachments to work around limitations in SMTP (so
there's no gain when compared to XML). |
|
To address the needs of all these applications, Web
services must support binary data efficiently. The proposed solution is
SOAP with attachments which, in a nutshell, removes binary
information from the XML payload and stores it directly in the HTTP
request as multipart/related MIME content.
Your options, when designing a Web service that works with binary data, are:
- If the dataset is small, you might consider base 64 encoding within the XML
payload; the overhead is less of a problem with small datasets.
- If the dataset is larger, an attachment is the only practical option.
Listing 1 is a SOAP request with a base 64-encoded parameter. Note the address element.
Listing 1. base 64-encoded parameter
POST /ws/retrieve HTTP/1.0
Content-Type: text/xml; charset=utf-8
Accept: application/soap+xml multipart/related, text/*
Host: localhost:8080
SOAPAction: ""
Content-Length: 540
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<ps:retrieve
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ps="http://psol.com/2004/ws/retrieve">
<address xsi:type="xsd:base64Binary">d3d3Lm1hcmNoYWwuY29t</address>
</ps:retrieve>
</soapenv:Body>
</soapenv:Envelope> |
Implementing attachments
Attachments are available to Java developers through both JAX-RPC (the Java API for
XML-based RPC) and SAAJ (SOAP with Attachments API for Java). Don't
let the SAAJ acronym fool you: JAX-RPC supports attachments (see Resources
for an example). The difference between JAX-RPC and SAAJ is the level of
abstraction, not the capabilities.
JAX-RPC is a high-level API that's more abstract than SAAJ.
It hides most of the protocol-oriented aspects of SOAP
behind an RMI layer. The developer works on Java objects and the
pre-processor turns them into SOAP nodes. JAX-RPC uses the java.awt.Image and javax.activation.DataHandler
classes to represent attachments.
SAAJ is closer to the protocol. It takes more work to create a SOAP message with SAAJ
than with JAX-RPC (and furthermore it offers no automatic link to WSDL), so in most cases you will
want to use JAX-RPC. Still the low-level aspects of SAAJ make it more suitable
for illustrating how attachments really work. Listing
2 is a SOAP request with an attachment. The request asks the server to
resize a photo; because photo files are large, an attachment is more
efficient.
Listing 2. Attachment parameter
POST /ws/resize HTTP/1.0
Content-Type: multipart/related; type="text/xml";
start="<EB6FC7EDE9EF4E510F641C481A9FF1F3>";
boundary="----=_Part_0_7145370.1075485514903"
Accept: application/soap+xml, multipart/related, text/*
Host: localhost:8080
SOAPAction: ""
Content-Length: 1506005
------=_Part_0_7145370.1075485514903
Content-Type: text/xml; charset=UTF-8
Content-Transfer-Encoding: binary
Content-Id: <EB6FC7EDE9EF4E510F641C481A9FF1F3>
<?xml version="1.0" encoding="UTF-8"?>
<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<soapenv:Body>
<ps:resize
soapenv:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/"
xmlns:ps="http://psol.com/2004/ws/resize"
xmlns:SOAP-ENC="http://schemas.xmlsoap.org/soap/encoding/">
<source href="cid:E1A97E9D40359F85CA19D1B8A7C52AA3"/>
<percent>20</percent>
</ps:resize>
</soapenv:Body>
</soapenv:Envelope>
------=_Part_0_7145370.1075485514903
Content-Type: image/jpeg
Content-Transfer-Encoding: binary
Content-Id: <E1A97E9D40359F85CA19D1B8A7C52AA3>
note: binary data deleted...
------=_Part_0_7145370.1075485514903-- |
Listing 3 illustrates the
creation of the SOAP request. The request asks a server
to resize an image. The procedure is as follows:
- Create SOAP connection and SOAP message objects through factories.
- Retrieve the message body from the message object
(intermediary steps: retrieve the SOAP part and envelope).
- Create a new XML element to represent the request and set the encoding style.
- Create the attachment and initialize it with a
DataHandler object.
- Create more elements to represent the two parameters (
source
and percent).
- Associate the attachment to the first parameter by adding an
href
attribute. The attachment is referred to through a cid
(content-id) URL.
- Set the value of the second parameter directly as text and call the service.
The service replies with
the resized image, again as an attachment. To retrieve it, you can
test for a SOAP fault (which indicates an error). If there are no faults,
retrieve the attachment as a file and process it.
Listing 3. Using SAAJ
public File resize(String endPoint,File file)
{
SOAPConnection connection =
SOAPConnectionFactory.newInstance().createConnection();
SOAPMessage message = MessageFactory.newInstance().createMessage();
SOAPPart part = message.getSOAPPart();
SOAPEnvelope envelope = part.getEnvelope();
SOAPBody body = envelope.getBody();
SOAPBodyElement operation =
body.addBodyElement(
envelope.createName("resize",
"ps",
"http://psol.com/2004/ws/resize"));
operation.setEncodingStyle("http://schemas.xmlsoap.org/soap/encoding/");
DataHandler dh = new DataHandler(new FileDataSource(file));
AttachmentPart attachment = message.createAttachmentPart(dh);
SOAPElement source = operation.addChildElement("source",""),
percent = operation.addChildElement("percent","");
message.addAttachmentPart(attachment);
source.addAttribute(envelope.createName("href"),
"cid:" + attachment.getContentId());
width.addTextNode("20");
SOAPMessage result = connection.call(message,endPoint);
part = result.getSOAPPart();
envelope = part.getEnvelope();
body = envelope.getBody();
if(!body.hasFault())
{
Iterator iterator = result.getAttachments();
if(iterator.hasNext())
{
dh = ((AttachmentPart)iterator.next()).getDataHandler();
String fname = dh.getName();
if(null != fname)
return new File(fname);
}
}
return null;
} |
Note that Listing 3 makes it clear that the attachment is outside of the XML message!
This is necessary for efficiency.
Speaking of efficiency, take a look at Listing 4, which illustrates the more common (and dramatically
shorter) JAX-RPC version of Listing 3. The JAX-RPC precompiler generates
a stub that greatly simplifies coding.
You pass a DataHandler object as a method parameter and
JAX-RPC automatically generates the attachment.
Listing 4. The more efficient JAX-RPC
public File resize(File file)
throws ServiceException, RemoteException
{
AttachmentService service = new AttachmentServiceLocator();
AttachmentTip port = service.getAttachmentTip(); // get stub
DataHandler dh = new DataHandler(new FileDataSource(file));
DataHandler result = port.resize(dh,20);
return new File(result.getName());
}
|
Conclusion
Choice is good, and SOAP gives you a choice when working with binary data: You
can either encode it as base 64 within the XML payload, which is good for
small datasets, or you can attach larger binary files, unencoded, to the request.
Resources
About the author
Rate this page
|  |