Skip to main content

skip to main content

developerWorks  >  XML  >

Tip: Make your CGI scripts available via XML-RPC

Providing a programmatic interface to Web services

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Introductory

David Mertz (mertz@gnosis.cx), Interfacer, Gnosis Software, Inc.

30 Apr 2003

For a large class of CGI scripts, it is both easy and useful to provide an alternate XML-RPC interface to the same calculation or lookup. If you do this, other developers can quickly utilize the information you provide within their own larger applications. This tip shows you how.

Many CGI scripts are, at their heart, just a form of remote procedure call. A user specifies some information, perhaps in an HTML form, and your Web server returns a formatted page that contains an answer to their inquiry. The data on this return page is surrounded by some HTML markup, but basically it is the data that is of interest. Examples of data-oriented CGI interfaces are search engines, stock price checks, weather reports, personal information lookup, catalog inventory, and so on.

A Web browser is a fine interface for human eyes, but a returned HTML page is not an optimal format for integration within custom applications. What programmers often do to utilize the data that comes from CGI queries is screen-scraping of returned pages -- that is, they look for identifiable markup and contents, and pull data elements from the text. But screen-scraping is error-prone; page layout might change over time or might be dependent on the specific results. A more formal API is better for programmatic access to your CGI functionality.

XML-RPC is specifically designed to enable application access to queryable results over an HTTP channel. Its sibling, SOAP, can do a similar job, but the XML format of the SOAP is more complicated than is needed for most purposes. An ideal system is one where people can make queries in a Web browser, while custom applications can make the same queries using XML-RPC. The underlying server can do almost exactly the same thing in either case.

An example

I have created a service within my Web site that enables users to send e-mail to anonymized recipients. Rather than a traceable address, recipients can create a local anonym where they can get mail. You can read about the goals and architecture of Gnosis-Anon at its home page (see Resources). At the same URL, you can enter a query into an HTML form, and in return be presented with an HTML page informing you of an anonym. From there, you need to either write down the information or cut-and-paste the information into a tool other than your Web browser.

Suppose you want to utilize the anonym automatically in an application such as a Mail User Agent (MUA) or Mail Transport Agent (MTA). You might do some screen-scraping like the following:


Listing 1. get-anonym-cgi.py
                
                #!/usr/bin/env python
                
from urllib import urlencode, urlopen
from sys import argv base_url = 'http://gnosis.cx/cgi-bin/encode_address.cgi' query = urlencode({'duration':argv[1], 'email':argv[2]}) html_answer = urlopen(base_url+'?'+query).readlines() result = "NO ANONYM FOUND!"
for line in html_answer: if line.find("<dt>Anonym:</dt>") >= 0: start = line.find('<dd>')+4 end = line.find('</dd>') result = line[start:end] break
print result

You can run this with a command line like the following:


Listing 2. Running get-anonym-cgi
                % get-anonym-cgi.py perm mertz@gnosis.cx
.rNCOolqsVQYu@gnosis.cx

This works if I do not change the format of the HTML -- but that's a big if. A more robust (and simpler) client application might look like this:


Listing 3. get-anonym-xmlrpc.py
                
                #!/usr/bin/env python
                
import sys, xmlrpclib server = xmlrpclib.Server("http://gnosis.cx:8000") print server.anonym(sys.argv[1], sys.argv[2])

This XML-RPC application behaves exactly the same as the CGI-based one -- except that it will not break if the layout of the Web page changes slightly.



Back to top


Setting up the XML-RPC server

Writing an XML-RPC server is not much different from writing a CGI script. The actual calculation or lookup code is identical; you only need to change the format of the output and do a little extra work parsing the inputs for CGI. My CGI script looks something like this:


Listing 4. encode_address.py
                
                #!/usr/bin/env python
                
import cgi query = cgi.FieldStorage() email = query.getvalue('email','test@test.lan') duration = query.getvalue('duration', 'Unknown') anonym = FIND_THE_ANONYM(duration, email) html_template = open('template').read() html = html_template % (email, anonym, duration) print "Content-Type: text/html"
print
print html

This leaves out the details of how FIND_THE_ANONYM() works and what the HTML template looks like, but those details are unimportant here. An XML-RPC server is even easier to program:


Listing 5. anonym-xmlrpc-server.py
                
                #!/usr/bin/env python
                
from SimpleXMLRPCServer import SimpleXMLRPCServer class Anonym: def anonym(self, duration, email): return FIND_THE_ANONYM(duration, email) def container_test(self): return {'spam':'eggs', 'bacon':'toast'} server = SimpleXMLRPCServer(('', 8000)) server.register_instance(Anonym()) server.serve_forever()

As you can see, the same lookup function is used; its return value is what is returned to a remote call to the .anonym() method. On the wire, return values are encoded as XML-RPC, but Python's xmlrpclib module automatically translates XML-RPC encoded structures back into native data structures, as do analogous libraries in other languages. The method .container_test() in Listing 5 can be called remotely as well, in which case the client will see a Python dictionary.



Back to top


A few notes

These code examples use Python, but implementations of both XML-RPC clients and servers exist for a large number of programming languages. Moreover, XML-RPC itself is completely language-neutral; multiple clients written in different languages can call the same server, and none of them will care what language the server was written in.

There is a difference in the way a CGI script runs and the way this XML-RPC server runs. The XML-RPC server is its own process (and uses its own port). CGI scripts, on the other hand, are automatically generated by a general HTTP server. But both still travel over HTTP (or HTTPS) layers, so any issues with firewalls, statefulness, and the like remain identical. Moreover, some general-purpose HTTP servers support XML-RPC internally. But if, like me, you do not control the configuration of your Web host, it is easier to write a stand-alone XML-RPC server like the eight-line version in Listing 5.



Resources



About the author

David Mertz knows a little bit about a lot of things, but a lot about fewer things than he once did. The smooth overcomes the striated. David can be reached at mertz@gnosis.cx; his life pored over at http://gnosis.cx/dW/.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top