Skip to main content

skip to main content

developerWorks  >  Java technology  >

Proofing Web applications for performance and scalability

An XML-based scripting language that calls on Java objects can prove Web software reliable

developerWorks
Document options

Document options requiring JavaScript are not displayed

Discuss


Rate this page

Help us improve this content


Level: Introductory

Frank Cohen (fcohen@inclusion.net), Cofounder, Inclusion.net

05 Jun 2001

Testing for performance and scalability using a combination of XML and Java technologies is essential in the fast-paced world of Web-application development. In this article, Frank Cohen outlines a conceptual framework for building Web software, explains why the combination of Java objects and an XML-based scripting language works well for testing, offers such practical Web-software test methods as state and boundary testing, and introduces an open-source set of tools and a scripting language called Load to help with your testing.

The benefits of being able to test Web-oriented applications under real-life stresses and situations should be evident at this point in the Internet revolution.

From the user's perspective, a Web application that works well enough to keep users coming back is one that works well whether the user is accessing all components or various combination of components. It is an application that works well whether one user is accessing it or 10,000 users are putting it through its paces.

And the business/organization perspective goes hand-in-hand with the user perspective because users are the core, the draw for any online venture, whether the model is subscription-based or advertising-driven, whether the payoff is in cash, data, or community-building.

More importantly for the purposes of this article, though, from the perspective of the Java developer, the benefit is to determine if the developer's hard work has produced an application that works well in the world (for instance, it offers seamless operation throughout the system and predictable behavior under any conditions). The payoff for the Java developer comes in the form of salary, invaluable experience, and career advancement.

Regardless of the perspective that is most important to you, a single theme emerges -- an application that is available to the world and the Web must be tested under the stresses and conditions that the world would impose upon it. With that in mind, I'd like to discuss strategies to effectively and efficiently perform those tests and suggest where the Java language fits into that testing strategy, as well as offer a hybrid open-source set of tools and scripting language called Load that can help with load-based testing.

A structure: The Flapjacks architecture

An emerging conceptual design practice for Web-application architectures consists of deploying many small servers that are accessed through a load balancer, providing a front-end to a powerful database server.

The analogy goes like this:

Hungry patrons (Web users) show up for breakfast at your diner. The more that come for breakfast, the more flapjacks (servers) you have to toss on the griddle.

Some of the patrons want banana pancakes, some want blueberry, some want plain with just a hint of vanilla, and many will want to sample several flavors. Just like some users will require access to the servers running Java-language servlets, some will need application servers, others will be looking for CGI servers, but most will favor a combination.

All the flapjacks are dished out from the same batter. In this analogy, the batter functions as a single database for persistent storage, search, indexing, etc.

The waitress takes orders (user requests) and passes them to the appropriate cook (say, the blueberry pancake chef), much like a load balancer routes Web browsers to the appropriate server (or an alternate, if necessary).

I call this the Flapjacks architecture. And it is becoming popular.

Users tend to get faster performance from the small, inexpensive servers. Java engineers find debugging less complex because fewer threads are typically running at any time. Company financial managers like the architecture because they can buy lots of small, inexpensive servers and avoid the giant system purchases. Network managers like Flapjacks for the flexibility all those small servers give.



Back to top


Testing Flapjack Web applications

User expectations for high quality and availability are on the rise, too, somewhat paralleling the increase of use in the Flapjack architecture, so the following is a checklist Java developers should keep in mind when planning to test a Web application that uses the Flapjack architecture:

  1. State machine testing. If I put something into a shopping basket, is the product still there when I check out?

  2. Really long session testing. If I sign in at 9:15 a.m., what happens when I request a function at 3:15 p.m. using the same session identifier?

  3. Hordes of savage users testing. The Huns are at the gates. Will your application survive?

  4. Privileged testing. What happens when the everyday user tries to access a control that is authorized only for administrators?

  5. Speed testing. Is your application taking too long to complete a task?

These are fairly common tests for any software application. Since this is a Web application, though, the testing arena expands into a matrix, as the following table of a Web-application test suite describes.

More users means testing proceeds on a matrix

Test Matrix

Concurrent users

 

   1   50  100

State machine test

 

 

 

Really long session test

 

 

 

Hordes of savage users test

 

 

 

Privileged functions test

 

 

 

Speed test

 

 

 



Back to top


The road from Java-based to script languages

Traditional arrangements in software testing uses lots of "beta" testers to uncover bugs and other undocumented features. Eventually, though, you run out of friends and family. So developing and using automated test suites is a must.

My first attempt at writing test suites resulted in a fairly robust set of Java classes that could issue HTTP requests to a Web server and do some simple analysis of the results. Writing a test suite consisted of writing Java code that sequentially called the correct Java object.

For example, a suite that read through an online discussion forum message base looked like this:

  1. Read the first page of the site (which included URLs to the discussion messages).
  2. Randomly read a discussion message.
  3. If the message has a reply, then read the reply message.
  4. Repeat step 3 until there are no more reply messages.

Every new test suite brought me back to writing Java code. While the test objects -- getting Web pages, signing in, testing for the correct return values -- stayed the same from test suite to test suite, the calling sequence was always different.

The Java language does not define a common way to define the calling sequence of Java objects. Every programmer has to invent a way to call objects in a particular sequence. As a result, many programmers invent a simple scripting language inside their code. Some of the more advanced applications even expose a scripting language to users. In the whole of computing, there are literally thousands of applications with their own scripting languages.

Looking at my test suites, it seemed like developing a scripting language to assemble test suites built from Java objects would be beneficial. The Java objects would perform the individual test routines, routines that rarely need to change. And a scripting language -- such as an XML-based one -- would be used to assemble and control the parameters of the objects, a scripting language that would be easier to alter for different levels and types of Web application testing.

The benefit (for me and to other Java programmers): A programmer could write the more labor-intensive Java-code objects only once and adapt the scripting language as needed. The scripting language enables engineers, who are more comfortable with scripts than Java code, to write their own test suites. The XML-based scripts are readable by mere human beings and can easily be shared and swapped with others.

The key to making Web application test suites in this form lies with the ability of XML to standardize scripting languages -- to provide a common reference for the way notations, variables, and program flow are noted.



Back to top


How XML standardizes scripting languages

Software tools seem to always get to a point where a macro or scripting language makes sense. I find a scripting language a sign of a tool becoming mature.

However, the downside to scripting languages is that each exists in its own little world. Every scripting language has its own way of noting instructions, variables, and program flow. This individuality can undermine a scripting language's usefulness in efficiently building Web application test suites.

But XML has the potential to standardize scripting languages. Currently, XML sports a widely adopted standard for defining properties files, storing short runs of data, and exchanging data between servers.

XML is also easily applied to building scripting languages. For example, here is a way to model the online discussion forum reader that I discussed earlier in XML:


Listing 1. A real-world, XML-based scripting example

<load>
  <script>
    <session host="http://www.inbuilders.com">
      <sought_item value="mid=" terminator="&"/>
      <repeat>
        <get path="GET /inbuilders/home.html?rid=R4 HTTP/1.0"/>
        <repeat>
          <get path="GET /inbuilders/home.html?rid=R4&|
            mid=$founditem[$random[$foundcount;]; HTTP/1.0"/>
        </repeat>
      </repeat>
    </session>
  </script>
</load>

XML is a nice language because it is readable by human beings. Every XML tag has a corresponding closing tag -- <load> and </load> -- or a closing "/" at the end of the opening tag -- <sought_item ... />.

This script is processed from the top down and then in sections. The first two tags are simply a preamble that tells us to expect script tags defined and handled by any program that knows the load script format.

The <session> tag identifies the Web server this script will interact with. In this case it is the INBuilders.com site, a real Internet site. You can go there and look at the HTML source of the home.html page. In the source you should find HTML reference tags like this:

<a href="/inbuilders/home.html?mid=M383&rid=R4>

This reference is a link to a reply message. There are going to be as many of these as there are messages and replies on the home.html page.

<sought_item> identifies a search phrase. This phrase searches for the mid=M383 value for a reply message. When a Web page is received through the <get> tag, any sought_items are stored in a list of found items. The second or inner <get> tag then uses the found items to get a random reply message.

$founditem[$random[$foundcount;];

$foundcount; returns the number of found items from the first <get> tag. $random[10]; returns a random number between 1 and 10. The combination returns one of the found items randomly. Put this into the second <get> tag and the script will randomly step through the replies to the message.

Why develop a scripting language with XML?

There are some advantages to developing scripting languages using XML:

  • XML includes a definition language (DTD) in which you describe what is acceptable within a script. When writing a script compiler or evaluator, you don't have to worry about finding syntax errors. Most syntax errors are caught by the XML structure itself.

  • XML is widely adopted, and mature tools for editing and working with XML scripts abound.

  • XML is shareable. Just like in the early days of the Web when HTML authors would share tips and techniques by looking at the HTML source of a Web page, now XML script authors are able to share their scripts.
Java objects for Web application testing
The following list describes the objects available in the Load open-source software to build Web-application test suites.

HTTP object. Models Web pages, handles cookies and log-in sessions, supports HTTP 1.0 and 1.1, identifies hosts and pages, identifies list of search phrases, and GET and POST commands.

Datasource object. Provides file and URL data input, row/field data, cursor movement through data.

Lingo. Creates dummy text that is used to post sample messages and comments.

Random. Used to randomly navigate through Web URLs.

OnError. Exception and error-handling objects.

Script. Loads Script object references scripts in a file or URL for remote operation and multiple concurrent sessions.

Log. Logs errors and status to file and URL destinations.

Extensible. An easy extensible interface for you to add your own test objects.

Why not use Java code for everything?

If you are a skilled, experienced Java engineer with years of coding experience, you might be wondering if using XML-based scripting with Java objects is worthwhile.

Let me point out that you are one of less than 500,000 people in the world today who know how to write production-quality Java code. There are 100 times as many engineers who know how to compose shell scripts. Imagine enabling the script engineers of the world to use your application through a simple XML-based scripting language. Suddenly, your Java-based objects are available to a tremendously wider audience of technical users.

Additionally, your application is going to need a way to schedule and call objects anyway. Your choice of XML to schedule objects is a choice that makes XML even more of a standard.



Back to top


Load: An XML-script, Java-object combo in action

So far I've described the advantages of building an XML-based scripting language. The scripts make calls to Java objects that actually do the heavy lifting. To demonstrate this in action, I'd like to introduce Load, an open-source set of tools and a scripting language.

Load is a utility program written in the Java language that simulates Web browser users. Load is currently in version 2 (Alpha 5). The script library includes the XML-based Load script and 30 Java-based objects. It also comes with a set of scripts that you can use to customize to your test suites.

I recently used Load to test the time it took a Web application to dynamically create a Web page containing a long list of messages posted to a Web-based bulletin board. The test focused on determining how long the pages would take to create under the simulated use of 50, 500, and 1,000 concurrent users. I wrote a Load script to sign-in to the bulletin board and randomly read through a thread of discussions. The script simulated the various levels of concurrent users.

Load is licensed under terms similar to those of the Apache Web server. You download the Load program and all its source code. You can change Load to fix bugs and add new features. The license even lets you build commercially exploitable new products based on the code.

Now let's examine some of the Java objects necessary to implement test suites for Web applications.

HTTP object

These are test suites for Web applications, so we begin with an object that represents an HTTP Web page. This object has methods:

  • To define the host and page
  • To identify a list of search phrases
  • To handle cookies
  • To support HTTP 1.0 and 1.1 protocols for GET and POST operations

The <session> script tag instantiates an HTTP object. Listing 2 shows an example.


Listing 2. GETting a Web page

<session host="http://www.inbuilders.com">
  <get path="GET /inbuilders/home.html?rid=R4 HTTP/1.0"/>
</session>

Multiple sessions

The <script> tag instantiates an object that runs the entire script. The <script> has an optional value that instantiates multiple simultaneous copies of the script. Listing 3 instantiates five copies of the script. Each copy runs in a separate thread concurrently.


Listing 3. Five users GETting a Web page

<script sessions="5>
  <session host="http://www.inbuilders.com">
    <get path="GET /inbuilders/home.html?rid=R4 HTTP/1.0"/>
  </session>
</script>

Expression evaluator

An expression evaluator object provides a simple lexicon of parameter tags to work with the data you receive during the test. For example, $random[10]; is evaluated and replaced with a random whole integer between 0 and 10. All told, more than 20 expressions are built into the evaluator object, including:

  • $founditems;, which returns the number of items found during the previous GET operation.

  • $repeat;, which returns the whole integer number of iterations from a <repeat> tag.

  • $derive[ value_sought, search_string ];, which looks through the search string for a series of name=value pairs. When the name equals the value sought, derive returns the value found. For example:

    <variable name="igid_var" value="$derive[ page, 5 "]/>
    

  • $lingomessage; and $lingosubject;, which return a random English-like sample message and subject line.

Variables

A variable object provides simple creation, assignment, and retrieval of variables. For example, to define the variable found_var, you would use this:

<variable name="found_var" value="100"/>

Later, you can reference $found_var; to retrieve the value of this variable (which is currently set to 100).

Log object

The log object provides a mechanism to see the results of your script. The log object enables results to be displayed on the screen or saved to a text file.

The log object enables you to set the detail of the results based on a scale from 0 to 6.

  • 0 = Don't log any messages.
  • 1 = Log critical error messages.
  • 2 = Log error messages.
  • 3 = Log warning messages.
  • 4 = Log informational messages.
  • 5 = Log obnoxious, tedious, and debugging messages.
  • 6 = Log the next level -- Ludicrous (SpaceBalls fans will recall Ludicrous speed as being even better than hyperspeed.)

For example, <property name="log_level" value="3"/> will output critical, error, and warning messages.



Back to top


Conclusion

Programming and delivering production-quality Web applications is made easier and faster when you test quality under the stress of multiple concurrent users. After all, once out in the real world, conditions will not hesitate to test your Web application under the toughest loads possible.

In the code, every engineer needs to invent a mechanism for calling objects in a sequence unique to each Web application. XML is an ideal means to standardize the code driving the calling sequence of your objects, not to mention it makes your objects available to a wider range of developers.

The philosophy behind the open-source utility Load -- combining XML, scripting, and Java test objects -- can offer a way to make your work more productive when testing Web applications.



Resources



About the author

Frank Cohen is a software entrepreneur who has contributed to the worldwide success of personal computers since 1975. He combines skills in software design, technology architecture, and graphic arts to visualize new products and then to lead a business to produce the finished marketable work. He began by writing operating systems for microcomputers, helped to establish video games as an industry, helped establish the Norton Utilities franchise, led Apple's efforts into middleware and Internet technologies, helped establish one of the first e-commerce businesses, and most recently, was cofounder and chief technology officer for Inclusion.net, a company that builds easy, fast, and efficient collaborative workplaces for business users. Contact Frank at fcohen@inclusion.net.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top