 | Level: Introductory Robert Williamson (robbiew@us.ibm.com), Software Engineer, Linux Technology Center, IBM
30 Jun 2004 Automating software testing allows you to run the same tests over a period of time, ensuring that you are really comparing apples to apples and oranges to oranges. In this article, Linux Test Project team
members share their methodology and rationale, as well as the scripts and tools they use to stress-test the Linux® kernel.
In testing the stability of Linux kernel
releases, there is a need to clearly state and document why the
release is stable or unstable. And yet no documented and proven, system-wide
stress test exists currently that can test the stability of the Linux kernel in its
entirety. This article provides a method for creating
a system-wide Linux stress test and proving the legitimacy of the results.
Different Linux developers, users, and distributions use their
own methods for testing kernel stability. However, information regarding
the basis for their decision on which tests to run, the kernel code
covered, and stress levels attained are unpublished, which greatly reduces
the value of the results.
Using lab machines and tests available for Linux from the Linux Test
Project test suite, we developed a combination of tests, based on system
resource utilization statistics, to adequately stress the system. We analyzed this
combination test to determine which sections of the Linux
kernel get exercised during test execution. Afterwards, we modified the combination
test to allow the highest percentage of code coverage, while
maintaining the high level of system stress desired. The final result is
a stress test that covers enough of the Linux kernel to be useful for
stability statements, and that has the system usage and kernel code coverage
data to support it.
The four steps to this combination test method are: test selection,
system resource utilization evaluation, kernel code coverage analysis, and
final stress test evaluation.
Selecting tests
Test selection involves selecting tests that accomplish two things:
- The tests should allow the attainment of high-resource utilization
levels for main kernel areas, such as the CPU(s), memory, I/O, and
networking.
- The tests should adequately cover the kernel code to help support
the stability statement produced from their results.
Whenever possible, use tests that are automated or
easily modified to support automation. Automation allows for quicker and
repeatable testing, and helps reduce the risk of human error. Using
applications that allow free publication of results is another
consideration when selecting suitable tests. It is good to choose tests and test suites
that adhere to the open source
methodology and/or GPL to help ensure an easy publication process.
Evaluating system resource utilization
The combination of selected tests must adequately stress
the system's resources. Four primary areas of the Linux kernel
can affect system response and execution time:
- CPU: Time spent processing data on the CPU(s) of the
machine
- Memory: Time spent reading and writing data to and from real
memory
- I/O: Time spent reading and writing data to and from disk
storage
- Networking: Time spent reading and writing data to and from
the network
Test designers should use the following two well-known and widely used open
source Linux resource monitoring tools to evaluate the resource
utilization levels. (For links to download both
of these tools, please see Resources later in this article.)
- top: An open source tool maintained by Albert D. Cahalan, which is
is included in most Linux
distributions and works on the current 2.4 and 2.6 kernels.
- sar. Another open source tool; this one is maintained by
Sebastien Godard. This tool is also included in most Linux distributions
and works on the current 2.4 and 2.6 kernels.
This system resource utilization evaluation phase of the method usually
requires multiple attempts at getting the right combination of tests that
will achieve the desired level of utilization. Over-utilization is always
a concern when deciding on the combination of tests. For example,
choosing a combination that is too I/O bound can create poor results for
the CPU, and vice versa. This part of the method consists primarily of a
large amount of trial and error, until the desired levels for all
resources are attained.
The top tool is useful for quickly determining
which resources (CPU, memory, or I/O) each test affects and how much of
them it utilizes in a real-time fashion. The sar
tool is useful for gathering network utilization statistics and recording
snapshots of all utilization data to a file over a period of time.
Once a
combination is chosen, a test must be run for an extended amount of
time to accurately evaluate the resource utilization. The amount of time
to run the test depends on the length of each test. Assuming that
multiple tests are being executed concurrently, the amount of time must be
long enough to allow the longest of all these tests to complete. The sar
tool should also be running during this evaluation. At the conclusion of
the evaluation run, you should gather and evaluate the utilization levels
for all four resources.
The following example shows sar output for CPU, memory, and network
utilization:
Listing 1. Example output from sar
10:48:27 CPU %user %nice %system %iowait %idle
10:48:28 all 0.00 0.00 0.00 0.00 100.00
10:48:29 all 3.00 0.00 1.00 0.00 96.00
10:48:30 all 100.00 0.00 0.00 0.00 0.00
10:48:31 all 100.00 0.00 0.00 0.00 0.00
02:27:31 kbmemfree kbmemused %memused kbswpfree kbswpused %swpused
02:29:31 200948 53228 20.94 530104 0 0.00
02:31:31 199136 55040 21.65 530104 0 0.00
02:33:31 198824 55352 21.78 530104 0 0.00
02:35:31 199200 54976 21.63 530104 0 0.00
02:27:31 IFACE rxpck/s txpck/s rxbyt/s txbyt/s
02:29:31 eth0 738.79 741.66 76025.55 136941.85
02:31:31 eth0 743.30 744.97 76038.82 136907.77
02:33:31 eth0 744.80 745.02 76135.53 136901.38
02:35:31 eth0 742.35 744.34 75947.45 136864.77
|
Analyzing kernel code coverage
Achieving adequate kernel coverage is another responsibility of a system stress test. Although the chosen combination of tests extensively
utilizes the four main resources, it may only be executing a small subset
of the kernel. Thus, you should analyze coverage to ensure that the
combination lends itself to being a system stress test, and not a system
load generator. Currently, two open source tools can help in
code coverage analysis of the Linux kernel:
- gcov: An open source tool maintained by the Linux Test
Project.
This tool analyzes the coverage of the
kernel, and reports what lines, functions, and branches are covered and how
many times they were hit.
- lcov: An open source tool developed by IBM and maintained
by the
Linux Test Project. This tool consists of a set of Perl scripts that
build on the text-based gcov output to implement HTML-based output. The
output includes coverage percentages, graphs, and overview pages that
allow quick browsing of coverage data. You can find both tools at the
Linux Test Project (LTP) home page (see Resources
for a link).
After the gcov module is loaded, all tests run in the system stress test
combination must be executed. Although the original system stress test
can and should have concurrent executions, this run should be iterative.
Each test should be run once to completion, one after another, without
repetition of any test. The single, iterative run is an attempt to reduce
the amount of unpredictable and untargeted kernel code executions that
result from the kernel's attempt to load balance the multiple, concurrent
runs of the system stress test. You should run the gcov analysis after
the conclusion of the final test run. As the final step in formulating the data for analysis,
run the lcov tool and unload the gcov module.
The lcov tool generates an entire HTML tree that contains every line of
code in the kernel and data on how many times, if any, each line was
executed. The tool quantifies the coverage data and generates coverage
percentage numbers for each section and file of the kernel. The following
example show a sample code coverage output:
Figure 1. Example of gcov output
The lcov maintainers defined "adequate coverage"
(green), and thus the lcov example is just an opinion. However, the
included raw data allows any reviewer to make his or her own judgment.
The test creator can now make changes to the combination of tests after
reviewing the coverage analysis, to change and/or increase the amount of
code covered.
Evaluating the final stress test
Verification of the system stress test is the reason for this final step
in the method. Execute the stress test on a
kernel believed to be stable; usually the kernel included in a
distribution will fill this requirement, but not always. Execute the stress test
over an extended period (minimum of 24 hours
recommended), with the sar tool running as well, for two reasons:
- The extended run will help find any problems within the combination
that would have otherwise gone unnoticed in a short, "sniff test."
- The data produced from sar forms your baseline for comparison in
future test runs.
After the conclusion of the extended run, you are now able to
decide, based on all the data gathered, whether or not this test
combination is a good candidate for system stress testing.
Figure 2. Summary of design process
The Linux Test Project used this design method when designing the Linux
kernel stress test script ltpstress.sh. This application combines
multiple tests from different areas of LTP's test suite, along with
memory and network traffic load generators. Before executing, the test
adjusts its total memory usage according to how much real and virtual
memory exist on the system. This test script is available through the LTP
test suite (see Resources). The script was
created under controlled laboratory conditions to ensure the accuracy of
the results.
The IBM Linux
Technology Center Test department uses this stress test, along with other
tools and tests, as a relatively quick and easy way to help validate the
stability of Linux kernel releases. Tests are conducted under
laboratory conditions, as well as under simulated customer scenarios, to help
ensure adequate coverage.
Resources
-
Download the stress test shell script and a passel of other useful tests
at the Linux Test Project home
page.
-
The mission of the IBM Linux
Technology Center is to work directly with the Linux development
community with a shared vision of making Linux succeed.
-
The OSDL's Linux Kernel
Scalable Test Platform (STP) provides a framework where developers can
test kernel patches against an online performance and scalability suite.
-
The LTP stress test makes use of the utilities
top (part of the procps package) and sar (part of systat).
-
Also, the LTP stress test makes good use of the GNU test coverage
program gcov and its
Perl-based gcov results HTMLizer lcov.
-
Kernel
comparison: Improvements in kernel development from 2.4 to 2.6
(developerWorks, February 2004) takes a look at the tools, tests, and
techniques that helped make 2.6 a better kernel than any that have come
before it.
-
Kernel
comparison: Web serving on 2.4 and 2.6 (developerWorks, February 2004)
presents results from the IBM Linux Technology Center's Web serving
testing efforts.
-
In Improving
Linux kernel performance and scalability (developerWorks, January
2003), the Linux Technology Center Linux Kernel Performance team discusses
how to quantify Linux performance for the purpose of comparing test
results over time.
-
Putting
Linux reliability to the test (developerWorks, December 2003) documents
the test results and analysis of the Linux kernel and other core OS
component by the IBM Linux Technology Center.
-
Inside
the Linux kernel debugger (developerWorks, June 2003) shows you how
to trace kernel execution and examine its memory and data structures.
- Find more resources for Linux developers in the developerWorks Linux
zone.
- Browse for books on these and other technical topics.
- Develop and test your Linux applications using the latest IBM tools
and middleware with a developerWorks Subscription: you get IBM software from
WebSphere®, DB2®, Lotus®, Rational®, and Tivoli®, and a license to use the
software for 12 months, all for less money than you might think.
- Download no-charge trial versions of selected developerWorks
Subscription products that run on Linux, including WebSphere Studio Site
Developer, WebSphere SDK for Web services, WebSphere Application Server,
DB2 Universal Database Personal Developers Edition, Tivoli Access Manager,
and Lotus Domino Server, from the Speed-start
your Linux app section of developerWorks. For an even speedier start,
help yourself to a product-by-product collection of how-to articles and
tech support.
About the author  | |  | Robbie Williamson is a Staff Software Engineer in the IBM Linux
Technology Center. He graduated from the University of Texas with a B.A.
in Computer Science in 2000. During his career, he has worked as a support
technician, verification engineer, and developer for various
implementations of UNIX. Robbie is currently one of the maintainers for
the Linux Test Project and can be reached at robbiew@us.ibm.com.
Note: This article represents the view of the author and does not necessarily
represent the view of IBM. The findings discussed in this article are based on a solution that was created
and tested under laboratory conditions. These findings may not be realized
in all customer environments, and implementation in such environments may
require additional steps, configurations, and performance analysis. The
information herein is provided AS IS with no warranties, express or
implied. This information does not constitute a specification or form
part of the warranty for any IBM products. Implementation and
certification of the solution rests on the implementation team.
|
Rate this page
|  |