Skip to main content

skip to main content

developerWorks  >  Java technology  >

Zap bugs with PMD

Find bugs before they bite with this handy static analysis tool

developerWorks
Document options

Document options requiring JavaScript are not displayed


Rate this page

Help us improve this content


Level: Intermediate

Elliotte Rusty Harold (elharo@metalab.unc.edu), Adjunct Professor, Polytechnic University

07 Jan 2005

PMD, an open source static analysis tool, can be a worthwhile addition to your bug-zapping arsenal. Elliotte Rusty Harold explains how to use PMD's built-in rules and your own custom rule sets to improve the quality of your Java code.

Tom Copeland's PMD is an open source (BSD license) tool that analyzes Java source code to find potential bugs. It's similar in general purpose to tools like FindBugs and Lint4j (see Resources). However, all of these tools tend to find different bugs, so it's profitable to run each of them across any given code base. In this article, I'll explain how to use PMD and show you what to expect from it. This article explores PMD's command-line interface. You can also integrate PMD with Ant for automatic source-code checking, and plug-ins exist for most major IDEs and programmer's editors.

Installing and running PMD

PMD is written in the Java programming language and requires JDK version 1.3 or later. If you're comfortable using the command line, PMD is straightforward to install and run. Download the zip file (see Resources) and unzip it to any convenient location, such as /usr or your home directory. For this article's purposes, assume you've unzipped it into /usr.

The easiest way to run PMD is to invoke the pmd.sh script (Unix/Linux) or the pmd.bat script (Windows). Unconventionally, you'll find these scripts in the pmd-2.1/etc directory rather than a bin directory. This script takes three command-line arguments:

  • The path to the .java file to check
  • The keyword html or xml to indicate the output format
  • The names of the rule sets to run

For example, this command checks the ImageGrabber.java file using the naming rule set and produces XML output:

$ /usr/pmd-2.1/etc/pmd.sh ImageGrabber.java xml rulesets/naming.xml



Back to top


Analyzing the results

The output from the above command, sent to System.out by default, looks like the report in Listing 1:


Listing 1. A PMD XML report
				<?xml version="1.0"?><pmd>
<file name="/Users/elharo/src/ImageGrabber.java">
<violation line="32" rule="ShortVariable" 
           ruleset="Naming Rules" priority="3">
Avoid variables with short names like j
</violation>
<violation line="105" rule="VariableNamingConventionsRule" 
           ruleset="Naming Rules" priority="1">
Variables that are not final should not contain underscores 
(except for underscores in standard prefix/suffix).
</violation>
</file>
<error filename="/Users/elharo/src/ImageGrabber.java" 
  msg="Error while processing /Users/elharo/ImageGrabber.java"/>
</pmd>

You can see in Listing 1 that PMD has found two problems: a short variable name in line 32 of ImageGrabber.java and a name that contains an underscore in line 105. These might seem like minor issues, but the results can be surprising. In this case, the underscore in line 105 was just some easy-to-fix detritus from 10-year-old code. But inspecting the first problem led to the realization that I could eliminate the j variable completely, because it was duplicating the functionality of another variable that was being incremented separately. The program worked, but it was lot more brittle than it should have been against future changes. Every line of code you eliminate is one less place for bugs to sneak in.

You can redirect the PMD output to a file or pipe it into an editor in the usual way. I often prefer to generate the output in HTML and load it into a Web browser, as shown in Figure 1.


Figure 1. PMD output in rendered HTML
PMD sample output

Sending the output to a file is especially helpful when you're checking a source tree. If you pass a directory name, zip file, or JAR archive file as the first argument, then PMD recursively checks every .java file in that directory or archive. The sheer amount of output can be a bit intimidating, especially when PMD generates a large number of false positives. For instance, when I run PMD on the XOM code base (see Resources), it constantly reports that I should "Avoid variables with short names like in." I happen to think that "in" is a perfectly fine name for a variable that points to an InputStream. Nonetheless, if you inspect the output in a decent text editor, you'll normally find it easy to recognize and delete the most frequent false positives, because they tend to be quite similar; then you can fix any remaining issues.

One feature that's sorely lacking in PMD is the ability to add a "lint comment" to the source code to indicate that you really did mean to perform some apparently dangerous operation. On the other hand, maybe this is a feature, not a bug. On more than one occasion I've changed my mind about what's really a false positive and decided that PMD was right all along. For instance, for a long time a try-catch block like this one occurred in various places in XOM:

try {
  this.data = data.getBytes("UTF8");
}
catch (UnsupportedEncodingException ex) {
  // All VMs support UTF-8
}

PMD flagged this as an empty catch block. This seemed like a nonissue until I discovered that some VMs in fact don't recognize the UTF-8 encoding, even though that makes them nonconformant. So I changed the blocks to this, and PMD stopped complaining:

try {
  this.data = data.getBytes("UTF8");
}
catch (UnsupportedEncodingException ex) {
   throw new RuntimeException("Broken VM: Does not support UTF-8");
}



Back to top


Available rules

PMD includes 16 rule sets that cover various common issues in Java code, some more controversial than others:

What name to pass?

The names of the rules to pass on the command line aren't particularly well documented. Sometimes a little trial and error is required to figure them out. The names I've given here in parentheses work.

  • Basic (rulesets/basic.xml) -- A grab bag of rules that most developers are unlikely to disagree with: catch blocks shouldn't be empty, override hashCode() anytime you override equals(), etc.

  • Naming (rulesets/naming.xml) -- Tests for the standard Java naming conventions: variable names should not be too short; method names should not be too long; class names should begin with an uppercase letter, method and field names should begin with a lowercase letter, etc.

  • Unused code (rulesets/unusedcode.xml) -- Looks for private fields and local variables that are never read, unreachable statements, private methods that are never called, and the like.

  • Design (rulesets/design.xml) -- Checks a variety of principles of good design, such as: switch statements should have default blocks, deeply nested if blocks should be avoided, parameters should not be reassigned, and doubles should not be compared for equality.

  • Import statements (rulesets/imports.xml) -- Checks for minor issues with import statements, such as importing the same class twice or importing a class from java.lang.

  • JUnit tests (rulesets/junit.xml) -- Looks for specific issues with test cases and test methods, such as correct spelling of method names and whether suite() methods are static and public.

  • Strings (rulesets/string.xml) -- Identifies common problems that crop up when you work with strings, such as duplicate string literals, calling the String constructor, and calling toString() on String objects.

  • Braces (rulesets/braces.xml) -- Checks whether for, if, while, and else statements use braces.

  • Code size (rulesets/codesize.xml) -- Tests for overly long methods, classes with too many methods, and similar candidates for refactoring.

  • Javabeans (rulesets/javabeans.xml) -- Inspects JavaBeans components for violations of JavaBeans coding conventions, such as unserializable bean classes.

  • Finalizers -- Because finalize() methods are so uncommon in Java language (it's been years since I needed to write one), the rules for their use, while detailed, are relatively unfamiliar. This group of checks seeks out various issues with finalize() methods, such as empty finalizers, finalize() methods that call other methods, explicit calls to finalize(), and so forth.

  • Clone (rulesets/clone.xml) -- A few rules for clone() methods: classes that override clone() must implement Cloneable, clone() methods should call super.clone(), and clone() methods should be declared to throw CloneNotSupportedException even if they don't actually throw it.

  • Coupling (rulesets/coupling.xml) -- Looks for signs of excessive coupling between classes, such as too many imports, using a subclass type where the supertype or interface would suffice, and the sheer number of fields, variables, and return types within a class.

  • Strict exceptions (rulesets/strictexception.xml) -- A few more tests for exceptions: methods should not be declared to throw java.lang.Exception, exceptions should not be used for flow control, Throwable should not be caught, and the like.

  • Controversial (rulesets/controversial.xml) -- Some of PMD's rules are ones any competent Java programmer would accept. But a few are at least arguable. This rule set contains some of the more questionable checks, including assigning null to a variable, multiple return points from a method, and importing from the sun packages.

  • Logging (rulesets/logging-java.xml) -- Searches for dodgy uses of java.util.logging.Logger, including nonfinal, nonstatic loggers, and more than one logger in a class.

You can check with several rule sets at once by separating the names with commas on the command line:

$ /usr/pmd-2.1/etc/pmd.sh ~/Projects/XOM/src html
 rulesets/design.xml,rulesets/naming.xml,rulesets/basic.xml



Back to top


Building your own rule sets

If you frequently check with particular collections of rules, you might wish to combine them in your own rule-set file, as shown in Listing 2. This rule set imports the basic, naming, and design rules:


Listing 2. A rule set that imports the basic, naming, and design rules
				<?xml version="1.0"?>
<ruleset name="customruleset">

  <description>
  Sample ruleset for developerWorks article
  </description>

  <rule ref="rulesets/design.xml"/>
  <rule ref="rulesets/naming.xml"/>
  <rule ref="rulesets/basic.xml"/>

</ruleset>

If you need more granularity, you can pick and choose the individual rules you want to include from each set. For example, Listing 3 shows a custom rule set that selects 11 specific rules from three of the built-in sets. Because checking a large code base can take a noticeable amount of time, even on fast hardware, this can also help you find specific problems you're looking for more quickly.


Listing 3. A rule set that imports 11 specific rules
				<?xml version="1.0"?>
<ruleset name="specific rules">

  <description>
  Sample ruleset for developerWorks article
  </description>

  <rule ref="rulesets/design.xml/AvoidReassigningParametersRule"/>
  <rule ref=
    "rulesets/design.xml/ConstructorCallsOverridableMethod"/>
  <rule ref="rulesets/design.xml/FinalFieldCouldBeStatic"/>
  <rule ref="rulesets/design.xml/DefaultLabelNotLastInSwitchStmt"/>
  <rule ref="rulesets/naming.xml/LongVariable"/>
  <rule ref="rulesets/naming.xml/ShortMethodName"/>
  <rule ref="rulesets/naming.xml/VariableNamingConventions"/>
  <rule ref="rulesets/naming.xml/MethodNamingConventions"/>
  <rule ref="rulesets/naming.xml/ClassNamingConventions"/>
  <rule ref="rulesets/basic.xml/EmptyCatchBlock"/>
  <rule ref="rulesets/basic.xml/EmptyFinallyBlock"/>

</ruleset>

You can also include most of the rules in a set but exclude a few specific ones that you disagree with or ones that lead to large numbers of false positives. For instance, XOM often uses switch statements without default blocks when doing table lookups. I can keep most of the design rules but turn off the checks for missing default blocks by adding an <exclude name="SwitchStmtsShouldHaveDefault"/> child element to the rule element that imports the design rules, as shown in Listing 4:


Listing 4. A rule set that excludes the design rule that switch statements should have defaults
				<?xml version="1.0"?>
<ruleset name="dW rules">

  <description>
  Sample ruleset for developerWorks article
  </description>

  <rule ref="rulesets/design.xml">
    <exclude name="SwitchStmtsShouldHaveDefault"/>
  </rule>

</ruleset>

(But, come to think of it, maybe PMD is right, and I should be adding the default blocks instead.)

You're not limited to the built-in rules. You can add new rules, either by writing Java code and recompiling PMD or, a little more simply, by writing XPath expressions that are resolved against each Java class's abstract syntax tree.



Back to top


Conclusion

Even with just its built-in rules (which are quite extensive), PMD will almost certainly find some real problems in your code. Some of them will be minor, but some won't be. It won't find every bug -- you still need to do unit testing and acceptance testing -- nor is PMD a substitute for a good debugger when you're hunting down a known bug. However, it really shines in finding bugs you didn't know you had. I've yet to see a code base that PMD couldn't find some problems in. It's a cheap, easy, fun way to improve your programs. If you haven't used PMD before, you owe it to yourself and your customers to try it.



Resources

  • Visit the PMD home page to learn more about PMD. You can download PMD from the PMD project page.

  • Read Tom Copeland's article "Static Analysis with PMD" for an introduction to PMD.

  • Learn how to write custom PMD rules from Tom Copeland's article "Custom PMD Rules".

  • FindBugs is another open source static code analysis tool. It differs from PMD in analyzing the compiled byte code rather than the source code, and in offering an optional GUI.

  • Lint4j is a free (as in beer), closed source static code analysis tool that detects some of the same issues as PMD and FindBugs -- and some unique ones.

  • Checkstyle is an open source code analysis tool that focuses on adherence to Java coding conventions such as line length and indentation rather than bug patterns.

  • AppPerfect is a payware static code analysis tool that has the unique ability to check JSP pages as well as conventional Java source code.

  • "FindBugs, Part 1: Improve the quality of your code" (developerWorks, May 2004) and "FindBugs, Part 2: Writing custom detectors" (developerWorks, May 2004) look at how FindBugs can help improve the quality of your code and eliminate bugs.

  • "Diagnosing Java code: Unit tests and automated code analysis working together" (developerWorks, May 2004) examines the relationship between unit testing and static analysis and covers how each can leverage the other. Check out the entire Diagnosing Java code column series.

  • XOM, referred to in some of this article's examples, is an open source, tree-based API for processing XML with Java.

  • You'll find articles about every aspect of Java programming in the developerWorks Java technology zone.

  • Browse for books on these and other technical topics.


About the author

Photo of Elliot Rusty Harold

Elliotte Rusty Harold is originally from New Orleans, to which he returns periodically in search of a decent bowl of gumbo. However, he resides in the Prospect Heights neighborhood of Brooklyn with his wife Beth and cats Charm (named after the quark) and Marjorie (named after his mother-in-law). He's an adjunct professor of computer science at Polytechnic University where he teaches Java and object-oriented programming. His Cafe au Lait Web site at has become one of the most popular independent Java sites on the Internet, and his spin-off site Cafe con Leche has become one of the most popular XML sites. His books include Effective XML, Processing XML with Java, Java Network Programming, and The XML 1.1 Bible. He's currently working on the XOM API for processing XML and the XQuisitor GUI query tool.




Rate this page


Please take a moment to complete this form to help us better serve you.



YesNoDon't know
 


 


12345
Not
useful
Extremely
useful
 


Back to top