Level: Intermediate Teodor Zlatanov (tzz@bu.edu), Programmer, Gold Software Systems
01 Jun 2000 Getting the job done in Perl is easy. The language was designed to make simple tasks easy, and hard tasks possible. But the built-in simplicity of the language can become a trap. Programmers are by nature averse to documenting or designing the architecture of their programs. The excitement of writing pure code lies in the direct connection to the machine, telling it exactly what to do. Teodor Zlatanov presents techniques to improve the reliability and maintainability of Perl programs through increasing clarity of the code. His tips are intended for the beginner or intermediate Perl programmer, with a stronger emphasis on establishing good standards rather than on changing particular coding styles.
You don't have to revolutionize your thinking (or coding!) to get increased clarity with Perl. Although it's difficult to write complicated tasks with Perl, it can be done. And it can be done neatly. You don't have to be the only one that can understand and maintain your program once it's written. Using these nine tips, you can keep using Perl, keep your style, and still have an accessible and stable program. "No room for improvement"
You may be thinking: "I have years of C/C++/Ada/Assembler/Pascal/LISP/Java, and
my code is perfect. Don't talk to me about improvements." My response is that there is no perfection in programming, only the pursuit of perfection.
Good programmers learn something new every day and improve their
technique constantly. Perl as a language is very malleable. For example, you can print out your environment (starting with version 5.005) like this: Printing the contents of %ENV with a one-liner
print "$_ => $ENV{$_}\n" foreach(sort keys %ENV);
|
Or you can do it like this: Printing the contents of %ENV, broken up
foreach (sort keys %ENV)
{
print "$_ => $ENV{$_}\n";
}
|
Or you can even use the Data::Dumper module: Printing the contents of %ENV with Data::Dumper
use Data::Dumper;
print Dumper(\%ENV);
|
 |
Documentation from the Perl distribution
Perl documentation, for
example the "xyz" page, comes with the Perl distribution. You can access these pages by typing "perldoc
xyz" at the command line. They are also available as HTML documents.
|
|
Every one of these approaches does the same thing in a debugging
context. But which one is easiest to understand, document, and
maintain? The third one, of course. If you have never used
Data::Dumper, you should read the documentation
("perldoc Data::Dumper") and try it in your programs. Speed is not the only measure of improvement of a program's code. Ease of testing, documentation, and maintenance should be kept in mind as well in any software project. A language as flexible as Perl facilitates good coding in every stage of the software project, except the
pre-coding stages (requirements gathering and architecture design).
Writing comments: when your script flashes before your eyes
There is no such thing as too much documentation. Being clear often means repeating yourself. Think of your
code as something you present to the world. There are a lot
of people in the world. The one comment you thought was redundant
could make someone's day a little easier. It could be your day, five
years from now, when you are adding a new feature. Use good planning when writing your programs. You don't have to determine every detail in advance. But you should break
up the program into component parts, and use comments to fill in the gaps. Let's take a small example: a program that reverses all its input
except for names found in the /etc/hosts file. The input reverser, first version
#!/usr/bin/perl -w
# author, date, revision, etc.
# brief summary of program
# modules used summary
# pragmas used summary
# function A summary
# function B summary
# main loop summary
# POD documentation
|
Next, we elaborate on each piece: The input reverser, second version
#!/usr/bin/perl -w
# author, date, revision, etc.
# This program will process user input with two functions.
# Command-line arguments will be treated as input file sources.
# No flags are allowed.
# modules used: none
# pragmas used: strict
use strict;
# function A: return a parameter with the letters reversed
# function B: return true if a word is in the /etc/hosts file
# main loop: go through input lines, passing each word to B
# and, if B returns false, to A
__END__
POD documentation
|
And finally we have the last version, with code and comments. See the script:
reverser.pl.
Writing loops that make sense
Loops can be categorized as one-liners and multi-line loops. Avoid the temptation to squeeze everything into one-line loops,
even though it may seem more natural. Compare: A one-line loop to print all the environment variables in
lowercase, with the first letter uppercase, and sorted
print ucfirst lc $_, "\n" foreach (sort keys %ENV);
|
With: A multi-line loop to print all the environment variables in
lowercase, with the first letter uppercase, and sorted
foreach (sort keys %ENV)
{
print ucfirst lc $_, "\n";
}
|
The second loop does the exact same job as the first. Which one would you rather
see in production code that you have to maintain? Right. Multi-line loops are much easier to read. Try to forget the bad things about C-based languages. This is good
advice in general. But regarding loops in particular, you should not
have to negate the whole condition. And you should use the 'for'
statement as little as possible. Transform if(not condition) into unless(condition). If becomes unless
print "Yes" if not $false;
print "Yes" if !$false;
# becomes...
print "Yes" unless $false;
|
Likewise, while(not condition) is equivalent to until(condition). Listing 9: while becomes until
print "No signal" while !$signal;
print "No signal" while not $signal;
# becomes...
print "No signal" until $signal;
|
The 'for' loop is a bad idea. It condenses three things into one line: initial
condition, increment, and ending condition. While this
was necessary before Perl came along, the 'foreach' loop has now eliminated the need for the 'for' loop. For example, going from 1 to 10 with the variable $i: Iterating with foreach
foreach $i (1 .. 10)
{
# do something
}
|
Compare this with the equivalent 'for' loop: Iterating with for
for ($i=1; $i <= 10; $i++)
{
# do something
}
|
In some instances, the 'for' loop can be useful. But I suggest heavily documenting
its occurrence, and providing a reason for why 'foreach' was not sufficient. For a beginner or intermediate Perl programmer, it can be
more harmful than beneficial to use the 'for' construct.
The road to cleanliness
If you miss lint and -Wall from your C days, there is still hope. Use the -w flag to run your script. Use the 'use strict' instruction
to the Perl interpreter to make your programs legible, structured, and better running. You won't magically become a better programmer, but it
will significantly improve your code. For example, variables and hash keys
won't be created on their first use by default. Variables have to be
declared, and you'll avoid many common bugs. See reverser.pl for an example of how to use
strict and -w. Don't define constants as variables or functions. The 'use constant'
directive does a better job. If your Perl doesn't support it,
upgrade. Use constant example
use constant TIMESLICE => 15;
|
Use prototypes for your functions. Don't make up your functions'
usage as you go along. Plan before you code, and your work will be all the
better for the planning. Be aware of the differences between scalars, numbers, and strings.
Read the FAQ, the Programming (or Learning) Perl book. In general, do
your homework on this one. Perl makes it easy to convert numbers into
strings and vice versa, but that can create subtle bugs that take
hours or days to find. The topic of cleanliness can never be exhausted. You should always strive to produce
clean code, learning as you go along. These are only the very basics
of good usage in Perl syntax. Look at the perlsyn and perlstyle
pages as well. They contain pertinent information on this topic.
Functional Programmers Anonymous meets at 9 PM tonight
For the programmers who want Perl to be a more functional language,
map and grep can satisfy the craving. Map and grep evaluate a block
or an expression for each element of a list. They can be surprisingly
useful with the block syntax. We will look at some more advanced
examples of that syntax in a minute. Map and grep set $_ to the current element
they are examining inside the block. The often-quoted Schwartzian transform is really just a
temporary array by one of its fields. Here is an example: The Schwartzian transform
@sorted_list =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] } # note that this is a numeric sort
map { [$_, index_function($_)] }
@unsorted_list;
|
The unsorted list can contain any data you like. Write an
index_function that will generate a key that can be used for sorting each element of
the unsorted list. The transform will sort the elements by those
keys numerically. If you do not understand this code listing, you should read the
documentation for the map and sort functions carefully. Learn to understand
the Perl notation for anonymous array references and dereferencing
array elements. The Effective Perl Programming site (see Resources) has
a detailed explanation of the Schwartzian transform. Map and grep allow a lot of neat techniques. For example: Mapping an array to uppercase
@uc = map { uc } @values;
|
This will convert an array to uppercase by mapping the uc function to each
element in the array. Have we seen that before? Yes! The loop to print
environment variables could benefit from this approach to increase legibility: A one-line loop to print all the environment variables in
lowercase, with the first letter uppercase, and sorted, using map
print $_, "\n" foreach (map {ucfirst lc } sort keys %ENV);
|
Even better code, because it is clearer and separates form from function, would be: A multi-line loop to print all the environment variables in
lowercase, with the first letter uppercase, and sorted, using map
foreach (map {ucfirst lc } sort keys %ENV)
{
print $_, "\n"
}
|
Grep works exactly like map, except that the elements only "pass through" if the expression or
block returns true. Map lets
everything through. You can use grep to extract every array element
that's a valid file, for example: Filter array of filenames for validity
@valid = grep { -f } @names;
|
The -f operator returns true only if the file exists.
When modules knock at your door
If you ever find more than one program using a particular functionality, it's
time to put it in a module. You don't have to use object orientation
in your module. Simply saying: Using the Exporter module
package My::Package;
require Exporter;
@ISA = qw(Exporter);
@EXPORT = qw(my_function); # symbols to export by default
sub my_function()
{
}
1;
|
is enough to create your module My::Package and export the function
my_function from it. Now, when other programs say "use My::Package"
they will automatically import the function "my_function." There is a lot more to modules. Read the perlmod documentation and
consult the Resources section of this article. I strongly recommend
learning about modules and making an effort to use them. They will
make your life easier. As a general guideline, if you use a function
in more than one program, it should be in a module.
Object-oriented programming and other unsightly habits
Object-oriented programming (OOP) is a wonderful tool. Don't try to
make it a religion, though. OOP does not:
- Solve the world's problems
- Shorten development schedules halfway through the project
- Require expensive tools
- Make planning unnecessary, because "objects take care of themselves"
OOP does:
- Make life easier when used correctly
- Allow abstract concepts to take shape quickly
- Interface nicely with Perl's modules
In short, use OOP, but don't rely on it to do everything. For an
introduction to OOP with Perl, see Resources and the
perltoot page. OOP can be as simple as data abstraction, or as
complex as a company-wide methodology. An example of a simple object can be found in the PSI::ESP module:
ESP.pm.
ptkdb and other consonants
Get the ptkdb debugger from CPAN (see Resources) as soon as you can. It requires a
few modules, but it's well worth it. It will let you view data and
code in your programs and in the loaded modules. ptkdb has saved me
from hours of frustration and early hair loss many times.
The PSI::ESP module
In the Perl community, the PSI::ESP module is famous. Unfortunately, only a few have real
access to its wisdom. The rest of us have to help each other
as best we can without psychic abilities. If you need help with a
Perl problem, or with your Perl style, remember to include a complete
description of the situation, what you have already tried, and your complete
code in the SOS. I will now provide a preliminary version of the ESP module. As long as the Earth's magnetic field is aligned correctly, this
will solve every Perl problem you may encounter. I hope you find it useful. You can use the module with the following syntax as long as you place the module in a PSI subdirectory of your current directory:
ESP.pm: perl -I. -MPSI::ESP -e 'use PSI::ESP; $p = new PSI::ESP; print $p->reason' |
Conclusion
I hope this article has helped you improve your Perl skills a bit. Much of the
information presented here is complemented by what you will find in Resources. Enjoy your journey with Perl. I hope you find
it to be a constant source of insight and excitement!
Resources - Read Ted's other Perl articles in the "Cultured Perl" series on developerWorks.
-
reverser.pl contains the final version of the script mentioned in this article, with code and comments
- An example of a simple object can be found in the PSI::ESP module
-
Effective Perl Programming, by Joseph Hall and Randal Schwartz (Addison
Wesley, 1998) is the definitive source of Perl tips and tricks in book
form, relating to the language rather than specific tasks
-
Object Oriented Perl, by Damian Conway (Manning Publications, 2000) is an
excellent guide to modules and object orientation
-
Programming Perl, 2nd Edition, by Larry Wall, Tom Christiansen, and
Randal L. Schwartz (O'Reilly, 1996) is the best guide to Perl
today, but a little outdated with 5.005 and 5.6.0 out now
-
The Perl Journal
- The USENET comp.lang.perl.misc newsgroup is a wonderful place to
learn. Please read the articles for a while before you post, check
the FAQs, and generally behave nicely when you participate in the
c.l.p.misc community. Look at section 9 in particular.
About the author  | 
|  | Teodor Zlatanov graduated with an M.S. in Computer Engineering from Boston University in 1999. He has worked as a programmer since 1992 using Perl, Java, C and C++. His interests are in open source work on text parsing, 3-tier client-server database architectures, UNIX system administration, CORBA, and project management. He can be contacted at tzz@bu.edu. |
Rate this page
|