 | Level: Introductory Hollis Blanchard (hollisb@us.ibm.com), Software Developer, IBM
18 Jan 2005 This article introduces PowerPC emulation and cross-compiling for developers without access to real hardware. It is intended for developers familiar with computer architecture who own an x86-based workstation but are interested in experimenting with PowerPC.
Some developers may not have access to a PowerPC® Linux™ system to play
with (although you can buy one for less than US$200 at the time of this
writing). For the curious x86 Linux user, emulation is a convenient and
inexpensive alternative. There are at least three open source PowerPC
emulators available, two of which are quite new.
Accuracy
Some emulators, particularly those used by processor developers, are
cycle-accurate, meaning that a particular instruction in a given context
will take exactly as many cycles to run as it would on real hardware.
These emulators emulate not just the instruction set, but also the
internal pipelines and caches of the processor. They are particularly
useful during development before real silicon exists, and they can also
yield more insight into performance bottlenecks than can be gleaned from
hardware performance counters. However, these emulators have some severe
limitations. Because they document so much intellectual property and
hardware tricks, their internals are almost never free for examination or
modification. Instead, the processor designer will make binaries
available, sometimes for no cost, often for a very restricted range of
hosts. Another problem for higher-level software developers is that
because they emulate large amounts of processor internals, they are very
slow. Finally, they may not be as accurate as real hardware. For reasons
of speed or complexity, even a cycle-accurate emulator can omit cache or
IO emulation, yielding skewed results. They're probably pretty close for
most situations, but the fact remains that an emulator is only emulating
the hardware, and its behavior can diverge.
None of the emulators discussed here are cycle-accurate. In fact, they
probably aren't even fully behavior-accurate. (When that happens, it's
called a bug, and will usually end up being squashed... eventually).
Emulating user mode
One very convenient feature for the casual developer is user-mode
emulation. If an emulator emulates only the processor and IO (such as a
network device), a Linux kernel would need to be booted (and emulated)
first, then the emulated application on top of that. That's certainly
important for more serious work, but it's much more convenient for simple
experimentation to avoid dealing with kernels entirely. If the emulator
can emulate not just the processor but also the operating system kernel,
that makes it much easier to run little programs that don't depend on many
kernel services, such as those that only need to use the write and exit system
calls.
When an emulator ordinarily encounters a PowerPC system call
instruction, it emulates the exception by storing the
instruction address into the SRR0 register, setting some
architecture-defined bits in SRR1, and transferring control to physical
address 0xC00. (Some PowerPC variants allow more control over this
behavior, but this is the traditional PowerPC model.) The emulated kernel
has its system call exception handler at 0xC00, just like on hardware, and
so the kernel takes control of the processor.
When an emulator supporting user-mode emulation encounters a system call
instruction, on the other hand, it does not transfer control to the
emulated exception handler; instead it interprets the system call itself.
The easiest examples are system calls like read
and write: these can be almost directly
converted into real system calls made by the emulator. The glue layer to
translate between emulated system calls made by the emulated application
and real system calls made by the emulator may have other functionality,
such as logging all system calls made by the emulated application.
In addition to bypassing the complexity of building a kernel to emulate
and a file system image to boot into, and configuring a virtual network
device for IO, this shortcut also speeds up emulation, as the reams of
kernel instructions that would have run to handle the system call -- from
the exception handler through the VFS and the device driver -- are bypassed.
However, it should be clear that not running the kernel inside the
emulator means the overall behavior could be quite different indeed. In
the worst case, a bug in the emulator's system call glue could make it
seem as though the emulated application is buggy, even though it would run
perfectly on a real kernel. This worst case remains pretty rare, though,
and these tools are generally production-ready.
 |
Just In Time
Just In Time (JIT) compilation is a method by which interpreted
bytecode (for example, an emulated instruction stream) is translated into
native instructions on the fly. Rather than simply interpreting and
emulating each instruction in turn, whole sequences of instructions are
converted to their native equivalents and cached so that the translation
need not occur for subsequent executions of the sequence. Accordingly,
tight CPU-bound loops of interpreted code should execute at near-native
speeds, since the native code is kept in the cache. On the other hand,
code with few loops would not see much speed improvement. JIT compilers
are extremely common for Java™ virtual machines, and they can be used to
great effect in emulated virtual machines as well.
|
|
Qemu
Qemu, which is relatively new, uses dynamic translation like a Java Just In Time (JIT) compiler to achieve good performance; in this case, good
performance is about 4x to 10x slower than native hardware, depending on
the benchmark. It supports a few different hosts and targets, but all
we'll worry about is x86 host and PowerPC target, which fortunately is one
of the supported configurations. Qemu also supports a remote GDB (GNU
Debugger) connection, which is very valuable for debugging. Unfortunately,
qemu does not support GDB connections in user-mode emulation, only in
full-system mode. Qemu does not support AltiVec™ vector-processing
instructions.
PearPC
PearPC is another new emulator that can use JIT dynamic translation,
but only on an x86 host with a PowerPC target -- however, that environment
is the goal of this article. Its performance isn't as good as qemu's,
being roughly 15x slower than the host system. Unfortunately, PearPC does
not support a user environment, so a kernel and basic file system would be
needed as well (Linux, Darwin, and Mac OS X are currently supported).
PearPC does not support a GDB connection, nor yet does it support AltiVec
vector-processing instructions (although the developers plan to add them
in a future release).
PSIM
PSIM (PowerPC simulator) is the granddaddy of PowerPC emulation: it was written in 1994 and
assisted in some of the initial port of Linux and NetBSD to the then-new
PowerPC architecture. PSIM was integrated with the GDB sources, and
amazingly, although it hasn't seen development since 1996, it still builds
and works. Being integrated with GDB, PSIM also supports GDB connections,
including user mode. Because it predates AltiVec, PSIM does not support
AltiVec vector-processing instructions.
Choosing an emulator
For the reasons discussed above, this article uses qemu; the same basic
issues apply with the others, but qemu is the simplest to build for the purposes of this article. Download and extract the latest qemu tarball (see Resources), then:
Listing 1: Building qemu
$ ./configure --target-list=ppc-user
$ make
|
This will produce ./ppc-user/qemu-ppc, which
will be used later to execute PowerPC binaries.
Cross-compiling
The second key ingredient in cross-development is a cross-compiler. A
cross-compiler is a compiler that runs on one architecture but produces
binary code for another. This is very convenient if the deployment system
is significantly underpowered relative to the development system, as is
usually the case in embedded system development. A cross-compiler does not
overwrite the system's native compiler or interact with it in any way.
Crosstool
Building a GNU cross-compiler can be pretty easy depending on the
architectures involved, but sometimes build breaks do happen. It can also
require several stages of builds to get all the right components built for
each other in the right way. To remove the guesswork and automate the
process, Dan Kegel has developed a very useful build script called
crosstool.
Download and extract the latest version of crosstool (see Resources). Then:
Listing 2: Building crosstool
$ sudo mkdir /opt/cross
$ sudo chown $USER /opt/cross
$ sh demo-ppc750.sh
|
That will run for a while, and when it finishes, binutils, GCC, and
glibc will be installed for cross-compiling in /opt/crosstool. Have a look
at the directory structure there, and consider adding it to the PATH
environment variable to save typing later.
Hello, world
Now that an emulator and cross-compiler have been built, it is time to
put them together and test the new environment. Put the following source
into hello.c:
Listing 3: A strangely familiar
program
#include <stdio.h>
int main(int argc, char *argv[])
{
printf("Hello, world.\n");
return 0;
}
|
For now, use static linking to avoid worrying about how to install
PowerPC shared libraries on the x86 host system. To produce a 32-bit
PowerPC ELF executable named "hello", run the following:
Listing 4: Cross-compiling with GCC
$ powerpc-750-unknown-gnu-gcc -static hello.c -o hello
|
To verify that it is the expected format, you can use this command:
Listing 5: Checking file type
$ file hello
hello: ELF 32-bit MSB executable, PowerPC or cisco 4500, version 1 (SYSV),
for GNU/Linux 2.4.3, statically linked, not stripped
|
And finally, run the executable under qemu:
Listing 5: Running an executable
under qemu
$ ./ppc-user/qemu-ppc hello
|
"Hello, world." should be output to the terminal.
What now?
Now you know you can build C code into PowerPC executables and run
them. You can also experiment with the simple assembly example given in
the "Introduction to PowerPC Assembly" article, which is listed in Resources. (Note that you could use the
cross-assembler directly, it's a lot easier to continue to use the
compiler instead.) Once you're satisfied with that, you can move on to
bigger and more interesting examples, perhaps including shared libraries
(read the qemu documentation -- which is also listed in Resources -- for
help with that).
64-bit PowerPC
Although crosstool can produce ppc64 toolchains just as easily, there
is unfortunately no open source emulator for 64-bit PowerPC, so you would
need real hardware to experiment. Of course, ppc32 executables run just as
well on ppc64 hardware (but the reverse is not true).
Conclusion
An emulator will never be as fast as native hardware; the biggest
reason functionality is implemented in hardware is speed. An emulator will
also never be as accurate as real hardware, especially when the hardware
itself could contain errata that can be triggered by subtle timing
interactions of internal components. However, an emulator can be very
valuable for development and even general-purpose computing. Virtual PC, a
commercial emulator, is used by a large number of Macintosh,® owners to run
Windows® applications. It may not be as fast as hardware, but it's cheaper
and easier to maintain. When developing low-level operating system code,
an emulator can provide that needed glimpse into the system's state to
reveal a hardware-crippling bug. In fact, during hardware development, an emulator might be the only development platform available!
The emulators above have been and are being used for operating system
development, which proves some measure of robustness. But don't let that
stop you from trying them out just to experience having 32 general-purpose
registers, or from going out of your way to try to support a PowerPC user of
software you've written. With an unbeatable price tag and convenient
environment, what do you have to lose?
Resources - PearPC is maintained on
SourceForge. See also the PearPC
documentation.
- You can get qemu from the qemu home page. See
also the QEMU
CPU Emulator User Documentation.
- The PSIM model of the
PowerPC Architecture is written in extended ANSI-C and hosted at Red Hat.
It too has ample
documentation.
- Downloads and documentation for crosstool may be found on the
project home page.
- Once you are up and running, you can play around with some of the
code from Introduction
to assembly on the PowerPC (developerWorks, July 2002).
- Did you know you can get a PowerPC Linux kit for as little as US$200?
We're talking, of course, about the Kuro (about which more later). See
also this review of the
Kuro from Penguin PPC.
- Can't wait for 64 bits? There is no need to -- here is some
information about 64-bit
PowerPC to get you started (developerWorks, October 2004).
- If you're emulating a whole system, these performance
tools will help you get something done (developerWorks, June 2004).
- You can learn more about Just-in-time
compilation from Wikipedia.
- See also A
developer's guide to Linux emulators and how they operate
(developerWorks, December 2004) and Emulate
legacy operating systems on Linux: From CP/M to OpenVMS, Linux does it
all (developerWorks, June 2003).
- Have experience you'd be willing to share with Power Architecture zone
readers? Article submissions on all aspects of Power Architecture technology from authors inside and outside
IBM are welcomed. Check out the Power Architecture author
FAQ to learn more.
- Have a question or comment on this story, or
on Power Architecture technology in general?
Post it in the Power Architecture technical forum
or send in a letter to the editors.
- All things Power are chronicled in the developerWorks Power
Architecture editors' blog, which is just one of many developerWorks
blogs.
- Find more articles and resources on Power Architecture
technology and all things
related in the developerWorks Power
Architecture technology content area.
- Download a IBM PowerPC 405 Evaluation Kit to demo a SoC in a simulated
environment, or just to explore the fully licensed version of
Power Architecture technology. This and other fine Power Architecture-related downloads are listed in
the developerWorks Power Architecture technology content area's downloads section.
About the author  | |  | Hollis Blanchard started learning about the PowerPC architecture and
the Linux kernel in 1998. He works in the IBM Linux Technology Center,
where he's developed for embedded PowerPC, pSeries servers, and x86
systems. He's also one of the core contributors to penguinppc.org. |
Rate this page
|  |