Building a Simple Java Profiler

Profiling Java programs has always been a bit of an adventure. For a long time, the only profiling support was built into Sun’s JVM, and the proper command-line incantations would unleash floods of profiling data from the VM. Of course, then the fun part was wading through all of that information to try to make sense of it! It was quite an adventure, profiling programs that way.

After a while it got easier, with the introduction of the Java Native Interface (JNI), and then the Java VM Profiler Interface (JVMPI). But, the cost of building your own profiler was still really high, because you primarily had to work in C or C++! A few commercial products like JProbe and OptimizeIt were available, and actually made profiling reasonably painless. The primary pain was in your wallet, unless you were able to convince your company to pick up some licenses for developers to use.

But then Java 1.5 (or 5.0, sigh) came along, and suddenly it got really easy to implement your own Java profiler! Java 1.5/5.0 introduces two incredibly useful features that make profiler-construction a breeze:

  • Pure-Java support for high-resolution timers, via the System.nanoTime() function.
  • Pure-Java support for transforming or instrumenting classes at load-time, via the java.lang.instrument package.

With these two new features, it becomes possible to write all kinds of neat tools, not just Java profilers! For example, with the new annotation support in Java 1.5, you can specify preconditions and postconditions on your class-methods, and invariants on your classes, and then incorporate them right into your classes at load-time so that they are automatically tested as your code runs. Or, you can implement a debug tool that will instrument all of your classes with debug-trace output. This is very neat stuff!

Of course, this is only part of the picture, because what the java.lang.instrument package really provides for you is the opportunity to implement this interface called ClassFileTransformer, which has one method that takes, among other things, a byte-array containing a class’ bytecodes! Your mission, if you choose to accept it, is to return another byte-array, containing the modified version of that class. Hopefully in a state that will actually work.

So, obviously there is going to be a significant amount of bytecode manipulation going on, and obviously a lot of things could go wrong, many of which will cause the JVM to become very cross with you. This is why the smart thing to do is to let somebody else worry about the hard part, and employ one of the many Java bytecode-manipulation libraries that are available. I chose the JBoss Javassist package, because it seemed to be quite straightforward to use, and powerful enough to do everything I wanted, and more! One of the neatest features of the Javassist library is that it is able to transform Java syntax into bytecodes, which makes it much easier to manipulate classes and methods. In fact, with Javassist, I don’t really ever deal with bytecodes. I know they’re there, but they’re nicely out of sight.

And, hopefully, the fact that Javassist is from the JBoss group won’t be a problem anytime in the future. There always seem to be interesting little issues, spats, and project-forks when it comes to JBoss. But I digress.

Starting at the Beginning

The obvious place to start is to get a simple Java instrumentation agent up and running. This task is quick and easy, and you get a pretty instantaneous reward for pretty minimal efforts. And that is always the name of the game when you are building a project in your spare time!

The details on how to do this are in the package-documentation for java.lang.instrument, but the quick summary is that a Java instrumentation agent has to provide one entry-method, similar to what a Java application provides:

    public static void premain(String agentArgs, Instrumentation inst);

The only tricky part of this is that Java instrumentation-agents get a single String argument containing all of the configuration properties, instead of a nice array of Strings. But that is easy enough to deal with. My personal preference is to format my agent-arguments like URL-arguments:

    name1=value1&name2=value2&…

The first thing your agent can do is to cache that Instrumentation reference - you want to be able to use that later! The second thing to do is to add a class-file transformer to the Instrumentation instance. This transformer will be called for just about every class that is loaded by the Java VM, and it will be able to do whatever it wants to each class, before its bytecodes are handed over to the VM.

So, here is a simple example of a Java instrumentation agent called PerfMonAgent:

package net.dp3.perfmon;

import java.lang.instrument.Instrumentation;
import java.lang.instrument.ClassFileTransformer;

public class PerfMonAgent
{
  /**
   * A reference to the {@link java.lang.instrument.Instrumentation} instance
   * passed to this agent's {@link #premain} method.  This way we can keep using
   * the Instrumentation functionality!
   **/
  static private Instrumentation inst = null;

  /**
   * This method is called before the application’s main-method is called, when
   * this agent is specified to the Java VM.
   **/
  public static void premain(String agentArgs, Instrumentation _inst)
  {
    System.out.println(”Donnie’s Agent.premain() was called.”);

    // Initialize the static variables we use to track information.
    inst = _inst;

    // Set up the class-file transformer.
    ClassFileTransformer trans = new PerfMonXformer();
    System.out.println(”Adding a PerfMonXformer instance to the JVM.”);
    inst.addTransformer(trans);
  }
}

And, here is the first cut of the class-file transformer, PerfMonXformer:

package net.dp3.perfmon;

import java.io.ByteArrayInputStream;
import java.io.IOException;

import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.lang.instrument.Instrumentation;

import java.security.ProtectionDomain;

public class PerfMonXformer implements ClassFileTransformer
{
  public byte[] transform(ClassLoader loader, String className,
    Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
    byte[] classfileBuffer)
    throws IllegalClassFormatException
  {
    byte[] transformed = null;

    System.out.println(”Transforming ” + className);

    return transformed;
  }
}

Yeah, it’s pretty bare-bones so far, but soon it’ll get far more interesting.

Packaging the Agent

Once you get these files written up and compiling, the next major challenge is to package up the Java instrumentation agent into a JAR-file that can be specified on the command-line. This task is just like any other JAR-building task, except that you need to specify a slightly different set of attributes in the JAR’s manifest:

  • Instead of Main-Class, you must specify a Premain-Class attribute that gives the full class-name of the class that implements premain() in your agent. For my agent, this is simple:

        Premain-Class: net.dp3.perfmon.PerfMonAgent
  • If your agent uses any class-libraries, then you should specify the Boot-Class-Path attribute, since your instrumentation agent will need its libraries to be visible from the bootstrap classloader. If you don’t do this, you will probably have to use the wacky -Xbootclasspath/a:... argument to the JVM. The command-line argument is a decent way to get things going, but you want to use this attribute in the long-run, because the command-line is already growing from having to specify the Java instrumentation agent. Might as well keep it as simple as possible.
  • Finally, there is the Can-Redefine-Classes attribute. If it is set to true, then the Java instrumentation agent is able to redefine the classes that the agent itself uses. This is a pretty odd circumstance, and certainly won’t arise in the performance-monitor agent, so we will leave this one alone.

Once you get all of this worked out (Hint: Use Ant!), you are ready to try out your new Java instrumentation agent!

Running the Agent

Java instrumentation agents are pretty easy to run in Java 1.5. All you need to do is to specify the -javaagent argument to java, and everything just magically happens from there. For example:

C:\projects\perfmon>java -javaagent:perfmon.jar net.dp3.perfmon.Example
Donnie's Agent.premain() was called.
Adding a PerfMonXformer instance to the JVM.
Transforming net/dp3/perfmon/Example
...

Sweet. Now that the basics are taken care of, it will be a lot easier to enhance the Java agent with additional features, such as instrumenting classes for performance-monitoring.

Instrumenting Java Classes

There are two major kinds of profiling that are done on programs. The first kind, CPU-profiling, is the most common, and simply involves finding out which parts of your program are taking the most time to run. The second kind is called heap-profiling, and involves finding out how your program uses memory. This is also very important, especially in languages like Java. Yes yes, I know that Java has garbage-collection, and memory-management isn’t supposed to be a concern, and all that. But, the reality is that you can still run into issues, even with Java’s simplified memory-management model:

  • Even Java programs can leak memory, if object-references aren’t cleaned up properly.
  • Java classes incur some additional memory-overhead, and this can add up quickly if you have a lot of small classes. For example, a simple class that contains an int and two object-references can take up 24 bytes, even though integers are 4 bytes, and references are also (probably) 4 bytes. That overhead can add up quickly if you have a lot of these objects!

A heap-profiler will tell you how your program’s memory is being used, and then you can figure out how to make your programs more efficient.

CPU Profiling

Instrumenting a class for CPU-profiling is straightforward, especially now that Java 1.5 provides the System.nanoTime() method. At the simplest level, all you do is store a timestamp at the start of the code you are profiling, and then at the end, use that initial timestamp to figure out how much time has gone by. Something like this:

long __stime = System.nanoTime();

... // Code being measured.

recordElapsedTime(System.nanoTime() - __stime);

There are many variations on this theme too, such as recording the amount of time spent entirely in a particular method, excluding any other method-calls or constructor-calls. This measurement is almost as easy to calculate as the overall time, but you need two variables instead of just one. One variable stores the timestep of a particular interval of the method’s code, and the other variable stores the running count. Like this:

long __sitime;
long __local = 0L;

__sitime = System.nanoTime();

... // Code that doesn't include methods or constructor-calls.

__local += System.nanoTime() - __sitime;

someMethod();

__sitime = System.nanoTime();

... // More code that doesn't include methods or constructor-calls.

__local += System.nanoTime() - __sitime;

recordLocalElapsedTime(__local);

Heap Profiling

Instrumenting a class for heap-profiling actually turns out to be pretty simple too. The technique is a bit brute-force, but it is very straightforward to implement, and gives useful results. The main statistic that we want to track is the number of objects that are created for any particular class, and how many are reclaimed. This can be done easily by instrumenting each constructor to report an instantiation of the class, and also instrumenting the class’ finalize() method (i.e. its finalizer) to report a destruction of the class. Now, most classes don’t have a finalize() method! So, in the process of instrumenting classes for heap-profiling, in most cases you are going to have to add a finalize() method to the class! And, savvy Java developers know that when you implement finalize(), you must call the superclass’ finalize() method, so the “fake” finalize() method does exactly this:

protected void finalize() throws Throwable
{
  super.finalize();
}

So, when instrumenting a class for heap-profiling, we can add in a finalize() method if one doesn’t already exist, and then we can instrument our fake finalizer.

The last useful bit of information we would like to have is the actual size of a particular object in memory. This is where the java.lang.instrument.Instrumentation instance comes in handy! The Instrumentation interface has this useful method called getObjectSize(Object objectToSize), and it will return an “implementation-specific approximation of the amount of storage consumed by the specified object.” In my experience, this estimate is dependent on the class, and not on which instance of the class it is called on. This is because the method only calculates the space used by the object itself - any other objects that the object refers to are excluded from the estimate. So, if you are interested in more sophisticated measurements then you have to do a little extra footwork. But, for a simple Java profiler, the information provided by the getObjectSize() method is quite useful.

The Javassist Library

more soon…