SGI's JDK 1.1.5: More Than A Port

by
Hank Shiffman
Strategic Technologist
Silicon Graphics, Inc.
January, 1998

A product's release number tells us something about what to expect from it. A change from version 1 to version 2 is dramatic, with huge changes in features, implementation and performance. The move from version 1 to 1.1 is less earthshaking but still significant. It identifies a product that has some significant new features and enhancements but is mostly unchanged from the earlier version. A dot.dot release like 1.1.1 is the least dramatic of all; this kind of increment is used for bug fix releases.

So what does this tell us about Silicon Graphics' new release of the Java Development Environment version 3.1, based on Sun's 1.1.5 JDK? From Sun's numbering we see that this release consists of bug fixes and some small performance improvements, but little of note or concern to the Java programmer. But SGI's numbering reveals that more is going on here than simply a port of the latest Sun source. Hidden in this new Java environment are some important capabilities. These new features don't change the code that a Java programmer can write; instead they improve the performance of that code and the ways it can interact with the rest of the system.

Improved Performance: N32

Java 3.1 includes a version of the Java runtime environment built using the MIPS N32 binary interface. N32 provides better performance by taking advantage of 64-bit instructions and registers available in every MIPS processor since the R4000 but unused by the earlier O32 ABI. For Java, N32 means faster operations using the long and double data types, as well as faster procedure calling by passing more arguments in registers. Applications using the new runtime can expect to run about 10% faster on average because of this change. Pendragon Software's CaffeineMark benchmark shows a 15% improvement in the Float test.

Java 3.1 also includes an updated version of the old O32 runtime system. This version will be important to developers who have already integrated Java with their own native code using the Java Native Interface. An application can be built using O32 or N32 but not a combination of the two. As a result, developers of native interfaces to Java have been restricted by the O32-built Java runtime to using O32 for their own code. This has limited them to the language and performance features of the older ucode compilers. While new developers of native code for Java will want to take advantage of N32 to get better performance and new language features, those with existing code are free to continue using the code they have already developed.

By default, invoking the Java runtime using the appletviewer or java commands will make use of the N32 version. A runtime switch on either command can be used to select the O32 runtime.

Improved Performance: Thread Synchronization

Because Java is a multithreaded language, its core libraries must be thread-safe. This means that they must allow for the possibility that more than one thread in a program will be using them at any given moment. We make our classes thread-safe using synchronization; we place the synchronized keyword on any method or block which must not be run concurrently from multiple threads. When a thread enters a synchronized block it attempts to acquire a lock on the relevant object or class. (We identify an object explicitly when we write a synchronized block. For a synchronized method we lock the object on which the method is invoked, or the entire class for a static method.) If another thread already holds the lock, this thread is suspended until the lock is released. In this way we prevent multiple threads from interfering with each other.

Multithreading is fundamental to the design of Java; locking is fundamental to MT. So the performance of locking is vital to the performance of Java in general. Even single threaded Java programs are affected, since they depend on class libraries which may also be used by MT code. And unfortunately Java's design calls for a complex and slow mechanism for dealing with locks.

In Java a lock is associated either with an object or a class. Although any object may have a lock assigned to it, most will never need one. So rather than store a lock directly inside an object, increasing the size of every object in the system, Java maintains this information in a hash table. Before a procedure can acquire a lock it must hand the associated object to the hash table; the hash routine returns the lock object. One further complication is that the hash table itself is changing as objects and their locks are created and destroyed. It would be a disaster if multiple threads were to access and/or modify the hash table at the same time, which means that we need to protect the hash table with its own lock.

So the process of synchronization goes something like this: When we enter a synchronized block we first attempt to acquire the global lock on the hash table of locks. If it's already locked we wait until it becomes free. Then we hand the hash table the object we want to lock; it returns the lock structure. We then release the lock on the hash table and attempt to acquire the lock on our object. Once we get it we can start executing our block of code, releasing the lock when we're done.

(Note that locking in languages like C and C++ can be a lot simpler and faster. Lacking this implicit association between objects and locks, they permit the programmer to go directly to the acquisition of the lock. They assume that the programmer will manage the relationship between locks and the objects they protect, which is a bit more work but a lot less overhead.)

Locking is both fundamental to Java and a source of much of its poor performance. For example, a surprising amount of the overhead associated with the String class can be attributed to synchronization. By reducing the time it takes to find and acquire a lock, we can improve the performance both of MT programs and of single threaded code that uses thread-safe classes like String.

(Sun uses a clever hack to speed up locking in the green thread library it ships with its JDK. Since green threads implement MT on top of a single threaded process, they can lock-protect the hash table by simply freezing the scheduler. Obviously, if no other thread can run then no one can upset the table while we're using it. This technique falls apart when we attempt to move Java to a multiprocess or multiprocessor implementation.)

Through some sophisticated monitor caching and other efficiencies, Java 3.1 reduces locking overhead by two thirds. The time to process a null synchronized method (no real code; just acquire and release the lock) has dropped from about 2 microseconds to 600 nanoseconds. The CaffeineMark String test shows a speed improvement of 62% over Java 3.0.1 using the O32 runtime and a doubling of performance using the N32 default, most of which can be attributed to faster locking. The overall CaffeineMark rating for this release shows an improvement of 11% using O32 and 21% with N32.

POSIX Threads

By default, Java 3.1 uses Sun's green thread library to implement Java's MT capabilities. But this release provides another option: MT implemented on top of the POSIX-standard pthread library supported by IRIX 6.3 and later releases and available as a patch to IRIX 6.2. Using this option, Java threads are layered on top of pthreads. Pthreads can distribute themselves across the CPUs in a multiprocessor system, giving a single Java application the potential for better performance than is available from a single processor. It also makes it possible for developers to take advantage of MT in their native code and integrate that code with multithreaded Java. (Probably not an exercise for the weak of heart.)

Pthreads are more sophisticated than green threads. As a result, they are heavier in their system demands. For many MT programs, the pthread-based environment will produce somewhat lower performance. (This difference will be felt most strongly in programs that generate large numbers of threads.) A good rule of thumb is to use the pthread environment where you want to use threads in your native code or where you want better performance on an MP system. If neither of these applies, you might want to try the pthread environment to see what effect it has on performance.

As with switching from the N32 Java runtime to O32, you can invoke the pthread runtime using a switch on the appletviewer or java command. Both green threads and pthreads are supported with N32 or O32.

Native Code Translation

Earlier releases of Java on IRIX supported conversion of Java byte code to MIPS native instructions for much improved performance. This was available as both a just-in-time (JIT) translator, which converted each Java class as it is first used, as well as the batch-mode javat command. Java 3.1 eliminates the javat command; the overhead of on-the-fly translation is so low and the disk requirements for batch-translated native code so high as to remove any justification for the existing javat. (Batch translation is likely to return at some point in the form of a true optimizing compiler for Java. High quality optimization is too time-consuming for a just-in-time translator to accomplish.)

Java 3.1 introduces an important and all but invisible change to the way JIT translation is done: instead of translating all of the methods of a class at once, the translator now converts just the method being invoked. This reduces translation time, especially for classes with a large number of methods, reduces perceived translation time (assuming you can detect it in the first place) and reduces the memory footprint of the application.

(The earlier per-class translation was designed on the assumption that per-method translation would be too expensive, since each newly-translated method would require the invalidation of the processor's instruction cache. Later experiments showed that this wasn't so; the overhead was more than compensated by the reduction in the number of methods to be translated and the reduced working set size of the application. Which demonstrates an important lesson: with a dynamic language like Java the only things we know for certain are the things we have measured. And we can't be entirely sure about them.)

As with the other features of this release, the changes to native code translation apply both to the O32 and N32 versions of the Java runtime.

In Conclusion

Java 3.1 also contains a number of smaller performance and reliability enhancements in addition to those provided by Sun as part of JDK 1.1.5. It supports IRIX 6.2 and later releases and will be the standard JDK for IRIX 6.5. It is available now on the Silicon Graphics web site.

Sun has scheduled the release of the Java Development Kit 1.2 for early Summer of 1998. This release includes many new features, including the Java-based Java Framework Classes that replace the AWT window toolkit. Sun's plan is for all Java licensees to release their 1.2-compliant environments at the same time. Silicon Graphics will release our 1.2 product in line with that plan.

Take me home:

Show me another: