Making Sense of Java

There is as much misinformation about Java as there is information. On this page I have listed some common claims and beliefs about Java, along with a description of how accurate the claims are and where they go astray:

* Java is a language for writing web pages; it's like HTML and VRML
* Java is easy to learn and use, unlike C, C++ and other programming languages
* Java code is portable, where C and C++ are not
* Java solves the problem of cross-platform application development
* Java can be extended to do anything the machine can do
* Java is suitable for building large applications
* Java's performance problems are temporary; it'll soon be as fast as C++
* Java is interpreted; Basic is interpreted; Java = Basic
* Java in a browser eliminates the need for CGI scripts and programs
* Netscape's JavaScript is related to Java
* Java will replace C++ as the language of choice

This is not a tutorial on Java; at best it's an effort to respond to the wild claims made about Java in the press and companies' marketing literature. For more information and commentary, take a look at Marty Hall's Java resource site.

Find Web Developers

Java is a language for writing web pages; it's like HTML and VRML

Java isn't a page description language like HTML. It's a programming language. Description languages specify content and placement; programming languages describe a process for generating a result. Where there is generally a direct mapping between an HTML description of a document and the result, the relationship between a Java program and its result is likely to be more complex. It's a little like the difference between a list of square roots of numbers from zero to 10 and a program to calculate the list.

Here's an HTML table of square roots:

sqrt(1) = 1
sqrt(2) = 1.41421
sqrt(3) = 1.73205
sqrt(4) = 2
sqrt(5) = 2.23607
sqrt(6) = 2.44949
sqrt(7) = 2.64575
sqrt(8) = 2.82843
sqrt(9) = 3
sqrt(10) = 3.16228

And here's the result of a Java applet:

You need a Java-aware browser

This is the code that specifies the Java code to run:

<APPLET CODEBASE="java" CODE="SqrtList" WIDTH=160 HEIGHT=162>
  <EM>You need a Java-aware browser</EM>
</APPLET> 

The <APPLET> tag specifies the class to load (the CODE= field), URL information (the CODEBASE= field) and the size of the region the applet will own. Notice that Java doesn't exactly integrate with the rest of the page. Within that region of the page Java is king: it decides background color and fonts and does all the mouse and keyboard handling. Contrast this behavior with the JavaScript example later in this document.

Parameters to the applet are placed in <PARAM> tags between the <APPLET> and </APPLET> tags. Anything else between these tags is ignored. It's common to include some information here for display by browsers that don't know about Java, since they'll ignore the <APPLET> and <PARAM> tags and display whatever else they find there.

Java is easy to learn and use, unlike C, C++ and other programming languages

Make no mistake about it: Java is a programming language. If you find Pascal hard, you won't care for Java. Writing in Java may be different in degree from C or C++, but it is not different in kind.

Is Java easy to learn? It may be somewhat easier than C or C++. Not because its syntax is any simpler, but more because there are fewer surprises. (Try explaining the difference between a C pointer and its array implementation some time. And C++ adds lots of its own peculiarities, like temporary variables that hang around long after the function that created them has terminated.)

Is Java easier to use? Again the answer is a firm maybe, possibly, perhaps. It eliminates explicit pointer dereferences and memory allocation/reclamation. These two features are the source of many of the hardest-to-find bugs C programmers have to deal with. And Java does add array bounds checking, so out-of-range subscripts are easy to find. It's too soon to tell whether Java is really easier or just seems that way because no one is writing anything truly complex with it.

Java code is portable, where C and C++ are not

Java source code is a little more portable than C-based languages. In C and C++, each implementation decides the precision and storage requirements for basic data types (short, int, float, double, etc.). This is a major source of porting problems when moving from one kind of system to another, since changes in numeric precision can affect calculations and assumptions about the size of structs can be violated. Java defines the size of basic types for all implementations; an int on one system is the same size (and can represent the same range of values) as on every other system. It does not permit the use of arbitrary pointer arithmetic, so assumptions about struct packing and sizes can't lead to non-portable coding practices.

(One reader of this page points out that while storage requirements for float and double are defined by Java, precision during calculation is not. This means that a program that uses floating point arithmetic can produce different answers on different systems, with the degree of difference increasing with the number of calculations a particular value goes through. This is true of floating point in general, not just in Java, and explains why the Cobol world continues to rely on bizarre data types like COMPUTATIONAL-3 (binary coded decimal) for calculations where accuracy matters.)

Where Java is more portable than other languages is in its object code. Most language compilers generate the native code for the target computer, which then runs at the best speed of which the system is capable. Java compiles to an object code for a theoretical machine; the Java interpreter emulates that machine. This means that Java code compiled on one kind of computer will run on every other kind of computer with a Java interpreter. The tradeoff is in performance: the interpreter adds a significant level of overhead to the program.

Note that this extra overhead can be reduced considerably by just-in-time compilation techniques. When the Java interpreter receives a chunk of code to execute, it could convert it from Java object code into the native code of the machine and then execute the real code. This adds some overhead during the translation process but permits the resulting code to run at close to native speeds. Java is still likely to be slower than C or C++, due to some features of the language intended to ease development. It's hard to know how close well-optimized native Java code can get to the best C or C++. But a range of 50% to 200% slower (1.5x to 3x the execution time) seems a reasonable guess.

But it's important that an application written in Java is still not 100% portable. An application written on one kind of system will still need to be tested on every platform before one can say with certainty that there are no problems. Even if the Java code itself was 100% portable (and it isn't; just compare the peculiarities of the Sun implementation of threads with Netscape's), every time the code goes out to native runtime code it encounters incompatibilities: the window toolkit and networking support are riddled with such problems.

Java solves the problem of cross-platform application development

Thanks to its portable byte code, the same Java applet will run anywhere the Java Virtual Machine runs. This leads to the logical conclusion that Java is the perfect language for writing applications that need to run across multiple platforms, especially the kind of lightweight enterprise-level applications that IS departments spend much of their time developing.

Java, coupled with a database connectivity package like JDBC, is a good language for things like database front ends and other lightweight applications. It's far more cross-platform than current solutions like PowerBuilder, Delphi or Visual Basic, easier to manage (no installation; just point at a web page) and potentially much higher performance than all but Delphi (which is based on compiled Pascal). But it doesn't solve all the problems of cross-platform development, as a few days reading any of the the Java newsgroups will show. There are three major limitations to Java's ability to do clean cross-platform execution:

The best cross-platform development and delivery environment I have ever seen was the ParcPlace Smalltalk environment. Every implementation on every supported platform was identical from the programmer's and the program's perspective. Every program behaved identically and looked identical no matter where it ran. Of course, there was a cost associated with this uniformity: although every Smalltalk program looked like every other Smalltalk program, they didn't look at all like any other application running on the same machine. Smalltalk programs on the Macintosh looked like Smalltalk programs; they didn't look like Macintosh programs.

Until and unless we reach a point where every system looks and behaves like every other (a point Microsoft appears to be praying for with great devotion), it will not be possible to write applications that look and feel like others on our development platform and on every other platform on which they run. Not, at least, without some serious work on the part of the developer.

Java can be extended to do anything the machine can do

In theory a Java applet can do anything: manipulate 3D VRML models, play movies, make sounds, you name it. In practice it's a lot more limited. We've already seen one important limitation: an applet has control of its region of the page but no ability to operate elsewhere on the page. But here's a more serious one: an applet can do only what the run time environment allows. Today that's limited to some low level graphics, user interfaces (buttons, menus, text fields and the like), reading and writing files (under strict security guidelines), playing a sound file and some network capabilities.

What's missing? There is no way today to control a VRML model. And what if we want to do more to a sound file than just play it? What if we want to change the volume? Or do a fade in? Or add a reverb? None of these effects exist today in Java's official bag of tricks. (Some are available through undocumented classes that Sun ships with the JDK. Anything undocumented is risky, since there's no support, no guarantee of compatible behavior across platforms and no guarantee that these interfaces won't change. Caveat Emptor.) Anything that Java doesn't support would need to be written in a fully compiled language like C and then made available to the Java run time environment.

And therein lies the real limitation. To do more than Java can do today requires that we do two things: write new libraries that can be used by the Java interpreter; and then make each of those libraries available on every single system that might try to use these new capabilities. An applet is only usable if the capabilities on which it depends are available wherever we want to run them. Because although we can download applets at the moment we want to run them, we can't do the same with the underlying libraries. Java's built-in security makes downloading an applet low in risk; the same can't be said for arbitrary code libraries which do the low level work.

So Java is limited by the pervasiveness of support libraries. We need general 2D and 3D graphics, sound and video manipulation and other multimedia capabilities on every system with a Java-enabled browser. Then we won't be quite so limited. This is the plan for SGI's Cosmo 3D and Sun's Java Media, cross-platform libraries that will extend Java into 3D graphics, sound, video and animation.

Java is suitable for building large applications

For this point, we need to distinguish between Java the programming language (the description of syntax and semantics) and Java as it is implemented today. As a language, Java may be perfectly suitable for big projects. Its object orientation supports integration of large numbers of object classes. By eliminating explicit pointers, it allows programmers to write more maintainable code. So Java as a language is likely to be a better choice than C and probably better than C++ for large applications. Of course, we won't know until someone actually tries it! We are now seeing descriptions of a few large Java development projects, most of which seem sketchy or self-serving enough to make one want to wait for further documentation before accepting their claims.

But while the Java language may be appropriate for big programs, Java as it is implemented in web browsers is not. With a fully compiled language like C, all of the compiled code is combined into an executable program as part of a link process. References to symbols in one module are resolved to their definitions in another.

Java may also turn out to be unsuitable for big applications, rather than just applets. Part of the problem is likely in the way Java deals with memory; none of the Java environments handle large memory spaces at all well. (A speaker at the 1997 Java Internet Business Expo made an interesting comment on his attempts to benchmark Java: that in taking his C++ benchmarks to Java he had to reduce the data size by a factor of ten before any of the Java environments could run the programs to completion.)

But there's another potential problem that is inherent in the dynamic rapid prototyping style of development Java and its advocates encourage. Good prototypes tend to become very bad applications. As we learned (or at least should have learned) from our brush with Lisp and Expert Systems in the 80's, there's a world of difference between prototyping an application and producing a piece of production quality code. It's far more than a matter of fixing bugs and smoothing out the rough edges. The very process of designing as we code leads to applications that don't meet the requirements of stability, reliability, maintainability and extensibility we demand of professional software.

Java resolves all symbols when an applet is loaded into the browser. Each class mentioned in the applet class is loaded to the browser and all the symbolic references are resolved. Inheritance relationships among classes are also resolved at this time; where C++ decides the location of each class member at compile time, Java defers this decision until you attempt to load and run the class.

The upshot of all this is that the equivalent of program linking occurs when you run the code in a class. The larger the class, the larger the number of classes and the more complex the inheritance tree, the longer all this will take.

In addition to dynamic linking, Java performs one other important task before it can begin running a class: validating the code to prevent it from doing anything dangerous. This requires a scan of all of the loaded code to look for funny operations and attempts to break out of the restrictions placed on untrusted applets. Again, the more code you have the longer it will take to process the code before it can begin to run.

Another concern with using Java for large applications is its reliance on stop-and-copy garbage collection. Whenever the application begins running low on memory, everything stops while the GC determines what objects are available for reclamation. Objects still in use are copied to a new area of memory to allow a large contiguous area of free space. Once the GC finishes, the program is free to continue execution.

Right now garbage collection is quick, taking perhaps one or two tenths of a second. But imagine what happens when the size of the Java code and its storage requirements increase by a factor of ten or one hundred. Suddenly we will see our program stop for seconds or even minutes while the garbage collector goes about its work. To solve this problem (as Lisp and Smalltalk systems have had to do) will require a much more sophisticated approach to garbage collection, using a generational scheme or a reference counting model. Either technique will add complexity and overhead to the Java run time environment.

Note that the first commercial Java applets don't use Java for everything. Applix's Java-based spreadsheet, for example, uses Java for the user interface. All the real processing, including loading and saving spreadsheets, is done in CGI code on the server. This is probably the best model for using Java in sophisticated applications. Once there are fully compiled Java implementations, of course, all the rules change.

The commentary on Java continues on page two.


Take me home: Show me more:

Comments to: Hank Shiffman, Mountain View, California

Copyright © 1996, 1997, 1998 Harris Shiffman Privacy Statement