Wednesday, June 16, 2010

My Short List of Key Missing JVM Features

I mused today on Twitter that there's just a few small things that the JVM/JDK need to become a truly awesome platform for all sorts of development. Since so many people asked for more details, I'm posting a quick list here. There's obviously other things, but these are the ones on my mind today.

Cold Performance

Current JVMs start up pretty fast, and there's changes coming in Hotspot in Java 7 that will make them even better. Usually this comes from combinations of pre-verifying bytecode (or providing verification hints), sharing class data across processes, and run-of-the-mill tweaks to make the loading and linking processes more efficient. But for many apps, this doesn't do anything to solve the biggest startup hit of all: cold execution performance. Because the JVM doesn't save off the jitted products of each run, it must start "cold" every time, running everything in the bytecode interpreter until it gets hot enough to compile. Even on JVMs that don't have an interpreter, the initial cost of compiling everything to not-particularly-optimized assembly also causes a major startup hit (try running command-line stuff on JRockit or J9).

There's a few things people have suggested, and they're all hard:
  • Tiered compilation - compile earlier using the fastest-possible, least-optimizing compiler, but decorate the compiled code with appropriate profiling logic to do a better job later. Hotspot in Java 7 may ship a tiered compiler, but there have been some resource setbacks that delayed its development.
  • Save off compilation or optimization artifacts - this is theoretically possible, but the deeper you go the harder it is to save it. Usually the in-memory results of optimization and compilation depend on the layout of memory. Saving them to disk means you need to scrub out anything that might be different in a new process like memory addresses and class identities. But .NET can do this, though it largely *just* does static compilation. Happy medium?
  • Keep a JVM process running and toss it new work. We do this in JRuby with the Nailgun library, but it has some problems. First off, it can leave various aspects of the JVM in a dirty state, like system properties and memory footprint. Second, it can't kill off rogue threads that don't terminate, so they can collect over time. And's not actually running at the console, so a lot of console things you'd do normally don't work.
This is probably the biggest unsolvable problem for JRuby right now, and the one we most often have to apologize for. JRuby is times, very fast...and getting faster every day. But not during the first 5 seconds, and so everyone gets the same bad impression.

Better Console/Terminal Support

There's endless blogs out there complaining about how the standard IO streams you get from the JVM are crippled in various ways. You can't select on them, for example, which is the source of a few unfixable bugs in JRuby. You can't pass them along to subprocesses, which is perhaps more a failing of the process-launching APIs in the JDK than standard IO itself. There's no direct terminal support in any of Java's APIs, so people end up yanking in libraries like jline just to support line editing. If the JDK shipped with some nice terminal and process APIs, a lot of the hassles developers have writing command-line tools in Java would melt away.

There's some light at the end of the tunnel. NIO2, scheduled to be part of Java 7, will bring better process launching APIs (with inherited standard IO streams, if you desire), a broader range of selectable channels, and much more. Hopefully it will be enough.

Fix the Busted APIs

JDBC is broken. Why? Because you have to register your driver in a global hard-referencing hash, and have to unregister it from the same classloader or it will leak. That means that if you're loading JDBC drivers from within a webapp or EE application, *your entire application remains in memory* because the driver references it and that map references the driver. This is the primary reason why most Java web application servers leak memory on undeploy, and it's another "unfixable" from JRuby's perspective.

Object serialization is broken. Why? Because it plays all sorts of tricks to get your classloader, reflectively access fields (if you're going to reflectively access them anyway, why not just break encapsulation if security allows it), and construct object instances without allowing you the opportunity to initialize them appropriately yourself. You have to provide no-arg constructors, have to un-final fields so they can be set up outside of construction, and heaven forbid you use default serialization: it's dead slow.

Reflection is too slow and there's no way around it. Not only do you end up calling through many extra levels of logic for reflective invocation, you have to box your argument lists, box your numerics, and wrap everything in exception-handling. And it doesn't have to be this way. The invokedynamic work brings along with it method handles, which are fast, direct pointers to methods. This should have been added long ago, but thankfully it's on the way in Java 7. Until then, projects like JRuby will have to continue eating the cost of reflection...or generate method handles by hand. We do both.

Regular expressions are broken. Why? Because simple alternations can blow the Java stack when fed especially large input. The current Sun-created regex implementation recurses for things like alternation, making it easy for it to fail to match on large input. The problem is so bad that we've actually switched regular expression engines in JRuby *four times*, including two implementations we wrote ourselves. Nobody can say we haven't bled for our users.

And there's numerous other examples. Some are relics of Java 1.0 that never got corrected (because old APIs don't die, they just get deprecated...or ignored). Some are relics of the idea that gigantic monolithic servers hosting dozens of apps (and leaking memory when they undeploy, or else contending for basic resources that separate processes would not) are a good idea, when in actuality running multiple JVMs that each only host one or a few apps works far better. Making a real effort to smooth these bad APIs would go a long way.

Better Support for Native Libraries and POSIX Features

As of Java 6, there's still no support for working with symlinks and only limited support for setting file permissions. Process launching is absolutely terrible. You can't select on all channels...only on sockets. If you want to use a native library, you have to write JNI code to do it, even though there are libraries like JNA and JFFI in the wild that do an outstanding job of dynamically loading and binding those libraries.

Missing the POSIXy features is basically inexcusable today. Most of the system-level APIs in the JDK are still based on a lowest common denominator somewhere near Windows 95, even though all modern operating systems provide at least a substantial subset of those APIs. NIO2 will bring many improvements, but it's almost certain that some parts of POSIX won't be exposed, either because there's not enough resources to spec out the Java APIs for them or because they end up being too system-specific.

As for loading native libraries...this is again something that should have been rolled into the JDK a long time ago. Many people will cry foul..."pure Java!" they'll shout...and I agree. But there are times when some functionality simply doesn't exist in a Java library, or doesn't scale well as an out-of-process call. For these times, you just have to use the native library...and the barrier to entry for doing that on the Java platform is just too high. Rolle JNA or JFFI or something similar into the JDK, so we grown-ups can choose when we want to use native code.

The Punchline

The punchline is that for most of these things, we've solved or worked around them in *great expense*. I'd go so far as to say we've done more to work around the JVM and the JDK's lackings than any other project. We've gone out of our way to improve startup by any means possible. We ship beautiful, solidly-performing libraries for binding native libs without writing a line of C code. We generate a large amount of code at compile and runtime to avoid using reflection as much. We maintain and ship native POSIX layers for a dozen platforms. We've (i.e. one of our champions, Marcin Mielzynski) implemented our own regular expression engine (i.e. a port of Oniguruma). We've pulled every trick in the book to get process launching to work nicely and behave like Ruby users expect. And so on and so forth.

But it's not sustainable. We can't continue to patch around the JVM and JDK forever, even if we've done a great job so far. Hopefully this will serve as a wake-up call for JVM and JDK implementers around the world: If you don't want the Java platform to be a server-only, large-app-only, long-running-only, headless-only's time to fix these things. I'm standing by to help coordinate those efforts :)


Unknown said...

"Missing the POSIXy features is basically inexcusable today."

Agreed. However, analogues are acceptable: properties files instead of environment variables, et cetera.

Either way, it's a VM world, and I'd prefer to continue to work with the JVM over other options.

Greg said...

Hey Charles, as you said, you guys have done at least as much work as any team on trying to squeeze the most out of the JVM. It seems like maybe it'd be good for you guys to document some of these workarounds more comprehensively since other JVM language implementors are likely to fight the same battles. Have you guys thought about starting some type of documentation project where those who are pushing the limits of the JVM can come and contribute code samples and documentation about how they got around some of these flaws, in detail? I'm sure it would really help people, I could even see compiler construction kits coming out of such an effort a la LLVM.

Toby said...

Great post Charles!

One thing that comes to mind with your list: A good number of these features probably couldn't be implemented without changing the JVM itself (e.g. POSIX integration). I'm sure a number of developers are hesitant about making changes at such a low level, as it would invariably lead to runtime fragmentation and jeopardize one of the biggest advantages of the JVM - the ability of a wide range of libraries to run on compatible JVMs regardless of the underlying system.

What are your thoughts?

Phil said...

I was surprised not to see Fixnums here. That would be a huge perf win across the JVM.

One wishlist "unfixable" item would be getting rid of UTF-16. What a short-sighted decision we're still paying off.

Anonymous said...

"JDBC is broken"

Unfortunately i have to agree with you. In my defense, i fought long and hard to remove the DriverManager from JDBC 3 / JDK 1.4. Needless to say, arguments for backward compatibility won out... as they probably should.

If there had been an obvious means to change the implementation without changing the behaviour it'd be there.

Anonymous said...

I would also agree that JVM shouldn't be limited to server side headless applications. One irreversible limitation is the use of 32 bit signed ints for array indexing.

Alex said...

Charles, you are a warrior. Thank you and your fellow contributors for JRuby, it's awesome. Having lost hard on Sun stock after buying into the latent potential and vision in 2006-2008, I feel entitled to say this: if everyone at Sun from top to bottom approached work like you the outcome would have been very, very different.

Daniel Larsson said...

What do you use to generate code? Profiling code that uses for example the javaassist-API is very slow, comparable to reflection.

AlBlue said...

You have a point about JDBC - but it's not the only one. For example, URL has the same problems with its schema providers.

OSGi has had these issues for a while; but it has a URL handler which registers a sacrificial proxy class that has a mutatable field. So even though the proxy never gets unregistered, it can be disconnected from the real class and so eject the classes.

Support for JDBC has been added as the enterprise OSGi specs recently.

Finally, LLVM is showing promise, but lacks compiled binaries for platforms which means the adoption rate outside OSX is limited.

Incidentally there is the which is working on a generic VM infrastructure for JVM and CLR - but the dynamic JIT (or lack thereof) means that it is not performant in itself at the moment.

Artur Biesiadowski said...

Java serialization DOES NOT require defining no-arg constructors and it DOES allow final fields.

It is perfectly possible that you have chosen some third party serialization framework which imposes those restrictions, but don't blame JVM for that.

Hongli said...

Your team has done some amazing work on JRuby so far.

Ismael Juma said...


Good list. There are some MLVM items that would also make a big difference (value types, tail calls, fixnums, etc.).

"But .NET can do this, though it largely *just* does static compilation. Happy medium?"

The IBM JRE also has some AOT support:


gambistics said...

If you could fork the complete JVM like with Dalvik that would be an interesting solution for improving startup speed.

A quick look shows that forking is probably not too easy to achieve:
1.) you basically can only fork the running thread, but the Hotspot VM runs >10 threads after start-up of a simple program (finalizer, GC, etc.), so you would have to coordinate those threads to be in a valid state when forking so that you can restart them on the child process. That's probably no simple change
2.) no forking on Windows, so that would not be portable so easy

Mark Essel said...

This sounds like Ruby on the JVM is a rough fit, always an uphill battle.

Any alternatives to VMs or interpreters for fast Ruby? V8 an option, the kinship of JavaScript & Ruby feels more connected. I haven't seen much new work for RubyJS

Best of luck regardless. I can respect you and your teams frustration.

Unknown said...

Another missing feature is annotations like @Inline to transfer knowledge from dynamic language runtime to JIT


Sakuraba said...

Amen to that.

There was I time, when I considered 'being on the JVM' to be a good thing. This generally is still true, but I am not sure whether a new language should really be implemented on this old crufted legacy baggage that the JDK has become over the years.

Why would I run e.g. Django on Jython or Rails on JRuby if it was not for some special Java integration feature request? I would never accept the slowdown that is perceived every time I run a Jython script compared to CPython unless I had a very good reason to do so.

Rodrigo Lopes said...

I really want to see openMP support in JVM and also in Java language. openMP is far better way concurrent optimization. Unfortunely, you can use it only with C++ or Fortran.

Charles Oliver Nutter said...

Greg: Sounds like a good idea, along with some nice Java (and other language) examples of using what we've created. I also would like to see a compile construction kit similar to LLVM or .NET's DLR, rather than having so many one-off, stitched-together solutions on a per-language basis.

Toby: Great things don't come easily! A lot of these things do require changes at the JVM level, and some even at the JVM specification level. That's a hard pill to swallow, especially considering backward-compatibility. But a platform that stagnates is a platform on its way to the grave. We must not be afraid.

Phil: More generally, value types would be a great addition to the JVM. But this also falls into the column of "really hard to add without breaking backward compatibility". In the short term, we'll have to rely on dynamic optimization at both the VM and language levels (see my dynopt work in JRuby) to escape the boxed numeric prison. FWIW, I would love to have value types, but I think it's possible to dynamically optimize JRuby to not need them in local scopes.

itllallendintears: I'd love to have a conversation about that :) The JDBC thing causes us no end of grief.

Daniel Larsson: I largely generate JVM bytecode directly (ie. by hand) using ASM as the toolkit. I have not much experience with libraries that generate code for you, like javaassist.

Artur: I should have been more clear; serialization requires those things to perform reasonably. And in fact, you basically need to go all the way to Externalizable to get even remotely usable speed out of serialization, since otherwise there's a tremendous amount of reflective access happening.

Ismael Juma: Thanks for that link, I did not know about J9 supporting AOT compilation. It doesn't appear to allow that AOT-compiled code to get re-optimized once it runs, that's too bad.

Gambistics: I *almost* included fork on this list, but it's almost totally infeasible with mainstream JVMs. I can see the benefits for a mobile devices, where you want fast-as-possible spinup and as much shared memory as possible, but the challenges of getting signal handling, file descriptors, user threads, and VM threads all coordinated through the fork of a mainstream JVM...that's more than I can reasonably ask for :)

Mark Essel: The JVM is actually a wonderful host for Ruby the language. It's Ruby's libraries that demand more than the JVM can provide, like POSIX support, native libraries (and a Ruby-compatible C API), and so on. We've only just begun to optimize Ruby on the JVM, and already we're doing extremely well on performance. I would bank on a Ruby impl atop JVM long before I'd bank on one atop V8 or any other language-specific VMs.

Rémi: Yes, I know you and I would both like that :) Along the same lines: call site and method-body specialization per trace path, so that using closures doesn't introduce un-inlinable megamorphic sites.

Sakuraba: That's a large part of my frustration. Applications *do* run faster on JRuby than on the standard Ruby implementations, but the startup slowdown usually means people don't even give JRuby a fair chance. Why would you want to use JRuby or Jython over their native equivalents? Because they either are better now or have the potential to be better very soon. Fast dynamic optimizations. Native threads. Vastly superior memory management. Even without Java libraries, the JVM is a great target for languages. If only they could fix these few problems...

Lukas Bradley said...

The trend of calling something "broken" because it doesn't act exactly the way you want needs to stop. Your criticisms are wonderful, but JDBC and serialization obviously work.

Charles Oliver Nutter said...

Lukas Bradley: I consider something broken if it doesn't work as advertised in the environments where it's recommended. JDBC at least falls into that category. As for serialization...sure, I suppose it does function as defined. But I also consider something to be broken if it's so unusably slow that it's useless to me. Default Java object serialization falls into that category.

If more projects treated such situations as being "broken", perhaps we wouldn't be suffering so long.

Mark Essel said...

Thanks for the response Charles. The language verse the library split helps someone newer to these types of problems get a feel for what the issues are.

My mistake is blending the language and the rich library into one entity. The question I'm left with, what is Ruby without it's powerful default lib?

Who actively develops and works the issues you describe with the JVM, are these open source groups?

J said...

I've said this a million times in the #clojure channel: JLine is terrible. Don't use it. Forget it exists.

***Use rlwrap instead***

rlwrap is amazing, and it gives you a clojure command prompt (or any command, really) with *FULL* GNU readline support.

Osvaldo Doederlein said...

On JDBC: This is just one of the reasons why you must never deploy a database driver inside application modules. JDBC drivers must be installed in your appserver's /lib, all DataSources should be provided by the appserver. It's basic J2EE admin good-practice to impose a policy like, "any developer who puts a JDBC driver jar inside his EAR/WAR, should be shoot in his head".

Osvaldo Doederlein said...

On Serialization: Granted it's not wicked-fast, but it's not that slug either; and no it doesn't suffer from excessive overhead of [public] reflection APIs. At least in modern Sun/Oracle JDKs, serialization will use the Unsafe operations to read and write fields directly, without using APIs like java.lang.reflect.Field. In fact, the standard serialization is so fast, that providing naive read/writeObject() methods (that basically do the same operations the default serializer would do) will make you lose performance, not gain it. You only start winning performance when you do higher-level opts like avoiding internalization of objects that you know won't be shared in the serialized graph, or create optimized wire representation of some object, etc. Such optimizations are by definition app-specific and impossible to do in the runtime API.

I guess Java serialization's design is old enough that it should carry the weight of some old screwups and compatibility issues, so a new fresh attempt could deliver better performance. But that would gain us some incremental performance, nothing earth-shattering. You will always need custom serialization code if you want the very best performance, and this is true in any language that I know. (But anybody correct me if I'm wrong: which language or VM has a standard serialization engine that can't be beat by custom app code?)

Charles Oliver Nutter said...

John Cromartie: That would work fine if it were just the UI aspect we need, but Ruby provides a Readline library as part of its core API set, and most tools that do console-like applications use this API directly. We can't do that with a simple wrapper.

Mark Essel: I should be even more specific: *certain* Ruby libraries hit these problems with the JVM. The majority of them are not particularly unusual or difficult to implement, like strings, arrays, hashes. It's only the built-in libraries that expose OS-level features the JVM doesn't support that get tricky.

Anonymous said...

The Java platform has been around for *15 years* and it still has problems like these. Has moving to an open source model sped up development?
Seriously, I'm not a Java developer, just fascinated by how slowly some software projects move. (Now, where's that GIMP mailing list, I here they're still happy editing photos in 8 bit color...)

Charles Oliver Nutter said...

Anonymous: I think it has. In OSS, several things have happened to Hotspot that would not have happened before:

* Wider platform support, including native builds for the BSDs and OS X
* Wilder feature experiments like coroutines and tail calls that have been developed entirely outside of Sun/Oracle

Some of the issues I listed are sort of "stuck" in the Java SE standard, like the mechanisms by which serialization and JDBC driver discover happen. Those are hard to change. The cold performance issues would be easier to tackle (if you like wrangling VM internals), but they're certainly less sexy. But in theory, anyone could attempt to fix any of the above issues; they just would take a while to get into standard Java.

Anonymous said...

You are simply tilting at windmills.


Your foolish and shortsighted decision to leave Sun put you outside. You lost close ties to the JVM team and now are just another person trying to force change by whining publicly.

JRuby has been moving towards being only JRails, except JRuby sucks with Rails 3.

It is a shame really. So much potential wasted.