Thursday, January 04, 2007

New JRuby Compiler: Progress Updates

I've been cranking away on the new compiler. I'm a bit tired and planning to get some sleep, but I've gotten the following working:

  • all three kinds of calls
  • local variables
  • string, fixnum, array literals
  • 'def' for simple methods and arg lists
  • closures
Now that last item comes with a big caveat: I have no way to pass closures I create. The code is compiling, basically as a Closure class that you initialize with local variables and invoke. But since block management in JRuby is still heavily dependent on ThreadContext nonsense, there's no easy way to pass it to a given method. So the next step to getting closures to work in the compiler is to start passing them on the call path as a parameter, much like we do for ThreadContext.

I've managed to keep the compiler fairly well isolated between node walking and bytecode generation, though the bytecode generator impl I have currently is getting a little large and cumbersome. It's commented to death, but it's pushing 900 LOC. It needs some heavy refactoring. However, it's behind a fairly straightforward interface, so the node-walking code doesn't ever see the ugliness. I believe it will be much easier to maintain, and it's certainly easier to follow.

In general, things are moving along well. I'm skipping edge cases for some nodes at the moment to get bulk code compiling. There's a potential that as this fills out more and handles compiling more code, it could start to be wired in as a JIT. Since it can fail gracefully if it can't compile an AST, we'd just drop back to interpreted mode in those cases.

So that's it.

...

Ok, ok, here's performance numbers. Twist my arm why don't you.

(best times only)

The new method dispatch benchmark tests 100M calls to a simple no-arg method that returns 'self', in this case Fixnum#to_i. The first part of the test is a control run that just does 100M local variable lookups.
method dispatch, control (var access only):
interpreted, client VM: 1.433
interpreted, server VM: 1.429
ruby 1.8.5: 0.552
compiled, client VM: 0.093
compiled, server VM: 0.056
Much better. The compiler handles local var lookups using an array, rather than going through ThreadContext to get a DynamicScope object. Much faster, and HotSpot hits it pretty hard. At worst it takes about 0.223s, so it's faster than Ruby even before HotSpot gets ahold of it. The second part of the test just adds in the method calls.
method dispatch, test (with method calls):
interpreted, client VM: 5.109
interpreted, server VM: 3.876
ruby 1.8.5: 1.294
compiled, client VM: 3.167
compiled, server VM: 1.932
Better than interpreted, but slow method lookup and dispatch is still getting in the way. Once we find a single fast way to dynamic dispatch I think this number will improve a lot.

So then, on to the good old fib tests.
recursive fib:
interpreted, client VM: 6.902
interpreted, server VM: 5.426
ruby 1.8.5: 1.696
compiled, client VM: 3.721
compiled, server VM: 2.463
Looking a lot better, and showing more improvement over interpreted than the previous version of the compiler. It's not as fast as Ruby, but with the client VM it's under 2x and with the server VM it's in the 1.5x range. Our heavyweight Fixnum and method dispatch issues are to blame for the remaining performance trouble.
iterative fib:
interpreted, client VM: 17.865
interpreted, server VM: 13.284
ruby 1.8.5: 17.317
compiled, client VM: 17.549
compiled, server VM: 12.215
Finally the compiler shows some improvement over the interpreted version for this benchmark! Of course this one's been faster than Ruby in server mode for quite a while, and it's more a test of Java's BigInteger support than anything else, but it's a fun one to try.

All the benchmarks are available in test/bench/compiler, and you can just run them directly. If you like, you can open them up and see how to use the compiler yourself; it's pretty easy. I will be continuing to work on this after I get some sleep, but any feedback is welcome.

10 comments:

Jim Baker said...

This is good news. I have been reviewing the Jython compiler in the dev trunk as part of preparing for our forthcoming sprint on Jython in Boulder. (You are credited for starting this by Eric Dobbs, by the way.)

So it will be useful comparing notes on bytecode generation. FWIW, the roughly equivalent CodeCompiler.java is 2476 LOC.

Anonymous said...

Do you think that it is possible to use Java's homogenous arrays as an Array backend in the generated code in some cases ?

halukag said...

This indeed is a good news. IMHO, without a decent byte-code compiler, JRuby interpreter in Java is not a good enough solution since it will always be playing catch-up on many accounts with the original Ruby interpreted in C. In order to be successful, the integration of JRuby and Java should be really seamless. We should be able to drop any valid Ruby source file into any Java package and get it compiled by the Java compiler without any extra effort or house keeping. Finally Java and Ruby objects should be able to call each other without any syntactic overhead. I know this is not easy at all, but it shouldn't be impossible either since you can get small extensions to both the JVM and the Java compiler in order to provide this level of abstraction, which is the most important quality of any properly engineered system. Since Groovy and Scala was successful in compiling into Java byte-code without any help from Sun, you should be able to do even better :-)). Congratulations for the progress you achieved with JRuby project so far and good luck for the feature.

Anonymous said...

I think that Groovy and Scala are contaminated by the Java type system two much. This makes Java integration much harder in JRuby, but it's worth it.

halukag said...

It is definitely worth it. This is exactly what Ruby people are trying to do with Ruby 2.0.
Compiling it into byte-code and running it on a virtual machine which will most probably be called Ruby Virtual Machine (RVM). But why invent the wheel. Sun and IBM has got rock solid, matured JVMs that are proven and accepted by the industry. Why not come together with Ruby people and deliver rock-solid Ruby 2 on an enhanced JVM much quicker than can possibly be achieved by Ruby people alone. If they also deliver a good Ruby plug-in support (as good as Java plug-ins) for Eclipse and Net Beans, Microsoft wouldn't know what hit them. This way Java would extend its shelf-life by another 10 years. Which will also be a real service to the Computing industry. But they wouldn't do that would they? Instead they will produce the best garbage (EJB, SOAP, WSDL, SOA etc.) that they can possibly come up with and turn Java into a Camel designed by a committee via hopeless add-ons: annotations, generics, closures...

Anonymous said...

I think ur hopeless add-on including annotations, generics and closures are nice :) Especially, closure I like it so much in Rudy !!

Instead they will produce the best garbage..

Well, Sun is working on JRudy now,
Do u imply that JRudy is "best garbage" ?

No Language Lives Forever, Long live Assemble ~~

Anonymous said...

JRuby is good thing, thank you all.

Anonymous said...

Recently I have upgraded my jruby installation from 0.9.2 to trunk.
I was quite surprised, I had to add asm to classpath to get it work. :)
First of all, I tried my own temporary microbenchmark - 1000 complex activerecord-jdbc operations. Every iteration retrieves last 10 records from embedded H2 database, sorts it to reverse order, remaps every activeredord to set of java objects and packs them into LinkedList.

Running under JRuby 0.9.2 takes cca 30 seconds, current trunk JRuby takes cca 23 seconds. Great improvement!

Charles Oliver Nutter said...

jim: Thanks for the comment, and I agree we need to work together. I'm using ASM for my compiler, and you guys are considering using ASM...so there's a lot to be shared there.

lopex: Very possible, if we can determine the scope of the array and guarantee that array behavior has not been overridden in incompatible ways. We have done experiments in this and gotten absurd performance gains from it, though our version was not "safe".

haluk: We had targetted compatibility primarily to get to a point where we were confident that Ruby apps would run as well as on Ruby. After that, we have focused on getting the interpreted mode execution as fast as possible, to help us clean up the core runtime and sort out what Ruby's actually doing. Now, it's time for compilation. It's a natural progression, really. And we agree that the barrier between Ruby and Java should be as slim as possible; ideally as slim as in Groovy.

mm: I have no idea what you just said :)

ratislav: and that's without any compilation and without any of the new dynamic dispatch optimizations. We have not yet begun to fight!

Anonymous said...

Did you consider to implement a PIC (polymorphic inline cache, check
http://smallthought.com/avi/?p=16%20-%2020%20year%20old%20techniques) ?

Regards,
Markus ( performance blog)