Wednesday, November 01, 2006

Compiler Progress: MultiStub and Full-Script Compilation

I spent today hacking on the Ruby to Java compiler and made some good progress. Here's the highlights:

  • It now parses a full script rather than just single method bodies.
  • The toplevel of the script is given its own method, and defined methods get theirs.
  • It uses MultiStub to implement the methods, so it will be faster than reflection.
  • It's more aware of incoming arguments, rather than assuming a single argument as in the previous revision.
  • It's generating a bit faster code, maybe 5-10% improvement.
MultiStub is our way of implementing many relocatable methods without generating a class-per-method or using reflection. We're using it today in our Enumerable implementation, and it works very well. Some numbers comparing our two reflection-based method binding techniques with MultiStub:

Invocation of a noop "test" method:
t = Time.now; 10000000.times { test }; puts Time.now - t
# test is alternately implemented using each of the three techniques
Control run (with no call to test in the block):
4.33s
ReflectionCallback-based (like Kernel methods today):
20.7s
ReflectedMethod-based (like most methods in normal classes):
19.3s
MultiStub-based (like Enumerable today):
14.9s
So simply switching to the MultiStub trims off around 20% for this benchmark. Removing the time to actually do 10M invocations of the block it comes closer to the 30% range. We're looking to start using MultiStub more in core classes. Anyway, back on topic...

What still needs to be done on the compiler:
  • I didn't implement any additional nodes, so it only handles perhaps 20% of them.
  • The toplevel method should define the contained methods as they're encountered. I'm wiring them all up manually in the test script right now.
  • It doesn't have any smarts for binding Ruby method names to the generated MultiStub methods yet.
It's a big leap in the right direction though, since you can pass it a script and it will try to compile the whole thing to Java code. Here's the results of the recursive fib benchmark with the new compiler (calculating fib(30)):
Time for bi-recursive, interpreted: 14.859
Time for bi-recursive, compiled: 9.825
Ruby 1.8.5:
Time for bi-recursive, interpreted: 1.677
This was in the mid 10-second range previously, so this is the first time we've dropped below 10 seconds. This puts the compiled code around 6x as slow as Ruby for this benchmark, which is very method-call intensive. Still, it's a solid 33% improvement over the interpreted version...probably an even larger percentage improvement if we don't count method-call overhead. Now on to iterative results, which are very light on interpretation (calculating fib(500000)):
Time for iterative, interpreted: 58.681
Time for iterative, compiled: 58.345
JRuby sans ObjectSpace support:
Time for iterative, interpreted: 47.638
Time for iterative, compiled: 47.563
Ruby 1.8.5:
Time for iterative, interpreted: 50.770461
For the iterative benchmark we're still about on par with (or around 20% slower than) Ruby because there's no interpretation involved and Java's BigInteger is faster than Ruby's Bignum. When ObjectSpace is turned off (it's pure overhead for us), the iterative version runs faster in JRuby. Once we eliminate some method overhead, things should improve more.

Moving right along.

1 comment:

Erik van Oosten said...

When you write 'compile to java code', I am right to assume you mean 'compile to java byte code'?