John Rose, lead of the "invokedynamic" effort (Java Specification Request 292), has posted some exciting articles about the future of the JVM and a number of changes potentially for the next Java version. Among these is, of course, the dynamic invocation efforts, but these entries include information on non-local returns from closures, tail calls, and tuple support. I'm excited to have John as a co-worker and to be helping out the invokedynamic effort in my own small way by working on JRuby's compiler. John has also provided guidance on how to make a dynamic language fast on current JVMs, which has informed much of my compiler's design.
Check out his articles:
Longumps Considered Inexpensive
tail calls in the VM
tuples in the VM
It's going to be a fun year for language developers on the JVM!
Sunday, July 15, 2007
The Ever-Evolving JVM
More Compiler Strategy: Call Adapters and Stack-based Methods
Compilers are hard. But not so hard as people would have you believe.
I've committed an update that installs a CallAdapter for every compiled call site. CallAdapter is basically a small object that stores the following:
- method name
- method index
- call type (normal, functional, variable)
The end result is that while compiled class init is a bit larger (needs to load adapters for all call sites), compiled method size has dropped substantially; in compiling bench_method_dispatch.rb, the two main tests went from 4000 and 3500 bytes of code down to 1500 and 1000 bytes (roughly). And simpler code means HotSpot has a better chance to optimize.
Here's the latest numbers for the bench_method_dispatch_only test, which just measures time to call a Ruby-implemented method a bunch of times:
Test interpreted: 100k loops calling self's foo 100 timesAnd Ruby 1.8.6 for reference:
2.383000 0.000000 2.383000 ( 2.383000)
2.691000 0.000000 2.691000 ( 2.691000)
1.775000 0.000000 1.775000 ( 1.775000)
1.812000 0.000000 1.812000 ( 1.812000)
1.789000 0.000000 1.789000 ( 1.789000)
1.776000 0.000000 1.776000 ( 1.777000)
1.809000 0.000000 1.809000 ( 1.809000)
1.779000 0.000000 1.779000 ( 1.781000)
1.784000 0.000000 1.784000 ( 1.784000)
1.830000 0.000000 1.830000 ( 1.830000)
Test interpreted: 100k loops calling self's foo 100 timesNote that these are JIT numbers rather than fully precompiled numbers, so this is 100% real-world safe. Fully precompiled is just a bit faster, since there's no interpreted step or DefaultMethod wrapper to go through.
2.160000 0.000000 2.160000 ( 2.188087)
2.220000 0.010000 2.230000 ( 2.237414)
2.230000 0.010000 2.240000 ( 2.248185)
2.180000 0.010000 2.190000 ( 2.218540)
2.240000 0.010000 2.250000 ( 2.259535)
2.220000 0.010000 2.230000 ( 2.241170)
2.150000 0.010000 2.160000 ( 2.178414)
2.240000 0.010000 2.250000 ( 2.259772)
2.260000 0.000000 2.260000 ( 2.285141)
2.230000 0.010000 2.240000 ( 2.252396)
I have also made a lot of progress on adapting the compiler to create stack-based methods when possible. Basically, this involved inspecting the code for anything that would require access to local variables outside the body of the call. Things like eval, closures, etc. At the moment it works well and passes all tests, but I know methods similar to gsub which modify $~ or $_ are not working right. It's disabled at the moment, pending more work, but here's the method dispatch numbers with stack-based method compilation enabled:
Test interpreted: 100k loops calling self's foo 100 timesIt seems very promising work. I hope I'll be able to turn it on soon.
1.735000 0.000000 1.735000 ( 1.738000)
1.902000 0.000000 1.902000 ( 1.902000)
1.078000 0.000000 1.078000 ( 1.078000)
1.076000 0.000000 1.076000 ( 1.076000)
1.077000 0.000000 1.077000 ( 1.077000)
1.086000 0.000000 1.086000 ( 1.086000)
1.077000 0.000000 1.077000 ( 1.077000)
1.084000 0.000000 1.084000 ( 1.084000)
1.090000 0.000000 1.090000 ( 1.090000)
1.083000 0.000000 1.083000 ( 1.083000)
Oh, and for those who always need a fib fix, here's fib with both optimizations turned on:
~ $ jruby -J-server bench_fib_recursive.rbAnd MRI:
1.258000 0.000000 1.258000 ( 1.258000)
0.990000 0.000000 0.990000 ( 0.989000)
0.925000 0.000000 0.925000 ( 0.926000)
0.927000 0.000000 0.927000 ( 0.928000)
0.924000 0.000000 0.924000 ( 0.925000)
0.923000 0.000000 0.923000 ( 0.923000)
0.927000 0.000000 0.927000 ( 0.926000)
0.928000 0.000000 0.928000 ( 0.929000)
~ $ ruby bench_fib_recursive.rbThese numbers went down a bit because the call adapter is currently just generic code, and generic code that calls lots of different methods causes HotSpot to stumble a bit. The next step for the compiler is to generate custom call adapters for each call site that handle arity correctly (avoiding IRubyObject[] all the time) and call directly to the most-likely target methods.
1.760000 0.010000 1.770000 ( 1.775660)
1.760000 0.010000 1.770000 ( 1.776360)
1.760000 0.000000 1.760000 ( 1.778413)
1.760000 0.010000 1.770000 ( 1.776767)
1.760000 0.010000 1.770000 ( 1.777361)
1.760000 0.000000 1.760000 ( 1.782798)
1.770000 0.010000 1.780000 ( 1.794562)
1.760000 0.010000 1.770000 ( 1.777396)
Subscribe to:
Posts (Atom)