Headius

Thursday, July 19, 2007

Alioth Numbers for JRuby 1.0

Someone pointed out to me the other day that the Alioth "Compuer Language Benchmarks Game" (as they call it now) had started to include JRuby 1.0 in the results. And how do we fare? We're slower than Ruby 1.8.6! Hooray!

But not by much. Depending on your definition of "on par", I'd say we're safely in the range of being on par with Ruby 1.8.6 performance, at least for these too-short, too-small measurements.

Alioth benchmarks are regularly panned by language enthusiasts...or at least enthusiasts on the "slow" end of the benchmarks. In the case of JVM-based language implementations, the problem is the same old excuse: not enough time for the JVM JIT to work its magic. This case isn't all that different, but it's fair to say that Alioth isn't testing straight-line performance in a hot application, but getting from point A to point B without a running start. And in that domain, it provides a reasonable window into implementation performance.

So then, the JRuby 1.0 numbers. You can click the link to see them, but they break down about like this:

Performance:

Four tests are equal or faster in JRuby--usually not more than 2x faster, but "4.8x" faster in one case. That fastest one involves "concurrency", and I haven't studied it enough to know whether it's meaningful or not.
Eight tests are less than 2x slower in JRuby.
The remaining four tests are greater than 2x slower in JRuby.
Startup time is considerably worse; it's listed as 305x slower for JRuby (!!!) but it's not a particularly useful ratio. We take a couple seconds to get going compared to Ruby's hundredths. That's life.

Memory:

All except one worse in JRuby
Most more than 2x worse in JRuby
Several more then 10x worse in JRuby
Does anyone care?

(Update: Commenters have made it clear that they do care...but of course I was being a little bit glib here :) We continue every release to do what we can to improve performance AND memory usage, and other than some additional overhead from the JVM itself our memory usage is pretty reasonable.)

So I guess that's all I've got to point out. We've been working on performance since 1.0, and there's a number of major improvements planned for the 1.1 release. And considering that "beating Ruby performance" wasn't a primary goal for JRuby 1.0, I think our "roughly on par" numbers here are pretty damn good. Granted, Ruby 1.8.x isn't the fastest implementation in the world, but we're pretty happy to have improved performance by an order of magnitude in the past year.

Now, onward to the future!

Wednesday, July 18, 2007

Groovy and JRuby Cooperating?

Can it be true? Of course it can!

Graeme Rocher blogs about a new feature in upcoming Groovy 1.1 beta 3: support for Ruby-style "method missing" hooks!

In heavily metaprogrammed applications, it's very common to define methods on the fly in response to calls made. For example, in Rails ActiveRecord, no "find by" methods are defined out of the box, but you may still call "find_by_x_and_y_and_z" to run your specific query. In order to reduce the overhead of handling these "missing methods" (usually through parsing the called name), it's typical to generate new methods on demand, binding them to the called class. So if you call find_by_name and it doesn't exist, two things happen: the query is generated and run and a new find_by_name method is dynamically created for future queries.

After Graeme blogged about the typical way to dynamically create methods in Groovy, I tapped him on IM and we started to talk about how Ruby handles the same use case. We agreed that the overhead of overriding invokeMethod (Groovy's entry point for all dynamic calls) was problematic...it's already got enough to do without adding the method-generation logic. As it turns out, "method missing" was a perfect remedy, and pretty easy to add into Groovy. Graeme's already started work to modify Grails to take advantage of methodMissing and I'm sure it will catch on like wildfire in the rest of Groovydom. And as a result you, the JVM dynamic language community, gain the benefit of a little Groovy/JRuby cross-pollination.

So for all you haters out there who just love to pit language teams against one another I say this:

Nyah.

Sunday, July 15, 2007

The Ever-Evolving JVM

John Rose, lead of the "invokedynamic" effort (Java Specification Request 292), has posted some exciting articles about the future of the JVM and a number of changes potentially for the next Java version. Among these is, of course, the dynamic invocation efforts, but these entries include information on non-local returns from closures, tail calls, and tuple support. I'm excited to have John as a co-worker and to be helping out the invokedynamic effort in my own small way by working on JRuby's compiler. John has also provided guidance on how to make a dynamic language fast on current JVMs, which has informed much of my compiler's design.

Check out his articles:

Longumps Considered Inexpensive
tail calls in the VM
tuples in the VM

It's going to be a fun year for language developers on the JVM!

More Compiler Strategy: Call Adapters and Stack-based Methods

Compilers are hard. But not so hard as people would have you believe.

I've committed an update that installs a CallAdapter for every compiled call site. CallAdapter is basically a small object that stores the following:

method name
method index
call type (normal, functional, variable)

As well as providing overloaded call() implementations for 1, 2, 3, n arguments and block or no block. The basic goal with this class is to provide a call adapter (heh) that makes calling a Ruby method in compiled code as similar to (and simple as) calling any Java method.

The end result is that while compiled class init is a bit larger (needs to load adapters for all call sites), compiled method size has dropped substantially; in compiling bench_method_dispatch.rb, the two main tests went from 4000 and 3500 bytes of code down to 1500 and 1000 bytes (roughly). And simpler code means HotSpot has a better chance to optimize.

Here's the latest numbers for the bench_method_dispatch_only test, which just measures time to call a Ruby-implemented method a bunch of times:

Test interpreted: 100k loops calling self's foo 100 times
 2.383000   0.000000   2.383000 (  2.383000)
 2.691000   0.000000   2.691000 (  2.691000)
 1.775000   0.000000   1.775000 (  1.775000)
 1.812000   0.000000   1.812000 (  1.812000)
 1.789000   0.000000   1.789000 (  1.789000)
 1.776000   0.000000   1.776000 (  1.777000)
 1.809000   0.000000   1.809000 (  1.809000)
 1.779000   0.000000   1.779000 (  1.781000)
 1.784000   0.000000   1.784000 (  1.784000)
 1.830000   0.000000   1.830000 (  1.830000)

And Ruby 1.8.6 for reference:

Test interpreted: 100k loops calling self's foo 100 times
2.160000   0.000000   2.160000 (  2.188087)
2.220000   0.010000   2.230000 (  2.237414)
2.230000   0.010000   2.240000 (  2.248185)
2.180000   0.010000   2.190000 (  2.218540)
2.240000   0.010000   2.250000 (  2.259535)
2.220000   0.010000   2.230000 (  2.241170)
2.150000   0.010000   2.160000 (  2.178414)
2.240000   0.010000   2.250000 (  2.259772)
2.260000   0.000000   2.260000 (  2.285141)
2.230000   0.010000   2.240000 (  2.252396)

Note that these are JIT numbers rather than fully precompiled numbers, so this is 100% real-world safe. Fully precompiled is just a bit faster, since there's no interpreted step or DefaultMethod wrapper to go through.

I have also made a lot of progress on adapting the compiler to create stack-based methods when possible. Basically, this involved inspecting the code for anything that would require access to local variables outside the body of the call. Things like eval, closures, etc. At the moment it works well and passes all tests, but I know methods similar to gsub which modify $~ or $_ are not working right. It's disabled at the moment, pending more work, but here's the method dispatch numbers with stack-based method compilation enabled:

Test interpreted: 100k loops calling self's foo 100 times
 1.735000   0.000000   1.735000 (  1.738000)
 1.902000   0.000000   1.902000 (  1.902000)
 1.078000   0.000000   1.078000 (  1.078000)
 1.076000   0.000000   1.076000 (  1.076000)
 1.077000   0.000000   1.077000 (  1.077000)
 1.086000   0.000000   1.086000 (  1.086000)
 1.077000   0.000000   1.077000 (  1.077000)
 1.084000   0.000000   1.084000 (  1.084000)
 1.090000   0.000000   1.090000 (  1.090000)
 1.083000   0.000000   1.083000 (  1.083000)

It seems very promising work. I hope I'll be able to turn it on soon.

Oh, and for those who always need a fib fix, here's fib with both optimizations turned on:

~ $ jruby -J-server bench_fib_recursive.rb      
 1.258000   0.000000   1.258000 (  1.258000)
 0.990000   0.000000   0.990000 (  0.989000)
 0.925000   0.000000   0.925000 (  0.926000)
 0.927000   0.000000   0.927000 (  0.928000)
 0.924000   0.000000   0.924000 (  0.925000)
 0.923000   0.000000   0.923000 (  0.923000)
 0.927000   0.000000   0.927000 (  0.926000)
 0.928000   0.000000   0.928000 (  0.929000)

And MRI:

~ $ ruby bench_fib_recursive.rb
1.760000   0.010000   1.770000 (  1.775660)
1.760000   0.010000   1.770000 (  1.776360)
1.760000   0.000000   1.760000 (  1.778413)
1.760000   0.010000   1.770000 (  1.776767)
1.760000   0.010000   1.770000 (  1.777361)
1.760000   0.000000   1.760000 (  1.782798)
1.770000   0.010000   1.780000 (  1.794562)
1.760000   0.010000   1.770000 (  1.777396)

These numbers went down a bit because the call adapter is currently just generic code, and generic code that calls lots of different methods causes HotSpot to stumble a bit. The next step for the compiler is to generate custom call adapters for each call site that handle arity correctly (avoiding IRubyObject[] all the time) and call directly to the most-likely target methods.

Friday, July 13, 2007

To Keyword Or Not To Keyword

One of the most attractive aspects of Ruby is the fact that it has relatively few sacred keywords. In most cases, things you'd expect to be keywords are actually methods, and you can wrap or hook their behavior and create amazing potential.

One perfect example of this is require. Because require is just a method, you can define your own version that wraps its behavior. This is exactly how RubyGems does its magic...rather than immediately calling the default require, it can modify load paths based on your installed gems, allowing for a dynamically-expanding load path and the pluggability we've all come to know and love.

But all such keyword-like methods are not so well behaved. Many methods make runtime changes that are otherwise impossible to do from normal Ruby code. Most of these are on Kernel. I propose that several of these methods should actually be keywords.

Update: Evan Phoenix of Rubinius (EngineYard), Wayne Kelly of Ruby.NET (Queensland University), and John Lam of IronRuby (Microsoft) have voiced their agreement on this interesting ruby-core mailing list thread. Have you shared your thoughts?

Justifying Keywords

There's a number of (in my opinion, very strong) justifications for this:

Many Kernel methods manipulate runtime state in ways no other methods can. For example: local_variables requires access to the caller's variable scope; block_given? requires access to the block/iter stacks (in MRI code); eval requires access to just about everything having to do with a call; and there are others, see below.
Because many of these methods manipulate normally-inaccessible runtime state, it is not possible to implement them in Ruby code. Therefore, even if someone wanted to override them (the primary reason for them to be methods) they could not duplicate their behavior in the overridden version. Overriding only destroys their utility.
These methods are exactly the ones that complicate optimizing Ruby in all implementations, including Ruby 1.9, Rubinius, JRuby, Ruby.NET, and others. They confound a compiler's efforts to optimize calls by always leaving open questions about the behavior of a method. Will it need access to a heap-allocated scope? Will it save off a binding or the current call frame? No way to know for sure, since they're methods.

In short, there appears to be no good reason to keep them as methods, and many reasons to make them keywords. What follows is a short list of such methods and why they ought to be keywords:

*eval - requires implicit access to the caller's binding
block_given?/iterator? - requires access to block/iter information
local_variables - requires access to caller's scope
public/private/protected - requires access to current frame's visibility

There may be others, but these are definitely the biggest offenders. The three points above were used to compile this list, but my criteria for a keyword could be the following more straightforward points. A feature should be implemented (or converted to) a keyword if it fits either of the following criteria:

It manipulates runtime state in ways impossible from user-created code
It can't be implemented in user-created code, and therefore could not reasonably be overridden or hooked to provide additional behavior

As an alternative, if modifications could be made to ensure these methods were not overridable, Ruby implementations could safely treat them as keywords; searching for calls to "eval" in a given context would be guaranteed to mean an eval would take place in that context.

What do we gain from doing all this?

I can at least give a JRuby perspective. I expect others can give their perspectives.

In JRuby, we could greatly optimize method invocations if, for example, we knew we could just use Java's local variables (on Java's stack) rather than always heap-allocating a scoping structure. We could also avoid allocating a frame or binding when they are not needed, just allowing Java's call frame to be "enough" for us. We can already detect if there are closures in a given context, which helps us learn that a heap-allocated scope will be necessary, but we can't safely detect eval, block_given?, etc. As a result of these methods-that-would-be-keywords, we're forced to set up and tear down every method in the most expensive manner.

Other implementations&emdash;including Ruby 1.9/2.0 and Rubinius&emdash;would probably be able to make similar optimizations if we could calculate ahead of time whether these keyword operations would occur.

For what it's worth, I may go ahead and implement JRuby's compiler to treat these methods as keywords, only falling back on the "method" behavior when we detect in the rest of the system that the keyword has been overridden. But that situation is far from ideal...we'd like to see all implementations adopt this behavior and so benefit equally.

As an example, here's an early demonstration of the performance change in our old friend fib() when we can know ahead of time if any of these keywords are called (fib calls none of them). This example shows the performance today and the performance when we can safely just use Java local variables and scoping constructs. We could additionally omit heap-allocated frames for each call, giving a further boost.

I've included Ruby 1.8.6 to provide a reference value.

Current JRuby:

~ $ jruby -J-server bench_fib_recursive.rb
1.323000   0.000000   1.323000 (  1.323000)
1.118000   0.000000   1.118000 (  1.119000)
1.055000   0.000000   1.055000 (  1.056000)
1.054000   0.000000   1.054000 (  1.054000)
1.055000   0.000000   1.055000 (  1.054000)
1.055000   0.000000   1.055000 (  1.055000)
1.055000   0.000000   1.055000 (  1.055000)
1.049000   0.000000   1.049000 (  1.049000)

~ $ jruby -J-server bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
3.901000   0.000000   3.901000 (  3.901000)
4.468000   0.000000   4.468000 (  4.468000)
2.446000   0.000000   2.446000 (  2.446000)
2.400000   0.000000   2.400000 (  2.400000)
2.423000   0.000000   2.423000 (  2.423000)
2.397000   0.000000   2.397000 (  2.397000)
2.399000   0.000000   2.399000 (  2.399000)
2.401000   0.000000   2.401000 (  2.401000)
2.427000   0.000000   2.427000 (  2.428000)
2.403000   0.000000   2.403000 (  2.403000)

Using Java's local variables instead of a heap-allocated scope:

~ $ jruby -J-server bench_fib_recursive.rb
2.360000   0.000000   2.360000 (  2.360000)
0.818000   0.000000   0.818000 (  0.818000)
0.775000   0.000000   0.775000 (  0.775000)
0.773000   0.000000   0.773000 (  0.773000)
0.799000   0.000000   0.799000 (  0.799000)
0.771000   0.000000   0.771000 (  0.771000)
0.776000   0.000000   0.776000 (  0.776000)
0.770000   0.000000   0.770000 (  0.769000)

~ $ jruby -J-server bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
3.100000   0.000000   3.100000 (  3.100000)
3.487000   0.000000   3.487000 (  3.487000)
1.705000   0.000000   1.705000 (  1.706000)
1.684000   0.000000   1.684000 (  1.684000)
1.678000   0.000000   1.678000 (  1.678000)
1.683000   0.000000   1.683000 (  1.683000)
1.679000   0.000000   1.679000 (  1.679000)
1.679000   0.000000   1.679000 (  1.679000)
1.681000   0.000000   1.681000 (  1.681000)
1.679000   0.000000   1.679000 (  1.679000)

Ruby 1.8.6:

~ $ ruby bench_fib_recursive.rb     
1.760000   0.010000   1.770000 (  1.775304)
1.750000   0.000000   1.750000 (  1.770101)
1.760000   0.010000   1.770000 (  1.768833)
1.750000   0.010000   1.760000 (  1.782908)
1.750000   0.010000   1.760000 (  1.774193)
1.750000   0.000000   1.750000 (  1.766951)
1.750000   0.010000   1.760000 (  1.777814)
1.750000   0.010000   1.760000 (  1.782449)

~ $ ruby bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
2.240000   0.000000   2.240000 (  2.268611)
2.160000   0.010000   2.170000 (  2.187729)
2.280000   0.010000   2.290000 (  2.292342)
2.210000   0.010000   2.220000 (  2.250331)
2.190000   0.010000   2.200000 (  2.210965)
2.230000   0.000000   2.230000 (  2.260737)
2.240000   0.010000   2.250000 (  2.256210)
2.150000   0.010000   2.160000 (  2.173298)
2.250000   0.010000   2.260000 (  2.271438)
2.160000   0.000000   2.160000 (  2.183670)

What do you think? Is it worth it?

Wednesday, July 11, 2007

Finding a JVM compilation strategy for Ruby's dynamic nature

In JRuby, we have a number of things we "decorate" the Java stack with for Ruby execution purposes. Put simply, we pass a bunch of extra context on the call stack for most method calls. At its most descriptive, making a method call passes the following along:

a ThreadContext object, for accessing JRuby call frames and variable scopes
the receiver object
the metaclass for the receiver object
the name of the method
a numeric index for the method, used for a fast dispatch mechanism
an array of arguments to the method
the type of call being performed (functional, normal, or variable)
any block/closure being passed to the method

Additionally there are a few places where we also pass the calling object, to use for visibility checks.

The problem arises when compiling Ruby code into Java bytecode. The case I'm looking at involves one of our benchmarks where a local variable is accessed and "to_i" is invoked on it a large number of times:

puts Benchmark.measure {
a = 5;
i = 0;
while i < 1000000
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  i += 1;
end
}

(that's 100 accesses and calls in a 1 million loop)

The block being passed to Benchmark.measure gets compiled into its own Java method on the resulting class, called something like closure0. This gets further bound into a CompiledBlock adapter which is what's eventually called when the block gets invoked.

Unfortunately all the additional context and overhead required in the compiled Ruby code seems to be causing trouble for hotspot.

In this case, the pieces causing the most trouble are obviously the "a.to_i" bits. I'll break that down.

"a" is a local variable in the same lexical scope, so we go to a local variable in closure0 that holds an array of local variable values.

 aload(SCOPE_INDEX)
ldc([index of a])
aaload

But for Ruby purposes we must also make sure a Java null is replaced with "nil" so we have an actual Ruby object

 dup
ifnonnull(ok)
pop
aload(NIL_INDEX) # immediately stored when method is invoked
label(ok)

So every variable access is at least seven bytecodes, since we need to access them from an object that can be shared with contained closures.

Then there's the to_i call. This is where it starts to get a little ugly. to_i is basically a "toInteger" method, and in this case, calling against a Ruby Fixnum, it doesn't do anything but return "self". So it's a no-arg noop for the most part.

The resulting bytecode to do the call ends up being uncomfortably long:

(assumes we already have the receiver, a Fixnum, on the stack)

 dup # dup receiver
invokevirtual "getMetaClass"
invokevirtual "getDispatcher" # a fast switch-based dispatcher
swap # dispatcher under receiver
aload(THREADCONTEXT)
swap # threadcontext under receiver
dup # dup receiver again
invokevirtual "getMetaClass" # for call purposes
ldc(methodName)
ldc(methodIndex)
getstatic(IRubyObject.EMPTY_ARRAY) # no args
ldc(call type)
getstatic(Block.NULL_BLOCK) # no closure
invokevirtual "Dispatcher.callMethod..."

So we're looking at roughly 15 operations to do a single no-arg call. If we were processing argument lists, it would obviously be more, especially since all argument lists eventually get stuffed into an IRubyObject[]. Summed up, this means:

100 a.to_i calls * (7 + 15 ops) = 2200 ops

That's 2200 operations to do 100 variable accesses and calls, where in Java code it would be more like 200 ops (aload + invokevirtual). An order of magnitude more work being done.

The closure above when run through my current compiler generates a Java method of something like 4000 bytes. That may not sound like a lot, but it seems to be hitting a limit in HotSpot that prevents it being JITed quickly (or sometimes, at all). And the size and complexity of this closure are certainly reasonable, if not common in Ruby code.

There's a few questions that come out of this, and I'm looking for more ideas too.

How bad is it to be generating large Java methods and how much impact does it have on HotSpot's ability to optimize?
This code obviously isn't optimal (two calls to getMetaClass, for example), but the size of the callMethod signature means even optimal code will still have a lot of argument loading to do. Any ideas on how to get around this in a general way? I'm thinking my only real chance is to find simpler signatures to invoke, such as arity-specific (so there's no requirement for an array of args), avoiding passing context that usually isn't needed (an object knows its metaclass already), and reverting back to a ThreadLocal to get the ThreadContext (though that was a big bottleneck for us before...).
Is the naive approach of breaking methods in two when possible "good enough"?

It should be noted that HotSpot eventually does JIT this code, it's substantially faster than the current general release of Ruby 1.8. But I'm worried about the complexity of the bytecode and actively looking for ways to simplify.

Friday, July 06, 2007

Understanding the JVM JIT and helping it along

I must apologize to my readers. I have been remiss in my blogging duties. I will be posting updates on the various events of the past month or so along with updates on JRuby progress and future events very soon. But for now, a technical divergence after a night of hacking.

--

I finally understand what we should be going for in our compiled code, and how we can really kick JRuby into the next level of performance.

The JVM, at least in HotSpot, gets a lot of its performance from its ability to inline code at runtime, and ultimately compile a method plus its inlined calls as a whole down to machine code. The benefit in doing this is the ability to do compiler optimizations across a much larger call path, essentially compiling all the logic for a method and its calls (and possibly their calls, ad infinatum) into a single optimized segment of machine code.

HotSpot is able to do this in a two main ways:

If it's obvious there's only ever one implementation of a given signature on a given type hierarchy
If it can determine at runtime that one (or a few) implementations are the only ones ever being called

The first one allows code to be optimized fairly quickly, because HotSpot can discover early on that there's only one implementation. In general, if there's a single implementation of a given signature, it will get inlined pretty quickly.

The second one is trickier. HotSpot tracks the actual types being called against for the various calls, and eventually can come up with a best guess at the method or methods to inline. It also can include a slow path for the rare future cases where the receiver does not match the target types, and it can deoptimize later to back down optimizations when situations change, such as when a new class is loaded into the system.

So in the end, inlining is one of the most powerful optimizations. Unfortunately in JRuby (and most other dynamic language implementations on the JVM), we're making inlining difficult or impossible in the most performance-sensitive areas. I believe this is a large part of our performance woes.

Consider that all method calls against any object must pass through an implementation of IRubyObject.callMethod. There's not too many callMethod implementations, and actually now there's only one implementation of each specific signature. So callMethod gets inlined pretty fast.

Consider also that almost all method calls within callMethod are to very specific methods and will also be inlined quickly. So callMethod is looking pretty good so far.

Now we look at the last step in callMethod...DynamicMethod.call. DynamicMethod is the top-level type for all our method objects in the system. The call method has numerous implementations, all of them different. And no one implementation stands out as the most frequently called. So we're already complicating matters for HotSpot, even though we know (based on the incoming method name) exactly the piece of code we *want* to call.

Let's continue on, assuming HotSpot is smart enough to work around our half-dozen or so DynamicMethod.call implementations.

DefaultMethod is the DynamicMethod implementation for interpreted Ruby code, so it calls directly into the evaluator. So at that point, DefaultMethod.call will inline the evaluator code and that looks pretty good. But there's also the JIT located in DefaultMethod. It generates a JVM bytecode version of the Ruby code and from then on DefaultMethod calls that. Now that's certainly a good thing on one hand, since we've eliminate the interpreter, but on the other hand we've essentially made it impossible for HotSpot to inline that generated code. Why? Because we generate a Java method for every JITable Ruby method. Hundreds, and eventually thousands of possible implementations. Making a decision to inline any of them into DefaultMethod.call is basically impossible. We've broken the chain.

To make matters worse, we also have the set of Java-wrapping DynamicMethod implementations, *CallbackMethod (used for binding Java code to Ruby method names) and CompiledMethod (used in AOT-compiled code).

The CallbackMethods all wrap another piece of generated code that implements Callback and calls the Java method in question. So we generate nice little wrappers for all the pre-existing methods we want to call, but we also make it impossible for the *CallbackMethod.call implementations to inline any of those calls. Broken chain again.

CompiledMethod is slightly better in this regard, since there's a new CompiledMethod subclass for every AOT-compiled Ruby method, but we still have a single implementaiton of DynamicMethod.call that all of those subclasses share in common. To make matters worse, even if we had separate DynamicMethod.call implementations, that may actually *hurt* our ability to inline code way back in IRubyObject.callMethod, since we've now added N possible DynamicMethod.call implementations to the system. And the chain gets broken even earlier.

So the bottom line here is that in order to continue improving performance, we need to do everything possible to move the call site and the call target closer together. There are a couple standard ways to do it:

Hard-coded special-case code for specific situations, much like YARV does for simple ops (+, -, <, >, etc). In these cases, the compiler would check that the target implements an appropriate type to do a direct call to the operation in question. In Fixnum's case, we'd first confirm it's a RubyFixnum, and then invoke e.g. RubyFixnum.plus directly. That skips all the chain breakage, and allows the compiled code to inline RubyFixnum.plus straight into the call site.
Dynamic generated method adapters that can be swapped out and that learn from previous calls to make direct invocations earlier in the chain. Basically, this would involve preparing call site caches that point at call adapters. Initially, the call adapters would be of some generic type that can use the slow path. But as more and more calls come in, more and more of the call sites would be replaced with specialized implementations that invoke the appropriate target code directly, allowing HotSpot a direct line from call site to call target.

The second version is obviously the ultimate goal, and essentially would mimic what the state-of-the-art JITs do (i.e. this is how HotSpot works under the covers). The first version is easily testable with some simple hackery.

I created a small patch that includes a trivial, unsafe change to the compiler to make Fixnum#+, Fixnum#-, and Fixnum#< direct calls when
possible. They're unsafe because they don't check to see if any of those
operations have been overridden...but of course you'd have to be a mad
fool to override them anyway.

To demonstrate a bit of the potential performance gains, here are some
numbers for JRuby trunk and trunk + patch. Note that Fixnum#+, Fixnum#-, and Fixnum#< are all already STI methods, which does a lot to speed up their invocation (STI uses a table of switch values to bypass dynamic method lookup). But this simple change of compiling direct calls completely blows the STI performance out of the water, and that's without similar direct calls to the fib_ruby method itself.

test/bench/bench_fib_recursive.rb

JRuby trunk without patch:
1.675000   0.000000   1.675000 (  1.675000)
1.244000   0.000000   1.244000 (  1.244000)
1.183000   0.000000   1.183000 (  1.183000)
1.173000   0.000000   1.173000 (  1.173000)
1.171000   0.000000   1.171000 (  1.170000)
1.178000   0.000000   1.178000 (  1.178000)
1.170000   0.000000   1.170000 (  1.170000)
1.169000   0.000000   1.169000 (  1.169000)

JRuby trunk with patch:
1.133000   0.000000   1.133000 (  1.133000)
0.922000   0.000000   0.922000 (  0.922000)
0.865000   0.000000   0.865000 (  0.865000)
0.862000   0.000000   0.862000 (  0.863000)
0.859000   0.000000   0.859000 (  0.859000)
0.859000   0.000000   0.859000 (  0.859000)
0.864000   0.000000   0.864000 (  0.863000)
0.859000   0.000000   0.859000 (  0.860000)

Ruby 1.8.6:
1.750000   0.010000   1.760000 (  1.760206)
1.760000   0.000000   1.760000 (  1.764561)
1.760000   0.000000   1.760000 (  1.762009)
1.750000   0.010000   1.760000 (  1.760286)
1.760000   0.000000   1.760000 (  1.759367)
1.750000   0.000000   1.750000 (  1.761763)
1.760000   0.010000   1.770000 (  1.798113)
1.760000   0.000000   1.760000 (  1.760355)

That's an improvement of over 25%, with about 20 lines of code. It would be even higher with a dynamic adapter for the fib_ruby call. And we can take this further...modify our Java integration code to do direct calls to Java types, modify compiled code to adapt to methods as they are redefined or added to the system, and so on and so forth. There's a ton of potential here.

I will continue working along this path.

Saturday, June 23, 2007

Camping at O'Reilly

Rumors of my demise were greatly exaggerated!

I've been mostly MIA the past couple weeks, largely due to the Ruby Kaigi and related events in Tokyo and a short family trip this past week to Lake Michigan's eastern shore (warm sandy beaches and a much-welcomed rest). But I figured I'd check in so everyone knows I'm still around and the wheels are still turning.

I'm at Foo Camp this weekend, and so far it's been a great time. Lots of good discussions, games of Werewolf, and staying up too late (I think I got about 3.5 hrs of sleep...more than enough!) Today I'm going to try to talk about something outside my usual subjects, mostly as a discussion facilitator, but with a bit of opinion tossed in for good measure. Here's the description I posted to the Foo calendar:

The Transformation Age

This is a collective presentation. There are no attendees; only nodes in the matrix. Expect to share.

Part 1: Individuals to collectives, isolated to connected, The Next Evolution.

As the exchange of information has become progressively more rapid, the communication boundaries more transparent, we are transforming into a new kind of collective entity. The swing from isolated individuals or isolated communities with their own distinct stores of knowledge toward a single global community with all stores of knowledge available to everyone is more than just an interesting trend...it represents a new evolutionary direction humanity is taking. It is the dawn of the transformation age for humankind.

We will explore what it means to be more and more a collective being in this new era, what it will mean for our children and grandchildren and nth grandchildren as pervasive interconnection becomes the norm. As all minds become so intimately connected that being disconnected from the global consciousness is like losing an arm...or one's own identity. And we'll look at how this new evolution is affecting and has affected successful and unsuccessful technologies, business models, and governments over the past several decades.

Part 2 (time permitting): Knowledge represented as transformation of information rather than as desperate attempts to snapshot or categorize information at a given point in time.

The idea of categorizing information has served us fairly well when information itself was slow to change, slow to be communicated, and largely static. Books can be sorted in a categorization scheme because the information they contain does not change over time; the categories remain as valid as when they were assigned. But what happens when the entirety of humanity's knowledge is now not only instantly available, but rapidly changing and evolving along with us? Is not categorization of changing information inherently flawed when information snapshots are almost immediately out of date? What can we do to allow everyone in this age of transformation to participate in information sharing in scalable way?

I would propose that when information itself has become so fluid it defies static categorization, that the transformation of knowledge is the new information we need to track. Already you see the signs of this:

Wikis are updated far more often than they have completely new entries added; the success of wiki is in transformation of knowledge over time, and in continuous evolution of the information contained therein.
Scientists in all fields build new ideas upon old; there are no new ideas that don't synthesize existing parts, transforming our understanding of those parts into a new entity.
Agile development emphasizes small, rapid, interactive transformations of data (a software algorithm + user interfaces on the large) rather than as big bang snapshots and releases.
Open source licenses reduce the barriers to accessing data effectively to zero explicitly to allow for rapid transformation of that data into today's entity. "Build from trunk" is becoming a normal recommendation for rapidly transforming projects, since snapshots are immediately out of date.

We'll talk about whether this all makes sense, whether moment-in-time snapshots of information are becoming more or less relevant than the changes over time to that information, and what can be done to both facilitate transformation and adapt our traditional information management ideals to this constantly changing information sea.

If you're interested in hearing more about all this, about how the session goes, or just would like to talk sometime along these lines, gimme a shout.

Sunday, June 10, 2007

JRuby 1.0 Released!

We have finally released JRuby 1.0, based on the last release candidate, RC3. And what more is there to say? Not really a whole lot...It's almost entirely RC3, with one or two minor fixes added in. But it's really turned out to be an outstanding release, and already reports are coming in of folks trying it out en masse. We're very happy.

So I'll do a little recap here. JRuby 1.0 was focused almost entirely on one goal: Ruby 1.8.x compatibility. To that end, we are now the only alternative Ruby implementation that can reasonably claim we're "compatible". It's no longer a question of whether we can run Ruby applications or not...we've proven that again and again. The issues people run into now are those requiring minor behavioral tweaks, minor parser tweaks, and occasionally exploration of some peculiar threading or memory concerns. It's been a long time coming, but the compatibility issue is largely answered.

Now we start looking toward the future. Once you have a working Ruby implementation, what next? I believe we've shown that the correct way to approach Ruby is to get it right first. The next step is making it run as well as possible. Ok, so we've cheated a bit along the way, introducing interpreter optimizations and a JIT compiler, but that's all fun stuff. The heavy lifting for performance and scalability are coming up, and there's a lot of low-hanging fruit we can start to tackle with 1.0 safely behind us. So action item #1 for the future of JRuby is plucking that low-hanging performance fruit.

The second item for any working implementation would have to be platform integration. Our platform is Java (and of course we've cheated a bit here by doing additional work to make Java integration really useful and usable), so integration involves making Ruby a better citizen of the Java platform. In this area, expect to see us further reduce the disconnect between Ruby and Java, allowing you to construct Ruby objects directly from Java code, define real Java classes using Ruby (classes you can then compile Java code to call), and minimizing to as large an extent possible the performance impact of calling from dynamically typed Ruby code into statically typed Java code. Java integration is action item #2.

And then what? Well, there are many options, all very attractive. I personally would like to see time spent implementing potential Ruby 1.9/2.0 features, to provide a second testbed where people can try those features out. I would like to find ways to share code with Rubinius (beyond tests), either by implementing just enough Rubinius logic to run its pure-Ruby core-class implementations or by implementing a full Rubinius kernel on the JVM. I would like to see JRuby expand to the rest of the Java world, bringing the Ruby Way to Java EE. But most importantly, I want to reach out to the Ruby community and find ways for them to be a part of the JRuby process.

Up to now, JRuby has really existed in its own, separate world. There was the Ruby community and also the JRuby community, and although members might identify themselves with both, there was always this perceived boundary.

We need to break that boundary down...not only for JRuby but for the other Ruby implementations as well.

Ruby is coming of age. Multiple implementations shows that Ruby has really matured as a language...and also shows it has a lot of maturing left to do. Ask Evan (of Rubinius) or me how we feel about retry behavior or block argument processing or thread event processing or SAFE levels and tainting and you'll start to understand some of the ugly, hidden bits of Ruby. We need to make sure that Ruby development proceeds as a whole...not necessarily as a single project or a single codebase, but as an open, direct exchange of concerns, complaints, solutions and ideas. We need to start treating the Rubinius community and the JRuby community and any other implementations' communities as part of the whole...different facets of the same gem.

If you have never tried an alternative implementation of Ruby: do it today. Pick out your favorite app, library, or framework and build your next app using something other than Matz's Ruby. Start running your continuous integration tests against JRuby trunk (or Rubinius trunk, if it runs your code ok; those guys sorely need more CI hits). Make sure your gem releases work on JRuby too, and if you have native (i.e. C) code, explore what it would take to do a JRuby port. Show that you welcome diversity into the Ruby world...that you recognize that diversity is an essential part of language evolution.

JRuby has always been a community project, and only by absorbing the JRuby community into the Ruby community will JRuby continue to be successful. If we can make that happen, I see wonderful things in Ruby's future...regardless of which implementation you use.

Monday, June 04, 2007

A Response to Ola's IronRuby Post

I'm not up for creative titles tonight. Hopefully you'll see this in your feed reader and click for a second opinion. Granted, I agree with Ola's IronRuby post on most points, but I disagree on a few key items. So let's dive in, shall we?

You managed to blog this viewpoint before me, Ola, but you know I agree with almost everything. However, to pour a bit more oil on the fire, I'll take it a step further.

Having run the Ruby gauntlet and brought JRuby from not running anything to running Rails almost 100% perfectly (in just over a year, I might add), I will confidently say there's no way with current specs and tests that anyone could create an implementation of Ruby from scratch that will run Rails unless they can look at the existing implementations. I simply do not believe it's possible.

And I'll throw another couple curve balls too:

I don't believe tests and specs will be where they need to be within the next 6-12 months unless there's a major effort put behind them. Even if that effort happens, I still don't think we'll ever have specs enough for someone to implement Ruby "in the dark" for at least a year, if it ever happens.
IronPython did a great job getting pretty close to 100% compatible. But Jim Hugunin had implemented an almost 100% Python-compatible implementation in Jython before going to Microsoft; he didn't need to look at the Python source. I don't believe John Lam has the same level of experience with Ruby, so he's at a severe disadvantage (and to John: I really feel for you, man...this has got to be difficult).
This is a good friend's belief, but he's won me over: we don't believe Microsoft would ever willingly allow IronRuby to get to the point of running Rails, since that would directly compete with their ASP.NET server, software, and tool offerings. What would be the benefit to them of a free runtime running a free language implementation that runs a free web framework? Probably zero. And as Martin Fowler and others have blogged about NUnit, Microsoft hasn't exactly been lovey-dovey with OSS projects that impact (or are perceived to impact) their bottom line.
Given these facts and the current situation, I'd say it's a better bet for us as a community to get behind the Gardens Point Ruby.NET Compiler project, which is already much farther along than IronRuby...plus it's real open source (you can contribute) and they can look at Ruby's source (and have admitted to doing so for at least the parser). I was wary/skeptical of Ruby.NET last year, but they now seem like the current best hope for Ruby on the CLR. That link again is Gardens Point Ruby.NET Compiler. And to the QUT guys working on Ruby.NET, you need to do three things right now to save your project from historical obscurity: set up a public source repository, accept a few external contributors, and start blogging and emailing up a freaking storm.

That about sums up what I have to say about IronRuby, Microsoft, and the future of Ruby on the CLR. The rest of Ola's points I agree with...diversity is important, a spec and test kit are needed immediately, and Microsoft needs to change their OSS attitude pretty quick or risk becoming irrelevant.

Sunday, June 03, 2007

JRuby 1.0.0RC3 Released - And This Is It!

Tom posted the announcements already, but JRuby 1.0.0RC3 is out in the wild! This release is our most important yet, because we intend for this release to become JRuby 1.0. The only things that will change from now until a 1.0 final release later this week would be any showstopping bugs that are extremely low-impact to fix. In general, RC3 should be nearly identical to 1.0 final.

So what does this mean for JRuby? We've taken the approach of calling the JRuby 1.0 release the "Ruby compatible" release. Basically, all known application bugs caused by JRuby incompatibilities with Matz's Ruby (MRI) have been resolved. This doesn't mean there aren't more compatibility bugs; there probably always will be, and we have probably 50-80 in the bug repository right now (and another 80 bugs or enhancements that are JRuby-specific). But it does mean that most applications should "just work" out of the box, and any application failures we knew of have been resolved. In general, the remaining post 1.0 bugs are edge cases and minor correctness issues, like obscure forms of core methods or missing error conditions.

We've tried very hard to make this both a solid release and a stepping stone to JRuby's future. With this release, we can confidently say that we're Ruby 1.8.5 compatible. Any additional fixes we need to make (like the known issues) are in the last 1% of Ruby features we can potentially support. We also recognize that this is a big moment for JRuby. People can start counting on JRuby to run their Ruby applications correctly. Of course many in the JRuby community have already been doing this for many months, but the 1.0 moniker says to the world we feel like we're ready for prime time.

So what comes after 1.0? That breaks down into a few areas:

Performance. JRuby's performance has come a long way in the past year. We've increased speed by at least an order of magnitude, and have enabled the JIT compiler for the 1.0 release. This means that for many cases where JRuby can compile Ruby code, it will perform faster than Ruby 1.8.5. The general case is still a little muddy, though. We've had anecdotal reports that JRuby on Rails in a Java application server performance extremely well, and other reports that general Ruby applications perform somewhat poorly. The general performance situation is not well-understood, but we all agree there's a lot more work to be done. And the good news is that we have a big list of optimizations remaining to continue improving JRuby's specific and general performance.
Java Integration. JRuby does an excellent job of fitting into the Java platform. In almost all cases, you can call libraries, implement interfaces, and extend classes with no difficulty. But there are edge cases--usually nonstandard or antipattern Java--that JRuby doesn't behave as nicely. The code to enable calling Java is also far more complicated than we'd like. To solve both issues, we'll be taking the existing Java integration syntax and API and backing it with a redesigned library. Expect to see something of this work in a 1.1 release later this year. Our goal is to achieve excellent integration with Java, on par with the most tightly-integrated JVM languages available today. And we're not far off.
Ruby 2.0 and Rubinius. We intend to start supporting Ruby 2.0 and Rubinius bytecode execution soon, though the Ruby 2.0 work is much farther along. We intend to catch up the Rubinius work as well as to try adopting some of Rubinius's pure Ruby implementations of core class functionality. We also plan to enable the use of Ruby 2.0 features through configuration and command-line switches to JRuby. Specify something like -J-Djruby.string.version=2 and all strings in the system will be treated as Ruby 2.0 strings, with full character semantics. Or optionally turn on experimental Ruby 2.0 syntax for lambdas and named parameters. Basically we want to bring JRuby up to the bleeding edge of Ruby features, to provide another platform where people can get a feel for what Ruby 2.0 might look like. More exposure for these features will help Matz and Co. decide the best way to move forward. And that's good for everyone.

And of course we'll continue fixing compatibility bugs, Java integration bugs, and expanding JRuby's reach on the Java platforms with more DSLs and API wrappers around all your favorite Java APIs.

But there's one more area I should talk about: how you can get involved.

JRuby is a community project. The core committers play traffic cops as much as developers, routing patches, examining bugs, documenting features. The only way a project like JRuby succeeds is through mass community involvement. We've been extremely lucky to have a constant flow of interested developers into our community, even when OSS community attrition has been very high. The community may look completely different from month to month, but the flow of patches and bugs has steadily increased.

For folks just getting started, I've written a few articles on the JRuby wiki that should help you understand the development process. They're linked from the front page under "Getting Involved".

For easy bugs, I'd recommend looking in the JRuby JIRA for bugs with "rubinius" or bugs reported by Daniel Berger. The Rubinius bugs are almost all problems we've had running Rubinius's excellent specs, and usually represent minor incompatibilities with C Ruby. Daniel's tests are a lot of those edge cases I mentioned...he's running through his own test suite and reporting any incorrect behavior that's come up. Another option would be to just pick your favorite Ruby app and start running its test cases with JRuby. You're sure to find something interesting to report, and it may be an easy fix.

For non-Java-coders (or folks afraid of hacking on JRuby), stop by the RubySpec Wiki and update or author an article. RubySpec is an effort to build a community-driven specification of Ruby that all users and implementers can freely reference. It is linked from the RubyDoc site and is fast becoming a standard way for the community to record language and library behaviors. I believe this is the best and fastest way for us to form a complete specification of Ruby's behavior...and I believe such a specification is becoming extremely important, what with there now being 5-10 different implementations of Ruby, all guessing at what "correct" is. Have you written RubySpec today?

Well that's about it for this release report. If all goes as well as I think it will, JRuby 1.0 final should be released some time this week. Give it a try, I think you'll be pleasantly surprised!

Friday, June 01, 2007

Creating a Field-Initializing 'new' Method

One thing often touted as a missing feature in Ruby is the lack of a constructor form that initializes fields. A few other languages have this feature, including for example Groovy, another JVM dynamic language. The general idea is that if you want to construct an object and initialize a number of fields, you often want to do it in one shot. Rather than modify the class to have additional initializers for all the fields you want to set, there's another option.

Because Ruby is so cool, you can add this feature yourself to all classes at the same time.


class Class
  def new!(*args, &block)
    # make sure we have arguments
    if args && args.size > 0
      # if it's not a Hash, perform a normal "new"
      return new(*args, &block) unless Hash === args[-1]

      # grab the last arg in the list
      last_arg = args.pop

      # make sure all fields actually exist
      last_arg.each_key {|key|
        unless public_instance_methods.include?("#{key}=") do
          raise ArgumentError.new(
                           "No attr setter for name: #{key}") 
        end
      }

      # create the object and set its fields
      new_obj = new(*args, &block)
      last_arg.each {|key, value|
        new_obj.send "#{key}=", value
      }
    else
      # no args, just do a normal "new" with any block passed
      new_obj = new(&block)
    end
    new_obj
  end
end

So with such a simple piece of code, we now have a new! method on all classes that accepts a final parameter--a hash of field names and values--that can be given using Ruby's named-parameter-like syntax. Given a simple class, like the following:


class MyObject
  attr_accessor :foo
  attr_accessor :bar
  
  def initialize(msg)
    puts msg
  end
end

No additional work is needed to use our new! method:


x = MyObject.new!("yippee",
                    :foo => "hello", :bar => "goodbye") 
 => "yippee"
p [x.foo, x.bar]
 => ["hello", "goodbye"]
y = MyObject.new!("blah", :yuck => "baz")
 => error: "No attr setter for name: yuck"

The reason this works is that all classes are instances of the Class class. So the MyObject class definition above is roughly equivalent to saying:


MyObject = Class.new {
  # class def logic here
}

This means that instances of Class, like MyObject, inherit methods defined on Class, like new!. Since all classes in the system are Class objects, all classes instantly gain a new! method.

This is a perfect example of why Ruby is such a powerful language, and why it's so easy in Ruby to use the coolest metaprogramming tricks. And it's a primary reason why frameworks like Rails have been able to do such amazing things. With a language that's this powerful and this easy, you can imagine what else is possible.

Are we having fun yet?

Sunday, May 27, 2007

Adding Annotations to JRuby Using Ruby

I love how meta-programmable Ruby is.

JRuby doesn't support annotations because Ruby doesn't support annotations. So what! We can extend Ruby to add something like annotations:


class JPABean
def self.inherited(clazz)
  @@annotations = {}
end
def self.anno(annotation)
  @@last_annotation = annotation
end

def self.method_added(sym)
  @@annotations[sym] = @@last_annotation
end

def self.get_annotation(sym)
  return @@annotations[sym]
end
end

Given this, we can do things like the following:


class MyBean :sql => "SELECT * FROM stuff WHERE name = 'foo'"
def foo; end

anno :field => :hello_id
attr :hello

anno :field => :bar_in_table
attr_accessor :bar
end

And here's some output from the resulting class:


p MyBean.get_annotation(:foo)
p MyBean.get_annotation(:hello)
p MyBean.get_annotation(:bar=)

=>

{:sql=>"SELECT * FROM stuff WHERE name = 'foo'"}
{:field=>:hello_id}
{:field=>:bar_in_table}

So as you can see we really do have all the necessary requirements to annotate classes. Now what if we just had MyBean be a java.lang.Object extension and stuffed the annotations into the resulting generated class? We can already create real Java classes in this way, but with the above syntax and a little magic in the JRuby Java integration later, we've got annotation support in on Java classes too. This should enable things like Hibernate, JPA, and JUnit 4 to work with JRuby's Ruby-based classes. Or at least, I believe it to be possible. It just requires a little work to add annotation information to the resulting generated classes.

I've planted the seed here and on the JRuby dev list...let's see if it germinates.

JVM Languages: The Future

I've started a Google Group for all those interested in the future of alternative languages (i.e. not Java) on the JVM. A number of you knew this was coming, and I've already invited folks that expressed an interest in my RedMonk Unconference session at JavaOne. The rest of you are also invited.

Please plan to talk shop about language implementation strategies, pain points on the JVM, and what we can do to build a common set of libraries, frameworks, and patterns to ease and improve the Java platform's support for many languages. The time for action is now, my friends, and we have a wealth of information across all our language projects to draw from. We must come together to build a stronger base for everyone to stand upon.

I welcome you to the future.

Saturday, May 26, 2007

Foo Camp 2007

I've been invited to this year's Foo Camp. Since it sounds like a great time and there's a bunch of other folks going I'd like to talk to, I'm accepting the invitation. My purpose posting this entry is to let other campers know I'll be there and try to hook up and chat a bit beforehand. So drop me a line...firstname.lastname@sun.com.

FYI, if any other campers are coming up from Oakland, I'm renting a car. My times are listed on the "Rides Offered" page.

I'm looking forward to a fun weekend!

Tuesday, May 22, 2007

The Final Bugs

We're within weeks of a final JRuby 1.0. We've pared down the bugs that we think we can or must fix for 1.0 and this email is a summary of the ones we need help with. So whatever time you can spare, please have a look and help resolve these.

In order of decreasing priority in JIRA:

JRUBY-820: Net::HTTP.get behaves differently form MRI, failing to get UTF8 properly
-and-
JRUBY-828: UTF-8 regular expressions aren't working

I believe these are both largely the same problem, and it's probably the most visible remaining bug we need to fix. Regular expressions in JRuby currently do not work well with unicode strings, both as incoming match strings and as the regular expression strings themselves. REXML uses /u regular expressions, which is why I believe these issues are closely related. Wes Nakamura has commented that he's looking into 828, but more eyes will help ensure this is fixed.
I believe these are true blocking bugs for 1.0, and I'm not comfortable doing a release if they are not resolved.

JRUBY-971: Make it clear which Java method maps to each equality method

I think we have subtle equality bugs remaining largely because we don't use a consistent naming convention for the Java implementations of Ruby methods like ==, ===, eql? and so on. We should make a quick sweep through the system and make sure all core JRuby code is using the same Java method names for all of these, and binding them in the same way. Note: This is actually the cause of some set_trace_func bugs, since the default === impl should actually invoke ==, causing trace events for both.

JRUBY-969: Add rake and rspec to standard JRuby distribution for 1.0

We have decided to ship both Rake and RSpec with JRuby 1.0, since they are both well-accepted and widely used (moreso for Rake, but RSpec has gained a lot of acceptance the past year). But the mechanism of including them is unclear. I do not want to commit installed gems to SVN; I would prefer to just install them as needed for testing and when building the JRuby distribution. Thoughts?

JRUBY-672: java.lang.Class representation of Ruby class not retrievable

This bug is basically just looking for a way to get at the actual java.lang.Class representing a Ruby-based extension of a Java class. There may have been an API added with Bill's work, or this could be simple to add.

JRUBY-914: JRuby's BigDecimal needs to round

Stu submitted a patch for this, but it unfortunately depends on behavior only present in Java 5 and higher. Since most of us are pretty unfamiliar with BigDecimal, we don't know of a good workaround to support it correctly in Java 1.4.

JRUBY-822: jruby fails to report process exit status correctly

I think Nick may have a grip on this one, but if anyone can offer suggestions as to why the exit codes are apparently shifted in this way, please let us know.

JRUBY-966: Various issues with LocalJumpError not being created early enough
-and-
JRUBY-767: next in an eval should produce a local jump error

These are likely going to be up to Tom and I since they involve pretty deep runtime/interpreter work. But we're also looking for any remaining known issues with LocalJumpError/JumpException that are still out there. I believe we've fixed all such externally-reported issues at the moment.

JRUBY-888: Make regular distro/build create the all-in-one jar by default

We waffle back and forth on this. What's the answer? An all-in-one jar by default, or on request?

JRUBY-873: ant test thread tests sometimes run forever

I managed to get a trace on this one. It appears that some of the tests cause a deadlock between stopping a thread and other operations. But I don't believe anyone's seen it in normal execution yet (or at least nobody's reported it). I'm probably the only one familiar enough with the threading subsystem to fix it, but I'd like to know if anyone's seen other issues.

JRUBY-98: allow "/" as absolute path in Windows
-and-
JRUBY-36: Dir['...'] incompatibilities between Ruby and accross platforms
-and-
JRUBY-61: IO CRLF compatibility with cruby on Windows
-and-
JRUBY-644: Windows line delimiter (\r\n) cause position errors

These blasted Windows pathing and newline bugs just seem to hang on forever. Nobody wants to fix them. There must be someone out there that can help resolve these, yes?

JRUBY-884: Create and consolidate extension and non-standard options

There are a number of configuration options for JRuby. Some have their own flags, like -O to disable ObjectSpace and -C to force compilation of a target script. But many others are only configurable via Java system properties. What we really could use from the community is some idea which settings are worth adding special flags for. We can handle the rest. My personal opinion is that the new -J flag that allows passing flags to the JVM is actually enough.

By my estimation, the rest of the bugs are either almost done or are trivial enough they could be punted to a post 1.0 release. But if we can get these issues knocked down soon (starting with those top couple), 1.0 is going to be an extremely solid release.

Monday, May 14, 2007

Big Plans

I just came across this amusing entry from an old, now-defunct blog of mine on LiveJournal.

The last line pretty much says it all. I must have been really frustrated with my job at the time:

Someone should pay me to sit at home and do great things.

(If you don't get the joke: I now work for Sun Microsystems, at home, on JRuby...so if JRuby's ever considered a "great thing", the prophecy will be fulfilled.)

And yes, I've seen the Microsoft news. I'd hate to be an OSS developer or apologist at Microsoft today. If Sun did something like this I'd resign.

Thursday, May 10, 2007

JSR-292 Summit Wrap-Up

Today Tom and I met with fellow language implementers Oti Humbel, Charlie Groves, and Tobias Ivarsson to start discussions about JSR-292 (invokedynamic) with John Rose, longtime VM implementer and new spec lead. It was an excellent discussion, with John listening patiently to the implementation challenges we've had on JRuby and Jython and offering suggestions. He took copious notes and posted them here.

We have the second edition of the JRuby on Rails talk tomorrow morning, so I'm signing out now. But check out John's post, it's a great first look at a promising start on 292 work.

Monday, May 07, 2007

"the solution is JRuby"

Big news today from ThoughtWorks Studios! Their collaborative development project management solution Mingle (think SourceForge, but pretty and usable) will be the first commercially-distributed Rails application to run on JRuby. Jon Tirsen summarizes the justification for this move in a Studios blog post.

The reasons are obvious ones to those of us who've poured our hearts and souls into making JRuby succeed, but they're excellent points to call out for the Ruby community at large. Mingle 1.0 will be released running with a Jetty web front-end and a Derby database back-end. Mingle 1.1 will be released as a WAR file (Rails-in-a-WAR courtesy of GoldSpike) targeting any app server and any database.

This is a big day for JRuby. ThoughtWorks is one of the most respected names in the Ruby world, and we're proud that they've chosen JRuby as their enterprise deployment solution. We're happy to have ThoughtWorkers like Ola Bini on the team, and look forward to collaborating with ThoughtWorks to continue improving JRuby into the future.

Wednesday, May 02, 2007

JRuby: The T-Shirt!

Yes, I've managed to secure a small order of JRuby t-shirts for giveaway at JavaOne and RailsConf. And they're pretty sweet, have a look at the logo on the front:

And the back will have in white the words:

include Java

How cool is that? The logo was designed and contributed by "Liz" (Liz, let me know if you want me to post contact info or anything...great work) and arranged by Damian Steer, longtime JRuby contributor. I handled the legwork of arranging the shirts, and managed to secure a little funding to get them paid for.

So! How does one get one of these slicko t-shirts? Well, the first round of giveaways goes to folks who've made contributions in the past year. And by contributions I mean either patches or some serious research and legwork on a bug. Obviously there's a core group of folks that should automatically get shirts, so contact me about it if you think you qualify and I'll try to set one aside.

Otherwise...You've got the next week or two to fix a bug, post a patch, and otherwise make yourself useful to the JRuby project. I'm gonna be pretty strict on this...if you want the goods, you've got to help out. May is 1.0 cleanup time, with an RC coming out soon and hopefully a final not too long after. So if you'd been considering helping out, now's the time!

Visit the JRuby homepage, check out the JRuby bug tracker, and grab JRuby source. The details of contributing are linked from the front page of the JRuby wiki.

Update: Oh yeah...how will you get the shirts. The easiest way will be to hunt down Tom and I at the JRuby booth (in Sun's section) between 1PM and 3PM every day. We'll verify that you're not a slacker and make the arrangements. We're looking forward to meeting contributors in person. Otherwise, email me and we'll try to make other arrangements.

For RailsConf, I'll just have the shirts at the hotel, and we'll figure something out. Maybe I'll just bring the box to our training and session there.