Headius

Thursday, September 27, 2007

The Compiler Is Complete

It is a glorious day in JRuby-land, for the compiler is now complete.

Tom and I have been traveling in Europe the past two weeks, first for RailsConf EU in Berlin and currently in Århus, Denmark for JAOO (which was an excellent conference, I highly recommend it). And usually, that would mean getting very little work done...but time is short, and I've been putting in full days for almost the entire trip.

Let's recap my compiler-related commits made on the road:

r4330, Sep 16: ClassVarDeclNode and RescueNode compiling; all tests pass...we're in the home stretch now!
r4339, Sep 17: Fix for return in rescue not bubbling all the way out of the synthetic method generated to wrap it.
r4341, Sep 17: Adding super compilation, disabled until the whole findImplementers thing is tidied up in the generated code.
r4342, Sep 17: Enable super compilation! I had to disable an assertion to do it, but it doesn't seem to hurt things and I have a big fixme on it.
r4355, Sep 19: zsuper is getting closer, but still not enabled.
r4362, Sep 20: Enabled a number of additional flow-control syntax within ensure bodies, and def within blocks.
r4363, Sep 20: Re-disabling break in ensure; it caused problems for Rails and needs to be investigated in more depth.
r4367, Sep 21: Removing the overhead of constructing ISourcePosition objects for every line in the compiler; I moved construction to be a one-time cost and perf numbers went back to where they were before.
r4368, Sep 21: Small optz for literal string perf and memory use: cache a single bytelist per literal in the compiled class and share it across all literal strings constructed.
r4370, Sep 22: Enable compilation of multiple assignment with args (rest args)
r4375, Sep 24: Total refactoring of zsuper argument processing, and zsuper is now enabled in the compiler. We still need more/better tests and specs for zsuper, unfortunately.
r4377, Sep 24: Compile the remaining part of case/when, for when *whatever (appears as nested when nodes in the ast...why?)
r4388, Sep 25: Add compilation of global and constant assignment in masgn/block args
r4392, Sep 25: Compilation of break within ensurified sections; basically just do a normal breakjump instead of Java jumps
r4400, Sep 25: Fix for JRUBY-1388, plus an additional fix where it wasn't scoping constants in the right module.
r4401, Sep 25: Retry compilation!
r4402, Sep 26: Multiple additional cleanups, fixes, to the compiler; expand stack-based methods to include those with opt/rest/block args, fix a problem with redo/next in ensure/rescue; fix an issue in the ASTInspector not inspecting opt arg values; shrink the generated bytecode by offloading to CompilerHelpers in a few places. Ruby stdlib now compiles completely. Yay!
r4404, Sep 26: Add ASM "CheckClass" adapter to command-line (class file dumping) part of compiler.
r4405, Sep 26: A few additional fixes for rescue method names and reduced size for the pre-allocated calladapters, strings, and positions.
r4410, Sep 27: A number of additional fixes for the compiler to remedy inconsistent stack issues, and a whole slew of work to make apps run correctly with AOT-compiled stdlib. Very close to "complete" in my eyes.
r4412, Sep 27: Fixes to top-level scoping for AOT-compiled methods, loading sequence, and some minor compiler tweaks to make rubygems start up and run correctly with AOT-compiled stdlib.
r4413, Sep 27: Fixed the last known bug in the compiler. It is now complete.
r4414, Sep 27: Ok, now the compiler is REALLY complete. I forgot about BEGIN and END nodes. The only remaining node that doesn't compile is OptN, whichwe won't put in the compiled output (we'll just wrap execution of scripts with the appropriate logic). It's a good day to be alive!

I think I've done a decent job proving you can get some serious work done on the road, even while preparing two talks and hob-nobbing with fellow geeks. But of course this is an enormous milestone for JRuby in general.

For the first time ever, there is a complete, fully-functional Ruby 1.8 compiler. There have been other compilers announced that were able to handle all Ruby syntax, and perhaps even compile the entire standard library. But they have never gotten to what in my eyes is really "complete": being able to dump the stdlib .rb files and continue running nontrivial applications like IRB or RubyGems. I think I'm allowed to be a little proud of that accomplishment. JRuby has the first complete and functional 1.8-semantics compiler. That's pretty cool.

What's even more cool is that this has all been accomplished while keeping a fully-functional interpreter working in concert. We've even made great strides in speeding up interpreted mode to almost as fast as the C implementation of Ruby 1.8, and we still have more ideas. So for the first time, there's a mixed-mode Ruby runtime that can run interpreted, compiled, or both at the same time. Doubly cool. This also means that we don't have to pay a massive compilation cost for 'eval' and friends, and that we can be deployed in a security-restricted environment where runtime code-generation is forbidden.

I will try to prepare a document soon about the design of the compiler, the decisions made, and what the future holds. But for now, I have at least one teaser for you to chew on: there is a second compiler in the works, this time for creating real Java classes you can construct and invoke directly from Java-land. Yes, you heard me.

Compiler #2

Compiler #2 will basically take a Ruby class in a given file (or multiple Ruby classes, if you so choose) and generate a normal Java type. This type will look and feel like any other Java class:

You can instantiate it with a normal new MyClass(arg1, arg2) from Java code
You can invoke all its methods with normal Java invocations
You can extend it with your own Java classes

The basic idea behind this compiler is to take all the visible signatures in a Ruby class definition, as seen during a quick walk through the code, and turn them into Java signatures on a normal class. Behind the scenes, those signatures will just dynamically invoke the named method, passing arguments through as normal. So for example, a piece of Ruby code like this:

class MyClass
 def initialize(arg1, arg2); end
 def method1(arg1); end
 def method2(arg1, arg2 = 'foo', *arg3); end
end

Might produce a Java class equivalent to this:

public class MyClass extends RubyObject {
   public MyClass(Object arg1, Object arg2) {
       callMethod("initialize", arg1, arg2);
   }

   public Object method1(Object arg1) {
       return callMethod("method1", arg1);
   }

   public Object method2(Object arg1, Object... optAndRest) {
       return callMethod("method2", arg1, optAndRest);
   }
}

It's a pretty trivial amount of code to generate, but it completes that "last mile" of Java integration, being directly callable from Java and directly integrated into Java type hierarchies. Triply cool?

Of course the use of Object everywhere is somewhat less than ideal, so I've been thinking through implementation-independent ways to specify signatures for Ruby methods. The requirement in my mind is that the same code can run in JRuby and any other Ruby without modification, but in JRuby it will gain additional static type signatures for calls from Java. The syntax I'm kicking around right now looks something like this:

class MyClass
...
 {String => [Integer, Array]}
 def mymethod(num, ary); end
end

If you're unfamiliar with it, this is basically just a literal hash syntax. The return type, String, is associated with the types of the two method arguments, Integer and Array. In any normal Ruby implementation, this line would be executed, a hash constructed, and execution would proceed with the hash likely getting quickly garbage collected. However Compiler #2 would encounter these lines in the class body and use them to create method signatures like this:

    public String mymethod(int num, List ary) {
       ...
   }

The final syntax is of course open for debate, but I can assure you this compiler will be far easier to write than the general compiler. It may not be complete before JRuby 1.1 in November, but it won't take long.

So there you have it, friends. Our work on JRuby has shown that it is possible to fully compile Ruby code for a general-purpose VM, and even that Ruby can be made to integrate as a first-class citizen on the Java platform, fitting in wherever Java code may be used today.

Are you as excited as I am?

Wednesday, September 19, 2007

Are Authors Technological Poseurs?

Recently, the JRuby team has gone through the motions of getting a definitive JRuby book underway. We've talked through outlines, some some chapter assignments, and discussed the overall feel of a book and how it should progress. I believe one or two of us may have started writing. However the entire exercise has made one thing abundantly clear to me:

Good authors do not have time to be good developers.

Think of your favorite technical author, perhaps one of the more insightful, or the one who takes the most care in their authoring craft. Now tell me one serious, nontrivial contribution they've made in the form of real code. It's hard, isn't it?

Of course I don't intend to paint with too wide a brush. There are, without a doubt, good authors that manage to keep a balance between words and code. But I'm increasingly of the opinion that it's not practical or perhaps even possible to maintain a serious dedication to both writing and coding.

What brings me to this conclusion is the growing realization that working on a JRuby book would--for me--mean a good bit less time spent working on JRuby and related projects. I fully intend to make a large contribution to the eventual JRuby book, but JRuby as a passion has meant most of my waking hours are spent working on real, difficult problems (and in some cases, really difficult problems). It physically pains me to consider taking time away from that work.

And I do not believe it's from a lack of interest in writing. I have long wanted to write a book, and as most of my blog posts should attest, I love putting my thoughts and ideas into a written form. I enjoy crafting English prose almost as much as I enjoy crafting excellent code. But at the end of the day, I am still a coder, and that is where my heart lies. I suspect I am not alone.

When I decided to write this post, I tried to think of concrete examples. Though many came to mind, I could not think of a way to name names without seeming malicious or disrespectful. So I leave it as an exercise for the reader. Am I totally off base? Or is there a direct correlation between the quality and breadth of an author's work and a suspicious (or obvious) lack of real, concrete development?

Friday, September 14, 2007

The End Is Near For Mongrel

At conferences and online, Tom and I have long been talking about a mystery deployment option coming soon from the GlassFish team. It would combine the agile command-line-friendly model of Mongrel with the power and simplicity of deploying to a Java application server. We have shown a few quick demos, but they have only been very simple and very limited. Plus there was no public release we could give people.

Until Now.

Today, you can finally download the GlassFish-Rails preview release gem.

So what is this tasty little morsel? Well, it's a 2.9MB Ruby gem containing the GlassFish server and the Grizzly connector for JRuby on Rails. It installs a "glassfish_rails" script in JRuby's bin directory, and you're done.

Witness!

~ $ gem install glassfish-gem-10.0-SNAPSHOT.gem
Successfully installed GlassFish, version 10.0.0
~ $ glassfish_rails testapp
Sep 14, 2007 3:00:45 PM com.sun.enterprise.v3.services.impl.GrizzlyAdapter postConstruct
INFO: Listening on port 8080
Sep 14, 2007 3:00:46 PM com.sun.enterprise.v3.services.impl.DeploymentService postConstruct
INFO: Supported containers : phobos,php,web,jruby
Sep 14, 2007 3:00:46 PM com.sun.grizzly.standalone.StaticResourcesAdapter <init>
INFO: New Servicing page from: /Users/headius/testapp/public
/Users/headius/NetBeansProjects/jruby/lib/ruby/gems/1.8/gems/actionmailer-1.3.3/lib/action_mailer.rb:50 warning: already initialized constant MAX_LINE_LEN
Sep 14, 2007 3:00:53 PM com.sun.enterprise.v3.server.AppServerStartup run
INFO: Glassfish v3 started in 9233 ms

That's all there is to it...you've got a production-ready server. Oh, did I mention you only have to run one instance? No more managing a dozen mongrel processes, ensuring they stay running, starting and stopping them all. One command, one process.

Of course this is a preview...we expect to see bug reports and find issues with it. For example, it currently deploys under a context rather than at the root of the server, so my app above would be available at http://localhost:8080/testapp instead of http://localhost:8080/. That's going to be fixed soon (and configurable) but for now you'll want to set the following in environment.rb:

ActionController::AbstractRequest.relative_url_root = "/<app name>/"
ActionController::CgiRequest.relative_url_root = "/<app name>/"

And of course, you're going to be running JRuby, so you'll need to take that into consideration. JRuby's general Rails performance still needs more tweaking and work to surpass Mongrel + Ruby, but out of the box you already get stellar static-file performance with the GlassFish gem...something like 2500req/s for the testapp index page on my system. The remaining JRuby performance is continuing to improve as well...we'll get there soon.

So! Give it a try, report bugs on the GlassFish issue tracker, and let us know on the GlassFish mailing lists what you'd like to see improved.

Mongrel...your days are numbered.

Thursday, September 13, 2007

How Easy Is It To Contribute To JRuby?

Answer: Very Easy!

Get the Code

svn co http://svn.codehaus.org/jruby/trunk/jruby
cd jruby
ant

Run Your Built JRuby

export PATH=$PATH:$PWD/bin
jruby my_script.rb

Install Gems

gem install somegem

jruby -S gem install somegem

Run the Benchmarks

jruby -J-server -O test/bench/bench_method_dispatch.rb

(-J-server uses the "server" JVM and -O disables ObjectSpace)

Build and Test Your Changes

ant clean test

Look For Bugs To Fix or Report Your Own

JRuby JIRA (bug tracker)

Join the Mailing Lists

JRuby Mailing Lists

Join Us on IRC

Channel #jruby on FreeNode IRC

We're waiting to hear from you!

Wednesday, September 12, 2007

InfoWorld Bossies Close to my Heart

A couple interesting "Bossies" were awarded by InfoWorld this week:

Best Open Source Programming Language - Summary: Ruby gets mad props for a vibrant community and a diverse range of implementations (e.g. JRuby), and then they go squishy and say "and these other languages are great too!"
Best Open Source IDE - InfoWorld heaps praise on NetBeans largely because it has *not* gone the way of an amorphous, all-encompassing "platform" and has continued to take risks to improve the overall experience of the IDE. Before the work on NetBeans 6 (don't download M10; use a daily build) I was a nay-sayer myself; having used NetBeans 6 for many months now, I can honestly say it's far better than NetBeans 5.5 and has caught up or passed Eclipse in many ways. There's more to do, but I'm extremely impressed with the progress.

I try not to be too much of a Sun marketing shill, but both Bossies are pretty close to my heart. JRuby has been my passion for three years now, and NetBeans has taken a serious about face with greatly improved Java support and best-in-class Ruby support. It's good to see some recognition for hard work.

Tuesday, September 11, 2007

JRuby Compiler Update, and a Nice Peformance Milestone

Hello again friends! It's time to update you on the status of the JRuby compiler.

Compiler Status

I've been working feverishly for the past several weeks to get the rest of the compiler complete. Currently, it's able to handle the majority of Ruby syntax. Here's a list of the remaining language features that do not compile:

"rescue" blocks; exception handling in Ruby is rather complicated, and there's some particularly odd uses of rescue that will be a bit tricky to support with normal Java exception-handling.
"class var declaration" is not yet supported. This is when you declare a class variable (@@foo) from within the body of a class or module. This primarily affects compiling class bodies, so although it prevents AOT compilation of some scripts, it doesn't usually affect individual methods.
"opt n" execution. This is specifying "-n" to the Ruby runtime, and it loops the provided script as though it were surrounded by "while gets(); ... end". It's useful for line-by-line processing of stdin.
"post execution" blocks. Post exe blocks are when you specify an END { ... } block somewhere in your script. These blocks are saved up and executed at the end of the script execution, regardless of where they appear in the script. They're a bit like Kernel#at_exit blocks.
"retry". Tell me friends, do you know what "retry" actually does? Retry is used within a block/closure, and it causes the method containing the closure to be re-called anew. And as an interesting quirk, the original arguments to the method are re-evaluated, so if you call foo(bar()) and a retry is triggered within foo(), bar() will get invoked again for the retried call to foo(). Weird, eh? Update: I didn't explain this well. Here's another attempt: if you have the following code:
```
def foo(x = bar()); 1.times {retry}; end
```
And you call foo with no arguments, allowing the default argument logic to fire, retry will cause that logic to fire again and again. It's essentially re-entering the method anew with the original arguments, but causing *argument processing* to be revisited. I'm not sure why you'd want this behavior, since it could frequently result in default arguments to re-call methods that might only be valid the first time.
Some non-local flow control is not yet complete. Non-local flow control happens any time you return, break, or next from within a block (when not immediately inside a normal loop construct). Much of non-local flow control is working, but I need to flush out any remaining cases that aren't running correctly.

It's a pretty short list, eh? Obviously "rescue" is the biggest and trickiest item here. Without exception handling, it's hard to say the compiler is near completion. The complications I mentioned involve the ability to embed rescue processing into arbitrary expressions. Here's a good example:

a = [1, 2, (begin; raise; rescue; 3; end)]

When this code is compiled, it turns into a local variable assignment. The value assigned is a literal array construction with three elements: a Fixnum 1, a Fixnum 2, and a rescued block of code. The typical way to construct the array then is to follow these steps:

Construct an array of the appropriate size
Dup the array reference
Push a constant integer zero
Push Fixnum 1
Insert Fixnum 1 at index zero in the array. This consumes the dup'ed array, the index, and the Fixnum1.
Dup the array reference again
Push a constant integer one
Push Fixnum 2
Insert Fixnum 2
Dup the array reference again
Push a constant integer two
Now it gets complicated; we must recurse in the compiler to handle the rescue block
The rescue block is compiled and a "raise" is triggered in the code
The exception raised is handled, resulting in the whole rescue leaving a Fixnum 3 on the stack
Insert the Fixnum 3
Construct a RubyArray object with the remaining object array

Now that seems simple enough. However there's a sneaky complication at steps 13 and 14: catching an exception clears the operand stack, and the original created array, its duplicated reference, and the integer two disappear as a result. The value "returned" from the rescue section therefore has nowhere to go.

We will likely have to solve this complication in one of two ways:

We could save off the stack when entering code that might trigger exception handling
We could put exception-handling logic in a separate method and invoke it in-place, thereby protecting our executing stack from clearage.

It remains to be seen which mechanism will work out to be simplest to compile and most performant.

A Nice Performance Milestone

And on the topic of performance, the recent compiler work has allowed us to reach a new milestone: we now exceed Ruby 1.8.6's performance on M. Edward (Ed) Borasky's MatrixBenchmark.

Some months back, after the Mountain West RubyConf in Salt Lake City, Ed posted an interesting blog entry where he professed a lot of confidence in JRuby's future. We emailed a bit offline, and he pointed me to this matrix benchmark he'd been using to measure the relative performance of Ruby 1.8.6 and Ruby 1.9 (YARV). I told him I'd give it a try.

Originally, we were perhaps 50% to 100% slower than Ruby 1.8.6. This was back when hardly anything was compiling, and there had been few serious efforts to optimize the JRuby runtime. Performance slowly crept up as time went on. But as recent as a week ago, JRuby performance was still roughly 20-25% slower than 1.8.6.

So last week, I dug into it a bit more. I turned on JRuby's JIT logging (-J-Djruby.jit.logging=true) and verbose logging (-J-Djruby.jit.logging.verbose=true) to log compiling and non-compiling methods, respectively. As it turned out, the "inverse_from" method in matrix.rb was not yet compiling...and it was where the bulk of MatrixBenchmark's work was happening.

The final sticking point in the compiler for this method was "operator element assignment" syntax, or basically anything that looks like a[0] += 5. It's a little involved to compile; you have to retrieve the element, calculate the value, call the operator method, and reassign all in one operation. For the ||= or &&= versions, you have to perform a boolean check against the element to see if you should proceed to the assignment. A good bit of compiler code, but it had to be done.

So then, with "OpElementAsgn" compiling, it was time to re-run the numbers. And finally, finally, we were comfortably exceeding Ruby 1.8.6 performance:

Ruby 1.8.6:
Hilbert matrix of dimension 128 times its inverse = identity? true
586.110000   5.710000 591.820000 (781.251569)

JRuby trunk, Java 6 server, ObjectSpace disabled:
Hilbert matrix of dimension 128 times its inverse = identity? true
372.950000   0.000000 372.950000 (372.950000)

Or should I say vastly exceeding? By my calculation this is an easy 2x performance increase, and perhaps a 70% improvement just by getting this one extra method to compile.

On Beyond Zebra

I believe we're pretty well on-target to have the compiler completed by RubyConf in November. I'm about to embark on a refactoring adventure to prepare for the stack-juggling I'll have to do to support rescue blocks. That will mean minimal progress on adding to the compiler until the end of the month, but ideally the refactoring will make it easy to get rescue compilation complete. The others are just a matter of spending some time.

Once the JRuby compiler is complete, we will start testing in earnest against a fully pre-compiled Ruby stdlib. Along with that, we'll wire in support for pre-compiling RubyGems as they install and pre-compiling Ruby scripts as they are executed and loaded. Much of this works already in prototype form, but it waits for the completion of the compiler to go into general use.

I also have plans for a "static" compiler for JRuby that enable compiling Ruby classes into normal, instantiable, callable, static Java classes. This would bring us on par with other compiled languages on the JVM, and allow you to directly instantiate and invoke JRuby/Ruby objects from within your Java code.

Beyond all this work, Tom and I have been discussing a whole raft of performance improvements we could make to the underlying JRuby runtime. There's a lot more performance to be had, and it's just around the corner.

Exciting times, friends. Exciting times.

Sunday, September 02, 2007

Java Native Access + JRuby = True POSIX

I know, I know. It's got native code in it, and that's bad. It's using JNI, and that's bad. But damn, for the life of me I couldn't see enough negatives to using Java Native Access that would outweigh the incredible positives. For example, the new POSIX interface I just committed:

import com.sun.jna.Library;
public interface POSIX extends Library {
  public int chmod(String filename, int mode);
  public int chown(String filename, int owner, int group);
}

Given this library, it takes a whopping *one line of code* to wire up the standard C library on a given platform:

import com.sun.jna.Native
POSIX posix = (POSIX)Native.loadLibrary("c", POSIX.class);

That's. It.

So the idea behind JNA (jna.dev.java.net) is to use a foreign function interface library (libffi in this case) to wire up C libraries programmatically. This allows loading a library, inspecting its contents, and providing interfaces for those functions at runtime. They get bound to the appropriate places in the Java interface implementation, and JNA does some magic under the covers to handle converting types around. And so far, it appears to do an outstanding job.

With this code in JRuby, we now have fully complete chmod and chown functionality. Before, we had to either shell out for chmod or use Java 6 APIs that wouldn't allow setting group bits, and we always had to shell out for chown because there's no Java API for it. Now, both work *flawlessly*.

The potential of this is enormous. Any C function we haven't been able to emulate, we can just use directly. Symlinks, fcntl, file stats and tests, process spawning and control, on and on and on. And how's this for a wild idea: with a bit of trickery, we could actually wire up Ruby C extensions, just by providing JNI/JRuby-aware implementations of the Ruby C API functions.

So there it is. I went ahead and committed a trunk build of JNA, and I'm working with the JNA guys to get a release out. We'll have to figure out the platform-specific bits, probably by just shipping a bunch of pre-built libraries (not a big deal; the JNI portion is about 26k), but the result will be true POSIX functionality in JRuby...the "last mile" of compatibility will largely be solved.

Many thanks to Timothy Wall, who's been nursemaiding me through getting JNA working well, and who appears to be the primary driver of the JNA project right now. If you're interest in this stuff, check out JNA, start using it, make it popular. As far as I'm concerned this is how native access is supposed to be.

Update: My mistake, Wayne Meissner has also been pouring a metric buttload of effort into JNA. Credit where credit's due!

Sunday, August 26, 2007

JRuby 1.0.1 and Jython 2.2 Released

I'm in the airport, but I wanted to add to the blogs reporting that JRuby 1.0.1 and Jython 2.2 have been released.

JRuby 1.0.1 is basically just a maintenance release to 1.0. It includes various compatibility fixes that have filtered in since the release, and doesn't do much to improve performance. JRuby 1.1 will come out by November, and should have all the latest research and perf work, including a complete compiler to JVM bytecode.

Jython 2.2 is much bigger news. After years of sleepily crawling along, the 2.2 release of Jython has finally been released. With 2.2 behind them, the Jython team can now be more ambitious about adding Python 2.3, 2.4, and 2.5 features. Much of that work has already started. Also coming up is a new JVM bytecode compiler. If you're a Python guy, give Jython a try today...those guys have put a lot of time and effort in.

I also understand that Groovy is supposed to have their 1.1 release out some time in October. Happy times for dynlangs on the JVM!

Tuesday, August 14, 2007

NetBeans Ruby Support is the BOMB

OMG NetBeans Ruby support is so awesome. I just picked up some of the recent dailies, and it does stuff I just can't believe. But don't take my word for it.

Today I stumbled back into the NetBeans Ruby Wiki, expecting to just find the same old "how to download", "how to install", and "how to build" instructions. Instead I find a nicely-organized set of pages describing (with screenshots!) all of the really awesome features.

Refactoring? Check. Debugging? Check. YAML and RHTML editing? Check. Test running? Check.

It just goes on and on.

If you haven't given it a try, and you're a Ruby programmer, you owe it to yourself to check it out. And if you ph34r a gigantic IDE download with support for Java and UML and BPEL and other stuff you'll never used, there's a Ruby-only IDE available too; check the Installation page. Incredible.

Direct links to feature pages:

RubyEditing describes the editing features available.
RubyRefactoring describes the refactoring features available.
RubyProjects describes the project features available.
RubyOnRails describes the Ruby On Rails features available.
RubyDebugging describes the debugging features available.
RubyTesting describes the testing features available.
RubyCodeTemplates describes the live code templates feature.
RubyHints describes the quickfixes and hints feature.
RubyShortcuts lists keyboard shortcuts relevant to Ruby

Sunday, August 05, 2007

Widening the JVM Languages Group: We Need You!

We need diversity in the JVM Languages group, and it's been brought to my attention that some popular/key/interesting languages may not have representation. So we need to change that.

If you are interested in the future of non-Java languages on the JVM, you should be on this list. Yes, we talk about a lot of JVM language implementation challenges, we discuss compilers and stack frames and call-site optimizations, but we also talk about features peripheral to language implementation like package indexing and retrofitting Java 5+ code. We need your help.

Once you've joined, or if you're already member, you have a second task

I respectfully request that each of you search out one individual you think would be interested in the list and try to get them involved. Toss them a quick email, invite them to describe their project or language or implementation to us, and promise them they're joining a very interesting and entertaining community. History will thank you, and so will I.

A Business Case for Supporting Jython

The facts:

Sun and many other organizations have started considering moves to Mercurial. In Sun's case, it's a mandate for all Sun-managed OSS projects (OpenSolaris, OpenJDK, etc).
Moving projects to Mercurial frequently (usually?) requires IDE/tool support.
Sun's IDE/tools and those of many other organizations are Java-based (NetBeans, Eclipse, and so on).
Mercurial is written in Python.
Jython is an implementation of Python for the JVM.

Therefore, Jython Mercurial support would be an excellent vector to Mercurial adoption within Java organizations (Sun included, I'd wager). Corollary: getting Mercurial running on Jython is an excellent business case for contributing time and resources to Jython.

Put simply: if you want to become an OSS rockstar tomorrow, get Mercurial running on Jython.

(credit for this idea goes to OpenJDK ambassador Tom Marble...I'm just trying to needle the OSS world into making it happen)

Thursday, July 19, 2007

Alioth Numbers for JRuby 1.0

Someone pointed out to me the other day that the Alioth "Compuer Language Benchmarks Game" (as they call it now) had started to include JRuby 1.0 in the results. And how do we fare? We're slower than Ruby 1.8.6! Hooray!

But not by much. Depending on your definition of "on par", I'd say we're safely in the range of being on par with Ruby 1.8.6 performance, at least for these too-short, too-small measurements.

Alioth benchmarks are regularly panned by language enthusiasts...or at least enthusiasts on the "slow" end of the benchmarks. In the case of JVM-based language implementations, the problem is the same old excuse: not enough time for the JVM JIT to work its magic. This case isn't all that different, but it's fair to say that Alioth isn't testing straight-line performance in a hot application, but getting from point A to point B without a running start. And in that domain, it provides a reasonable window into implementation performance.

So then, the JRuby 1.0 numbers. You can click the link to see them, but they break down about like this:

Performance:

Four tests are equal or faster in JRuby--usually not more than 2x faster, but "4.8x" faster in one case. That fastest one involves "concurrency", and I haven't studied it enough to know whether it's meaningful or not.
Eight tests are less than 2x slower in JRuby.
The remaining four tests are greater than 2x slower in JRuby.
Startup time is considerably worse; it's listed as 305x slower for JRuby (!!!) but it's not a particularly useful ratio. We take a couple seconds to get going compared to Ruby's hundredths. That's life.

Memory:

All except one worse in JRuby
Most more than 2x worse in JRuby
Several more then 10x worse in JRuby
Does anyone care?

(Update: Commenters have made it clear that they do care...but of course I was being a little bit glib here :) We continue every release to do what we can to improve performance AND memory usage, and other than some additional overhead from the JVM itself our memory usage is pretty reasonable.)

So I guess that's all I've got to point out. We've been working on performance since 1.0, and there's a number of major improvements planned for the 1.1 release. And considering that "beating Ruby performance" wasn't a primary goal for JRuby 1.0, I think our "roughly on par" numbers here are pretty damn good. Granted, Ruby 1.8.x isn't the fastest implementation in the world, but we're pretty happy to have improved performance by an order of magnitude in the past year.

Now, onward to the future!

Wednesday, July 18, 2007

Groovy and JRuby Cooperating?

Can it be true? Of course it can!

Graeme Rocher blogs about a new feature in upcoming Groovy 1.1 beta 3: support for Ruby-style "method missing" hooks!

In heavily metaprogrammed applications, it's very common to define methods on the fly in response to calls made. For example, in Rails ActiveRecord, no "find by" methods are defined out of the box, but you may still call "find_by_x_and_y_and_z" to run your specific query. In order to reduce the overhead of handling these "missing methods" (usually through parsing the called name), it's typical to generate new methods on demand, binding them to the called class. So if you call find_by_name and it doesn't exist, two things happen: the query is generated and run and a new find_by_name method is dynamically created for future queries.

After Graeme blogged about the typical way to dynamically create methods in Groovy, I tapped him on IM and we started to talk about how Ruby handles the same use case. We agreed that the overhead of overriding invokeMethod (Groovy's entry point for all dynamic calls) was problematic...it's already got enough to do without adding the method-generation logic. As it turns out, "method missing" was a perfect remedy, and pretty easy to add into Groovy. Graeme's already started work to modify Grails to take advantage of methodMissing and I'm sure it will catch on like wildfire in the rest of Groovydom. And as a result you, the JVM dynamic language community, gain the benefit of a little Groovy/JRuby cross-pollination.

So for all you haters out there who just love to pit language teams against one another I say this:

Nyah.

Sunday, July 15, 2007

The Ever-Evolving JVM

John Rose, lead of the "invokedynamic" effort (Java Specification Request 292), has posted some exciting articles about the future of the JVM and a number of changes potentially for the next Java version. Among these is, of course, the dynamic invocation efforts, but these entries include information on non-local returns from closures, tail calls, and tuple support. I'm excited to have John as a co-worker and to be helping out the invokedynamic effort in my own small way by working on JRuby's compiler. John has also provided guidance on how to make a dynamic language fast on current JVMs, which has informed much of my compiler's design.

Check out his articles:

Longumps Considered Inexpensive
tail calls in the VM
tuples in the VM

It's going to be a fun year for language developers on the JVM!

More Compiler Strategy: Call Adapters and Stack-based Methods

Compilers are hard. But not so hard as people would have you believe.

I've committed an update that installs a CallAdapter for every compiled call site. CallAdapter is basically a small object that stores the following:

method name
method index
call type (normal, functional, variable)

As well as providing overloaded call() implementations for 1, 2, 3, n arguments and block or no block. The basic goal with this class is to provide a call adapter (heh) that makes calling a Ruby method in compiled code as similar to (and simple as) calling any Java method.

The end result is that while compiled class init is a bit larger (needs to load adapters for all call sites), compiled method size has dropped substantially; in compiling bench_method_dispatch.rb, the two main tests went from 4000 and 3500 bytes of code down to 1500 and 1000 bytes (roughly). And simpler code means HotSpot has a better chance to optimize.

Here's the latest numbers for the bench_method_dispatch_only test, which just measures time to call a Ruby-implemented method a bunch of times:

Test interpreted: 100k loops calling self's foo 100 times
 2.383000   0.000000   2.383000 (  2.383000)
 2.691000   0.000000   2.691000 (  2.691000)
 1.775000   0.000000   1.775000 (  1.775000)
 1.812000   0.000000   1.812000 (  1.812000)
 1.789000   0.000000   1.789000 (  1.789000)
 1.776000   0.000000   1.776000 (  1.777000)
 1.809000   0.000000   1.809000 (  1.809000)
 1.779000   0.000000   1.779000 (  1.781000)
 1.784000   0.000000   1.784000 (  1.784000)
 1.830000   0.000000   1.830000 (  1.830000)

And Ruby 1.8.6 for reference:

Test interpreted: 100k loops calling self's foo 100 times
2.160000   0.000000   2.160000 (  2.188087)
2.220000   0.010000   2.230000 (  2.237414)
2.230000   0.010000   2.240000 (  2.248185)
2.180000   0.010000   2.190000 (  2.218540)
2.240000   0.010000   2.250000 (  2.259535)
2.220000   0.010000   2.230000 (  2.241170)
2.150000   0.010000   2.160000 (  2.178414)
2.240000   0.010000   2.250000 (  2.259772)
2.260000   0.000000   2.260000 (  2.285141)
2.230000   0.010000   2.240000 (  2.252396)

Note that these are JIT numbers rather than fully precompiled numbers, so this is 100% real-world safe. Fully precompiled is just a bit faster, since there's no interpreted step or DefaultMethod wrapper to go through.

I have also made a lot of progress on adapting the compiler to create stack-based methods when possible. Basically, this involved inspecting the code for anything that would require access to local variables outside the body of the call. Things like eval, closures, etc. At the moment it works well and passes all tests, but I know methods similar to gsub which modify $~ or $_ are not working right. It's disabled at the moment, pending more work, but here's the method dispatch numbers with stack-based method compilation enabled:

Test interpreted: 100k loops calling self's foo 100 times
 1.735000   0.000000   1.735000 (  1.738000)
 1.902000   0.000000   1.902000 (  1.902000)
 1.078000   0.000000   1.078000 (  1.078000)
 1.076000   0.000000   1.076000 (  1.076000)
 1.077000   0.000000   1.077000 (  1.077000)
 1.086000   0.000000   1.086000 (  1.086000)
 1.077000   0.000000   1.077000 (  1.077000)
 1.084000   0.000000   1.084000 (  1.084000)
 1.090000   0.000000   1.090000 (  1.090000)
 1.083000   0.000000   1.083000 (  1.083000)

It seems very promising work. I hope I'll be able to turn it on soon.

Oh, and for those who always need a fib fix, here's fib with both optimizations turned on:

~ $ jruby -J-server bench_fib_recursive.rb      
 1.258000   0.000000   1.258000 (  1.258000)
 0.990000   0.000000   0.990000 (  0.989000)
 0.925000   0.000000   0.925000 (  0.926000)
 0.927000   0.000000   0.927000 (  0.928000)
 0.924000   0.000000   0.924000 (  0.925000)
 0.923000   0.000000   0.923000 (  0.923000)
 0.927000   0.000000   0.927000 (  0.926000)
 0.928000   0.000000   0.928000 (  0.929000)

And MRI:

~ $ ruby bench_fib_recursive.rb
1.760000   0.010000   1.770000 (  1.775660)
1.760000   0.010000   1.770000 (  1.776360)
1.760000   0.000000   1.760000 (  1.778413)
1.760000   0.010000   1.770000 (  1.776767)
1.760000   0.010000   1.770000 (  1.777361)
1.760000   0.000000   1.760000 (  1.782798)
1.770000   0.010000   1.780000 (  1.794562)
1.760000   0.010000   1.770000 (  1.777396)

These numbers went down a bit because the call adapter is currently just generic code, and generic code that calls lots of different methods causes HotSpot to stumble a bit. The next step for the compiler is to generate custom call adapters for each call site that handle arity correctly (avoiding IRubyObject[] all the time) and call directly to the most-likely target methods.

Friday, July 13, 2007

To Keyword Or Not To Keyword

One of the most attractive aspects of Ruby is the fact that it has relatively few sacred keywords. In most cases, things you'd expect to be keywords are actually methods, and you can wrap or hook their behavior and create amazing potential.

One perfect example of this is require. Because require is just a method, you can define your own version that wraps its behavior. This is exactly how RubyGems does its magic...rather than immediately calling the default require, it can modify load paths based on your installed gems, allowing for a dynamically-expanding load path and the pluggability we've all come to know and love.

But all such keyword-like methods are not so well behaved. Many methods make runtime changes that are otherwise impossible to do from normal Ruby code. Most of these are on Kernel. I propose that several of these methods should actually be keywords.

Update: Evan Phoenix of Rubinius (EngineYard), Wayne Kelly of Ruby.NET (Queensland University), and John Lam of IronRuby (Microsoft) have voiced their agreement on this interesting ruby-core mailing list thread. Have you shared your thoughts?

Justifying Keywords

There's a number of (in my opinion, very strong) justifications for this:

Many Kernel methods manipulate runtime state in ways no other methods can. For example: local_variables requires access to the caller's variable scope; block_given? requires access to the block/iter stacks (in MRI code); eval requires access to just about everything having to do with a call; and there are others, see below.
Because many of these methods manipulate normally-inaccessible runtime state, it is not possible to implement them in Ruby code. Therefore, even if someone wanted to override them (the primary reason for them to be methods) they could not duplicate their behavior in the overridden version. Overriding only destroys their utility.
These methods are exactly the ones that complicate optimizing Ruby in all implementations, including Ruby 1.9, Rubinius, JRuby, Ruby.NET, and others. They confound a compiler's efforts to optimize calls by always leaving open questions about the behavior of a method. Will it need access to a heap-allocated scope? Will it save off a binding or the current call frame? No way to know for sure, since they're methods.

In short, there appears to be no good reason to keep them as methods, and many reasons to make them keywords. What follows is a short list of such methods and why they ought to be keywords:

*eval - requires implicit access to the caller's binding
block_given?/iterator? - requires access to block/iter information
local_variables - requires access to caller's scope
public/private/protected - requires access to current frame's visibility

There may be others, but these are definitely the biggest offenders. The three points above were used to compile this list, but my criteria for a keyword could be the following more straightforward points. A feature should be implemented (or converted to) a keyword if it fits either of the following criteria:

It manipulates runtime state in ways impossible from user-created code
It can't be implemented in user-created code, and therefore could not reasonably be overridden or hooked to provide additional behavior

As an alternative, if modifications could be made to ensure these methods were not overridable, Ruby implementations could safely treat them as keywords; searching for calls to "eval" in a given context would be guaranteed to mean an eval would take place in that context.

What do we gain from doing all this?

I can at least give a JRuby perspective. I expect others can give their perspectives.

In JRuby, we could greatly optimize method invocations if, for example, we knew we could just use Java's local variables (on Java's stack) rather than always heap-allocating a scoping structure. We could also avoid allocating a frame or binding when they are not needed, just allowing Java's call frame to be "enough" for us. We can already detect if there are closures in a given context, which helps us learn that a heap-allocated scope will be necessary, but we can't safely detect eval, block_given?, etc. As a result of these methods-that-would-be-keywords, we're forced to set up and tear down every method in the most expensive manner.

Other implementations&emdash;including Ruby 1.9/2.0 and Rubinius&emdash;would probably be able to make similar optimizations if we could calculate ahead of time whether these keyword operations would occur.

For what it's worth, I may go ahead and implement JRuby's compiler to treat these methods as keywords, only falling back on the "method" behavior when we detect in the rest of the system that the keyword has been overridden. But that situation is far from ideal...we'd like to see all implementations adopt this behavior and so benefit equally.

As an example, here's an early demonstration of the performance change in our old friend fib() when we can know ahead of time if any of these keywords are called (fib calls none of them). This example shows the performance today and the performance when we can safely just use Java local variables and scoping constructs. We could additionally omit heap-allocated frames for each call, giving a further boost.

I've included Ruby 1.8.6 to provide a reference value.

Current JRuby:

~ $ jruby -J-server bench_fib_recursive.rb
1.323000   0.000000   1.323000 (  1.323000)
1.118000   0.000000   1.118000 (  1.119000)
1.055000   0.000000   1.055000 (  1.056000)
1.054000   0.000000   1.054000 (  1.054000)
1.055000   0.000000   1.055000 (  1.054000)
1.055000   0.000000   1.055000 (  1.055000)
1.055000   0.000000   1.055000 (  1.055000)
1.049000   0.000000   1.049000 (  1.049000)

~ $ jruby -J-server bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
3.901000   0.000000   3.901000 (  3.901000)
4.468000   0.000000   4.468000 (  4.468000)
2.446000   0.000000   2.446000 (  2.446000)
2.400000   0.000000   2.400000 (  2.400000)
2.423000   0.000000   2.423000 (  2.423000)
2.397000   0.000000   2.397000 (  2.397000)
2.399000   0.000000   2.399000 (  2.399000)
2.401000   0.000000   2.401000 (  2.401000)
2.427000   0.000000   2.427000 (  2.428000)
2.403000   0.000000   2.403000 (  2.403000)

Using Java's local variables instead of a heap-allocated scope:

~ $ jruby -J-server bench_fib_recursive.rb
2.360000   0.000000   2.360000 (  2.360000)
0.818000   0.000000   0.818000 (  0.818000)
0.775000   0.000000   0.775000 (  0.775000)
0.773000   0.000000   0.773000 (  0.773000)
0.799000   0.000000   0.799000 (  0.799000)
0.771000   0.000000   0.771000 (  0.771000)
0.776000   0.000000   0.776000 (  0.776000)
0.770000   0.000000   0.770000 (  0.769000)

~ $ jruby -J-server bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
3.100000   0.000000   3.100000 (  3.100000)
3.487000   0.000000   3.487000 (  3.487000)
1.705000   0.000000   1.705000 (  1.706000)
1.684000   0.000000   1.684000 (  1.684000)
1.678000   0.000000   1.678000 (  1.678000)
1.683000   0.000000   1.683000 (  1.683000)
1.679000   0.000000   1.679000 (  1.679000)
1.679000   0.000000   1.679000 (  1.679000)
1.681000   0.000000   1.681000 (  1.681000)
1.679000   0.000000   1.679000 (  1.679000)

Ruby 1.8.6:

~ $ ruby bench_fib_recursive.rb     
1.760000   0.010000   1.770000 (  1.775304)
1.750000   0.000000   1.750000 (  1.770101)
1.760000   0.010000   1.770000 (  1.768833)
1.750000   0.010000   1.760000 (  1.782908)
1.750000   0.010000   1.760000 (  1.774193)
1.750000   0.000000   1.750000 (  1.766951)
1.750000   0.010000   1.760000 (  1.777814)
1.750000   0.010000   1.760000 (  1.782449)

~ $ ruby bench_method_dispatch_only.rb
Test interpreted: 100k loops calling self's foo 100 times
2.240000   0.000000   2.240000 (  2.268611)
2.160000   0.010000   2.170000 (  2.187729)
2.280000   0.010000   2.290000 (  2.292342)
2.210000   0.010000   2.220000 (  2.250331)
2.190000   0.010000   2.200000 (  2.210965)
2.230000   0.000000   2.230000 (  2.260737)
2.240000   0.010000   2.250000 (  2.256210)
2.150000   0.010000   2.160000 (  2.173298)
2.250000   0.010000   2.260000 (  2.271438)
2.160000   0.000000   2.160000 (  2.183670)

What do you think? Is it worth it?

Wednesday, July 11, 2007

Finding a JVM compilation strategy for Ruby's dynamic nature

In JRuby, we have a number of things we "decorate" the Java stack with for Ruby execution purposes. Put simply, we pass a bunch of extra context on the call stack for most method calls. At its most descriptive, making a method call passes the following along:

a ThreadContext object, for accessing JRuby call frames and variable scopes
the receiver object
the metaclass for the receiver object
the name of the method
a numeric index for the method, used for a fast dispatch mechanism
an array of arguments to the method
the type of call being performed (functional, normal, or variable)
any block/closure being passed to the method

Additionally there are a few places where we also pass the calling object, to use for visibility checks.

The problem arises when compiling Ruby code into Java bytecode. The case I'm looking at involves one of our benchmarks where a local variable is accessed and "to_i" is invoked on it a large number of times:

puts Benchmark.measure {
a = 5;
i = 0;
while i < 1000000
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i; a.to_i;
  i += 1;
end
}

(that's 100 accesses and calls in a 1 million loop)

The block being passed to Benchmark.measure gets compiled into its own Java method on the resulting class, called something like closure0. This gets further bound into a CompiledBlock adapter which is what's eventually called when the block gets invoked.

Unfortunately all the additional context and overhead required in the compiled Ruby code seems to be causing trouble for hotspot.

In this case, the pieces causing the most trouble are obviously the "a.to_i" bits. I'll break that down.

"a" is a local variable in the same lexical scope, so we go to a local variable in closure0 that holds an array of local variable values.

 aload(SCOPE_INDEX)
ldc([index of a])
aaload

But for Ruby purposes we must also make sure a Java null is replaced with "nil" so we have an actual Ruby object

 dup
ifnonnull(ok)
pop
aload(NIL_INDEX) # immediately stored when method is invoked
label(ok)

So every variable access is at least seven bytecodes, since we need to access them from an object that can be shared with contained closures.

Then there's the to_i call. This is where it starts to get a little ugly. to_i is basically a "toInteger" method, and in this case, calling against a Ruby Fixnum, it doesn't do anything but return "self". So it's a no-arg noop for the most part.

The resulting bytecode to do the call ends up being uncomfortably long:

(assumes we already have the receiver, a Fixnum, on the stack)

 dup # dup receiver
invokevirtual "getMetaClass"
invokevirtual "getDispatcher" # a fast switch-based dispatcher
swap # dispatcher under receiver
aload(THREADCONTEXT)
swap # threadcontext under receiver
dup # dup receiver again
invokevirtual "getMetaClass" # for call purposes
ldc(methodName)
ldc(methodIndex)
getstatic(IRubyObject.EMPTY_ARRAY) # no args
ldc(call type)
getstatic(Block.NULL_BLOCK) # no closure
invokevirtual "Dispatcher.callMethod..."

So we're looking at roughly 15 operations to do a single no-arg call. If we were processing argument lists, it would obviously be more, especially since all argument lists eventually get stuffed into an IRubyObject[]. Summed up, this means:

100 a.to_i calls * (7 + 15 ops) = 2200 ops

That's 2200 operations to do 100 variable accesses and calls, where in Java code it would be more like 200 ops (aload + invokevirtual). An order of magnitude more work being done.

The closure above when run through my current compiler generates a Java method of something like 4000 bytes. That may not sound like a lot, but it seems to be hitting a limit in HotSpot that prevents it being JITed quickly (or sometimes, at all). And the size and complexity of this closure are certainly reasonable, if not common in Ruby code.

There's a few questions that come out of this, and I'm looking for more ideas too.

How bad is it to be generating large Java methods and how much impact does it have on HotSpot's ability to optimize?
This code obviously isn't optimal (two calls to getMetaClass, for example), but the size of the callMethod signature means even optimal code will still have a lot of argument loading to do. Any ideas on how to get around this in a general way? I'm thinking my only real chance is to find simpler signatures to invoke, such as arity-specific (so there's no requirement for an array of args), avoiding passing context that usually isn't needed (an object knows its metaclass already), and reverting back to a ThreadLocal to get the ThreadContext (though that was a big bottleneck for us before...).
Is the naive approach of breaking methods in two when possible "good enough"?

It should be noted that HotSpot eventually does JIT this code, it's substantially faster than the current general release of Ruby 1.8. But I'm worried about the complexity of the bytecode and actively looking for ways to simplify.

Friday, July 06, 2007

Understanding the JVM JIT and helping it along

I must apologize to my readers. I have been remiss in my blogging duties. I will be posting updates on the various events of the past month or so along with updates on JRuby progress and future events very soon. But for now, a technical divergence after a night of hacking.

--

I finally understand what we should be going for in our compiled code, and how we can really kick JRuby into the next level of performance.

The JVM, at least in HotSpot, gets a lot of its performance from its ability to inline code at runtime, and ultimately compile a method plus its inlined calls as a whole down to machine code. The benefit in doing this is the ability to do compiler optimizations across a much larger call path, essentially compiling all the logic for a method and its calls (and possibly their calls, ad infinatum) into a single optimized segment of machine code.

HotSpot is able to do this in a two main ways:

If it's obvious there's only ever one implementation of a given signature on a given type hierarchy
If it can determine at runtime that one (or a few) implementations are the only ones ever being called

The first one allows code to be optimized fairly quickly, because HotSpot can discover early on that there's only one implementation. In general, if there's a single implementation of a given signature, it will get inlined pretty quickly.

The second one is trickier. HotSpot tracks the actual types being called against for the various calls, and eventually can come up with a best guess at the method or methods to inline. It also can include a slow path for the rare future cases where the receiver does not match the target types, and it can deoptimize later to back down optimizations when situations change, such as when a new class is loaded into the system.

So in the end, inlining is one of the most powerful optimizations. Unfortunately in JRuby (and most other dynamic language implementations on the JVM), we're making inlining difficult or impossible in the most performance-sensitive areas. I believe this is a large part of our performance woes.

Consider that all method calls against any object must pass through an implementation of IRubyObject.callMethod. There's not too many callMethod implementations, and actually now there's only one implementation of each specific signature. So callMethod gets inlined pretty fast.

Consider also that almost all method calls within callMethod are to very specific methods and will also be inlined quickly. So callMethod is looking pretty good so far.

Now we look at the last step in callMethod...DynamicMethod.call. DynamicMethod is the top-level type for all our method objects in the system. The call method has numerous implementations, all of them different. And no one implementation stands out as the most frequently called. So we're already complicating matters for HotSpot, even though we know (based on the incoming method name) exactly the piece of code we *want* to call.

Let's continue on, assuming HotSpot is smart enough to work around our half-dozen or so DynamicMethod.call implementations.

DefaultMethod is the DynamicMethod implementation for interpreted Ruby code, so it calls directly into the evaluator. So at that point, DefaultMethod.call will inline the evaluator code and that looks pretty good. But there's also the JIT located in DefaultMethod. It generates a JVM bytecode version of the Ruby code and from then on DefaultMethod calls that. Now that's certainly a good thing on one hand, since we've eliminate the interpreter, but on the other hand we've essentially made it impossible for HotSpot to inline that generated code. Why? Because we generate a Java method for every JITable Ruby method. Hundreds, and eventually thousands of possible implementations. Making a decision to inline any of them into DefaultMethod.call is basically impossible. We've broken the chain.

To make matters worse, we also have the set of Java-wrapping DynamicMethod implementations, *CallbackMethod (used for binding Java code to Ruby method names) and CompiledMethod (used in AOT-compiled code).

The CallbackMethods all wrap another piece of generated code that implements Callback and calls the Java method in question. So we generate nice little wrappers for all the pre-existing methods we want to call, but we also make it impossible for the *CallbackMethod.call implementations to inline any of those calls. Broken chain again.

CompiledMethod is slightly better in this regard, since there's a new CompiledMethod subclass for every AOT-compiled Ruby method, but we still have a single implementaiton of DynamicMethod.call that all of those subclasses share in common. To make matters worse, even if we had separate DynamicMethod.call implementations, that may actually *hurt* our ability to inline code way back in IRubyObject.callMethod, since we've now added N possible DynamicMethod.call implementations to the system. And the chain gets broken even earlier.

So the bottom line here is that in order to continue improving performance, we need to do everything possible to move the call site and the call target closer together. There are a couple standard ways to do it:

Hard-coded special-case code for specific situations, much like YARV does for simple ops (+, -, <, >, etc). In these cases, the compiler would check that the target implements an appropriate type to do a direct call to the operation in question. In Fixnum's case, we'd first confirm it's a RubyFixnum, and then invoke e.g. RubyFixnum.plus directly. That skips all the chain breakage, and allows the compiled code to inline RubyFixnum.plus straight into the call site.
Dynamic generated method adapters that can be swapped out and that learn from previous calls to make direct invocations earlier in the chain. Basically, this would involve preparing call site caches that point at call adapters. Initially, the call adapters would be of some generic type that can use the slow path. But as more and more calls come in, more and more of the call sites would be replaced with specialized implementations that invoke the appropriate target code directly, allowing HotSpot a direct line from call site to call target.

The second version is obviously the ultimate goal, and essentially would mimic what the state-of-the-art JITs do (i.e. this is how HotSpot works under the covers). The first version is easily testable with some simple hackery.

I created a small patch that includes a trivial, unsafe change to the compiler to make Fixnum#+, Fixnum#-, and Fixnum#< direct calls when
possible. They're unsafe because they don't check to see if any of those
operations have been overridden...but of course you'd have to be a mad
fool to override them anyway.

To demonstrate a bit of the potential performance gains, here are some
numbers for JRuby trunk and trunk + patch. Note that Fixnum#+, Fixnum#-, and Fixnum#< are all already STI methods, which does a lot to speed up their invocation (STI uses a table of switch values to bypass dynamic method lookup). But this simple change of compiling direct calls completely blows the STI performance out of the water, and that's without similar direct calls to the fib_ruby method itself.

test/bench/bench_fib_recursive.rb

JRuby trunk without patch:
1.675000   0.000000   1.675000 (  1.675000)
1.244000   0.000000   1.244000 (  1.244000)
1.183000   0.000000   1.183000 (  1.183000)
1.173000   0.000000   1.173000 (  1.173000)
1.171000   0.000000   1.171000 (  1.170000)
1.178000   0.000000   1.178000 (  1.178000)
1.170000   0.000000   1.170000 (  1.170000)
1.169000   0.000000   1.169000 (  1.169000)

JRuby trunk with patch:
1.133000   0.000000   1.133000 (  1.133000)
0.922000   0.000000   0.922000 (  0.922000)
0.865000   0.000000   0.865000 (  0.865000)
0.862000   0.000000   0.862000 (  0.863000)
0.859000   0.000000   0.859000 (  0.859000)
0.859000   0.000000   0.859000 (  0.859000)
0.864000   0.000000   0.864000 (  0.863000)
0.859000   0.000000   0.859000 (  0.860000)

Ruby 1.8.6:
1.750000   0.010000   1.760000 (  1.760206)
1.760000   0.000000   1.760000 (  1.764561)
1.760000   0.000000   1.760000 (  1.762009)
1.750000   0.010000   1.760000 (  1.760286)
1.760000   0.000000   1.760000 (  1.759367)
1.750000   0.000000   1.750000 (  1.761763)
1.760000   0.010000   1.770000 (  1.798113)
1.760000   0.000000   1.760000 (  1.760355)

That's an improvement of over 25%, with about 20 lines of code. It would be even higher with a dynamic adapter for the fib_ruby call. And we can take this further...modify our Java integration code to do direct calls to Java types, modify compiled code to adapt to methods as they are redefined or added to the system, and so on and so forth. There's a ton of potential here.

I will continue working along this path.

Saturday, June 23, 2007

Camping at O'Reilly

Rumors of my demise were greatly exaggerated!

I've been mostly MIA the past couple weeks, largely due to the Ruby Kaigi and related events in Tokyo and a short family trip this past week to Lake Michigan's eastern shore (warm sandy beaches and a much-welcomed rest). But I figured I'd check in so everyone knows I'm still around and the wheels are still turning.

I'm at Foo Camp this weekend, and so far it's been a great time. Lots of good discussions, games of Werewolf, and staying up too late (I think I got about 3.5 hrs of sleep...more than enough!) Today I'm going to try to talk about something outside my usual subjects, mostly as a discussion facilitator, but with a bit of opinion tossed in for good measure. Here's the description I posted to the Foo calendar:

The Transformation Age

This is a collective presentation. There are no attendees; only nodes in the matrix. Expect to share.

Part 1: Individuals to collectives, isolated to connected, The Next Evolution.

As the exchange of information has become progressively more rapid, the communication boundaries more transparent, we are transforming into a new kind of collective entity. The swing from isolated individuals or isolated communities with their own distinct stores of knowledge toward a single global community with all stores of knowledge available to everyone is more than just an interesting trend...it represents a new evolutionary direction humanity is taking. It is the dawn of the transformation age for humankind.

We will explore what it means to be more and more a collective being in this new era, what it will mean for our children and grandchildren and nth grandchildren as pervasive interconnection becomes the norm. As all minds become so intimately connected that being disconnected from the global consciousness is like losing an arm...or one's own identity. And we'll look at how this new evolution is affecting and has affected successful and unsuccessful technologies, business models, and governments over the past several decades.

Part 2 (time permitting): Knowledge represented as transformation of information rather than as desperate attempts to snapshot or categorize information at a given point in time.

The idea of categorizing information has served us fairly well when information itself was slow to change, slow to be communicated, and largely static. Books can be sorted in a categorization scheme because the information they contain does not change over time; the categories remain as valid as when they were assigned. But what happens when the entirety of humanity's knowledge is now not only instantly available, but rapidly changing and evolving along with us? Is not categorization of changing information inherently flawed when information snapshots are almost immediately out of date? What can we do to allow everyone in this age of transformation to participate in information sharing in scalable way?

I would propose that when information itself has become so fluid it defies static categorization, that the transformation of knowledge is the new information we need to track. Already you see the signs of this:

Wikis are updated far more often than they have completely new entries added; the success of wiki is in transformation of knowledge over time, and in continuous evolution of the information contained therein.
Scientists in all fields build new ideas upon old; there are no new ideas that don't synthesize existing parts, transforming our understanding of those parts into a new entity.
Agile development emphasizes small, rapid, interactive transformations of data (a software algorithm + user interfaces on the large) rather than as big bang snapshots and releases.
Open source licenses reduce the barriers to accessing data effectively to zero explicitly to allow for rapid transformation of that data into today's entity. "Build from trunk" is becoming a normal recommendation for rapidly transforming projects, since snapshots are immediately out of date.

We'll talk about whether this all makes sense, whether moment-in-time snapshots of information are becoming more or less relevant than the changes over time to that information, and what can be done to both facilitate transformation and adapt our traditional information management ideals to this constantly changing information sea.

If you're interested in hearing more about all this, about how the session goes, or just would like to talk sometime along these lines, gimme a shout.

Sunday, June 10, 2007

JRuby 1.0 Released!

We have finally released JRuby 1.0, based on the last release candidate, RC3. And what more is there to say? Not really a whole lot...It's almost entirely RC3, with one or two minor fixes added in. But it's really turned out to be an outstanding release, and already reports are coming in of folks trying it out en masse. We're very happy.

So I'll do a little recap here. JRuby 1.0 was focused almost entirely on one goal: Ruby 1.8.x compatibility. To that end, we are now the only alternative Ruby implementation that can reasonably claim we're "compatible". It's no longer a question of whether we can run Ruby applications or not...we've proven that again and again. The issues people run into now are those requiring minor behavioral tweaks, minor parser tweaks, and occasionally exploration of some peculiar threading or memory concerns. It's been a long time coming, but the compatibility issue is largely answered.

Now we start looking toward the future. Once you have a working Ruby implementation, what next? I believe we've shown that the correct way to approach Ruby is to get it right first. The next step is making it run as well as possible. Ok, so we've cheated a bit along the way, introducing interpreter optimizations and a JIT compiler, but that's all fun stuff. The heavy lifting for performance and scalability are coming up, and there's a lot of low-hanging fruit we can start to tackle with 1.0 safely behind us. So action item #1 for the future of JRuby is plucking that low-hanging performance fruit.

The second item for any working implementation would have to be platform integration. Our platform is Java (and of course we've cheated a bit here by doing additional work to make Java integration really useful and usable), so integration involves making Ruby a better citizen of the Java platform. In this area, expect to see us further reduce the disconnect between Ruby and Java, allowing you to construct Ruby objects directly from Java code, define real Java classes using Ruby (classes you can then compile Java code to call), and minimizing to as large an extent possible the performance impact of calling from dynamically typed Ruby code into statically typed Java code. Java integration is action item #2.

And then what? Well, there are many options, all very attractive. I personally would like to see time spent implementing potential Ruby 1.9/2.0 features, to provide a second testbed where people can try those features out. I would like to find ways to share code with Rubinius (beyond tests), either by implementing just enough Rubinius logic to run its pure-Ruby core-class implementations or by implementing a full Rubinius kernel on the JVM. I would like to see JRuby expand to the rest of the Java world, bringing the Ruby Way to Java EE. But most importantly, I want to reach out to the Ruby community and find ways for them to be a part of the JRuby process.

Up to now, JRuby has really existed in its own, separate world. There was the Ruby community and also the JRuby community, and although members might identify themselves with both, there was always this perceived boundary.

We need to break that boundary down...not only for JRuby but for the other Ruby implementations as well.

Ruby is coming of age. Multiple implementations shows that Ruby has really matured as a language...and also shows it has a lot of maturing left to do. Ask Evan (of Rubinius) or me how we feel about retry behavior or block argument processing or thread event processing or SAFE levels and tainting and you'll start to understand some of the ugly, hidden bits of Ruby. We need to make sure that Ruby development proceeds as a whole...not necessarily as a single project or a single codebase, but as an open, direct exchange of concerns, complaints, solutions and ideas. We need to start treating the Rubinius community and the JRuby community and any other implementations' communities as part of the whole...different facets of the same gem.

If you have never tried an alternative implementation of Ruby: do it today. Pick out your favorite app, library, or framework and build your next app using something other than Matz's Ruby. Start running your continuous integration tests against JRuby trunk (or Rubinius trunk, if it runs your code ok; those guys sorely need more CI hits). Make sure your gem releases work on JRuby too, and if you have native (i.e. C) code, explore what it would take to do a JRuby port. Show that you welcome diversity into the Ruby world...that you recognize that diversity is an essential part of language evolution.

JRuby has always been a community project, and only by absorbing the JRuby community into the Ruby community will JRuby continue to be successful. If we can make that happen, I see wonderful things in Ruby's future...regardless of which implementation you use.

Monday, June 04, 2007

A Response to Ola's IronRuby Post

I'm not up for creative titles tonight. Hopefully you'll see this in your feed reader and click for a second opinion. Granted, I agree with Ola's IronRuby post on most points, but I disagree on a few key items. So let's dive in, shall we?

You managed to blog this viewpoint before me, Ola, but you know I agree with almost everything. However, to pour a bit more oil on the fire, I'll take it a step further.

Having run the Ruby gauntlet and brought JRuby from not running anything to running Rails almost 100% perfectly (in just over a year, I might add), I will confidently say there's no way with current specs and tests that anyone could create an implementation of Ruby from scratch that will run Rails unless they can look at the existing implementations. I simply do not believe it's possible.

And I'll throw another couple curve balls too:

I don't believe tests and specs will be where they need to be within the next 6-12 months unless there's a major effort put behind them. Even if that effort happens, I still don't think we'll ever have specs enough for someone to implement Ruby "in the dark" for at least a year, if it ever happens.
IronPython did a great job getting pretty close to 100% compatible. But Jim Hugunin had implemented an almost 100% Python-compatible implementation in Jython before going to Microsoft; he didn't need to look at the Python source. I don't believe John Lam has the same level of experience with Ruby, so he's at a severe disadvantage (and to John: I really feel for you, man...this has got to be difficult).
This is a good friend's belief, but he's won me over: we don't believe Microsoft would ever willingly allow IronRuby to get to the point of running Rails, since that would directly compete with their ASP.NET server, software, and tool offerings. What would be the benefit to them of a free runtime running a free language implementation that runs a free web framework? Probably zero. And as Martin Fowler and others have blogged about NUnit, Microsoft hasn't exactly been lovey-dovey with OSS projects that impact (or are perceived to impact) their bottom line.
Given these facts and the current situation, I'd say it's a better bet for us as a community to get behind the Gardens Point Ruby.NET Compiler project, which is already much farther along than IronRuby...plus it's real open source (you can contribute) and they can look at Ruby's source (and have admitted to doing so for at least the parser). I was wary/skeptical of Ruby.NET last year, but they now seem like the current best hope for Ruby on the CLR. That link again is Gardens Point Ruby.NET Compiler. And to the QUT guys working on Ruby.NET, you need to do three things right now to save your project from historical obscurity: set up a public source repository, accept a few external contributors, and start blogging and emailing up a freaking storm.

That about sums up what I have to say about IronRuby, Microsoft, and the future of Ruby on the CLR. The rest of Ola's points I agree with...diversity is important, a spec and test kit are needed immediately, and Microsoft needs to change their OSS attitude pretty quick or risk becoming irrelevant.

Sunday, June 03, 2007

JRuby 1.0.0RC3 Released - And This Is It!

Tom posted the announcements already, but JRuby 1.0.0RC3 is out in the wild! This release is our most important yet, because we intend for this release to become JRuby 1.0. The only things that will change from now until a 1.0 final release later this week would be any showstopping bugs that are extremely low-impact to fix. In general, RC3 should be nearly identical to 1.0 final.

So what does this mean for JRuby? We've taken the approach of calling the JRuby 1.0 release the "Ruby compatible" release. Basically, all known application bugs caused by JRuby incompatibilities with Matz's Ruby (MRI) have been resolved. This doesn't mean there aren't more compatibility bugs; there probably always will be, and we have probably 50-80 in the bug repository right now (and another 80 bugs or enhancements that are JRuby-specific). But it does mean that most applications should "just work" out of the box, and any application failures we knew of have been resolved. In general, the remaining post 1.0 bugs are edge cases and minor correctness issues, like obscure forms of core methods or missing error conditions.

We've tried very hard to make this both a solid release and a stepping stone to JRuby's future. With this release, we can confidently say that we're Ruby 1.8.5 compatible. Any additional fixes we need to make (like the known issues) are in the last 1% of Ruby features we can potentially support. We also recognize that this is a big moment for JRuby. People can start counting on JRuby to run their Ruby applications correctly. Of course many in the JRuby community have already been doing this for many months, but the 1.0 moniker says to the world we feel like we're ready for prime time.

So what comes after 1.0? That breaks down into a few areas:

Performance. JRuby's performance has come a long way in the past year. We've increased speed by at least an order of magnitude, and have enabled the JIT compiler for the 1.0 release. This means that for many cases where JRuby can compile Ruby code, it will perform faster than Ruby 1.8.5. The general case is still a little muddy, though. We've had anecdotal reports that JRuby on Rails in a Java application server performance extremely well, and other reports that general Ruby applications perform somewhat poorly. The general performance situation is not well-understood, but we all agree there's a lot more work to be done. And the good news is that we have a big list of optimizations remaining to continue improving JRuby's specific and general performance.
Java Integration. JRuby does an excellent job of fitting into the Java platform. In almost all cases, you can call libraries, implement interfaces, and extend classes with no difficulty. But there are edge cases--usually nonstandard or antipattern Java--that JRuby doesn't behave as nicely. The code to enable calling Java is also far more complicated than we'd like. To solve both issues, we'll be taking the existing Java integration syntax and API and backing it with a redesigned library. Expect to see something of this work in a 1.1 release later this year. Our goal is to achieve excellent integration with Java, on par with the most tightly-integrated JVM languages available today. And we're not far off.
Ruby 2.0 and Rubinius. We intend to start supporting Ruby 2.0 and Rubinius bytecode execution soon, though the Ruby 2.0 work is much farther along. We intend to catch up the Rubinius work as well as to try adopting some of Rubinius's pure Ruby implementations of core class functionality. We also plan to enable the use of Ruby 2.0 features through configuration and command-line switches to JRuby. Specify something like -J-Djruby.string.version=2 and all strings in the system will be treated as Ruby 2.0 strings, with full character semantics. Or optionally turn on experimental Ruby 2.0 syntax for lambdas and named parameters. Basically we want to bring JRuby up to the bleeding edge of Ruby features, to provide another platform where people can get a feel for what Ruby 2.0 might look like. More exposure for these features will help Matz and Co. decide the best way to move forward. And that's good for everyone.

And of course we'll continue fixing compatibility bugs, Java integration bugs, and expanding JRuby's reach on the Java platforms with more DSLs and API wrappers around all your favorite Java APIs.

But there's one more area I should talk about: how you can get involved.

JRuby is a community project. The core committers play traffic cops as much as developers, routing patches, examining bugs, documenting features. The only way a project like JRuby succeeds is through mass community involvement. We've been extremely lucky to have a constant flow of interested developers into our community, even when OSS community attrition has been very high. The community may look completely different from month to month, but the flow of patches and bugs has steadily increased.

For folks just getting started, I've written a few articles on the JRuby wiki that should help you understand the development process. They're linked from the front page under "Getting Involved".

For easy bugs, I'd recommend looking in the JRuby JIRA for bugs with "rubinius" or bugs reported by Daniel Berger. The Rubinius bugs are almost all problems we've had running Rubinius's excellent specs, and usually represent minor incompatibilities with C Ruby. Daniel's tests are a lot of those edge cases I mentioned...he's running through his own test suite and reporting any incorrect behavior that's come up. Another option would be to just pick your favorite Ruby app and start running its test cases with JRuby. You're sure to find something interesting to report, and it may be an easy fix.

For non-Java-coders (or folks afraid of hacking on JRuby), stop by the RubySpec Wiki and update or author an article. RubySpec is an effort to build a community-driven specification of Ruby that all users and implementers can freely reference. It is linked from the RubyDoc site and is fast becoming a standard way for the community to record language and library behaviors. I believe this is the best and fastest way for us to form a complete specification of Ruby's behavior...and I believe such a specification is becoming extremely important, what with there now being 5-10 different implementations of Ruby, all guessing at what "correct" is. Have you written RubySpec today?

Well that's about it for this release report. If all goes as well as I think it will, JRuby 1.0 final should be released some time this week. Give it a try, I think you'll be pleasantly surprised!

Friday, June 01, 2007

Creating a Field-Initializing 'new' Method

One thing often touted as a missing feature in Ruby is the lack of a constructor form that initializes fields. A few other languages have this feature, including for example Groovy, another JVM dynamic language. The general idea is that if you want to construct an object and initialize a number of fields, you often want to do it in one shot. Rather than modify the class to have additional initializers for all the fields you want to set, there's another option.

Because Ruby is so cool, you can add this feature yourself to all classes at the same time.


class Class
  def new!(*args, &block)
    # make sure we have arguments
    if args && args.size > 0
      # if it's not a Hash, perform a normal "new"
      return new(*args, &block) unless Hash === args[-1]

      # grab the last arg in the list
      last_arg = args.pop

      # make sure all fields actually exist
      last_arg.each_key {|key|
        unless public_instance_methods.include?("#{key}=") do
          raise ArgumentError.new(
                           "No attr setter for name: #{key}") 
        end
      }

      # create the object and set its fields
      new_obj = new(*args, &block)
      last_arg.each {|key, value|
        new_obj.send "#{key}=", value
      }
    else
      # no args, just do a normal "new" with any block passed
      new_obj = new(&block)
    end
    new_obj
  end
end

So with such a simple piece of code, we now have a new! method on all classes that accepts a final parameter--a hash of field names and values--that can be given using Ruby's named-parameter-like syntax. Given a simple class, like the following:


class MyObject
  attr_accessor :foo
  attr_accessor :bar
  
  def initialize(msg)
    puts msg
  end
end

No additional work is needed to use our new! method:


x = MyObject.new!("yippee",
                    :foo => "hello", :bar => "goodbye") 
 => "yippee"
p [x.foo, x.bar]
 => ["hello", "goodbye"]
y = MyObject.new!("blah", :yuck => "baz")
 => error: "No attr setter for name: yuck"

The reason this works is that all classes are instances of the Class class. So the MyObject class definition above is roughly equivalent to saying:


MyObject = Class.new {
  # class def logic here
}

This means that instances of Class, like MyObject, inherit methods defined on Class, like new!. Since all classes in the system are Class objects, all classes instantly gain a new! method.

This is a perfect example of why Ruby is such a powerful language, and why it's so easy in Ruby to use the coolest metaprogramming tricks. And it's a primary reason why frameworks like Rails have been able to do such amazing things. With a language that's this powerful and this easy, you can imagine what else is possible.

Are we having fun yet?

Sunday, May 27, 2007

Adding Annotations to JRuby Using Ruby

I love how meta-programmable Ruby is.

JRuby doesn't support annotations because Ruby doesn't support annotations. So what! We can extend Ruby to add something like annotations:


class JPABean
def self.inherited(clazz)
  @@annotations = {}
end
def self.anno(annotation)
  @@last_annotation = annotation
end

def self.method_added(sym)
  @@annotations[sym] = @@last_annotation
end

def self.get_annotation(sym)
  return @@annotations[sym]
end
end

Given this, we can do things like the following:


class MyBean :sql => "SELECT * FROM stuff WHERE name = 'foo'"
def foo; end

anno :field => :hello_id
attr :hello

anno :field => :bar_in_table
attr_accessor :bar
end

And here's some output from the resulting class:


p MyBean.get_annotation(:foo)
p MyBean.get_annotation(:hello)
p MyBean.get_annotation(:bar=)

=>

{:sql=>"SELECT * FROM stuff WHERE name = 'foo'"}
{:field=>:hello_id}
{:field=>:bar_in_table}

So as you can see we really do have all the necessary requirements to annotate classes. Now what if we just had MyBean be a java.lang.Object extension and stuffed the annotations into the resulting generated class? We can already create real Java classes in this way, but with the above syntax and a little magic in the JRuby Java integration later, we've got annotation support in on Java classes too. This should enable things like Hibernate, JPA, and JUnit 4 to work with JRuby's Ruby-based classes. Or at least, I believe it to be possible. It just requires a little work to add annotation information to the resulting generated classes.

I've planted the seed here and on the JRuby dev list...let's see if it germinates.

JVM Languages: The Future

I've started a Google Group for all those interested in the future of alternative languages (i.e. not Java) on the JVM. A number of you knew this was coming, and I've already invited folks that expressed an interest in my RedMonk Unconference session at JavaOne. The rest of you are also invited.

Please plan to talk shop about language implementation strategies, pain points on the JVM, and what we can do to build a common set of libraries, frameworks, and patterns to ease and improve the Java platform's support for many languages. The time for action is now, my friends, and we have a wealth of information across all our language projects to draw from. We must come together to build a stronger base for everyone to stand upon.

I welcome you to the future.

Saturday, May 26, 2007

Foo Camp 2007

I've been invited to this year's Foo Camp. Since it sounds like a great time and there's a bunch of other folks going I'd like to talk to, I'm accepting the invitation. My purpose posting this entry is to let other campers know I'll be there and try to hook up and chat a bit beforehand. So drop me a line...firstname.lastname@sun.com.

FYI, if any other campers are coming up from Oakland, I'm renting a car. My times are listed on the "Rides Offered" page.

I'm looking forward to a fun weekend!

Tuesday, May 22, 2007

The Final Bugs

We're within weeks of a final JRuby 1.0. We've pared down the bugs that we think we can or must fix for 1.0 and this email is a summary of the ones we need help with. So whatever time you can spare, please have a look and help resolve these.

In order of decreasing priority in JIRA:

JRUBY-820: Net::HTTP.get behaves differently form MRI, failing to get UTF8 properly
-and-
JRUBY-828: UTF-8 regular expressions aren't working

I believe these are both largely the same problem, and it's probably the most visible remaining bug we need to fix. Regular expressions in JRuby currently do not work well with unicode strings, both as incoming match strings and as the regular expression strings themselves. REXML uses /u regular expressions, which is why I believe these issues are closely related. Wes Nakamura has commented that he's looking into 828, but more eyes will help ensure this is fixed.
I believe these are true blocking bugs for 1.0, and I'm not comfortable doing a release if they are not resolved.

JRUBY-971: Make it clear which Java method maps to each equality method

I think we have subtle equality bugs remaining largely because we don't use a consistent naming convention for the Java implementations of Ruby methods like ==, ===, eql? and so on. We should make a quick sweep through the system and make sure all core JRuby code is using the same Java method names for all of these, and binding them in the same way. Note: This is actually the cause of some set_trace_func bugs, since the default === impl should actually invoke ==, causing trace events for both.

JRUBY-969: Add rake and rspec to standard JRuby distribution for 1.0

We have decided to ship both Rake and RSpec with JRuby 1.0, since they are both well-accepted and widely used (moreso for Rake, but RSpec has gained a lot of acceptance the past year). But the mechanism of including them is unclear. I do not want to commit installed gems to SVN; I would prefer to just install them as needed for testing and when building the JRuby distribution. Thoughts?

JRUBY-672: java.lang.Class representation of Ruby class not retrievable

This bug is basically just looking for a way to get at the actual java.lang.Class representing a Ruby-based extension of a Java class. There may have been an API added with Bill's work, or this could be simple to add.

JRUBY-914: JRuby's BigDecimal needs to round

Stu submitted a patch for this, but it unfortunately depends on behavior only present in Java 5 and higher. Since most of us are pretty unfamiliar with BigDecimal, we don't know of a good workaround to support it correctly in Java 1.4.

JRUBY-822: jruby fails to report process exit status correctly

I think Nick may have a grip on this one, but if anyone can offer suggestions as to why the exit codes are apparently shifted in this way, please let us know.

JRUBY-966: Various issues with LocalJumpError not being created early enough
-and-
JRUBY-767: next in an eval should produce a local jump error

These are likely going to be up to Tom and I since they involve pretty deep runtime/interpreter work. But we're also looking for any remaining known issues with LocalJumpError/JumpException that are still out there. I believe we've fixed all such externally-reported issues at the moment.

JRUBY-888: Make regular distro/build create the all-in-one jar by default

We waffle back and forth on this. What's the answer? An all-in-one jar by default, or on request?

JRUBY-873: ant test thread tests sometimes run forever

I managed to get a trace on this one. It appears that some of the tests cause a deadlock between stopping a thread and other operations. But I don't believe anyone's seen it in normal execution yet (or at least nobody's reported it). I'm probably the only one familiar enough with the threading subsystem to fix it, but I'd like to know if anyone's seen other issues.

JRUBY-98: allow "/" as absolute path in Windows
-and-
JRUBY-36: Dir['...'] incompatibilities between Ruby and accross platforms
-and-
JRUBY-61: IO CRLF compatibility with cruby on Windows
-and-
JRUBY-644: Windows line delimiter (\r\n) cause position errors

These blasted Windows pathing and newline bugs just seem to hang on forever. Nobody wants to fix them. There must be someone out there that can help resolve these, yes?

JRUBY-884: Create and consolidate extension and non-standard options

There are a number of configuration options for JRuby. Some have their own flags, like -O to disable ObjectSpace and -C to force compilation of a target script. But many others are only configurable via Java system properties. What we really could use from the community is some idea which settings are worth adding special flags for. We can handle the rest. My personal opinion is that the new -J flag that allows passing flags to the JVM is actually enough.

By my estimation, the rest of the bugs are either almost done or are trivial enough they could be punted to a post 1.0 release. But if we can get these issues knocked down soon (starting with those top couple), 1.0 is going to be an extremely solid release.

Monday, May 14, 2007

Big Plans

I just came across this amusing entry from an old, now-defunct blog of mine on LiveJournal.

The last line pretty much says it all. I must have been really frustrated with my job at the time:

Someone should pay me to sit at home and do great things.

(If you don't get the joke: I now work for Sun Microsystems, at home, on JRuby...so if JRuby's ever considered a "great thing", the prophecy will be fulfilled.)

And yes, I've seen the Microsoft news. I'd hate to be an OSS developer or apologist at Microsoft today. If Sun did something like this I'd resign.