Tuesday, November 20, 2007

Bytecode Tools in Ruby: A Low-level DSL

I've been toying with the idea of rewriting the JRuby compiler in Ruby, or at least writing the appropriate plumbing that would allow someone to do something similar. Migrating the JRuby compiler may or may not be worth it, since the existing Java compiler is basically done and working well, and a conversion would be sure to introduce bugs here and there. But it would certainly be a show of faith to give it a try.

As part of this effort, I've built up some basic utility code and a simple JVM bytecode builder that could act as the lowest level of such a compiler. I'm looking for input on the syntax at this point, while I take a break from it to explore JRuby Java integration improvements I think should be done before 1.1.

So here's the Ruby source of the builder, as contained within a test case:

require 'test/unit'
require 'compiler/builder'
require 'compiler/signature'

class TestBuilder < Test::Unit::TestCase
import java.lang.String
import java.util.ArrayList
import java.lang.Void
import java.lang.Object
import java.lang.Boolean

include Compiler::Signature

def test_class_builder
cb = Compiler::ClassBuilder.build("MyClass", "MyClass.java") do
field :list, ArrayList

constructor(String, ArrayList) do
aload 0
invokespecial Object, "<init>", Void::TYPE
aload 0
aload 1
aload 2
invokevirtual this, :bar, [ArrayList, String, ArrayList]
aload 0
swap
putfield this, :list, ArrayList
returnvoid
end

static_method(:foo, this, String) do
new this
dup
aload 0
new ArrayList
dup
invokespecial ArrayList, "<init>", Void::TYPE
invokespecial this, "<init>", [Void::TYPE, String, ArrayList]
areturn
end

method(:bar, ArrayList, String, ArrayList) do
aload 1
invokevirtual(String, :toLowerCase, String)
aload 2
swap
invokevirtual(ArrayList, :add, [Boolean::TYPE, Object])
aload 2
areturn
end

method(:getList, ArrayList) do
aload 0
getfield this, :list, ArrayList
areturn
end

static_method(:main, Void::TYPE, String[]) do
aload 0
ldc_int 0
aaload
invokestatic this, :foo, [this, String]
invokevirtual this, :getList, ArrayList
aprintln
returnvoid
end
end

cb.write("MyClass.class")
end
end

For those of you who don't speak bytecode, here's roughly the Java code that this would produce:
import java.util.ArrayList;

public class MyClass {
public ArrayList list;

public MyClass(String a, ArrayList b) {
list = bar(a, b);
}

public static MyClass foo(String a) {
return new MyClass(a, new ArrayList());
}

public ArrayList bar(String a, ArrayList b) {
b.add(a.toLowerCase());
return b;
}

public ArrayList getList() {
return list;
}

public static void main(String[] args) {
System.out.println(foo(args[0]).getList());
}
}

The general idea is that fairly clean-looking Ruby code can be used to generate real Java classes, providing a readable base for code generation tools like compilers.

There's a couple things to notice here:
  • Everything is public. I have not wired in visibility and other modifiers mainly because it starts to look cluttered no matter how I try. Suggestions are welcome.
  • The bytecode, while clean looking, is pretty raw. This interface also doesn't save you from yourself; if you're not ordering your bytecodes right, you'll end up with an unverifiable class file.
  • It's not apparent just from looking at the code which types specified are return values and which are argument values. Something more explicit could be useful here.
I'd like to continue this work. The above code, run against JRuby trunk and the lib/ruby/site_ruby/1.8/compiler library I'm working on, will produce a working MyClass class file:
~/NetBeansProjects/jruby $ jruby test/compiler/test_builder.rb
Loaded suite test/compiler/test_builder
Started
.
Finished in 0.096 seconds.

1 tests, 0 assertions, 0 failures, 0 errors
~/NetBeansProjects/jruby $ java -cp . MyClass foo
[foo]

So it's actually emitting the appropriate bytecode for this class.

Comments? Thoughts for improvement?

Monday, November 19, 2007

Have You Written RubySpec Today?

You know Ruby. You've been coding up Ruby apps for a while now. You've seen the power and the magic that Ruby offers developers, and you've seen some of the weirder, wilder, and perhaps uglier sides of Ruby. You think you're pretty well-versed in the core classes, and you can hold your own when metaprogramming.

You're a Ruby Programmer. Now Prove it.

RubySpec is a wiki-based Ruby specification, aimed at forming an English-language, community-driven spec for Ruby 1.8 (and in the future, Ruby 1.9). There's a lot of content there already, and a lot of stubbed articles and missing details. And it's your turn to contribute.

You'll be in good company. Ruby oldschoolers like why the lucky stiff and Matz himself have contributed updates and fixes. Ryan Davis contributed the content of his Ruby Quickref. Those of us on the JRuby team try to update it when we find pecularities in the Ruby language, classes, or runtime. And many folks use it as a convenient reference.

RubySpec is connected with the Ruby Documentation Project, which also hosts Ruby 1.8 and 1.9 RDoc-generated documentation directly from the C source. The spec is designed to fill in the gaps, explaining more details about the language, the runtime, and the implementations that would be useful to folks interested in a deeper look at Ruby.

So...have you written RubySpec today?

The First Ruby Mailing List Translator

In response to my post about the Ruby community needing an autotranslator for the key mailing lists, due to the language barrier between English-speaking and Japanese-speaking folks, I received quite a bit of interest, and a few people are looking into automatic solutions, manual solutions, and various solutions in between. But I believe we have a first attempt to meet the challenge.

Jason Toy has set up a mailing list translator site for the Ruby community. It provides autotranslated text of many Ruby mailing lists (both directions, and far more than I expected anyone to tackle), and even better it provides the original text and invites bilingual Rubyists to submit better translations for individual posts.

Jason commented on the previous post, saying his site still needs some work, but it's definitely on the right track. The obvious missing feature is a way to subscribe to the translated lists, either via feeds or mailing lists. Anyone feel like lending a hand can contact Jason at (I think) admin@translator.rubynow.com. Anyone feeling like doing this one better, maybe by setting up an army of human translators to proxy information across the divide, don't let this stop you :)

Saturday, November 17, 2007

RejectConf 4 Calls to Action: Ruby specs and JRuby failures

For those of you that don't know, RejectConf started last year at RubyConf 2006 in Denver because Ryan Davis (zenspider; author of the many parsetree-based tools like flog and heckle, part of the vlad team, and so on) had a rejected presentation and still wanted to show it off. It turned out to be one of the most entertaining parts of the conference, with dozens of folks taking 5-15 minutes to demo something they thought would be cool. Some were met with applause, some were booed down Apollo-style, and everyone had a good time.

RejectConf 4 was at RubyConf 2007 earlier this month. I took the opportunity to toss out two calls to action:

  1. Help contribute to Rubinius by adding specs, and if you're especially lazy just copy any of the many tests in JRuby (that are not specs, but which we'd just as soon see migrate to a single suite).
  2. Report failures in JRuby running the specs, and possibly fix them if you feel like helping out even more.
Confreaks have all the RejectConf 4 videos up now, and my call to action is there with the others. Check it and the others out, and consider helping out either the spec project or JRuby.

Monday, November 05, 2007

Ruby Community Seeks Autotranslator

As many of you know, Ruby was created in Japan by Yukihiro Matsumoto, and most of the core development team is still Japanese to this day. This has posed a serious problem for the Ruby community, since the language barrier between the Japanese core team and community and the English-speaking community is extremely high. Only a few members of the core team can speak English comfortably, so discussions about the future of Ruby, bug fixes, and new features happens almost entirely on the Japanese ruby-dev mailing list. That leaves those of us English speakers on the ruby-core mailing list out in the cold.

We need a two-way autotranslator.

Yes, we all know that automated translation technology is not perfect, and that for East Asian languages it's often barely readable. But even having partial, confusing translations of the Japanese emails would be better than having nothing at all, since we'd know that certain topics are being discussed. And English to JP translators do a bit better than the reverse direction, so core team members interested in ruby-core emails would get the same benefit.

I imagine this is also part of the reason Rails has not taken off as quickly in Japan as it has in the English-speaking world: the Rails core team is peopled primarily by English speakers, and the main Rails lists are all in English. Presumably, an autotranslating gateway would be useful for many such communities.

But here's the problem: I know of no such service.

There are multiple translation services, for free and for pay, that can handle Japanese to some level. Google Translate and Babelfish are the two I use regularly. But these only support translating a block of text or a URL entered into a web form. There also does not appear to be a Google API for Translate, so screen-scraping would be the only option at present.

The odd thing about this is that autotranslators are good enough now that there could easily be a generic translation service for dozens of languages. Enter in source and target languages, source and target mailing lists, and it would busily chew through mail. For closely-related European languages, autotranslators do an extremely good job. And just last night I translated a Chinese blog post using Google Translate that ended up reading as almost perfect English. The time is ripe for such a service, and making it freely available could knock down some huge barriers between international communities.

So, who's going to set it up first and grab the brass ring (or is there a service I've overlooked)?

Updated Alioth Numbers for JRuby 1.1b1

Oh yes, you all know you love the Alioth Shootout.

Isaac Gouy has updated the JRuby numbers, and modified the default comparison to be with Ruby 1.8.6 rather than with Groovy as it was before. And true to form, JRuby is faster than Ruby on 14 out of 18 benchmarks.

There are reasons for all four benchmarks that are slower:

  • pidigits is simply too short for JRuby to hit its full stride. Alioth runs it with n = 2500, which on my system doing a simple "time" results in JRuby taking 11 seconds and Ruby taking 5. If I bump that up to 5000, JRuby takes 27 seconds to Ruby's 31.
  • regex-dna and recursive-complement are both hitting the Regexp performance problem we have in JRuby 1.0.x and in the 1.1 beta. We expect to have that resolved for 1.1 final, and Ola Bini and Marcin Mielczynski are each developing separate Regexp engines to that end.
  • startup, beyond being a touch unfair for a JVM-based language right now, is actually about half our fault and half the JVM's. The JVM half is the unpleasantly high cost of classloading, and specifically the cost of generating many small temporary classes into their own classloaders, as we have to do in JRuby to avoid leaking classes as we JIT and bind methods at runtime. The JRuby half is the fact that we're loading and generating so many classes, most of them too far ahead of time or that will never be used. So there's blame to go around, but we'll never have Ruby's time for this.
Standard disclaimer applies about reading too much into these kinds of benchmarks, but it's certainly a nice set of numbers to wake up to.

Closures Prototype Applied to JRuby Compiler

A bit ago, I was catching up on my feeds and noticed that Neal Gafter had announced the first prototype of Java closures. I've been a fan of the BGGR proposal, so I thought I'd catch up on the current status and try applying it to a pain point in the JRuby source: the compiler.

The current compiler is made up of two halves: the AST walker and the bytecode emitter. The AST walker recursively walks the AST, calling appropriate methods on a set of interfaces into the bytecode emitter. The bytecode emitter, in turn, spits out appropriate bytecodes and calls back to the AST walker. Back and forth, the AST is traversed and all nested structures are assembled appropriately into a functional Java method.

This back and forth is key to the structure and relative simplicity of the compiler. Take for example the following method in ASTCompiler.java, which compiles a Ruby "next" keyword (similar to Java's "continue"):

public static void compileNext(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

final NextNode nextNode = (NextNode) node;

ClosureCallback valueCallback = new ClosureCallback() {
public void compile(MethodCompiler context) {
if (nextNode.getValueNode() != null) {
ASTCompiler.compile(nextNode.getValueNode(), context);
} else {
context.loadNil();
}
}
};

context.pollThreadEvents();
context.issueNextEvent(valueCallback);
}

First, the "lineNumber" operation is called on the MethodCompiler, my interface for primary bytecode emitter. This emits bytecode for line number information based on the parsed position in the Ruby AST.

Then we get a reference to the NextNode passed in.

Now here's where it gets a little tricky. The "next" operation can be compiled in one of two ways. If it occurs within a normal loop, and the compiler has an appropriate jump location, it will compile as a normal Java GOTO operation. If, on the other hand, the "next" occurs within a closure (and not within an immediately-enclosing loop), we must initiate a non-local branch operation. In short, we must throw a NextJump.

In Ruby, unlike in Java, "next" can take an optional value. In the simple case, where "next" is within a normal loop, this value is ignored. When a "next" occurs within a closure, the given value becomes the local return from that invocation of the closure. The idea is that you might write code like this, where you want to do an explicit local return from a closure rather than let the return value "fall off the end":
def foo
puts "still going" while yield
end

a = 0
foo {next false if a > 4; a += 4; true}

...which simply prints "still going" four times.

The straightforward way to compile this non-local "next" would be to evaluate the argument, construct a NextJump object, swap the two so we can call the NextJump(IRubyObject value) constructor with the given value, and then raise the exception. But that requires us to juggle values around all the time. This simple case doesn't seem like such a problem, but imagine the hundreds or thousands of nodes the compiler will handle for a given method, all spending at least part of their time juggling stack values around. It would be a miserable waste.

So the compiler constructs a poor-man's closure: an anonymous inner class. The inner class implements our "ClosureCallback" interface which has a single method "compile" accepting a single MethodCompiler parameter "context". This allows the non-local "next" bytecode emitter to first construct the NextJump, then ask the AST compiler to continue processing AST nodes. The compiler walks the "value" node for the "next" operation, again causing appropriate bytecode emitter calls to be made, and finally we have our value on the stack, exactly where we want it. We continue constructing the NextJump and happily toss it into the ether.

The final line of the compileNext method initiates this process.

So what would this look like with the closure specification in play? We'll simplify it with a function object.
public static void compileNext(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

final NextNode nextNode = (NextNode) node;

ClosureCallback valueCallback = { MethodCompiler => context
if (nextNode.getValueNode() != null) {
ASTCompiler.compile(nextNode.getValueNode(), context);
} else {
context.loadNil();
}
};

context.pollThreadEvents();
context.issueNextEvent(valueCallback);
}

That's starting to look a little cleaner. Gone is the explicit "new"ing of a ClosureCallback anonymous class, along with the superfluous "compiler" method declaration. We're also seeing a bit of magic outside the function type: closure conversion. Our little closure that accepts a MethodCompiler parameter is being coerced into the appropriate interface type for the "valueCallback" variable.

How about a more complicated example? Here's a much longer method from JRuby that handles "operator assignment", or any code that looks like a += b:
public static void compileOpAsgn(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

// FIXME: This is a little more complicated than it needs to be;
// do we see now why closures would be nice in Java?

final OpAsgnNode opAsgnNode = (OpAsgnNode) node;

final ClosureCallback receiverCallback = new ClosureCallback() {
public void compile(MethodCompiler context) {
ASTCompiler.compile(opAsgnNode.getReceiverNode(), context); // [recv]
context.duplicateCurrentValue(); // [recv, recv]
}
};

BranchCallback doneBranch = new BranchCallback() {
public void branch(MethodCompiler context) {
// get rid of extra receiver, leave the variable result present
context.swapValues();
context.consumeCurrentValue();
}
};

// Just evaluate the value and stuff it in an argument array
final ArrayCallback justEvalValue = new ArrayCallback() {
public void nextValue(MethodCompiler context, Object sourceArray,
int index) {
compile(((Node[]) sourceArray)[index], context);
}
};

BranchCallback assignBranch = new BranchCallback() {
public void branch(MethodCompiler context) {
// eliminate extra value, eval new one and assign
context.consumeCurrentValue();
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
}
};

ClosureCallback receiver2Callback = new ClosureCallback() {
public void compile(MethodCompiler context) {
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getVariableName(), receiverCallback, null,
CallType.FUNCTIONAL, null, false);
}
};

if (opAsgnNode.getOperatorName() == "||") {
// if lhs is true, don't eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(doneBranch, assignBranch);
} else if (opAsgnNode.getOperatorName() == "&&") {
// if lhs is true, eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(assignBranch, doneBranch);
} else {
// eval new value, call operator on old value, and assign
ClosureCallback argsCallback = new ClosureCallback() {
public void compile(MethodCompiler context) {
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
}
};
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getOperatorName(), receiver2Callback, argsCallback,
CallType.FUNCTIONAL, null, false);
context.createObjectArray(1);
context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
}

context.pollThreadEvents();
}

Gods, what a monster. And notice my snarky comment at the top about how nice closures would be (it's really there in the source, see for yourself). This method obviously needs to be refactored, but there's a key goal here that isn't addressed easily by currently-available Java syntax: the caller and the callee must cooperate to produce the final result. And in this case that means numerous closures.

I will spare you the walkthrough on this, and I will also spare you the one or two other methods in the ASTCompiler class that are even worse. Instead, we'll jump to the endgame:
public static void compileOpAsgn(Node node, MethodCompiler context) {
context.lineNumber(node.getPosition());

// FIXME: This is a little more complicated than it needs to be;
// do we see now why closures would be nice in Java?

final OpAsgnNode opAsgnNode = (OpAsgnNode) node;

ClosureCallback receiverCallback = { MethodCompiler context =>
ASTCompiler.compile(opAsgnNode.getReceiverNode(), context); // [recv]
context.duplicateCurrentValue(); // [recv, recv]
};

BranchCallback doneBranch = { MethodCompiler context =>
// get rid of extra receiver, leave the variable result present
context.swapValues();
context.consumeCurrentValue();
};

// Just evaluate the value and stuff it in an argument array
ArrayCallback justEvalValue = { MethodCompiler context, Object sourceArray, int index =>
compile(((Node[]) sourceArray)[index], context);
};

BranchCallback assignBranch = { MethodCompiler context =>
// eliminate extra value, eval new one and assign
context.consumeCurrentValue();
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
};

ClosureCallback receiver2Callback = { MethodCompiler context =>
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getVariableName(), receiverCallback, null,
CallType.FUNCTIONAL, null, false);
};

// eval new value, call operator on old value, and assign
ClosureCallback argsCallback = { MethodCompiler context =>
context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
};

if (opAsgnNode.getOperatorName() == "||") {
// if lhs is true, don't eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(doneBranch, assignBranch);
} else if (opAsgnNode.getOperatorName() == "&&") {
// if lhs is true, eval rhs and assign
receiver2Callback.compile(context);
context.duplicateCurrentValue();
context.performBooleanBranch(assignBranch, doneBranch);
} else {
context.getInvocationCompiler().invokeDynamic(
opAsgnNode.getOperatorName(), receiver2Callback, argsCallback,
CallType.FUNCTIONAL, null, false);
context.createObjectArray(1);
context.getInvocationCompiler().invokeAttrAssign(
opAsgnNode.getVariableNameAsgn());
}

context.pollThreadEvents();
}

There's two things I'd like you to notice here. First, it's a bit shorter as a result of the literal function objects and closure conversion. It's also a bit DRYer, which naturally plays into code reduction. Second, there's far less noise to contend with. Rather than having a minimum of five verbose lines to define a one-line closure (for example), we now have three terse ones. We've managed to tighten the focus to the lines of code we're actually interested in: the bodies of the closures.

Of course this quick tour doesn't get into the much wider range of features that the closures proposal contains, such as non-local returns. It also doesn't show closures being invoked because with closure conversion many existing interfaces can be represented as function objects automatically.

I'll be looking at the closure proposal a bit more closely, and time permitting I'll try to get a simple JRuby prototype compiler wired up using the techniques above. I'd recommend you give it a try too, and offer Neal your feedback.

Sunday, November 04, 2007

Ruby Continues to Climb on TIOBE

I've posted about TIOBE here before.

The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month. The ratings are based on the world-wide availability of skilled engineers, courses and third party vendors. The popular search engines Google, MSN, Yahoo!, and YouTube are used to calculate the ratings. Observe that the TIOBE index is not about the best programming language or the language in which most lines of code have been written.

I noticed this month that Ruby has moved up to #9 in the list, passing JavaScript. Also noted in the November Newsflash is that Ruby is currently the front runner to win "programming language of the year" for the second year in a row, closely followed by D and C#.

TIOBE Programming Community Index

Is Werewolf Killing the Conference Hackfest?

The latest craze at conferences, especially those associated with O'Reilly or Ruby, is the game Werewolf (historically known as Mafia, but Werewolf has become more popular). The premise of the game is simple: (from Wikipedia's Mafia article)

Mafia (also known under the variant Werewolf or Vampire) is a party game modeling a battle between an informed minority and an uninformed majority. Mafia is usually played in groups with at least five players. During a basic game, players are divided into two teams: 'Mafia members', who know each other; and 'honest people', who generally know the number of Mafia amongst them. The goal of both teams is to eliminate each other; in more complicated games with multiple factions, this generally becomes "last side standing".
Substitute "villagers" for "honest people" and "werewolves" for "mafia members" and you get the general idea. They're the same game.

Again from Wikipedia:
Mafia was created by Dimma Davidoff at the Psychological Department of Moscow State University, in spring of 1986, and the first players were playing in classrooms, dorms, and summer camps of Moscow University. [citation needed] The game became popular in other Soviet colleges and schools and in the 1990s it began to be played in Europe (Hungary, Poland, England, Norway) and then the United States. Mafia is considered to be one of "the 50 most historically and culturally significant games published since 1800" by about.com.
So you get the idea.

It's a fun game. I first played it at Foo Camp 2007, and I've played at each Ruby-related conference since then. It's a great way to get to know people, and an interesting study in social dynamics. I have my own opinions about strategy and in particular a fatal flaw in the game, but I'll leave those for another day. Instead, I have a separate concern:

Is Werewolf killing the conference hackfest?

Last year at RubyConf, Nick Sieger and I sat up until 4AM with Eric Hodel, Evan Phoenix, Zed Shaw and others hacking on stuff. Eric showed a little Squeak demo for those of us who hadn't used it, Zed and I talked about getting a JRuby-compatible Mongrel release out (which finally happened a year later) and I think we all enjoyed the time to hack with a few others who code for the love of coding.

As another example, RubyGems, the definitive and now official packaging system for Ruby apps and libraries, was written during a late-night hackfest at an early RubyConf (2002?). I'm not remembering my history so well, but I believe Rake had a similar start, as well as several other projects including the big Rails 1.0 release's final hours.

This year, and for the past several conferences, there's been a significant drop in such hackfests. And it's because of Werewolf.

Immediately after the Matz's keynote last night, many of the major Ruby players sequestered themselves in isolated groups to play Werewolf for hours (and yes, I know many did not). They did not write the next RubyGems. They did not plant the seeds of the next great Ruby web framework. They did not advance the Ruby community. They played a game.

Don't get me wrong...I have been tempted to join every Werewolf game I can. I enjoy the game, and I feel like I'm at least competent at it. And I can appreciate wanting to blow off steam and play a game after a long conference day. I often feel the same way.

But I'm worried about the general trend. Not only does Werewolf seem to be getting more popular, it seems to draw in many of the best and brightest conference attendees, attendees who might otherwise be just as happily hacking on the "hard problems" we still face in the software world.

So what do you think? Is this a passing trend, or is it growing and spreading? Does it mean the doom of the late-night, post-conference hackfest and the inspirational products it frequently produces? Or is it just a harmless pastime for a few good folks who need an occasional break?

Friday, November 02, 2007

Top Five Questions I Get Asked

I'm at RubyConf this weekend, the A-list gathering of Ruby dignitaries. Perhaps one day I'll be considered among them.

At any rate, here's the top five questions I get asked. Perhaps this will help people move directly on to more interesting subjects :)

#1: Are you Charles Nutter

Yes, I am. I look like that picture up in the corner of my blog.

#2: How's life been at Sun/How's Sun been treating you?

Coming to Sun has been the best professional experience of my life. When they took us in over a year ago, many worried that they'd somehow ruin the project or steer it in the wrong direction (and I had my occasional concerns as well). The truth is that they've remained entirely hands-off with regards to JRuby while simultaneously putting a ton of effort into NetBeans Ruby support and committing mission-critical internal projects to running on JRuby. Beyond that, I've become more and more involved in the growing effort to support many languages on the JVM, through changes to Java and the JVM specification, public outreach, and cultivating and assisting language-related projects. It's been amazing.

Along with this question I often get another: "What's it like to work at Sun?" Unfortunately, this one is a little hard for me to answer. Tom Enebo and I work from our homes in Minneapolis, so "working at Sun" basically means hacking on stuff in my basement for 16 hours a day. In that regard, it's like...great.

#3 What's next for JRuby?

Usually this comes from people who haven't really been following JRuby much, and that's certainly understandable. We had our 1.0 release just five months ago, and we've had two maintenance releases since then plus a major release coming out in beta this weekend. So there's a lot to keep track of.

JRuby 1.0 was released in RC form at JavaOne 2007 and in final form at Ruby Kaigi 2007 in Tokyo. It was focused largely on compatibility, but had early bits of the Ruby-to-bytecode compiler (perhaps 25% complete). Over the summer, we simultaneously worked on improving compatibility (adding missing features, fixing bugs) and preparing for another major release (with me spending several months working on the compiler and various performance optimizations). That leads us to JRuby 1.1, coming out in beta form this weekend. JRuby 1.1 represents a huge milestone for JRuby and for Ruby in general. It's now safe to say it's the fastest way to run Ruby code in a 1.8-compatible environment. It includes the world's first Ruby compiler for a general-purpose VM, and the first complete and working 1.8-compatible compiler of any kind. And it has finally reached our largest Rails milestone to date: it runs Rails faster than Ruby 1.8.x does.

What comes after JRuby 1.1? Well, we need to finally clean up our association/integration with the other half of JRuby: Java support. Java integration works great right now--probably good enough for 90% of users. But JRuby still doesn't quite fit into the Java ecosystem as well as we'd like. Along with continuing performance and compatibility work, JRuby 2.0 should finally run the last mile of Java integration. Then, maybe we'll be "done"?

#4 What do you think of [Ruby implementation X]?

I usually hear this as "what bad things do you want to say about the other implementations", even though people probably don't intend it that way. So, for the record, here's my current position on the big five alternative Ruby implementations.

Rubinius: Evan and I talk quite a bit about implementation design, challenges of implementing Ruby, and so on. JRuby uses Rubinius's specs for testing, and Rubinius borrows/ports many of our tests for their specs. I've even made code contributions to Rubinius and Sun sponsored the Ruby Hackfest in Denver this past September. We're quite cozy.

Technical opinion? It's a great idea to implement Ruby in Ruby...I'd rather use Ruby than Java in JRuby whenever possible. However the challenge of making it perform well is a huge one. When all your core classes are implemented in Ruby, you need your normal Ruby execution to be *way* faster than Ruby 1.8 to have comparable application performance...like orders of magnitude faster. Think about it this way: if each core class method makes ten calls to more core class methods, and each of those methods calls ten native methods (finally getting to the VM kernel), you've got 100x more method invocations than an implementation with a "thicker" kernel that's immediately a native call. Now the situation probably isn't that bad in Rubinius, and I'm sure Evan and company will manage to overcome performance challenges, but it's a difficult problem to solve writing a VM from scratch.

We all like difficult problems, don't we?

XRuby: XRuby is the "other" implementation of Ruby for the JVM. Xue Yong Zhi took the approach of writing a compile-only implementation, but that's probably the only major difference from the JRuby approach. XRuby has similar core class structure to JRuby, and when I checked a few months ago, it could run some benchmarks faster than JRuby. Although it's still very incomplete, it's a good implementation that's probably not getting enough attention.

Xue and I talk occasionally on IM about implementation challenges. I believe we've both been able to help each other.

Ruby.NET: Ruby.NET is the first implementation of Ruby for the CLR, and by far the most complete. Wayne Kelly and John Gough from the University of Queensland ran this Microsoft-funded project when it was called "Gardens Point Ruby.NET Compiler", and Wayne has continued leading it in its "truly open source" project form on Google Code. I exchanged emails many times with Wayne about strategies for opening up the project, and I involved myself in many of the early discussions on their mailing list. I'm also, oddly enough, listed as a project owner on the project page. I'd like to see them succeed...more Ruby options means Ruby gains more validation on all platforms, and that means JRuby will be more successful. Ironic, perhaps, that long-term Ruby and JRuby success depends on all implementations doing well...but I believe that to be true.

IronRuby: IronRuby is the current in-progress implementation of Ruby for the CLR, primarily being developed by Microsoft's John Lam and his team. IronRuby is still very early in their implementation, but they've already achieved one big milestone: they have their Ruby core classes available as a public, open-source project on RubyForge, and you, dear reader, can actually contribute. In my book, it qualifies as the first "real" open source project out of Microsoft since it finally recognizes that OSS is a two-way street.

Technically, IronRuby is also an interesting project because it's stretching Microsoft's DLR in some pretty weird directions. DLR was originally based on IronPython, and it sounds like there's been growing pains supporting Ruby. Ruby is a much more complicated language to implement than Python, and from what I've heard the IronRuby team has had to diverge from DLR in a few areas (perhaps temporarily). But it also sounds like IronRuby is helping to expand DLR's reach, which is certainly good for MS and for folks interested in the DLR.

John and I talk at conferences, and I monitor and participate in the IronRuby mailing list and IRC channel. We Rubyists are a close-knit bunch.

Ruby 1.9.1/YARV: Koichi Sasada has done the impossible. He's implemented a bytecode VM and compiler for Ruby without altering the memory model, garbage collector, or core class implementations. And he's managed to deliver targeted performance improvements for Ruby that range from two times to ten times faster. It's an amazing achievement.

Ruby 1.9.1 will finally come out this Christmas with Koichi's YARV as the underlying execution engine. I talk with Koichi on IRC frequently, and I've been studying his optimizations in YARV for ideas on how to improve JRuby performance in the future. I've also offered suggestions on Koichi's Fiber implementation, resulting in him specifying a core Fiber API that can safely be implemented in JRuby as well. Officially, Ruby 1.9.1 is the standard upgrade path for Rubyists interested in moving on from Ruby 1.8 (and not concerned about losing a few features). JRuby will support Ruby 1.9 execution in the near future, and we already have a partial YARV VM implementation. JRuby and YARV will continue to evolve together, and I expect we'll be cooperating closely in the future.

#5 How's JRuby's Performance?

This is the first question that finally gets into interesting details about JRuby. JRuby's performance is great, and improving every day. Straight-line execution is now typically 2-4x better than Ruby 1.8.6. Rails whole-app performance ranges from just slightly slower to slightly better, though there have been reports of specific apps running many times faster. There's more work to do across the board, but I believe we've finally gotten to a point where we're comfortable saying that JRuby is the fastest complete Ruby 1.8 implementation available.

See also my previous posts and posts by Ola Bini and Nick Sieger for more information on performance. There's more detail there.

Saturday, October 27, 2007

Easy JRuby 1.0.2 Bugs

For those of you looking to start helping JRuby along, here's a few open 1.0.2 bugs that would be pretty easy for a newb to look at. They'd also help you get your first taste of JRuby internals. Good eatin!

This should help get you started:

FYI: Most of these are also 1.1 bugs, so you'll be helping two releases at once (provided you fix them and provide patches for both branches, of course!)

This is fairly simple, isolated core class code. Have a look at src/org/jruby/RubyArray.java

There's a patch here that looks pretty clean, but needs updating to apply cleanly to trunk and 1.0 branch.

We have File.sysopen defined, but for some reason we don't have IO.sysopen. If you can't fix it, at least do some research as to *why*.

This one throws a NullPointerException, which are always really ugly, really broken, and usually really easy to figure out and fix. The spec in question is our (oldish) copy in test/externals/rubinius/spec/core.

This one just needs proper error handling, and there's a preliminary patch provided.

I'm not sure if we can support these, but if we can we should. If we can't, we should raise an error. Either way it's probably not hard to resolve.

There's a patch here that just needs a little more work.

I don't even understand what's happening in this one, because I keep falling asleep reading the description. I think it's probably easy.

Again, there's a patch here that appears to just need a bit more work.

This one has all the background work done on the JRuby side. We're simply not using the right algorithm for Range#step over character ranges. A quick peek at MRI source or a little insight on how to "just fix it" is all that's needed here, and the code is pretty simple. (Update: Fixed, thanks to Mathias Biilmann Christensen.)

This has a patch that's been applied to trunk, but it didn't apply cleanly to the 1.0 branch. Tidy it up, and it's done.

...but Kernel#trap does. Huh? (Update: Fixed! Ola had already repaired it, but not closed the bug.)

This is pretty straightforward...we just don't have Q/q implemented. (Update: Fixed, thanks to Riley Lynch.)

Thursday, October 25, 2007

JRuby 1.1 on Rails: More Performance Previews

Nick Sieger, a Sun coworker and fellow JRuby core team member, has posted the results of benchmarking his team's large Rails-based project on MRI and the codebase soon to be released as JRuby 1.1 (trunk). Read it here:


Now there's a couple things I get out of this post:
  1. JRuby on Rails is starting, at least for this app, to surpass MRI + Mongrel for simple serial-request benchmarks. And this a week before the 1.1 beta release and all the continuing performance enhancements we know are still out there. I think we're gonna make it, folks. If this continues, there's no doubt in my mind JRuby 1.1 will run Rails fastest.
  2. JRuby on Rails will perform at least as well as MRI + Mongrel for the app Nick and his teammates are building. Several months ago they committed to eventually deploying on JRuby, and if you knew what I know about this app, you'd know why that scared the dickens out of me. But I'm glad the team had faith in JRuby and the JRuby community, and I'm glad we've been able to deliver.
I hope we're approaching a time when people believe that JRuby is the real deal--an excellent choice for running Rails now and potentially the best choice in the near future. There's always more work to do, but it's good to see the effort paying off.

Give it a try today, why not?

Tuesday, October 23, 2007

Help Us Complete JRuby 1.1 and 1.0.2 Releases

We're looking to do releases of a 1.1 beta and 1.0.2 over the next couple weeks, and we're hoping to pull in help from the community to turn over as many bugs as possible. A lot of the bugs we'd like to fix for each release wouldn't be very difficult for folks to get into, even if you only have a bit of Ruby experience and a good Java background. Currently we're looking at the following counts:

  • JRuby 1.0.2: 54 scheduled - These are almost exclusively compatibility issues, though there's a few minor performance items and several stability fixes.
  • JRuby 1.1: 147 scheduled - There's almost all the 1.0.2 bugs in here plus many additional performance enhancements and bug fixes that can't be cleanly/easily backported to 1.0.2.
  • Post 1.1: 7 JRuby 1.x, 71 unscheduled - These bugs are in danger of getting punted to a future release, so if you have a bug in these lists or you see something you think you'd like to/want to fix, now's the time.
The 1.0.2 release is basically a maintenance release to the 1.0 branch, and as such includes primarily compatibility and stability fixes. There's a couple minor perf issues that will be resolved, generally only where they were considered so slow as to be "broken".

The big story is going to be 1.1. It will include a massive number of performance enhancements and bug fixes. It will include a complete Ruby-to-Java bytecode compiler, which is responsible for much of the performance improvements. And it should also include a completely reworked regular expression subsystem that eliminates one of the last really major bottlenecks running Rails code. We may not have that done for the 1.1 beta release at RubyConf, but it will be there for 1.1 final around a month later.

So if you've been looking for a chance to get involved in JRuby, hop on the mailing lists or start browing the bug lists above, and feel free to email me directly (firstname.lastname@sun.com) if you have questions about any specific bug. JRuby needs your support!

Wednesday, October 17, 2007

Another Performance Discovery: REXML

I've discovered a really awful bottleneck in REXML processing.

Look at these results for parsing our build.xml:

read content from stream, no DOM
2.592000 0.000000 2.592000 ( 2.592000)
1.326000 0.000000 1.326000 ( 1.326000)
0.853000 0.000000 0.853000 ( 0.853000)
0.620000 0.000000 0.620000 ( 0.620000)
0.471000 0.000000 0.471000 ( 0.471000)
read content once, no DOM
5.323000 0.000000 5.323000 ( 5.323000)
5.328000 0.000000 5.328000 ( 5.328000)
5.209000 0.000000 5.209000 ( 5.209000)
5.173000 0.000000 5.173000 ( 5.173000)
5.138000 0.000000 5.138000 ( 5.138000)

When reading from a stream, the content is read in in chunks, with each chunk being matched in turn. Because our current regexp engine uses char[] instead of byte[], each chunk must be decoded into UTF-16 characters, matched, and encoded back into UTF-8 bytes.

For small chunks, like those read off a stream, this decode/encode cycle is fairly quick. Here, the streamed numbers are pretty close to MRI. However, when an XML string is parsed from memory, the process goes like this:
  1. set buffer to entire string
  2. match against the buffer
  3. set buffer to post match (remainder of the string)
Now this is obviously a little inefficient, since it creates a lot of extra strings, but a copy-on-write String implementation helps a lot. However in our case it also means that we decode/encode the entire remaining XML string for every element match. For any nontrivial file, this is *terrible* overhead.

So what's the fix? Here's the same second benchmark using a StringIO object passed to the parser instead, with a simple change to rexml/source.rb:

Diff:
Index: lib/ruby/1.8/rexml/source.rb
===================================================================
--- lib/ruby/1.8/rexml/source.rb (revision 4596)
+++ lib/ruby/1.8/rexml/source.rb (working copy)
@@ -1,4 +1,5 @@
require 'rexml/encoding'
+require 'stringio'

module REXML
# Generates Source-s. USE THIS CLASS.
@@ -8,7 +9,7 @@
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String
- Source.new(arg)
+ IOSource.new(StringIO.new(arg))
elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and

New numbers:
read content once, no DOM
0.640000 0.000000 0.640000 ( 0.640000)
0.693000 0.000000 0.693000 ( 0.693000)
0.542000 0.000000 0.542000 ( 0.542000)
0.349000 0.000000 0.349000 ( 0.349000)
0.336000 0.000000 0.336000 ( 0.336000)

This is a perfect indication why JRuby's Rails performance is currently nowhere near what it will be. We continue to find these little gems...and there's no telling how many more are out there. With recent execution performance numbers looking extremely solid and recent Rails performance getting closer and closer, the upcoming 1.1 release ought to be amazing.

Friday, October 12, 2007

Performance Update

As some of you may know, I've been busily migrating all method binding to use Java annotations. The main reasons for this are to simplify binding and to provide end-to-end metadata that can be used for optimizing methods. It has enabled using a single binding generator for 90% of methods in the system (and increasing). And today that has enabled making some impressive perf improvements.

The method binding ends up looking like this:

@JRubyMethod(name = "[]", name2 = "slice",
required = 1, optional = 1)
public IRubyObject aref(IRubyObject[] args) {
...

This binds the aref Java method to the two Ruby method names [] and slice and enforces a minimum of one argument and a maximum of two. And it does this all automatically; no manual arity checking or method binding is necessary. Neat. But that's not the coolest result of the migration.

The first big step I took today was migrating all annotation-based binding to directly generate unique DynamicMethod subclasses rather than unique Callback subclasses that would then be wrapped in a generic DynamicMethod implementation. This moves generated code closer to the actual calls.

The second step was to completely disable STI dispatch. STI, we shall miss you.

So, benchmarks. Of course fibonacci numbers are indicative of only a very narrow range of performance, but I think they're a good indicator of where general performance will go in the future, as we're able to expand these optimizations to a wider range of methods.

JRuby before the changes:
$ jruby -J-server -O bench_fib_recursive.rb
1.039000 0.000000 1.039000 ( 1.039000)
1.182000 0.000000 1.182000 ( 1.182000)
1.201000 0.000000 1.201000 ( 1.201000)
1.197000 0.000000 1.197000 ( 1.197000)
1.208000 0.000000 1.208000 ( 1.208000)
1.202000 0.000000 1.202000 ( 1.202000)
1.187000 0.000000 1.187000 ( 1.187000)
1.188000 0.000000 1.188000 ( 1.188000)

JRuby after:
$ jruby -J-server -O bench_fib_recursive.rb
0.864000 0.000000 0.864000 ( 0.863000)
0.640000 0.000000 0.640000 ( 0.640000)
0.637000 0.000000 0.637000 ( 0.637000)
0.637000 0.000000 0.637000 ( 0.637000)
0.642000 0.000000 0.642000 ( 0.642000)
0.643000 0.000000 0.643000 ( 0.643000)
0.652000 0.000000 0.652000 ( 0.652000)
0.637000 0.000000 0.637000 ( 0.637000)

This is probably the largest performance boost since the early days of the compiler, and it's by far the fastest fib has ever run. Here's MRI (Ruby 1.8) and YARV (Ruby 1.9) numbers for comparison:

MRI:
$ ruby bench_fib_recursive.rb
1.760000 0.010000 1.770000 ( 1.813867)
1.750000 0.010000 1.760000 ( 1.827066)
1.760000 0.000000 1.760000 ( 1.796172)
1.760000 0.010000 1.770000 ( 1.822739)
1.740000 0.000000 1.740000 ( 1.800645)
1.750000 0.010000 1.760000 ( 1.751270)
1.750000 0.000000 1.750000 ( 1.778388)
1.740000 0.000000 1.740000 ( 1.755024)

And YARV:
$ ./ruby -I lib bench_fib_recursive.rb
0.390000 0.000000 0.390000 ( 0.398399)
0.390000 0.000000 0.390000 ( 0.412120)
0.400000 0.010000 0.410000 ( 0.424013)
0.400000 0.000000 0.400000 ( 0.415217)
0.400000 0.000000 0.400000 ( 0.409039)
0.390000 0.000000 0.390000 ( 0.415853)
0.400000 0.000000 0.400000 ( 0.415201)
0.400000 0.000000 0.400000 ( 0.504051)

What I think is really awesome is that I'm comfortable showing YARV's numbers, since we're getting so close--and YARV has a bunch of additional integer math optimizations we don't currently support and thought we'd never be able to compete with. Well, I guess we can.

However a more reasonable benchmark is the "pentomino" benchmark in the YARV suite. We've always been slower than MRI...much slower some time ago when nothing compiled. But times they are a-changin'. Here's JRuby before the changes:
$ time jruby -J-server -O sbench/bm_app_pentomino.rb

real 1m50.463s
user 1m49.990s
sys 0m1.131s

And after:
$ time jruby -J-server -O bench/bm_app_pentomino.rb

real 1m25.906s
user 1m26.393s
sys 0m0.946s

MRI:
$ time ruby test/bench/yarv/bm_app_pentomino.rb

real 1m47.635s
user 1m47.287s
sys 0m0.138s

And YARV:
$ time ./ruby -I lib bench/bm_app_pentomino.rb

real 0m49.733s
user 0m49.543s
sys 0m0.104s

Again, keep in mind that YARV is optimized around these benchmarks, so it's not surprising it would still be faster. But with these recent changes--general-purpose changes that are not targeted at any specific benchmark--we're now less than 2x slower.

My confidence has been wholly restored.

Thursday, September 27, 2007

The Compiler Is Complete

It is a glorious day in JRuby-land, for the compiler is now complete.

Tom and I have been traveling in Europe the past two weeks, first for RailsConf EU in Berlin and currently in Århus, Denmark for JAOO (which was an excellent conference, I highly recommend it). And usually, that would mean getting very little work done...but time is short, and I've been putting in full days for almost the entire trip.

Let's recap my compiler-related commits made on the road:

  • r4330, Sep 16: ClassVarDeclNode and RescueNode compiling; all tests pass...we're in the home stretch now!
  • r4339, Sep 17: Fix for return in rescue not bubbling all the way out of the synthetic method generated to wrap it.
  • r4341, Sep 17: Adding super compilation, disabled until the whole findImplementers thing is tidied up in the generated code.
  • r4342, Sep 17: Enable super compilation! I had to disable an assertion to do it, but it doesn't seem to hurt things and I have a big fixme on it.
  • r4355, Sep 19: zsuper is getting closer, but still not enabled.
  • r4362, Sep 20: Enabled a number of additional flow-control syntax within ensure bodies, and def within blocks.
  • r4363, Sep 20: Re-disabling break in ensure; it caused problems for Rails and needs to be investigated in more depth.
  • r4367, Sep 21: Removing the overhead of constructing ISourcePosition objects for every line in the compiler; I moved construction to be a one-time cost and perf numbers went back to where they were before.
  • r4368, Sep 21: Small optz for literal string perf and memory use: cache a single bytelist per literal in the compiled class and share it across all literal strings constructed.
  • r4370, Sep 22: Enable compilation of multiple assignment with args (rest args)
  • r4375, Sep 24: Total refactoring of zsuper argument processing, and zsuper is now enabled in the compiler. We still need more/better tests and specs for zsuper, unfortunately.
  • r4377, Sep 24: Compile the remaining part of case/when, for when *whatever (appears as nested when nodes in the ast...why?)
  • r4388, Sep 25: Add compilation of global and constant assignment in masgn/block args
  • r4392, Sep 25: Compilation of break within ensurified sections; basically just do a normal breakjump instead of Java jumps
  • r4400, Sep 25: Fix for JRUBY-1388, plus an additional fix where it wasn't scoping constants in the right module.
  • r4401, Sep 25: Retry compilation!
  • r4402, Sep 26: Multiple additional cleanups, fixes, to the compiler; expand stack-based methods to include those with opt/rest/block args, fix a problem with redo/next in ensure/rescue; fix an issue in the ASTInspector not inspecting opt arg values; shrink the generated bytecode by offloading to CompilerHelpers in a few places. Ruby stdlib now compiles completely. Yay!
  • r4404, Sep 26: Add ASM "CheckClass" adapter to command-line (class file dumping) part of compiler.
  • r4405, Sep 26: A few additional fixes for rescue method names and reduced size for the pre-allocated calladapters, strings, and positions.
  • r4410, Sep 27: A number of additional fixes for the compiler to remedy inconsistent stack issues, and a whole slew of work to make apps run correctly with AOT-compiled stdlib. Very close to "complete" in my eyes.
  • r4412, Sep 27: Fixes to top-level scoping for AOT-compiled methods, loading sequence, and some minor compiler tweaks to make rubygems start up and run correctly with AOT-compiled stdlib.
  • r4413, Sep 27: Fixed the last known bug in the compiler. It is now complete.
  • r4414, Sep 27: Ok, now the compiler is REALLY complete. I forgot about BEGIN and END nodes. The only remaining node that doesn't compile is OptN, whichwe won't put in the compiled output (we'll just wrap execution of scripts with the appropriate logic). It's a good day to be alive!
I think I've done a decent job proving you can get some serious work done on the road, even while preparing two talks and hob-nobbing with fellow geeks. But of course this is an enormous milestone for JRuby in general.

For the first time ever, there is a complete, fully-functional Ruby 1.8 compiler. There have been other compilers announced that were able to handle all Ruby syntax, and perhaps even compile the entire standard library. But they have never gotten to what in my eyes is really "complete": being able to dump the stdlib .rb files and continue running nontrivial applications like IRB or RubyGems. I think I'm allowed to be a little proud of that accomplishment. JRuby has the first complete and functional 1.8-semantics compiler. That's pretty cool.

What's even more cool is that this has all been accomplished while keeping a fully-functional interpreter working in concert. We've even made great strides in speeding up interpreted mode to almost as fast as the C implementation of Ruby 1.8, and we still have more ideas. So for the first time, there's a mixed-mode Ruby runtime that can run interpreted, compiled, or both at the same time. Doubly cool. This also means that we don't have to pay a massive compilation cost for 'eval' and friends, and that we can be deployed in a security-restricted environment where runtime code-generation is forbidden.

I will try to prepare a document soon about the design of the compiler, the decisions made, and what the future holds. But for now, I have at least one teaser for you to chew on: there is a second compiler in the works, this time for creating real Java classes you can construct and invoke directly from Java-land. Yes, you heard me.

Compiler #2

Compiler #2 will basically take a Ruby class in a given file (or multiple Ruby classes, if you so choose) and generate a normal Java type. This type will look and feel like any other Java class:
  • You can instantiate it with a normal new MyClass(arg1, arg2) from Java code
  • You can invoke all its methods with normal Java invocations
  • You can extend it with your own Java classes
The basic idea behind this compiler is to take all the visible signatures in a Ruby class definition, as seen during a quick walk through the code, and turn them into Java signatures on a normal class. Behind the scenes, those signatures will just dynamically invoke the named method, passing arguments through as normal. So for example, a piece of Ruby code like this:
class MyClass
def initialize(arg1, arg2); end
def method1(arg1); end
def method2(arg1, arg2 = 'foo', *arg3); end
end
Might produce a Java class equivalent to this:
public class MyClass extends RubyObject {
public MyClass(Object arg1, Object arg2) {
callMethod("initialize", arg1, arg2);
}

public Object method1(Object arg1) {
return callMethod("method1", arg1);
}

public Object method2(Object arg1, Object... optAndRest) {
return callMethod("method2", arg1, optAndRest);
}
}
It's a pretty trivial amount of code to generate, but it completes that "last mile" of Java integration, being directly callable from Java and directly integrated into Java type hierarchies. Triply cool?

Of course the use of Object everywhere is somewhat less than ideal, so I've been thinking through implementation-independent ways to specify signatures for Ruby methods. The requirement in my mind is that the same code can run in JRuby and any other Ruby without modification, but in JRuby it will gain additional static type signatures for calls from Java. The syntax I'm kicking around right now looks something like this:
class MyClass
...
{String => [Integer, Array]}
def mymethod(num, ary); end
end
If you're unfamiliar with it, this is basically just a literal hash syntax. The return type, String, is associated with the types of the two method arguments, Integer and Array. In any normal Ruby implementation, this line would be executed, a hash constructed, and execution would proceed with the hash likely getting quickly garbage collected. However Compiler #2 would encounter these lines in the class body and use them to create method signatures like this:
    public String mymethod(int num, List ary) {
...
}

The final syntax is of course open for debate, but I can assure you this compiler will be far easier to write than the general compiler. It may not be complete before JRuby 1.1 in November, but it won't take long.

So there you have it, friends. Our work on JRuby has shown that it is possible to fully compile Ruby code for a general-purpose VM, and even that Ruby can be made to integrate as a first-class citizen on the Java platform, fitting in wherever Java code may be used today.

Are you as excited as I am?

Wednesday, September 19, 2007

Are Authors Technological Poseurs?

Recently, the JRuby team has gone through the motions of getting a definitive JRuby book underway. We've talked through outlines, some some chapter assignments, and discussed the overall feel of a book and how it should progress. I believe one or two of us may have started writing. However the entire exercise has made one thing abundantly clear to me:

Good authors do not have time to be good developers.

Think of your favorite technical author, perhaps one of the more insightful, or the one who takes the most care in their authoring craft. Now tell me one serious, nontrivial contribution they've made in the form of real code. It's hard, isn't it?

Of course I don't intend to paint with too wide a brush. There are, without a doubt, good authors that manage to keep a balance between words and code. But I'm increasingly of the opinion that it's not practical or perhaps even possible to maintain a serious dedication to both writing and coding.

What brings me to this conclusion is the growing realization that working on a JRuby book would--for me--mean a good bit less time spent working on JRuby and related projects. I fully intend to make a large contribution to the eventual JRuby book, but JRuby as a passion has meant most of my waking hours are spent working on real, difficult problems (and in some cases, really difficult problems). It physically pains me to consider taking time away from that work.

And I do not believe it's from a lack of interest in writing. I have long wanted to write a book, and as most of my blog posts should attest, I love putting my thoughts and ideas into a written form. I enjoy crafting English prose almost as much as I enjoy crafting excellent code. But at the end of the day, I am still a coder, and that is where my heart lies. I suspect I am not alone.

When I decided to write this post, I tried to think of concrete examples. Though many came to mind, I could not think of a way to name names without seeming malicious or disrespectful. So I leave it as an exercise for the reader. Am I totally off base? Or is there a direct correlation between the quality and breadth of an author's work and a suspicious (or obvious) lack of real, concrete development?

Friday, September 14, 2007

The End Is Near For Mongrel

At conferences and online, Tom and I have long been talking about a mystery deployment option coming soon from the GlassFish team. It would combine the agile command-line-friendly model of Mongrel with the power and simplicity of deploying to a Java application server. We have shown a few quick demos, but they have only been very simple and very limited. Plus there was no public release we could give people.

Until Now.

Today, you can finally download the GlassFish-Rails preview release gem.

So what is this tasty little morsel? Well, it's a 2.9MB Ruby gem containing the GlassFish server and the Grizzly connector for JRuby on Rails. It installs a "glassfish_rails" script in JRuby's bin directory, and you're done.

Witness!

~ $ gem install glassfish-gem-10.0-SNAPSHOT.gem
Successfully installed GlassFish, version 10.0.0
~ $ glassfish_rails testapp
Sep 14, 2007 3:00:45 PM com.sun.enterprise.v3.services.impl.GrizzlyAdapter postConstruct
INFO: Listening on port 8080
Sep 14, 2007 3:00:46 PM com.sun.enterprise.v3.services.impl.DeploymentService postConstruct
INFO: Supported containers : phobos,php,web,jruby
Sep 14, 2007 3:00:46 PM com.sun.grizzly.standalone.StaticResourcesAdapter <init>
INFO: New Servicing page from: /Users/headius/testapp/public
/Users/headius/NetBeansProjects/jruby/lib/ruby/gems/1.8/gems/actionmailer-1.3.3/lib/action_mailer.rb:50 warning: already initialized constant MAX_LINE_LEN
Sep 14, 2007 3:00:53 PM com.sun.enterprise.v3.server.AppServerStartup run
INFO: Glassfish v3 started in 9233 ms

That's all there is to it...you've got a production-ready server. Oh, did I mention you only have to run one instance? No more managing a dozen mongrel processes, ensuring they stay running, starting and stopping them all. One command, one process.

Of course this is a preview...we expect to see bug reports and find issues with it. For example, it currently deploys under a context rather than at the root of the server, so my app above would be available at http://localhost:8080/testapp instead of http://localhost:8080/. That's going to be fixed soon (and configurable) but for now you'll want to set the following in environment.rb:
ActionController::AbstractRequest.relative_url_root = "/<app name>/"
ActionController::CgiRequest.relative_url_root = "/<app name>/"

And of course, you're going to be running JRuby, so you'll need to take that into consideration. JRuby's general Rails performance still needs more tweaking and work to surpass Mongrel + Ruby, but out of the box you already get stellar static-file performance with the GlassFish gem...something like 2500req/s for the testapp index page on my system. The remaining JRuby performance is continuing to improve as well...we'll get there soon.

So! Give it a try, report bugs on the GlassFish issue tracker, and let us know on the GlassFish mailing lists what you'd like to see improved.

Mongrel...your days are numbered.

Thursday, September 13, 2007

How Easy Is It To Contribute To JRuby?

Answer: Very Easy!

Get the Code

svn co http://svn.codehaus.org/jruby/trunk/jruby
cd jruby
ant

Run Your Built JRuby
export PATH=$PATH:$PWD/bin
jruby my_script.rb

Install Gems
gem install somegem
OR
jruby -S gem install somegem

Run the Benchmarks
jruby -J-server -O test/bench/bench_method_dispatch.rb
(-J-server uses the "server" JVM and -O disables ObjectSpace)


Build and Test Your Changes
ant clean test

Look For Bugs To Fix or Report Your Own

JRuby JIRA (bug tracker)


Join the Mailing Lists

JRuby Mailing Lists


Join Us on IRC

Channel #jruby on FreeNode IRC


We're waiting to hear from you!

Wednesday, September 12, 2007

InfoWorld Bossies Close to my Heart

A couple interesting "Bossies" were awarded by InfoWorld this week:

  • Best Open Source Programming Language - Summary: Ruby gets mad props for a vibrant community and a diverse range of implementations (e.g. JRuby), and then they go squishy and say "and these other languages are great too!"
  • Best Open Source IDE - InfoWorld heaps praise on NetBeans largely because it has *not* gone the way of an amorphous, all-encompassing "platform" and has continued to take risks to improve the overall experience of the IDE. Before the work on NetBeans 6 (don't download M10; use a daily build) I was a nay-sayer myself; having used NetBeans 6 for many months now, I can honestly say it's far better than NetBeans 5.5 and has caught up or passed Eclipse in many ways. There's more to do, but I'm extremely impressed with the progress.
I try not to be too much of a Sun marketing shill, but both Bossies are pretty close to my heart. JRuby has been my passion for three years now, and NetBeans has taken a serious about face with greatly improved Java support and best-in-class Ruby support. It's good to see some recognition for hard work.