Headius

Saturday, December 08, 2007

Upcoming Events: Dec 2007, Jan/Feb 2008

JavaPolis 2007 - Antwerp, Belgium - December 10-14 - Sounds like a great event this year, with claims of over 3200 registrations so far. I'll be sharing the JRuby/NetBeans tutorial with Brian Leonard on the 10th and the JRuby/Rails talk with Ola Bini on the 12th. Outside of that, I'll probably be hacking in the main area. Come say hi.

Microsoft Lang.NET Symposium - Redmond, Washington - January 28-30 - I'll be there to get ideas about building a language platform, sharing war stories with fellow language implementers, and probably contributing a bit to John Rose's talk on the Multi-Language VM project. Oughta be a fun time...though it feels a bit weird making my first trip to Microsoft.

acts_as_conference - Orlando, Florida - February 8-9 - Robert Dempsey of Rails For All invited me to come talk about JRuby and Rails...though I'll be doing things a bit differently this time (not showing how to build a Rails app, but showing purely how JRuby improves the Rails ecosystem). Who could pass up a trip to Florida from Minnesota at this time of year?

FOSDEM 2008 - Brussels, Belgium - February 23-24 - FOSDEM invited me to present on the OSS languages track. I've got some great ideas for how to tackle this one. Given that it's an OSS conference, I think it's finally time to show how JRuby has evolved in the past three years from a slow, partial interpreter and runtime to the fastest Ruby 1.8-compatible implementation around. It's been a hell of a ride, and it's gotta qualify as an OSS success story.

Outside these four events, I've had invitations for plenty others (I could probably just do conferences...but how would I ever get anything done?) so I'm sure there will be more to come. You can also count on JavaOne in San Francisco this spring, Ruby Kaigi in Tokyo this summer, RubyConf Europe in Prague some time between April and July, and maybe RailsConf 2008 in Portland (though there's a good chance I won't be presenting).

Friday, December 07, 2007

Groovy 1.5 Released!

The Groovy team has kicked out their second major production release, Groovy 1.5...and skipped straight from 1.0 to 1.5. Why? Perhaps because they added generics, enums, static imports, annotations, fully dynamic metaclasses, improved performance, ... and much more. I think the move to 1.5 was certainly warranted, and we've been considering making the next JRuby release 1.5 for the same reasons.

Congratulations to the Groovy team! I'm looking forward to seeing 1.6 and 2.0 in the future!

OpenJDK Migration to Mercurial is Complete!

I'm really excited about this one. Kelly O'Hair reports that OpenJDK source has been fully migrated to Mercurial! This means that daily development on OpenJDK (eventually to produce Java 7 and other great things) will happen on the same repository that you, dear reader, can access from home. And it's using Mercurial, one of the two big Distributed SCM apps, so you can pull off an entire repo and maintain your own OpenJDK workshop at home. Excellent news...I now finally have an excuse to learn Hg, and I can finally put in the effort to get OpenJDK building with the knowledge that I'll safely be able to "pull" changes as they happen. Thank you to the OpenJDK migration team!

See also Kelly O'Hair's sun.com blog for articles on the OpenJDK Mercurial layout and how to work with it.

Wednesday, December 05, 2007

Groovy in Ruby: Implement Interface with a Map

Some of you may know I participate in the Groovy community as well. I'm hoping to start contributing some development time to the Groovy codebase, but for now I've mostly been monitoring their progress. One thing the Groovy team has more experience with is integrating with Java.

Now if you ask the Groovy team, they'll make some claim like "it's all Java objects" or "Groovy integrates seamlessly with Java" but neither of those are entirely true. Groovy does integrate extremely well with Java, but it's because of a number of features they've added over time to make it so...many of them not directly part of the Groovy language but features of their core libraries and portions of their runtime.

Since Ruby and Groovy seem to be the two most popular (or noisiest) non-Java JVM languages these days, I thought I'd start a series of posts showing how to add Groovy features missing from Ruby to JRuby. But there's a catch: I'll use only Ruby code to do this, and what I show will work on any unmodified JRuby release. That's the beauty of Ruby: the language is so flexible and fluid, you can implement many features from other languages without ever modifying the implementation.

First up, Groovy's ability to implement an interface from a Map.

1. impl = [
2.   i: 10,
3.   hasNext: { impl.i > 0 },
4.   next: { impl.i-- },
5. ]
6. iter = impl as Iterator
7. while ( iter.hasNext() )
8.   println iter.next()

Ok, this is Groovy code. The brackety thing assigned to 'impl' shows Groovy's literal Map syntax (a Hash to you Rubyists). Instead of providing literal strings for the keys, Groovy automatically turns whatever token is in the key position into a Java String. So 'i' becomes a String key referencing 10, 'hasNext' becomes a String key referencing a block of code that checks if impl.i is greater than zero, and so on.

The magic comes on line 6, where the newly-constructed Map is coerced into a java.util.Iterator implementation. The resulting object can then be passed to other code that expects Iterator, such as the while loop on lines 7 and 8, and the values from the Map will be used as the code for the implemented methods.

To be honest, I find this feature a bit weird. In JRuby, you can implement a given interface on any class, add methods to that class at will, and get most of this functionality without ever touching a Hash object. But it's pretty simple to implement this in JRuby:

1. module InvokableHash
2.   def as(java_ifc)
3.     java_ifc.impl {|name, *args| self[name].call(*args)}
4.   end
5. end

Here we have one of Ruby's wonderful modules, which I appreciate more each day. This InvokableHash module provides only a single method 'as' which accepts a Java interface type and produces an implementation of that type that uses the contents of hash keys to implement the methods. That's really all there is to it. So by reopening the Hash class, we gain this functionality:

1. class Hash
2.   include InvokableHash
3. end

And we're done! Let's see the fruits of our labor in action:

1. impl = {
2.   :i => 10,
3.   :hasNext => proc { impl[:i] > 0 },
4.   :next => proc { impl[:i] -= 1 }
5. }
6. iter = impl.as java.util.Iterator
7. while (iter.hasNext)
8.   puts iter.next
9. end

Our final Ruby code looks roughly like the Groovy code. On lines 1 through 5 we construct a literal Hash. Notice that instead of automatically turning identifier tokens into Strings, Ruby uses the exact object you specify for the key, and so here we use Ruby Symbols as our hash keys (they're roughly like interned Strings, and highly recommended for hash keys). On line 6, we coerce our Hash into an Iterator instance (and we could have imported Iterator above to avoid the long name). And then lines 7 through 9 use the new Iterator impl in exactly the same way as the Groovy code.

You've gotta love a language this flexible, especially with JRuby's magic Java integration features to back it up.

Thursday, November 29, 2007

Coming Soon

$ jruby -J-Djruby.compat.version=ruby1_9 -v
ruby 1.9.1 (2007-11-29 rev 4842) [i386-jruby1.1b1]

Wednesday, November 28, 2007

Tab Sweep November 28

WiiHear - Streaming audio for the Wii. Very cool. It was a little spotty for me, but your results may be better. And in related news: Super Mario Galaxy is the best Mario ever. I'll post a review later.

Easy(?) JRuby bugs from Rubinius specs - Vladimir Sizikov has been contributing new/fixed specs to Rubinius while running all the specs against JRuby. He's reported several new bugs as a result, which could potentially be easy API edge cases. Give them a shot. I'll try to post an "easy bug round-up" soon too.

File IO in Different Languages - A quick comparison of (very) basic file I/O in TCL, Python, PHP, Ruby, Groovy, Scala, and JavaScript. The poster then goes on to compare startup times for the Java implementations of these languages, showing an area JRuby has some issues (slowest startup time, at around 1.6s). We know about the problem, and hopefully NailGun plus other improvements in the future will help. Largely, it's a JVM issue; classes are slow to load and verify, and we create literally hundreds of tiny classes.

Minneapolis - I live in Richfield, an "urban suburb" of Minneapolis (often dubbed Minneapolis Junior), but I generally just say I'm from Minneapolis, since most people recognize it. The Wikipedia article on Minneapolis is excellent (as is the one on the Twin Cities area). A few choice facts:

Minneapolis has more per-capita theater seats than any US city other than New York
The Twin Cities is one of the warmest places in Minnesota, with an average annual temperature of 45.4 ºF. Monthly average daily high temperatures range from 21.9 °F (-5.6 °C) in January to 83.3 °F (28.5 °C) in July; the average daily minimum temperatures for the two months are 4.3 °F (-15.4 °C) and 63.0 °F (17 °C) respectively. So it gets cold in the winter and hot in the summer, not too hot nor too cold. It's nice to have four seasons.
Every home in Minneapolis is no more than six blocks away from a wooded park. In the summer, Minneapolis from the air looks like a bunch of skyscrapers surrounded by a bed of trees.

It's a nice place to live. I'd be hard pressed to find somewhere better.

Oniguruma Regular Expression Syntax version 5.6 - This is the library Marcin ported. I believe all of this is supported in Joni. Of course we wrap it so normal Ruby regular expressions work exactly as they would under Ruby 1.8, but Marcin will release Joni as a standalone library too.

Dr. Nic Installs Mingle with Capistrano - A nice walkthrough.

JavaPolis 2007 - I will be attending and co-presenting a JRuby/NetBeans tutorial with Brian Leonard and a JRuby/Rails session with Ola Bini. Only two weeks and I'll be in the land of beer, chocolate, and diamonds.

acts_as_conference 2007 - I will be attending and presenting something JRubyish and Railsy. Perhaps my last Rails presentation before handing such things off to people who actually know Rails? This conference also wins my vote for "most cumbersome title".

RailsConf 2008 CFP - I haven't decided if I plan to present anything this year. There are many, many other folks doing real-world JRuby on Rails work that could probably do a better job.

FOSDEM 2008 - I have been invited and will be attending. Back to the land of beer, chocolate and diamonds.

Tuesday, November 27, 2007

REXML Numbers With Joni

As Ola reported earlier today, we've merged Joni, Marcin Mielczynski's port of Oniguruma, to JRuby trunk. Here's the description from the Oniguruma home page:

Oniguruma is a regular expressions library.
The characteristics of this library is that different character encoding
for every regular expression object can be specified.

The benefit for us is avoiding the encode/decode we previously had to do for every regular expression match, since Ruby uses byte[]-based strings and all Java regular expression engines work with char[]. You can imagine the overhead all that array churn introduced.

After running through a series of basic optimizations, most of the key expressions we worried about were performing as well as or much better than JRegex, so Ola went through with the conversion over the past couple days. Marcin is continuing to work on various optimizations, but both Ola and I have been playing with the new code. And it's looking great.

You may remember I reported recently about how the regexp bottleneck impacted XML parsing with REXML. Here's the numbers run against JRuby immediately before merging Joni:

read content from stream, no DOM
3.362000   0.000000   3.362000 (  3.362000)
1.232000   0.000000   1.232000 (  1.232000)
0.887000   0.000000   0.887000 (  0.887000)
1.009000   0.000000   1.009000 (  1.010000)
0.801000   0.000000   0.801000 (  0.801000)
read content once, no DOM
9.869000   0.000000   9.869000 (  9.869000)
9.779000   0.000000   9.779000 (  9.779000)
9.786000   0.000000   9.786000 (  9.786000)
9.655000   0.000000   9.655000 (  9.655000)
9.601000   0.000000   9.601000 (  9.601000)
read content from stream, build DOM
1.368000   0.000000   1.368000 (  1.368000)
1.297000   0.000000   1.297000 (  1.297000)
1.192000   0.000000   1.192000 (  1.192000)
1.131000   0.000000   1.131000 (  1.131000)
0.812000   0.000000   0.812000 (  0.812000)
read content once, build DOM
10.595000   0.000000  10.595000 ( 10.595000)
9.489000   0.000000   9.489000 (  9.488000)
9.947000   0.000000   9.947000 (  9.947000)
9.821000   0.000000   9.821000 (  9.821000)
9.414000   0.000000   9.414000 (  9.415000)

And here's the performance numbers today, with Joni:

read content from stream, no DOM
2.309000   0.000000   2.309000 (  2.308000)
1.217000   0.000000   1.217000 (  1.217000)
0.776000   0.000000   0.776000 (  0.776000)
0.825000   0.000000   0.825000 (  0.825000)
0.637000   0.000000   0.637000 (  0.637000)
read content once, no DOM
0.370000   0.000000   0.370000 (  0.369000)
0.415000   0.000000   0.415000 (  0.415000)
0.288000   0.000000   0.288000 (  0.288000)
0.260000   0.000000   0.260000 (  0.260000)
0.254000   0.000000   0.254000 (  0.254000)
read content from stream, build DOM
1.455000   0.000000   1.455000 (  1.455000)
0.916000   0.000000   0.916000 (  0.916000)
0.887000   0.000000   0.887000 (  0.888000)
0.827000   0.000000   0.827000 (  0.827000)
0.607000   0.000000   0.607000 (  0.607000)
read content once, build DOM
0.630000   0.000000   0.630000 (  0.630000)
0.664000   0.000000   0.664000 (  0.664000)
0.680000   0.000000   0.680000 (  0.680000)
0.553000   0.000000   0.553000 (  0.553000)
0.650000   0.000000   0.650000 (  0.650000)

Marcin's being modest about the work, but we're all absolutely amazed by it.

So finally the last really gigantic performance bottleneck in JRuby is gone, and it appears that JRuby's slow regexp era has come to a close. Next targets: the remaining issues with IO and Java integration performance.

Java 6 Port for OS X (Tiger and Leopard)

I just stumbled across this little gem today:

Landon Fuller's JDK 6 Port for OS X

Who Landon Fuller is I don't know. But I find it incredibly impressive that he's managed to get the base JDK 6 ported to OS X and working. Talk about showing the value of an open-source JDK...Landon Fuller FTW. Apple, are you hiring? Perhaps this guy can kick the Apple JDK process in the ass.

So naturally when I'm confronted with a final JDK 6 release for OS X one thing immediately springs to mind: performance.

We'd always suspected that the early preview version of JDK 6 on OS X was not showing us the true awesome performance we could expect from a final version.

We were right.

So I'll give you two sets of numbers, one that's specious and unreliable and the other that's a more real-world test.

fib numbers

Yes, good old fib. A constant in benchmarking. It shows practically nothing, and yet people use it to demonstrate perf. And in the case of Ruby 1.9, they've specifically optimized for integer-math-heavy benchmarks like this.

Ruby 1.9:

  0.400000   0.000000   0.400000 (  0.413737)
 0.420000   0.010000   0.430000 (  0.421622)
 0.400000   0.000000   0.400000 (  0.411591)
 0.410000   0.000000   0.410000 (  0.411593)
 0.400000   0.000000   0.400000 (  0.410080)
 0.410000   0.000000   0.410000 (  0.408836)
 0.400000   0.000000   0.400000 (  0.408572)
 0.410000   0.000000   0.410000 (  0.408114)
 0.400000   0.000000   0.400000 (  0.410374)
 0.400000   0.000000   0.400000 (  0.413096)

Very nice numbers, especially considering Ruby 1.8 benchmarks at about 1.7s on my system. Ruby 1.9 contains several optimizations for integer math, including the use of "tagged integers" for Fixnum values (saving object costs) and fast math opcodes in the 1.9 bytecode specification (avoiding method dispatch). JRuby does neither of these, representing Fixnums as a normal Java object containing a wrapped Long and dispatching as normal for all numeric operations.

JRuby trunk:

  0.783000   0.000000   0.783000 (  0.783000)
 0.510000   0.000000   0.510000 (  0.510000)
 0.510000   0.000000   0.510000 (  0.510000)
 0.506000   0.000000   0.506000 (  0.506000)
 0.505000   0.000000   0.505000 (  0.504000)
 0.507000   0.000000   0.507000 (  0.507000)
 0.510000   0.000000   0.510000 (  0.510000)
 0.507000   0.000000   0.507000 (  0.507000)
 0.508000   0.000000   0.508000 (  0.508000)
 0.510000   0.000000   0.510000 (  0.510000)

This is improved from numbers in the 0.68s range under the Apple JDK 6 preview. Pretty damn hot, if you ask me. I love being able to sit back and do nothing while performance numbers improve. It's a nice change from 16 hour days.

Anyway, back to performance. JRuby also supports an experimental frameless execution mode that omits allocating and initializing per-call frame information. In Ruby, frames are used for such things as holding the current method visibility, the current "self", the arguments and block passed to a method, and so on. But in many cases, it's safe to omit it entirely. I haven't got it running 100% safe in JRuby yet, and probably won't before 1.1 final comes out...but it's on the horizon. So then...numbers.

JRuby trunk, frameless execution:

  0.627000   0.000000   0.627000 (  0.627000)
 0.409000   0.000000   0.409000 (  0.409000)
 0.401000   0.000000   0.401000 (  0.401000)
 0.402000   0.000000   0.402000 (  0.402000)
 0.403000   0.000000   0.403000 (  0.403000)
 0.403000   0.000000   0.403000 (  0.403000)
 0.404000   0.000000   0.404000 (  0.405000)
 0.401000   0.000000   0.401000 (  0.401000)
 0.403000   0.000000   0.403000 (  0.403000)
 0.405000   0.000000   0.405000 (  0.405000)

Hello hello? What do we have here? JRuby actually executing fib faster than an optimized Ruby 1.9? Can it truly be?

Pardon my snarkiness, but we never thought we'd be able to match Ruby 1.9's integer math performance without seriously stripping down Fixnum and introducing fast math operations into the compiler. I guess we were wrong.

M. Ed Borasky's MatrixBenchmark

I like Borasky's matrix benchmark because it's a non-trivial piece of code, and pulls in a Ruby standard library (matrix.rb) as well. It basically inverts a matrix of a particular size and multiplies the original by the inverse. I show here numbers for a 64x64 matrix, since it's long enough to show the true benefit of JRuby but short enough I don't get bored waiting.

Ruby 1.9:

Hilbert matrix of dimension 64 times its inverse = identity? true
21.630000   0.110000  21.740000 ( 21.879126)

JRuby trunk:

Hilbert matrix of dimension 64 times its inverse = identity? true
14.780000   0.000000  14.780000 ( 14.780000)

This is down from 16-17s under the Apple JDK 6 preview and a clean 25% faster than Ruby 1.9.

So what have we learned today?

Sun's JDK 6 provides frigging awesome performance
Apple users are crippled without a JDK 6 port. Apple, I hope you're paying attention.
Landon Fuller is my hero of the week. I know Landon will just point at the excellent work to port JDK 6 to FreeBSD and OpenBSD...but give yourself some credit, you did what none of the other Leopard whiners did.
JRuby rocks

Note: You have to be a Java Research License licensee to legally download the binary or source versions of Landon's port. That or complain to Apple about some dude making a working port before they did. Landon mentions on his blog that he plans to contribute this work to OpenJDK soon...which would quickly result in a buildable GPLed JDK for OS X. Awesome.

Friday, November 23, 2007

Oracle Mix Proving JRuby is "The Best Way"?

I have a tendency to post intentionally inflammatory posts that usually end up evenly divided between "yea" and "nay". But this time, I've got Rich Manalang's post from the JRuby on Rails front lines to back me up.

Rich is one of the primaries (Rich: THE primary?) behind Oracle's new Mix site, the first highly-visible public site based on JRuby on Rails. And after the experience he's convinced that JRuby is "the best way" to deploy Rails.

Read the whole article, but I think Rich's final paragraph sums it up pretty well:

This was an amazing project to be a part of. And one thing I’ll say is that for anyone working in a Java EE environment where you have to use the stack that’s there, the future is bright and it’s all because of jRuby, Rails, and the speed and agility at which you can build applications on that framework. I’m convinced that jRuby is [the] best way to deploy a Rails app if you need performance and flexibility. My prediction: next year will be the year for jRuby’s rise into the mainstream.

Thursday, November 22, 2007

JRuby on ME Devices?

Roy Hayun presented his "JRuby on ME" talk this past JavaOne and got a pretty solid response. He ported a pre-1.0 JRuby to CDC by incrementally stripping out libraries and functionality that couldn't be supported. He succeeded, and yesterday delivered to us a buildable version of his JRubME 0.1 (that name was a typo on his original JavaOne submission...but the missing 'y' produced a comically good name).

In the JRuby download "research" area, you can find the results of his work plus some docs and a short presentation on JRubyME. As near as I can tell he based it on JRuby 0.9.8. The same work could probably be done to produce a stripped version of current JRuby, and with a bit more work we could probably tweak and reconfigure the source to support ME execution as a normal build target. Both are exercises for you all until there's more time (or Ruby on ME becomes a primary goal rather than performance and compatibility on SE :) )

There you go! You wanted JRuby on ME, now you have a damn good start! Run with it!

Wednesday, November 21, 2007

GlassFish Gem Build Instructions

Hopefully by now you've heard about the GlassFish Gem. It's a roughly 3MB gem that includes only the pieces of GlassFish necessary to launch a production JRuby on Rails server. Instead of using WAR deployment, you just run "glassfish_rails" and point at your Rails dir. The result is a multi-request production-ready server.

But there are things that could be improved. It deploys apps under a subcontext, rather than at the root context. It doesn't appear to route static content correctly. It doesn't provide options for configuring the deployed app, like for connection pools, number of JRuby runtime to spin up, and so on. Basically, it needs people to try it out and provide improvement suggestions.

Now you can build the GlassFish Gem yourself Arun has provided step-by-step instructions on how to get all the necessary files and generate your own GlassFish Gem. I believe this could be the best way to deploy JRuby on Rails apps for both production and development use, so I really hope you'll give it a try and offer suggestions and patches back to the GlassFish team (or funnel them through me if you like).

Tab Sweep

Inspired by Tim Bray's periodic tab sweeps (Tim, do you collect unread tabs into the dozens like I do?) here's the first of hopefully many tab sweeps from me.

Kindle by Amazon - I've been waiting for an ePaper-based reader, but I don't think this is going to be the one. No cables or syncing? That says to me everything I read has to go through Amazon, and my existing PDFs will be worthless. Black and white. Ugly as sin ("Hello 1996? I found something that belongs to you"). Flop.

Neal Ford on JRuby and Ruby versus others (podcast) - Neal does a great job explaining JRuby to the masses while understanding the deeper reasons why it's one of the better languages for the JVM. One note to Neal: JRuby is far from being a simple port of the MRI C code; it's grown far, far beyond that now and is considerably more advanced in design and implementation.

Tim Bray's two-question Ruby tools survey - Tim's pretty good at keeping his finger on the pulse of the dev community. Guess that's why he's a director and distinguished engineer.

John Rose's report on his and my meeting with the PyPy team - John summed it up pretty well, but he's interested for different reasons than I. He's interested in finding ways to evolve the JVM to support the sorts of optimizations the PyPy JIT is performing (or will perform, where it's not complete yet). I'm interested in the possibility of a generic language toolchain that allows you to build your language of choice in a subset of your language of choice; effectively a way to quickly bootstrap a language's most rabid users directly into the implementation process, rather than forcing them to use a new language like Java or C#. If it can be done, my money's on that approach beating all "Language Runtimes" that saddle implementers with exactly the sorts of languages they don't want to use.

Glimmer - Despite sharing its name with the ill-fated Whitney Houston movie, Glimmer appears to be an interesting yet-another-JRuby-GUI-framework based on SWT and data binding. Is it obvious yet that GUI development in the C Ruby world is sorely lacking?

Greg Haygood mixes traditional JSP webapp and Rails in the same WAR - Perfectly valid. Now to start blurring the lines between JRuby on Rails and the various Java web frameworks and technologies.

Oracle Mix, the first big-time public JRuby on Rails site - A top-level oracle.com site running JRuby on Rails. That's hardcore. And I don't think they're even running JRuby 1.1, with all its performance glory. And I know about upcoming sites you don't. Future is bright.

David Bock with another OS X Java rant - But he's right about one thing...until Java 6 comes out on OS X, JRuby users will have to be content running the slower Java 5 or the somewhat-flaky Java 6 developer preview (no longer available for download). Write your congressman.

Tor Norbye's NetBeans Ruby features for last week - The guy's a machine. Awesome stuff, and more to come.

JRuby CafePress Store - Any proceeds will go...somewhere, I dunno. I just put it up because I got tired of people whining that they wanted JRuby t-shirts. I didn't bump up any prices, so who knows if there will even be any profit from them. If there is, I'll leave it be until there's something to use it for. If not, so be it. BTW: Don't get the logo on a black shirt; it won't look right.

Experiments with frameless execution - (site is down at the moment; try later) JRuby, like many other language implementations on general-purpose VMs, suffers from the overhead of heap-allocated "frame objects" that hold information about the current method call. They're needed because various languages often need per-call information on an easily accessible stack, or need to be able to tuck call frames away for future use (in closures, for example). They're overhead because the JVM already is allocating (and frequently optimizing away) call frames for the underlying Java code, and there's no way to get at those frames or overload them with new duties. IronPython is a notable example of a language impl that has opted to avoid frame objects, at the cost of some Python features. JRuby, in an effort to support all Ruby features, still has frame objects; but it may be possible to optimize them away in certain cases. The link has microbenchmark numbers for framed and frameless execution. Both are faster than Ruby 1.9; frameless by several times.

Paul Brannan's Ludicrous JIT compiler for Ruby - Promising work, and it gets some decent gains over Ruby 1.9.

Tuesday, November 20, 2007

Bytecode Tools in Ruby: A Low-level DSL

I've been toying with the idea of rewriting the JRuby compiler in Ruby, or at least writing the appropriate plumbing that would allow someone to do something similar. Migrating the JRuby compiler may or may not be worth it, since the existing Java compiler is basically done and working well, and a conversion would be sure to introduce bugs here and there. But it would certainly be a show of faith to give it a try.

As part of this effort, I've built up some basic utility code and a simple JVM bytecode builder that could act as the lowest level of such a compiler. I'm looking for input on the syntax at this point, while I take a break from it to explore JRuby Java integration improvements I think should be done before 1.1.

So here's the Ruby source of the builder, as contained within a test case:

require 'test/unit'
require 'compiler/builder'
require 'compiler/signature'

class TestBuilder < Test::Unit::TestCase
import java.lang.String
import java.util.ArrayList
import java.lang.Void
import java.lang.Object
import java.lang.Boolean

include Compiler::Signature

def test_class_builder
  cb = Compiler::ClassBuilder.build("MyClass", "MyClass.java") do
    field :list, ArrayList
  
    constructor(String, ArrayList) do
      aload 0
      invokespecial Object, "<init>", Void::TYPE
      aload 0
      aload 1
      aload 2
      invokevirtual this, :bar, [ArrayList, String, ArrayList]
      aload 0
      swap
      putfield this, :list, ArrayList
      returnvoid
    end
  
    static_method(:foo, this, String) do
      new this
      dup
      aload 0
      new ArrayList
      dup
      invokespecial ArrayList, "<init>", Void::TYPE
      invokespecial this, "<init>", [Void::TYPE, String, ArrayList]
      areturn
    end
  
    method(:bar, ArrayList, String, ArrayList) do
      aload 1
      invokevirtual(String, :toLowerCase, String)
      aload 2
      swap
      invokevirtual(ArrayList, :add, [Boolean::TYPE, Object])
      aload 2
      areturn
    end
  
    method(:getList, ArrayList) do
      aload 0
      getfield this, :list, ArrayList
      areturn
    end
  
    static_method(:main, Void::TYPE, String[]) do
      aload 0
      ldc_int 0
      aaload
      invokestatic this, :foo, [this, String]
      invokevirtual this, :getList, ArrayList
      aprintln
      returnvoid
    end
  end

  cb.write("MyClass.class")
end
end

For those of you who don't speak bytecode, here's roughly the Java code that this would produce:

import java.util.ArrayList;

public class MyClass {
  public ArrayList list;

  public MyClass(String a, ArrayList b) {
      list = bar(a, b);
  }

  public static MyClass foo(String a) {
      return new MyClass(a, new ArrayList());
  }

  public ArrayList bar(String a, ArrayList b) {
      b.add(a.toLowerCase());
      return b;
  }

  public ArrayList getList() {
      return list;
  }

  public static void main(String[] args) {
      System.out.println(foo(args[0]).getList());
  }
}

The general idea is that fairly clean-looking Ruby code can be used to generate real Java classes, providing a readable base for code generation tools like compilers.

There's a couple things to notice here:

Everything is public. I have not wired in visibility and other modifiers mainly because it starts to look cluttered no matter how I try. Suggestions are welcome.
The bytecode, while clean looking, is pretty raw. This interface also doesn't save you from yourself; if you're not ordering your bytecodes right, you'll end up with an unverifiable class file.
It's not apparent just from looking at the code which types specified are return values and which are argument values. Something more explicit could be useful here.

I'd like to continue this work. The above code, run against JRuby trunk and the lib/ruby/site_ruby/1.8/compiler library I'm working on, will produce a working MyClass class file:

~/NetBeansProjects/jruby $ jruby test/compiler/test_builder.rb
Loaded suite test/compiler/test_builder
Started
.
Finished in 0.096 seconds.

1 tests, 0 assertions, 0 failures, 0 errors
~/NetBeansProjects/jruby $ java -cp . MyClass foo
[foo]

So it's actually emitting the appropriate bytecode for this class.

Comments? Thoughts for improvement?

Monday, November 19, 2007

Have You Written RubySpec Today?

You know Ruby. You've been coding up Ruby apps for a while now. You've seen the power and the magic that Ruby offers developers, and you've seen some of the weirder, wilder, and perhaps uglier sides of Ruby. You think you're pretty well-versed in the core classes, and you can hold your own when metaprogramming.

You're a Ruby Programmer. Now Prove it.

RubySpec is a wiki-based Ruby specification, aimed at forming an English-language, community-driven spec for Ruby 1.8 (and in the future, Ruby 1.9). There's a lot of content there already, and a lot of stubbed articles and missing details. And it's your turn to contribute.

You'll be in good company. Ruby oldschoolers like why the lucky stiff and Matz himself have contributed updates and fixes. Ryan Davis contributed the content of his Ruby Quickref. Those of us on the JRuby team try to update it when we find pecularities in the Ruby language, classes, or runtime. And many folks use it as a convenient reference.

RubySpec is connected with the Ruby Documentation Project, which also hosts Ruby 1.8 and 1.9 RDoc-generated documentation directly from the C source. The spec is designed to fill in the gaps, explaining more details about the language, the runtime, and the implementations that would be useful to folks interested in a deeper look at Ruby.

So...have you written RubySpec today?

The First Ruby Mailing List Translator

In response to my post about the Ruby community needing an autotranslator for the key mailing lists, due to the language barrier between English-speaking and Japanese-speaking folks, I received quite a bit of interest, and a few people are looking into automatic solutions, manual solutions, and various solutions in between. But I believe we have a first attempt to meet the challenge.

Jason Toy has set up a mailing list translator site for the Ruby community. It provides autotranslated text of many Ruby mailing lists (both directions, and far more than I expected anyone to tackle), and even better it provides the original text and invites bilingual Rubyists to submit better translations for individual posts.

Jason commented on the previous post, saying his site still needs some work, but it's definitely on the right track. The obvious missing feature is a way to subscribe to the translated lists, either via feeds or mailing lists. Anyone feel like lending a hand can contact Jason at (I think) admin@translator.rubynow.com. Anyone feeling like doing this one better, maybe by setting up an army of human translators to proxy information across the divide, don't let this stop you :)

Saturday, November 17, 2007

RejectConf 4 Calls to Action: Ruby specs and JRuby failures

For those of you that don't know, RejectConf started last year at RubyConf 2006 in Denver because Ryan Davis (zenspider; author of the many parsetree-based tools like flog and heckle, part of the vlad team, and so on) had a rejected presentation and still wanted to show it off. It turned out to be one of the most entertaining parts of the conference, with dozens of folks taking 5-15 minutes to demo something they thought would be cool. Some were met with applause, some were booed down Apollo-style, and everyone had a good time.

RejectConf 4 was at RubyConf 2007 earlier this month. I took the opportunity to toss out two calls to action:

Help contribute to Rubinius by adding specs, and if you're especially lazy just copy any of the many tests in JRuby (that are not specs, but which we'd just as soon see migrate to a single suite).
Report failures in JRuby running the specs, and possibly fix them if you feel like helping out even more.

Confreaks have all the RejectConf 4 videos up now, and my call to action is there with the others. Check it and the others out, and consider helping out either the spec project or JRuby.

Monday, November 05, 2007

Ruby Community Seeks Autotranslator

As many of you know, Ruby was created in Japan by Yukihiro Matsumoto, and most of the core development team is still Japanese to this day. This has posed a serious problem for the Ruby community, since the language barrier between the Japanese core team and community and the English-speaking community is extremely high. Only a few members of the core team can speak English comfortably, so discussions about the future of Ruby, bug fixes, and new features happens almost entirely on the Japanese ruby-dev mailing list. That leaves those of us English speakers on the ruby-core mailing list out in the cold.

We need a two-way autotranslator.

Yes, we all know that automated translation technology is not perfect, and that for East Asian languages it's often barely readable. But even having partial, confusing translations of the Japanese emails would be better than having nothing at all, since we'd know that certain topics are being discussed. And English to JP translators do a bit better than the reverse direction, so core team members interested in ruby-core emails would get the same benefit.

I imagine this is also part of the reason Rails has not taken off as quickly in Japan as it has in the English-speaking world: the Rails core team is peopled primarily by English speakers, and the main Rails lists are all in English. Presumably, an autotranslating gateway would be useful for many such communities.

But here's the problem: I know of no such service.

There are multiple translation services, for free and for pay, that can handle Japanese to some level. Google Translate and Babelfish are the two I use regularly. But these only support translating a block of text or a URL entered into a web form. There also does not appear to be a Google API for Translate, so screen-scraping would be the only option at present.

The odd thing about this is that autotranslators are good enough now that there could easily be a generic translation service for dozens of languages. Enter in source and target languages, source and target mailing lists, and it would busily chew through mail. For closely-related European languages, autotranslators do an extremely good job. And just last night I translated a Chinese blog post using Google Translate that ended up reading as almost perfect English. The time is ripe for such a service, and making it freely available could knock down some huge barriers between international communities.

So, who's going to set it up first and grab the brass ring (or is there a service I've overlooked)?

Updated Alioth Numbers for JRuby 1.1b1

Oh yes, you all know you love the Alioth Shootout.

Isaac Gouy has updated the JRuby numbers, and modified the default comparison to be with Ruby 1.8.6 rather than with Groovy as it was before. And true to form, JRuby is faster than Ruby on 14 out of 18 benchmarks.

There are reasons for all four benchmarks that are slower:

pidigits is simply too short for JRuby to hit its full stride. Alioth runs it with n = 2500, which on my system doing a simple "time" results in JRuby taking 11 seconds and Ruby taking 5. If I bump that up to 5000, JRuby takes 27 seconds to Ruby's 31.
regex-dna and recursive-complement are both hitting the Regexp performance problem we have in JRuby 1.0.x and in the 1.1 beta. We expect to have that resolved for 1.1 final, and Ola Bini and Marcin Mielczynski are each developing separate Regexp engines to that end.
startup, beyond being a touch unfair for a JVM-based language right now, is actually about half our fault and half the JVM's. The JVM half is the unpleasantly high cost of classloading, and specifically the cost of generating many small temporary classes into their own classloaders, as we have to do in JRuby to avoid leaking classes as we JIT and bind methods at runtime. The JRuby half is the fact that we're loading and generating so many classes, most of them too far ahead of time or that will never be used. So there's blame to go around, but we'll never have Ruby's time for this.

Standard disclaimer applies about reading too much into these kinds of benchmarks, but it's certainly a nice set of numbers to wake up to.

Closures Prototype Applied to JRuby Compiler

A bit ago, I was catching up on my feeds and noticed that Neal Gafter had announced the first prototype of Java closures. I've been a fan of the BGGR proposal, so I thought I'd catch up on the current status and try applying it to a pain point in the JRuby source: the compiler.

The current compiler is made up of two halves: the AST walker and the bytecode emitter. The AST walker recursively walks the AST, calling appropriate methods on a set of interfaces into the bytecode emitter. The bytecode emitter, in turn, spits out appropriate bytecodes and calls back to the AST walker. Back and forth, the AST is traversed and all nested structures are assembled appropriately into a functional Java method.

This back and forth is key to the structure and relative simplicity of the compiler. Take for example the following method in ASTCompiler.java, which compiles a Ruby "next" keyword (similar to Java's "continue"):

public static void compileNext(Node node, MethodCompiler context) {
  context.lineNumber(node.getPosition());

  final NextNode nextNode = (NextNode) node;

  ClosureCallback valueCallback = new ClosureCallback() {
      public void compile(MethodCompiler context) {
          if (nextNode.getValueNode() != null) {
              ASTCompiler.compile(nextNode.getValueNode(), context);
          } else {
              context.loadNil();
          }
      }
  };

  context.pollThreadEvents();
  context.issueNextEvent(valueCallback);
}

First, the "lineNumber" operation is called on the MethodCompiler, my interface for primary bytecode emitter. This emits bytecode for line number information based on the parsed position in the Ruby AST.

Then we get a reference to the NextNode passed in.

Now here's where it gets a little tricky. The "next" operation can be compiled in one of two ways. If it occurs within a normal loop, and the compiler has an appropriate jump location, it will compile as a normal Java GOTO operation. If, on the other hand, the "next" occurs within a closure (and not within an immediately-enclosing loop), we must initiate a non-local branch operation. In short, we must throw a NextJump.

In Ruby, unlike in Java, "next" can take an optional value. In the simple case, where "next" is within a normal loop, this value is ignored. When a "next" occurs within a closure, the given value becomes the local return from that invocation of the closure. The idea is that you might write code like this, where you want to do an explicit local return from a closure rather than let the return value "fall off the end":

def foo
puts "still going" while yield
end

a = 0
foo {next false if a > 4; a += 4; true}

...which simply prints "still going" four times.

The straightforward way to compile this non-local "next" would be to evaluate the argument, construct a NextJump object, swap the two so we can call the NextJump(IRubyObject value) constructor with the given value, and then raise the exception. But that requires us to juggle values around all the time. This simple case doesn't seem like such a problem, but imagine the hundreds or thousands of nodes the compiler will handle for a given method, all spending at least part of their time juggling stack values around. It would be a miserable waste.

So the compiler constructs a poor-man's closure: an anonymous inner class. The inner class implements our "ClosureCallback" interface which has a single method "compile" accepting a single MethodCompiler parameter "context". This allows the non-local "next" bytecode emitter to first construct the NextJump, then ask the AST compiler to continue processing AST nodes. The compiler walks the "value" node for the "next" operation, again causing appropriate bytecode emitter calls to be made, and finally we have our value on the stack, exactly where we want it. We continue constructing the NextJump and happily toss it into the ether.

The final line of the compileNext method initiates this process.

So what would this look like with the closure specification in play? We'll simplify it with a function object.

public static void compileNext(Node node, MethodCompiler context) {
  context.lineNumber(node.getPosition());

  final NextNode nextNode = (NextNode) node;

  ClosureCallback valueCallback = { MethodCompiler => context
      if (nextNode.getValueNode() != null) {
          ASTCompiler.compile(nextNode.getValueNode(), context);
      } else {
          context.loadNil();
      }
  };

  context.pollThreadEvents();
  context.issueNextEvent(valueCallback);
}

That's starting to look a little cleaner. Gone is the explicit "new"ing of a ClosureCallback anonymous class, along with the superfluous "compiler" method declaration. We're also seeing a bit of magic outside the function type: closure conversion. Our little closure that accepts a MethodCompiler parameter is being coerced into the appropriate interface type for the "valueCallback" variable.

How about a more complicated example? Here's a much longer method from JRuby that handles "operator assignment", or any code that looks like a += b:

public static void compileOpAsgn(Node node, MethodCompiler context) {
  context.lineNumber(node.getPosition());

  // FIXME: This is a little more complicated than it needs to be;
  // do we see now why closures would be nice in Java?

  final OpAsgnNode opAsgnNode = (OpAsgnNode) node;

  final ClosureCallback receiverCallback = new ClosureCallback() {
      public void compile(MethodCompiler context) {
          ASTCompiler.compile(opAsgnNode.getReceiverNode(), context); // [recv]
          context.duplicateCurrentValue(); // [recv, recv]
      }
  };

  BranchCallback doneBranch = new BranchCallback() {
      public void branch(MethodCompiler context) {
          // get rid of extra receiver, leave the variable result present
          context.swapValues();
          context.consumeCurrentValue();
      }
  };

  // Just evaluate the value and stuff it in an argument array
  final ArrayCallback justEvalValue = new ArrayCallback() {
      public void nextValue(MethodCompiler context, Object sourceArray,
              int index) {
          compile(((Node[]) sourceArray)[index], context);
      }
  };

  BranchCallback assignBranch = new BranchCallback() {
      public void branch(MethodCompiler context) {
          // eliminate extra value, eval new one and assign
          context.consumeCurrentValue();
          context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
          context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
      }
  };

  ClosureCallback receiver2Callback = new ClosureCallback() {
      public void compile(MethodCompiler context) {
          context.getInvocationCompiler().invokeDynamic(
                  opAsgnNode.getVariableName(), receiverCallback, null,
                  CallType.FUNCTIONAL, null, false);
      }
  };

  if (opAsgnNode.getOperatorName() == "||") {
      // if lhs is true, don't eval rhs and assign
      receiver2Callback.compile(context);
      context.duplicateCurrentValue();
      context.performBooleanBranch(doneBranch, assignBranch);
  } else if (opAsgnNode.getOperatorName() == "&&") {
      // if lhs is true, eval rhs and assign
      receiver2Callback.compile(context);
      context.duplicateCurrentValue();
      context.performBooleanBranch(assignBranch, doneBranch);
  } else {
      // eval new value, call operator on old value, and assign
      ClosureCallback argsCallback = new ClosureCallback() {
          public void compile(MethodCompiler context) {
              context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
          }
      };
      context.getInvocationCompiler().invokeDynamic(
              opAsgnNode.getOperatorName(), receiver2Callback, argsCallback,
              CallType.FUNCTIONAL, null, false);
      context.createObjectArray(1);
      context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
  }

  context.pollThreadEvents();
}

Gods, what a monster. And notice my snarky comment at the top about how nice closures would be (it's really there in the source, see for yourself). This method obviously needs to be refactored, but there's a key goal here that isn't addressed easily by currently-available Java syntax: the caller and the callee must cooperate to produce the final result. And in this case that means numerous closures.

I will spare you the walkthrough on this, and I will also spare you the one or two other methods in the ASTCompiler class that are even worse. Instead, we'll jump to the endgame:

public static void compileOpAsgn(Node node, MethodCompiler context) {
  context.lineNumber(node.getPosition());

  // FIXME: This is a little more complicated than it needs to be;
  // do we see now why closures would be nice in Java?

  final OpAsgnNode opAsgnNode = (OpAsgnNode) node;

  ClosureCallback receiverCallback = { MethodCompiler context =>
      ASTCompiler.compile(opAsgnNode.getReceiverNode(), context); // [recv]
      context.duplicateCurrentValue(); // [recv, recv]
  };

  BranchCallback doneBranch = { MethodCompiler context =>
      // get rid of extra receiver, leave the variable result present
      context.swapValues();
      context.consumeCurrentValue();
  };

  // Just evaluate the value and stuff it in an argument array
  ArrayCallback justEvalValue = { MethodCompiler context, Object sourceArray, int index =>
      compile(((Node[]) sourceArray)[index], context);
  };

  BranchCallback assignBranch = { MethodCompiler context =>
      // eliminate extra value, eval new one and assign
      context.consumeCurrentValue();
      context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
      context.getInvocationCompiler().invokeAttrAssign(opAsgnNode.getVariableNameAsgn());
  };

  ClosureCallback receiver2Callback = { MethodCompiler context =>
      context.getInvocationCompiler().invokeDynamic(
          opAsgnNode.getVariableName(), receiverCallback, null,
          CallType.FUNCTIONAL, null, false);
  };

  // eval new value, call operator on old value, and assign
  ClosureCallback argsCallback = { MethodCompiler context =>
      context.createObjectArray(new Node[]{opAsgnNode.getValueNode()}, justEvalValue);
  };

  if (opAsgnNode.getOperatorName() == "||") {
      // if lhs is true, don't eval rhs and assign
      receiver2Callback.compile(context);
      context.duplicateCurrentValue();
      context.performBooleanBranch(doneBranch, assignBranch);
  } else if (opAsgnNode.getOperatorName() == "&&") {
      // if lhs is true, eval rhs and assign
      receiver2Callback.compile(context);
      context.duplicateCurrentValue();
      context.performBooleanBranch(assignBranch, doneBranch);
  } else {
      context.getInvocationCompiler().invokeDynamic(
              opAsgnNode.getOperatorName(), receiver2Callback, argsCallback,
              CallType.FUNCTIONAL, null, false);
      context.createObjectArray(1);
      context.getInvocationCompiler().invokeAttrAssign(
              opAsgnNode.getVariableNameAsgn());
  }

  context.pollThreadEvents();
}

There's two things I'd like you to notice here. First, it's a bit shorter as a result of the literal function objects and closure conversion. It's also a bit DRYer, which naturally plays into code reduction. Second, there's far less noise to contend with. Rather than having a minimum of five verbose lines to define a one-line closure (for example), we now have three terse ones. We've managed to tighten the focus to the lines of code we're actually interested in: the bodies of the closures.

Of course this quick tour doesn't get into the much wider range of features that the closures proposal contains, such as non-local returns. It also doesn't show closures being invoked because with closure conversion many existing interfaces can be represented as function objects automatically.

I'll be looking at the closure proposal a bit more closely, and time permitting I'll try to get a simple JRuby prototype compiler wired up using the techniques above. I'd recommend you give it a try too, and offer Neal your feedback.

Sunday, November 04, 2007

Ruby Continues to Climb on TIOBE

I've posted about TIOBE here before.

The TIOBE Programming Community index gives an indication of the popularity of programming languages. The index is updated once a month. The ratings are based on the world-wide availability of skilled engineers, courses and third party vendors. The popular search engines Google, MSN, Yahoo!, and YouTube are used to calculate the ratings. Observe that the TIOBE index is not about the best programming language or the language in which most lines of code have been written.

I noticed this month that Ruby has moved up to #9 in the list, passing JavaScript. Also noted in the November Newsflash is that Ruby is currently the front runner to win "programming language of the year" for the second year in a row, closely followed by D and C#.

TIOBE Programming Community Index