Headius

Monday, August 04, 2008

libdl _dl_debug_initialize problem solved

I'm posting this to make sure it gets out there, so nobody else spends a couple days trying to fix it.

Recently, after some upgrade, I started getting the following message during JRuby's run of RubySpecs:

Inconsistency detected by ld.so: dl-open.c: 623: _dl_open: Assertion
 `_dl_debug_initialize (0, args.nsid)->r_state == RT_CONSISTENT' failed!

I narrowed it down to a call to getgrnam in the 'etc' library, which we provide using JNA. Calling that function through any means caused this error.

My debugging skills are pretty poor when it comes to C libraries, so I mostly started poking around trying to upgrade or downgrade stuff. Most other posts about this error seemed to do the same thing, but weren't very helpful about which libraries or applications were changed. Eventually I got back to looking at libc6 versions, to see if I could downgrade and hopefully eliminate the problem, when I saw this text in the libc6-i686 package:

 WARNING: Some third-party binaries may not work well with these
 libraries. Most notably, IBM's JDK. If you experience problems with
 such applications, you will need to remove this package.

Nothing else seemed to depend on libc6-i686, so I removed it. And voila, the problem went away.

This was on a pretty vanilla install of Ubuntu 7.10 desktop with nothing special done to it. I'm not sure where libc6-i686 came from.

On a side note, sorry to my regular readers for not blogging the past month. I've been hard at work on a bunch of new JRuby stuff for an upcoming 1.1.4 release. I'll tell you all about it soon.

Friday, June 27, 2008

JRuby Japanese Tour 2008 Wrap-Up!

Whew! I survived the insanity of the JRuby Japanese Tour 2008, and now it's time to report on it. This post will mostly be a blow-by-blow account of the trip, and I'll try to post more in-depth thoughts later. I am still in Tokyo and need to repack my luggage, so this will be brief.

Day 1

Left my warm, sunny vacation in Michigan to board flight #1 from Chicago to Minneapolis
First-class upgrade for Chicago-Minneapolis flight. Whoopie...it's like an hour flight.
Just enough time in Minneapolis airport to change some money. Hopefully 20k ¥ will be enough. Compatible cash machines are hard to come by in Japan.
Twelve-hour flight #2 from Minneapolis to Narita. Glad I moved to a window seat facing a bulkhead, since I was able to stretch out and sleep most of the flight.
Arrived in Narita without event; purchased bus ticket to Tsukuba and rented a cell phone.
Bus ride to Tsukuba went through some very nice countryside.
Arrived in Tsukuba, walked about five minutes to the hotel and ran into Chad Fowler, Evan Phoenix, Rich Kilmer and others gathering in the lobby.
Checked into my room, dropped off my stuff, went back down to lobby to find other American rubyists had taken off for dinner. Teh suck. Ate unimpressive dinner alone in hotel restaurant.

Day 2

Ruby Kaigi day one.
Sun folks provided a JRuby t-shirt, which was great because I forgot to pack one of mine!
Delivered JRuby presentation, and it went very well. Demos almost all worked perfectly, lots of questions showed people were impressed.
Met up with Sun guys at Kaigi booth for a bit. They were giving away little Duke+Ruby candies! Awesome!
At evening event, lots of discussion about JRuby, and I got to show off Duby a little bit too.

Day 3

Ruby Kaigi day two.
Some good talks, but definitely a "day two" slate.
Especially liked Naoto Takai and Koichiro Ohba's enterprise Ruby talk. Very pragmatic, hopefully very helpful for .jp Rubyists.
Met up with ko1 and Prof Kakehi from Tokyo University to discuss progress of MVM collaboration.
Had to leave before Reject Kaigi to transfer to a hotel near Haneda Airport.
Dinner at Haneda Airport with Takashi Shitamichi of Sun KK.
Stayed at nearby comfortable JAL Hotel, basically JAL's airport hotel.

Day 4

Quick breakfast at Haneda Airport; "morning set" included a half-boiled egg, toast, salad, coffee.
Flight #3 from Haneda to Izumo. Upgraded for 1000¥ to first class.
Takashi and I met Matsue folks at Izumo airport for a short ride to Matsue.
Met up with NaCl folks including Matz, Shugo Maeda, and others.
Listened to a presentation in English about a large Ruby app migrated from an old COBOL mainframe app. JRuby used to interface Ruby with reporting solutions.
Lunch box at NaCl after presentation.
Played Shogi with Shugo after lunch. Shugo beat me pretty handily. He said I was strong (but I think he was just being polite).
Played Igo (Go) with Hideyuki Yasuda. He played with a 9-stone handicap and was winning when I had to leave. He invited me to join NaCl Igo club on KGS and said he'd like to continue the game online.
Delivered a lecture on JRuby with Takashi at Shimane University. Received some good questions, but it was a tough crowd (kids just starting out in CS).
Took a quick tour of Matsue Castle with Takashi and a personal guide. Basically ran to the top, looked around, and ran back down. Back to work!
Evening event at office of the "Ruby City Matsue" project. While everything was being set up, demonstrated JRuby/JVM/HotSpot optimizations for NaCl folks.
Delivered the opening toast for the evening event. Barely had time to eat between questions from folks. Showed off Duby and more HotSpot JIT-logging coolness.
Post-event trip to local Irish pub. Many Guinness were drunk. Many Rubyists were drunk.
Walked back to hotel earlier than others to get some sleep.

Day 5

Took a bus from Matsue back to Izumo Airport.
Flights #4 and #5 took me from Izumo to Haneda and Haneda to Fukuoka. Saw Mount Fuji from the air, poking just above the clouds.
Takashi and I missed flight #5, so we were delayed about an hour.
Arrived in Fukuoka, immediately raced over to Ruby Business Commons event. Delivered presentation to an extremely receptive crowd.
Post RBC event included beer, various dried fishy things, and lots of photo-taking and card-exchanging. Showed off Duby again, and met authors of an upcoming Japanese book on JRuby!
Invited out for more drinking (famous Fukuoka Shouchu), but Takashi wisely told them we were very tired.

Day 6

Breakfast at Izumo airport. "Honey toast" and coffee. Basically a thick slice of toast with butter and honey.
Flight back to Tokyo (Haneda) from Izumo.
Checked into Cerulean Tower Hotel in Shibuya, my final home for the trip.
Off to Shinagawa to present JRuby at Rakuten's offices. Across the street from Namco/Bandai! Lots of great questions...I think they were impressed.
Back to Shibuya with Takashi for Shabu-Shabu dinner. Mid-range Japanese beef...truly excellent. Ate way too much.

Day 7

Woke up a couple times in the night with indigestion. Why oh why did I eat so much beef?
Off to Sun KK offices in Yoga for a public techtalk event.
JRuby presentation wowed attendees...lots of questions after and great discussions.
Traditional Japanese-style dinner with Takai-san and Ohba-san, plus the excellent Sun KK JRuby enthusiasts.
Witnessed registration of new jruby-users.jp site and discussed new mascot ideas for JRuby. NekoRuby, perhaps?
Many new consumption firsts: Hoppy (cheap beer plus shouchu on ice), raw beef, and raw horse. I can check horse off my list.
Ramen in Shibuya to close out the night.

Day 8

Internal presentation on JRuby at Sun KK offices. Slim attendance...there were apparently HR training sessions at the same time. But still fun.
Big sigh of relief at being done presenting. Lunch at Chinese restaurant with Takashi and said goodbye. どもありがとございます, Shitamichi-san!
Stopped into hotel to arrange remaining plans for the day.
Visted Edo-Tokyo Museum in Ryogoku, to see images and artifacts of the old capital. Choked up a bit watching videos of the incendiary bombing raids by the US on Tokyo.
Headed to Akihabara to meet up with ko1. Wandered around for about an hour before heading up to his "Sasada-lab".
Mini "JRuby Kaigi" at Sasada-lab. Showed off Duby, JRuby optimizations, and Ruby-Processing demos (plus applets!).
Off to "Meido Kissa" ("Maid Cafe") with ko1 and other members of ruby-core. Very unusual experience, but entertaining.
Back to hotel. Considered a trip to my favorite Belgian beer bar in Shibuya ("Belgo"), but I'm plumb tuckered out. Catching up on email, IRC, and writing this post.

Day 9

All that remains is getting to Narita and flying home. Hopefully all will go smoothly.

Well, that's about it. Any names I've excluded I'll hopefully include in more detailed posts later. To all the Japanese Rubyists (and JRubyists): thank you so much for making me feel so welcome, and I hope we can work together to make JRuby the best Ruby implementation it can possibly be. Please feel free to email me (Japanese is ok!) or find me on IRC (Japanese is ok!) and please try to help the jruby-users.jp site be a success when it is ready. I think there's a tremendous future for JRuby in Japan!

Monday, June 02, 2008

Inspiration from RailsConf

RailsConf 2008 is over, and it was by far better than last year. I'm not one for drawn-out conference wrap-up posts so here's a summary of my most inspiring moments and if applicable how they're going to affect JRuby going forward.

IronRuby and Rubinius both running Rails has inspired me to finally knock out the last Rails bottlenecks in JRuby. Look for a release sometime this summer or later this fall to be accompanied by a whole raft of numbers proving better performance under JRuby than any other options. Oh, and huge congratulations to both teams, and I wish you the best of luck on the road to running larger apps.
Phusion's Passenger (formerly mod_rails) has made some excellent incremental improvements to MRI for running Rails. It's nothing revolutionary, but judging by the graphs they've managed 10-20% memory and perf improvements over the next best MRI-based option. We're going to try to match them by more aggressively sharing immutable runtime data across JRuby instances such as parsed Ruby code (which on some measurements accounts for almost 40% of a freshly-started app's memory use). We'd like to be able to say that JRuby is also the most memory-efficient way to run Rails in the near future.
The Maglev presentation inspired me to dive back into performance. For the most part, we stopped really working hard on performance once we started to be generally as fast as Ruby 1.9. Now we'll start pulling out all the stops and really kick JRuby into high gear.
Wilson Bilkovitch impressed me most when he used the historically-correct "drinking the Flavor-Ade" instead of the incorrect but more popular "drinking the Kool-Aid".
Ezra's talk on Vertebra, Engine Yard's upcoming Erlang-based XMPP routing engine, almost inspired me to try out Erlang a bit. Almost. At any rate it sounds awesome...I am all set to write an agent plugin for JRuby when it's released and the protocol is published
My keynote was generally pretty well received, but I had several people say I should have smiled more, and that it came off as a bit defensive. I think a lot of that had to do with getting only 10 minutes for the whole thing and trying to jam too much in, but I'll definitely pay attention to that in the future.
This was my first US-based Ruby-related conference where I did not play Werewolf. I don't expect to ever play much (or maybe ever) in the future. I've decided I don't really want to play a game where the best players are the ones who can learn to lie most convincingly. It seems like a crucial flaw in the game, and if I ever do play again I will try to make a strong case that to win, kill the most experienced people first. They'll never be a net good, because if they're good villagers with strong deductive skills, they're also likely to be good warewolves, with strong lying skills. Eject them immediately.

All told, a great conference. I'm looking forward to RailsConf EU 2008 and RubyConf 2008.

Sunday, June 01, 2008

Maglev

Of course anyone who reads my blog expected I'd have something to say about Maglev once it was made public. I've previously performed what I thought was a fair analysis of the various Ruby implementations, and Maglev was mostly a sidebar. With their coming out at RailsConf, they're now fair game for some level of analysis.

Avi Bryant and Bob Walker talked about Maglev, a new Ruby VM based on Gemstone's Smalltalk VM, at RailsConf this weekend. And there's been an explosion of coverage about it.

First off, they demonstrated its distributed object database automatically synchronizing globally-reachable state across multiple VMs. It's an amazing new idea that the world has never really seen...

except that it isn't. This is based on existing OODB technology that Gemstone and others have been promoting for better than a decade. It's cool stuff, no doubt, but it's been available in Gemstone's Smalltalk product and in their Java product for years, and hasn't seen widespread adoption. Maybe it's on the rise, I really don't know. It's certainly cool, but it's certainly not new.

The duo eventually moved on to show off some performance numbers. And please pardon me if I don't have these numbers exactly right. They showed our old friend fib running something like 15x faster. Method dispatch something like 30x faster. While loops 100x faster. Amazing results.

Except that these are results reported entirely in a vacuum. Whether this is fib following the "rules" of Ruby is entirely an open question. Whether this is method dispatch adhering to Ruby's call logic is entirely an open question. Whether this is a while loop using all method calls for its condition and increment steps is an open quesetion. Because the Maglev guys haven't started running Ruby tests yet. Is it Ruby?

I don't want to come off as too defensive here, and I don't want to appear as though I'm taking shots at another implementation. I've certainly launched my share of controversial commentary at Rubinius and IronRuby over the past few months, and while some of it may perhaps have slipped over the edge of polite commentary, I always thought I was being at least honest.

But there's an entirely new situation with Maglev. Maglev has begun to publish glowing performance numbers well in advance of actually running anything at all. They haven't started running the RubySpecs and have no compatibility story today. You can't actually get Maglev yet and run anything on it. It's worse than Vaporware, it's Presentationware. Go to Gemstone's site and download Maglev (you can't). Pull the source (you can't). Build it yourself and investigate what it does (you can't). You start to understand what I mean. And this is what the "Ruby media" is calling the most disruptive new Ruby technology. Dudes, come on. Were you born yesterday?

It's time for a confession. I've been too hard on IronRuby and Rubinius. Both teams are working really hard on their respective implementations, and both teams have really tried to stay true to Ruby ideals in everything they do. Guess what...IronRuby runs Rails. Rubinius runs Rails. And if they're not production ready now, they will be soon. And that's a good thing for Ruby. Sure, I still believe both teams may have made unreasonable claims about what they'd be able to accomplish in a given period of time, but we've all made those claims. If they haven't delivered on all milestones, they've delivered on most of the important ones. And it's those milestones I think deserve some credit now.

My sin is pride. I'm proud of what we've accomplished with JRuby. And when new implementers come along saying they're going to do it in half the time, I feel like it belittles the effort we've put in. IronRuby has done it. Rubinius has done it. And while I've occasionally lashed out at them as a result, I've always been right there trying to help them...answering questions, contributing specs, suggesting strategies and even committing code. In the end it's the cockiness...the attitude...the belief that "I know better than you do" that irritates me, and I'm too sensitive to it. Color me human. But it's time for me and others to understand another side of IronRuby and Rubinius in light of this new contender.

Rubinius and IronRuby teams have always considered compatibility as the primary goal. If you can't run Ruby apps, you're not Ruby, right? And so every step of the way, as they published performance results AND compatibility metrics, they've always been honest about the future.

IronRuby has managed to get great performance on several benchmarks by leveraging the DLR and the excellent language implementation folks on the DLR and IronPython teams at Microsoft. So if nothing else, they've proven many of the "fast-bootstrapping" claims they've made about the DLR. And they've always been balanced in reporting results...John Lam has shown a couple slow benchmarks along with fast benchmarks at every talk, not to mention showing spec results with pass/fail rates clearly spelled out. That honesty has not gone unnoticed, and it shows a realism and humility that will ensure IronRuby's future; a realism that will ensure Ruby users who really want or need a .NET implementation will receive an excellent one.

Rubinius has taken an entirely new approach to implementing Ruby by attempting to write as much as possible in Ruby itself. Maybe they have a lot of C/C++ code right now, but it's not that big a deal...and I was perhaps too pedantic to focus on this ratio in previous posts. What's important is that Rubinius has always tried to be an entirely open, community-driven project. Their successes and failures are immediately accessible to anyone who wants to pull the source; and anyone who wants to pull the source can probably become a Rubinius contributor within a short amount of time. They've had performance ups and downs, but again they've been honest about both the good and the bad. And like IronRuby, if they haven't trumpeted the bad side of things, it's because they're already proving that the Ruby-in-Ruby approach absolutely can work. The bad side will lessen over time until it completely disappears.

Then there's Maglev. Like the other impls, I'm excited that there's a new possibility for Ruby to succeed. A high performance, "scalable" Ruby implementation is certainly what this community needs. But unlike most of the other implementations, it seems like Maglev is pushing performance numbers without compatibility metrics; marketing before reality. Am I far off here?

Let's take a step back. Maglev will probably be amazing. It will probably be fast, maybe on some order approaching the numbers they've reported. Maybe this will happen some day along with support for existing Ruby code. And hell, maybe I'll use it too...I want to be able to write applications in Ruby and have insane performance so I can just write code the way I want to write code. So do you.

But we're talking theory here. So let's do an experiment using JRuby briefly.

Maglev published fib numbers as being around 15x MRI performance. That's very impressive. So let's check MRI perf on my machine (keeping in mind, as I've stated previously, that fib is far from indicative of any real-world performance):

Ruby 1.8.6, fib(34), best of 10: 6.56s

Now let's try stock JRuby, with full compatibility:

JRuby 1.1.2, fib(34), best of 10: 1.735s (3.8x faster)

Not bad, but certainly not up to Maglev speeds, right? Well...perhaps. JRuby, like IronRuby and Rubinius, has always focused first on compatibility. This means we're bending over backwards to make normal Ruby code run. So in many cases, we're doing more work than we need to, because compatibility has always been the primary goal. IronRuby and Rubinius will report the same process. Make it work, then make it fast. And both IronRuby and Rubinius are now starting to run Rails, so I think we've proven at least three times that this is the right approach.

But let's say we could tweak JRuby to run with some "future" optimizations, optimizations that might not be quite "Ruby" but which would still successfully run these benchmarks.

First, we'll turn off first-class frame object allocation/initialization, since it's not needed here:

JRuby 1.1.2, fib(34), no frames: 1.273s (5.15x faster than MRI)

Now we'll turn off thread checkpointing needed to implement operations like Thread#kill and Thread#raise, as well as turning off artificial line-position updates:

JRuby 1.1.2, fib(34), -frames, -checkpoints, -positions: 1.25s (5.24x faster)

Now we'll add in some fast integer operations like Ruby 1.9 includes, where Fixnum#+, -, etc are specially-handled by the compiler. And we'll simultaneously omit some last framing overhead that's still around to handle backtrace information:

JRuby 1.1.2, fib(34), "fastest" mode: 0.984s (6.67x faster)

So just by tweaking a few things we've gained another 3x performance over MRI. Are we having fun yet? Should we extrapolate to optimizations X, Y, Z that bring JRuby performance another half-dozen times faster than MRI? If we can run the benchmarks, it shouldn't matter that we can't run Ruby code, right?

The truth is that not all of these optimizations are kosher right now. Removing the ability to override Fixnum#+ certainly makes it easier to optimize addition, but it's not in the spirit of Ruby. Removing frames may be legal in some cases (like this one) but it's not legal in all cases. And of course I've blogged about how Thread#kill and Thread#raise are broken, but we have to support them anyway. On and on we can go through lots of optimizations you might make in the first 100 days of your implementation, only to back out later when you realize you're actually breaking features people depend on.

This all adds up to a very different picture of Ruby implementation. Rather than wishing for a rose-colored world where anyone with a new VM can swoop in and post magic performance numbers, perhaps we as Ruby community members should be focusing on whether this is going to help us actually run today's apps any better; whether these results are repeatable in ways that actually help us get shit done. Perhaps we should be focusing on the compatibility story over bleeding-edge early performance numbers; focusing on tangible steps toward the future rather than the "furs and gold rings" that David warned about in his keynote. Maybe we should think more about the effect that broadcasting vaporware performance numbers will have on the community, rather than rushing to be the first to republish the latest numbers on the latest slides. Maybe it's worth taking all this microbenchmark nonsense with a grain of salt and trying it out ourselves (if, of course, that's even possible) before serving as the mouthpiece for others' commercial ventures.

Am I wrong? Am I being unfair? Am I taking an unreasonable shot at Maglev?

Wednesday, May 21, 2008

JRuby Pre-RailsConf Hackfest on Thursday

Hey there JRubyists and JRubyists-to-be...if you're planning to be in Portland the evening of Thursday, May 29, there's going to be a sponsored JRuby hackfest for you!

The event is going to be at the McMenamins Brewpub/Restaurant at the Kennedy School. We'll have a room set aside for around 40ish people, so please RSVP via email or comments. There will be a taco bar and beers/beverages provided!

JRuby Hackfest!
Thursday, May 29, 6:30PM - whenever
McMenamins at the Kennedy School
5736 N.E. 33rd Ave.
Portland, OR 97211

We're pushing this event as a real hackfest, so bring your laptops and apps you are running or want to run on JRuby. At least Tom Enebo, Nick Sieger, Ola Bini and I will be there, so we'll probably be able to get you up and running. Otherwise, if you don't have an app, stop by and we'll give you a walkthrough of JRuby and maybe there will be a bug or piece of code you can help out with.

I want to thank LinkedIn for initiating this event...Tom and I and the other JRubyists are too busy to set these sorts of things up, so it's great to have community members take this initiative. LinkedIn is also sponsoring, along with Joyent and Sun Microsystems.

Remember, please RSVP in comments or by email so we can get an idea of headcount. And if you have any suggestions for things you'd like to see or hear about or work on at the 'fest, let us know!

Monday, May 19, 2008

JRuby on Rails Fighting Infectious Disease

A new JRuby on Rails venture was just publicly announced. It's a collaboration between Collaborative Software Initiative and the State of Utah:

Portland, Ore., May 19, 2008 - Collaborative Software Initiative (CSI), the company that brings like-minded organizations together to work on collaborative software at a fraction of the cost, today announced the release of the first open source, web-based infectious disease reporting and management system.

The application is basically a system for reporting, investigating, and managing outbreaks of communicable disease. So if some kid at a local school contracts bacterial meningitis, this is the sort of system that would record the event and track related cases or contacts with that kid. Seems like a great application for JRuby on Rails, and a potential to see wide use.

Technical details are still a little sparse on the announcement and on the project site, but it's JRuby on Rails based, and "friend of JRuby" Mike Herrick is quoted in the article saying that they "look forward to rolling this out and talking to other states about how to implement it and improve the health and safety of their citizens."

Mike has promised me more information, but from our discussions with him he's very happy using JRuby on Rails. I think he's going to be at RailsConf next week, so if you're interested in talking to him you might drop an email his way.

It seems like JRuby is picking up speed.

Sunday, May 18, 2008

"Ask the Experts" Session: NetBeans 6 Ruby Support

Sun is running an "Ask the Experts" session on Ruby/JRuby support in NetBeans 6(.1) with myself, Tor Norbye, and Brian Leonard. If you've had questions about NetBeans Ruby support you'd like answered, but haven't had a chance to ask...here's your opportunity.

The main page is here: Ask the Experts

Fire away! You have all this week to get your questions in.

Saturday, May 17, 2008

The Road to Babel

It's Saturday and I'm going to be waxing poetic again. This time, it's stream-of-consciousness mind-dumping. Enjoy, or not.

The Game of Life

Every so often you have to take a step back from the daily grind and consider your position in the world. I've done this a few times during my career.

After a "checking in" manager discussion at the University of Minnesota in 1997, my conclusion was that I had been grossly undervalued in my last review. The solution was to strike out on my own, finally, for a job in the private sector. So I fell into my first post-University job earning twice what I got as a full-time member of the U of M Web Team.

I think I stuck around there for about 1.5 years. It wasn't a bad gig, but for whatever reason I really wanted to move up. When people made bad decisions, I wanted to be able to veto them or force a reevaluation rather than sitting by as projects failed. At the time, it seemed like management was the only path I could take, so I took some of the requisite courses, involved myself in more meetings and discussions, and generally tried to unofficially steer various projects in directions I thought would be more successful. Finally, I expressed my interest in moving into the management chain. "We're sure you'd be entirely capable, but we don't think it would be good for the team if you were leading folks who had been here much longer." Great. Age descrimination. So what if I was 23...I was fucking right, and I could have handled the job. Time to jump ship.

Thus began my dark foray into consulting. I hooked up with a local firm, consulting at a large Manufacturing and Mining company in Minnesota. Talk about stooges. These guys didn't know which was was up. They had some gigantic CORBA-based back-end for their product "store" they wanted to wire up to an ATG Dynamo front-end (which I *still* get job opportunities for). It was slow. They couldn't figure out why. Oh, is it a problem that all our web services, which are based on one end of the city, need to call back to a CORBA server on the other end of the city for ever damn field? Nah, it couldn't be that. I think we need more hardware. Probably about a month before my contract ended, I got tired of arguing the case. BAIL.

At this point, I think I just got tired of trying. I landed a comfy job as some sort of Java EE architect, and never actually did any real work. I lead a team of two, provided some basic architectural guidance (which was probably wrong, I was totally disinterested now), and things moved along reasonably well. Unfortunately, instead of hooking my wagon to a gravy train, I ended up with the Donner Party. About nine months after coming on board, the company crumbled under a high-level management embezzling scam, and the remaining employees started to cannibalize one another. I was laid off.

This was a rough time for everyone. It was 2001, and the shit was starting to hit the fan. Of course "me and my friends" all saw this coming...the nonsense prices people were paying for bullshit companies, the SuperBowl spots for web sites we'd never heard of and had no intention of using...all of it added up to a big fat crash. Not a great time to fall out of a job. Luckily, I had some old friends pulling for me.

Revelation

Enter Kelly Nawrocke. Kelly and I had worked for a while at the University, on the Web Team. Kelly was never really devoted to the position...while I was desperately fiddling with the latest stupid AWT GUI they'd tasked me to write for no apparent reason, Kelly would be toying with a digital theremin, echoing sci-fi whines and whistles all the way down the hall. I sometimes hated his flippancy, but I sure wouldn't have shared an office with anyone else. Anyway, Kelly and I had been out of touch for a few years when started my Java EE architect job. While working there, I started studying Chinese "just because" and ran into Kelly on the U of M campus. After describing the work I was doing, Kelly replied with five words that have weighed on my conscience ever since, and are probably responsible for where I am now:

"I expected better from you."

Of course he said it with a smile. He was joking around, probably poking fun at my nonchalance about that EE job. But shit, he was right. What the hell was I doing with my life? Is this what I wanted to do?

So the bottom fell out and I needed a job. Kelly came to the rescue, getting me an interview and then vouching for me at another local shop. This time, it was a spin off of another Minnesota chemical company, building Java-based (but not necessarily EE-based) storefronts for both the original company and new ones interested in the same service. And this was probably a couple months before I would have had to bail on a mortgage and start over. Kelly really came to the rescue.

The new place paid pretty well, but it was totally doomed. I got there well after the happy times, when execs would fly off to Rome for a business lunch with prospects and "hired gun" consultants would organize daily poker games while they waited for work. When I arrived, we were heads-down working on stuff, sometimes late into the evening.

I think this job really started to sculpt where I would go in the future. I started out as just another "senior software engineer" in a company full of "senior software engineers". I helped build out a few apps, did a whole crapload of catch-up learning on the latest web and enterprise frameworks, and generally found the whole experience really enlightening. Sure, it wasn't the most compelling technology, but it was a fair day's work and a great team to hang out with. So I was reasonably happy.

A Return to Creative Exploration

At some point, they decided to look into a new suite of software from TIBCO. The managers-of-the-day had decided that it was foolish to always be building our own toolchain, workflows, portals, and messaging tiers and mandated that we start using TIBCO's suite of COTS products instead. A couple folks started to play with the workflow engine. Others got to look at the messaging tier. Me, I got the portal. Yick.

The portal, at the time, was a badly stitched-together collection of damn near every open source Java project. There was Apache stuff in there, Sun stuff, arbitrary third-party projects...a whole mess of them. Several were not properly attributed...and none of them came with source. I was forced to actually unpack and decompile pieces just to figure out what version of Apache commons it was running or which version of Velocity they had included. It was a bloody nightmare, but actually a hell of a fun puzzle to put together.

The end result was that we had the ability to run the portal on servers of our choosing, largely because I did the late-night homework to figure out what it actually used. So we largely disassembled the TIBCO suite and put it back together in a way that actually made sense for our app. It worked.

But remember I said this company was doomed. No amount of portal-wrangling was going to save it, so eventually it got absorbed back into its parent company, several million dollars poorer. Most of the staff were laid off or left, but those of us "real employees" got introduced back into the matrix. And I was in corporate stooge hell once again. Sure, I stuck around for a bit...I'm not one to jump ship at the drop of a hat. But a few quarters later, after no bonuses, no pay increases, and our numbers slowly dwindling under increasing demands...it was time to bail again.

Never Give Up Hope

Oddly enough, it was my old consulting friends that bailed me out. They had a gig with a government services consulting operation out in northern Virginia. I probably wouldn't have considered it if it were just a straight-up hourly consulting gig. But this one was contract-to-hire, with six months commuting to the DC area. I was intrigued, and signed on.

Here I found a brand new project. I arrived at perhaps the perfect time (or as perfect as it can get). They had just spent a couple *years* assembling a set of requirements documents for a gigantic new project, a piece of software intended to replace the key mainframe-based database that powered a multi-BILLION-dollar US government program. And the guv had just recently signed off on the requirements and funded three years of development. Basically a clean slate as far as the software went.

I was just hired on as an extra hand, but it quickly became apparent they needed a lot more than that. The design was flimsy at best...the folks in charge had a general idea of the technology needed to make it work, but they'd obviously never done this before. What's worse, many of them didn't seem to have the analytical skills necessary to make this project succeed. So I was presented with a choice. I could either toe the line and let those in charge make bad decisions...or I could stand up and say "you know what, you're wrong...and I'm going to prove it." I chose the latter.

Let's fast-forward about a year. My tour of duty in Virginia had long since been over, and I was now working as the on-site lead at the Fed's office in Minneapolis, Minnesota. I was back home, I was busing to work, and I was in charge of the leading edge of the app, where we were just a few months away from going to production. I had basically taken over the entire architect role by now. The original build and deployment process were irreparably broken; I rewrote them in a 36-hour, three-day epic battle with Ant. The SCM workflow was useless, focusing on tagging what was currently in a given environment and moving those tags around; I replaced it with version numbers, release schedules, and tags and branches for actual project snapshots. And the production environment was a total wreck. I worked with the highly-competant guv IT staff to produce a ground-up automated setup and deployment script; they could now take a machine from bare server to fully-operational production box in about 20 minutes, including EE server installs, EE server and Apache configuration, and application staging and deployment. All by running one command.

We went live in October of 2004. A multi-million-dollar, 750kloc, fully "Java EE" project, deployed on time, under budget, and without any production snags. Yes, my friends, it can be done. Yes, you did it wrong.

Given that success, I think it was reasonable that I stuck around for another couple years. I became the lead architect for the project, guiding additional feature work, spinning off a subset of the application into a framework for future projects, and generally acting as part of the guv's team, rather than as an outside consultant. Things were pretty good...I had the run of the show.

Never Get Comfortable

Why did I leave?

"I expected better from you."

It was too comfortable. I started to realize that I was getting soft. I stopped tracking the latest web and persistence frameworks. I stopped caring about the fate of the project, since we'd largely automated it into a self-sufficient dynamo. I got tired of vetoing decisions...tired of calling idiots idiots and dealing with the fallout. I got 5/5 on my reviews in all but one area: interpersonal relationships...I made one dude cry because he restarted the UAT environment without asking me, and I had to explain to the guv why their server was down. It wasn't fun anymore.

Enter JRuby.

Find Something You Love

I started working casually on JRuby in about 2004, shortly after that big production release. I'd arranged to make a trip out to the home office in Tyson's Corner, VA, so I could attend RubyConf 2004 in Reston. It was Kelly Nawrocke who had turned me on to Ruby, even though I'd never actually looked at the language or written anything in it. Kelly was working in New York City, and promised to join me at the conference. While there, I saw David Heinemeier Hansson present Rails publicly for the first time, Koichi Sasada unveil his work on YARV, and meet a number of already-old-school Rubyists who thought they'd found the answer to all life's problems. But I wasn't really stricken by any of them. What struck me most was that without even knowing Ruby, I could look at their code and their slides and actually understand what was going on. I was hooked.

Naturally, being stuck on the then-crufty Java platform, I needed an in. So I started poking around to see if there might be a Ruby implementation on the JVM. It turned out JRuby had already existed for a couple years and my old U of MN Web Team coworker Tom Enebo was the current project lead.

Isn't it funny how fate works sometimes? I end up in a series of progressively more-challenging jobs because of a well-respected friend's offhand comment. I stumble upon JRuby where another respected coworker is lead developer. Weird.

Anyway...like I say, I only casually contributed to JRuby for about a year, but around the summer of 2005 I really started to take an interest. I think I'd reached a tipping point, where I understood enough of JRuby to make larger changes, and I'd become bored enough with my daily job to start working on night projects. I rejiggered the interpreter, hoping to move it toward a stackless design. And what do you know, it worked. I went to RubyConf 2005 and presented JRuby for the first time, running "fib" recursively to 100_000 with ease and showing how basic functions all worked great in JRuby. It was a lot of fun. And damn was that interpreter slow. But it really planted the seed in me.

Hell Breaks Loose

Over the next year, the shit hit the fan. Around January 2006, we got IRB working. About a month later, apps like Rake and RubyGems started to come online. Ola Bini came on the project around then and helped get them fully working. Then we heard from Tim Bray that he could spare a few Sun machines for us to work on, and if we got Rails running by JavaOne there could be bigger things in store. We did...and were hired four months later by Sun to work on JRuby full time. Since then we've had JRuby 1.0 and 1.1 releases, numerous production JRuby on Rails deployments, a new compiler and "generally better" performance than other impls. And today, we're one of the best options for deploying Rails, with ever-improving performance, native threading, and a great software stack backing us up.

Now it's time to take a step back and examine where we really are. JRuby, by most accounts, is fighting a war on two fronts.

Fighting on Two Fronts

On the JVM side, there's a renewed interest in languages. Where two years ago JRuby was really the only big language story (or at least the only one getting press), now both Groovy and Scala are talking about how they're "just as good" or "better than" Ruby. And in many cases, they're right...Groovy still integrates better with Java and Scala mops the floor with both JRuby and Groovy in the performance arena. So no matter what success we've had with JRuby, there's a constant arms race going on. We need to demonstrate features to compare with Scala (or evangelism to show that Ruby still has an edge). We'll trade performance back-and-forth with Groovy (who now have really decent performance in 1.6...many kudos to them). And Jython is coming online, already performing better than C Python with many, many enhancements yet to be made. So JRuby's got a tough fight on the JVM to remain one of the lead competitors. Or is it just a part of a big happy family?

On the Ruby side of the world, situations have changed just as drastically. In 2006, when JRuby was starting to run basic Rails apps, still deep in compatibility hell, and only beginning to look at performance, there were still really only two implementations: C Ruby (MRI) and JRuby. Now, two years later, there's 5, or 6, or maybe 8 depending on the day and how you count implementations. And as of yesterday, JRuby and C Ruby are joined by Rubinius in being able to route basic Rails requests...so the other implementations are most certainly snapping at our heels now. They're going to move fast. It may take some time, but they're going to start presenting compatibility and performance milestones comparable to JRuby. Already, Chad Fowler of the Ruby community has chosen his horse and claimed that within a year or two Rubinius will be on its way to becoming the "de facto standard" Ruby implementation. Others have put their bets on Rubinius or Gemstone's MagLev. Everyone has a favorite contender in the Ruby implementation arena. Whither JRuby among this crop of promising upstarts?

So it makes me think. Here I sit, on a Saturday night which could be a relaxing evening at home. It's a nice warm night in Minnesota. There's a few good movies on TV. I've got a couple nice beers in the fridge.

But I'm here writing a blog post and considering what next major enhancement to make to JRuby. Why am I doing this? Why do I labor day and night to improve JRuby, or push JVM languages forward, or try to show people why the JVM is such an awesome platform to target? Why suffer under the pain of a two-front battle to keep JRuby relevant?

"I expected better from you."

Kelly's words still ring in my ears. What is better? Is better making a great Ruby implementation that runs on the JVM and solves most of the scaling and "enterprisey" problems with the original? Perhaps success is bringing an off-JVM language to the JVM, and arguably making it the *best* choice for a large subset of users? Does better mean constantly chasing a dream of staying on top and fighting performance and compatibility wars and looking out for number 2 until the end of days?

I think there's something missing here.

The Ruby Side

Chad claims that Rubinius will be the de facto standard Ruby implementation in a year or two. Of course he's totally wrong...in a year or two things are going to be just as fucked up and confusing as they are now, but they'll be fucked up and confusing in altogether different ways.

JRuby will by then be convincingly the fastest way to run Ruby webapps either buoyed by continuing JRuby performance work, Rails multithreading enhancements, a new JVM version, or a crop of new web frameworks with Merb leading the way. And probably a majority of Rubyists (including several key "thought leaders" in the Ruby community) still won't care because they've always wanted Ruby to "kill Java" in some way. Poor guys...prejudiced and short-sighted.

Rubinius will be running Rails well enough to do production deployment, but without multithreading, memory reductions, or performance improvements it won't present a much more compelling story than MRI. Of course, it's probably going to receive most of those improvements during the year, and with five or six full-time folks working on it it's not infeasible for it to be better than MRI for Rails deployments by then.

IronRuby might be running Rails, might even be running it well, but will probably be getting most of its press running some proprietary Microsoft RIA or MVC framework. John Lam will still be fighting the good fight to make MS an Open Source company...and maybe even succeeding.

MacRuby will be released, probably in an official OS X release, and folks will be using it for various awesome mostly-GUI apps...but it probably won't be a substantial contender in the Ruby web arena.

Ruby 1.9.x will be more stable than today, but not enough of an improvement for most people to move off the 1.8 line; people will be waiting for the mythical Ruby 2.0 that brings all the promised features left out of 1.9.

MagLev will be bringing Avi Bryant's dream of Ruby on Smalltalk true, maybe even running Rails and other web frameworks. And...well, nobody will really want to pay for it, even if it can be ten times faster and has the best persistence architecture in the world.

What will I be doing?

I don't expect I'll spend the next five years working on JRuby. I don't necessarily expect I'll spend the next year working on JRuby. There's too much else out there.

...And Beyond

With JRuby, we've shown it's possible to bring a language, a set of libraries, and popular frameworks from another world onto the JVM and make them run even better. We've shown that there's real value in pushing the multilanguage JVM meme over "One Java to Rule Them All". Never before has there been such support for polyglots. Job postings once again list three or four or five languages they'd like candidates to know. Libraries and frameworks boast support for Groovy, JRuby, Jython as part of their feature list. Real money is being spent to turn the existing OpenJDK into a dynamic language powerhouse, led by efforts like the Da Vinci Machine, JRuby, and Groovy. And new static-typed languages like Scala are showing where the Java language needs to evolve (or not evolve) into the future. Ruby is just one part of a great new adventure.

I want to be a part of that. A year ago I started up the JVM Languages Google group to bring JVM language implementers together, and it's been a great success. You can post a question about polymorphic inline caches one day and read about call frame reification or compiler strategies the next. The Da Vinci Machine project, led by John Rose, has started to incorporate all those crazy features we dynlang implementers have really wanted into OpenJDK. Projects like the Maxine VM are starting to show that self-hosting works just as well (or even better) with Java and the JVM than other languages and platforms. These are exciting times.

No, dear readers, I don't mean to say I won't be working on JRuby. JRuby is the gateway drug for me into many different arenas of software. I've learned more about languages and compilers and parsers and VMs and runtimes and libraries and unicode and so on from working on JRuby than I ever learned from any of my jobs. My work on JRuby will continue long into the future.

But it's time to look to bigger things. JRuby has been successful not because of what magic Tom or I or anyone else were able to work...it has been successful because of the JVM, that fantastic piece of engineering that enables top-notch implementations of dozens of languages already. But it's too hard to make languages run well on the JVM right now, and I'll attest to that. We need to make it easier to get languages performing on the JVM. We need to make it easier to build tools for them. In short, we need to open the JVM up to a much larger audience...an audience that might have written Java off as a dead technology. And we need your help.

The Road To Babel

So I'm publicly announcing that we at Sun are hosting a JVM Language Summit. It's long overdue in my opinion...we should have been having these events ten years ago. But it's happening now. We're calling all language and VM implementers to come talk about their projects. It doesn't have to be something running on the JVM...we want very much to hear from folks on the Rubinius project, Parrot project, LLVM project, CLR and DLR projects, and any other language and runtime you can think of. This is the chance to get together with a group of your peers to discuss topics you can usually only explore over email or IRC. It's your chance to say what you want the JVM to do for you...or else to say why your platform does it better. It's a meeting of the minds...a first step toward building more open platforms, better runtimes, and completely Free software stacks that all languages can take advantage of.

And this is just the first step. Over the next year, I'm going to be actively working with others in the JVM Languages community to build out a library of tools and frameworks we can all use to better our implementations. That process has already started...Attila Szgedi has been working on a standard MetaObject protocol for the JVM. John Rose has been working on the DVM and its set of dynlang features. I've been working on various backports of those features and similar libraries for JRuby. The Groovy guys have been working on code-generated call site optimizations. The list of independent projects goes on and on, and now we need to bring these efforts together.

I did a talk at CommunityOne this year about "Bringing the JVM Language Implementers Together", and I really meant it. It's happening right now, and it's about time. It's bigger than Ruby, bigger than Groovy, bigger than any one project for sure. It's about a real platform for real users, users that have different tastes and want different tools for the job. It's about you and those projects you might have put aside. It's about those biases you might have against anything Java-related, as though somehow any project with Java involved is an automatic FAIL. Of course you know it's foolish, but old habits die hard. It's time for you to get involved. It's time for you to cast off those prejudices and help push this platform in the right direction. And I'm going to be here to help...I really want this to happen. But it depends on you. Are you up to the challenge?

"I expected better from you."

No doubt. We should all be doing better. And this is your chance.

Wednesday, May 14, 2008

The Great JRuby Japanese Tour

Yes, friends, it's time one again for a JRuby tour. This trip, we're localizing to the islands of Japan. Do I have any Japanese readers out there?

Here's our route for this trip...it's going to be a crazy ten days:

View Larger Map

June 19: Tsukuba, Ruby Kaigi

Tom Enebo and I will be presenting JRuby at the Ruby Kaigi 2008 in Tsukuba this year, as well as meeting up with other Ruby implementers making the trip and communing with the locals. The Kaigi was great last year...lots of fun, great company, and an excellent community. We're both really excited about it. JRuby has come a long way, so we'll be doing a whirlwind tour of performance, Rails, GUI support, and finishing off with something special. It should be a great conference again this year.

June 23: Matsue, Lecture at Shimane University, Meetup with Matz and Locals

We were invited to present JRuby at Shimane University, as part of a lecture series they're doing on Ruby. And since Matz is based on Matsue, we'll certainly meet up with him to talk a bit offline...I expect he'll be pulled many directions at the Kaigi.

June 24: Fukuoka, Ruby Business Commons

Tom did a keynote last year for the Ruby Business Commons, a group of Rubyists proactively trying to bring Ruby to the business community. I suspect we'll deliver another talk or just meet up with them and see how things have progressed in the past year. At any rate, Tom enjoyed the trip to Fukuoka, so I'm looking forward to this.

June 25-27: Tokyo, meetup with local partners, universities, Rubyists?

Whenever Tom and I travel to Japan or meet up with Japanese associates, we put ourselves entirely in the hands of Sun Japan, and specifically our excellent friend and guide Takashi Shitamichi. So far, the Tokyo leg of our trip has a few embedded question marks, but I'm sure Shitamichi-san will be able to fill our days from dawn until dusk with events. Hopefully we'll have a little time to take a breath and poke around Tokyo again, but either way it will be a great end to the tour.

Spreading the Word

I'd really like to jam in as much Ruby and JRuby meetups, discussions, and talks as possible on this trip, so feel free to reblog (and perhaps translate) this entry, contact me, Tom, and Takashi directly...especially if you know of any good events while we're in Tokyo. And if you're press, Takashi can certainly hook you up if there's time in the tour (for emails...use firstname.lastname@sun.com).

JRuby is ready for the Japanese Ruby community, and we're coming to town to help send it off!

Tuesday, May 13, 2008

RubySpec: Bringing Ruby Test Suites Together

Hooray! The RubySpec Project, a collection of runnable specifications for Ruby 1.8.6ish behavior, has graduated into its own domain. Finally there's a lively, fast-moving, independent project to create a Ruby specification and test kit. And it's already well on its way.

Better documentation on how to pull the specs, update them, and use them for your own Ruby implementation (you do have a Ruby implementation, don't you?) are still being ironed out, but the repository is already available at the RubySpec github address, so you can pull them and start reading and running them. Also see the MSpec github for a lightweight (lighter than RSpec) tool to run the specs with.

But this post is not just about the the RubySpec project...Brian Ford is putting together an official announcement for that as we speak. This post is a call to action.

JRuby currently encompasses something like 6 separate test suites:

Our old JRuby test suite using "minirunit", a small runit clone no longer in wide use (the camelCase.rb tests at that URL)
Ryan Davis and Eric Hodel's "BFTS" suite, a narrow but deep set of tests for a few core classes
Our newer set of JRuby tests using test/unit (the underscore_case.rb tests at that URL)
MRI's own set of tests, from the Ruby 1.8 repository (link is to our somewhat out-of-date copy)
A test/unit port of the Rubicon test suite, originally written by Dave Thomas while writing the Pickaxe books
The ruby_test test suite, a suite of tests created by Daniel Berger for his projects (link is to our out-of-date copy)

We don't want to run these tests forever...we would rather just run the RubySpec. So this is where we need help.

Much of these tests are already encompassed in the RubySpec specs. BFTS, for example, focuses only on a very few core classes, which have been heavily covered in RubySpec. In many cases, these test suites even overlap each other, meaning that our 3 minute test run could probably be a lot shorter. If we could just replace our test suite with the RubySpec (modulo JRuby-specific bits like Java integration), we'd be very happy.

But we can't afford to do that unless we know we're not throwing away good tests. The RubySpec is a work in progress, and there are always going to be gaps. It would be folly to throw away our tests without consideration. So that's where you come in.

We need to start at A and work our way through Z, porting over any test cases that aren't covered in the RubySpec.

I started the process tonight, adding a number of missing cases from our test_array.rb script and deleting everything I ported and everything that was already covered. It took perhaps an hour to go through, and it was of a reasonable size. Many other scripts will be much smaller, some will be larger.

The benefits extend far, far beyond JRuby of course. By adding missing test cases, we're going to ensure that all new implementations have a complete spec to go on. We're going to make sure there aren't a lot of incompatibilities you users have to deal with. And we're going to show all those other languages (who are still laughing at our lack of a spec) that we can do this in our own Ruby way.

So what are you waiting for? Contact Brian Ford and get access to the specs (perhaps after paying a one-patch toll)...have a look at the JRuby test repository...pick a file, and start comparing. Tell your friends, email your favorite Ruby list, blog and reblog this effort. The time is now to pull together all the disparate suites into one. RubySpec is ready!

Saturday, May 03, 2008

The Power of the JVM

In the past couple days, a new project release was announced that has shown once again the potential of the Java platform. Shown how the awesome JVM has not yet begun to flex its muscles and really hit its stride in this project's domain. Made clear that even projects with serious issues can correct them, harnessing much more of the JVM with only a modest amount of rework. And demonstrated there's a lot more around the corner.

That project wasn't JRuby this time. It was Groovy.

Groovy's Problem

Groovy 1.6 beta 1 was released a couple days ago. This release was focused largely on performance, rather than polishing bugs and adding features like the 1.5 series. You see, in 1.5 and earlier, Groovy had become basically feature-complete, and was starting to hit its stride. Most of the capabilities they desired were in the language and working. Their oft-touted Java integration had caught up to most Java 5 features. And Grails recently had its 1.0 release; finally there's a framework that can show Groovy at its best. But there was a problem: Groovy was still slow, one of the slowest languages on the JVM.

This doesn't really make a lot of sense, especially compared to languages like JRuby, which have a more complicated feature set to support. JRuby's performance regularly exceeded Groovy's, even though several Ruby features require us, for example, to allocate a synthetic call frame for *every* Ruby method invocation and most block invocations. And JRuby had only received serious work for about 1.5 years. The problem was not that Groovy was an inherently slow language...the problem was the huge amount of code that calls had to pass through to reach their target. Groovy's call path was fat.

A few months back I measured the number of frames between a call and the actual receiver code in Groovy and JRuby. JRuby, which has received a lot of work to shorten and simplify that call path, took only about four stack frames between calls. Groovy, on the other hand, took nearly 15. Some of these frames were due to Groovy still using Java reflection to hold "method objects", but the majority of those frames were Groovy internals. Calls had to dig through several layers of dispatch logic before they would reach a reflected method object, and then there were a few more layers before the target method was actually executed. Oh, and next time you call that method? Start over from scratch.

A Standard Solution

Early in the JRuby 1.1 dev cycle, we shortened the call path in two ways:

Rather than use reflection for core Ruby class's methods, we generate small stub methods ("method handles") that directly invoke for us. This avoids all the argument boxing and overhead of reflection entirely. It's only applicable for the core classes, but a very high percentage of any JRuby app--even one that calls Java classes--depends on core classes being fast. So it made a big difference.
When compiling Ruby code to Java bytecode, we employed what's called a call site cache, a tiny slot in the calling method where the previously looked-up method handle can be stored. If when we return to that call site the class associated with the method has not changed, and if we're again invoking against that class...we can skip the lookup. That drastically reduces the overhead of making dynamic calls, since most of the time we don't have to start over.

It is the call site mechanism that gave us our largest performance boost back in November (though I blogged a bit about the technique way back in June and July of 2007...boy was I naïve back then!).

It's certainly not a new technique. There are scads of papers out there (some really old) about how to build call site caches, either monomorphic (like JRuby's and Groovy's) or polymorphic (like most of the high-performance JVMs). Until we put them in place in JRuby, they weren't commonly used for languages built on top of the JVM. But that's all changing...now Groovy 1.6 has the same optimizations in place.

What's the result? A tremendous improvement in performance, similar to what we saw in JRuby last fall. According to Guillaume Laforge, Groovy project lead, the boost on the "Alioth" benchmarks can range anywhere from 150% faster to 560% faster. And the latest Benchmarks Game results prove it out: Groovy 1.6 has drastically improved, and even surpasses JRuby for most of those benchmarks. And while JRuby and Groovy will probably spend the next few months one-upping each other, we've both proven something far more important: the JVM is an *excellent* platform for dynamic languages. Don't let anyone tell you it's not.

Why It Works

The reason call site optimizations work so well for both JRuby and Groovy is twofold.

Firstly, eliminating all that extra dispatch logic whenever possible reduces overhead and speeds up method calls. That's a no-brainer, and any dynamic language can get that boost with the simplest of caches.

But it's the second reason that not only shows the benefit of running on the JVM but gives us a direction to take the JVM in the future. Call site optimizations allow the JVM to actually inline dynamic invocations into the calling method.

The JVM is basically a dynamic language runtime. Because all calls in Java are virtual (meaning subclass methods of the same name and parameters always override parent class methods), and because new code can be loaded into the system at any time, the JVM must deal with nearly-dynamic call paths all the time. In order to make this perform, the JVM always runs code through an interpreter for a short time, very much like JRuby does. While interpreting, it gathers information about the calls being made, 'try' blocks that immediately wrap throws, null checks that never fail, and so on. And when it finally decides to JIT that bytecode into native machine code, it makes a bunch of guesses based on that profiled information; methods can be inlined, throws can be turned into jumps, null checks can be eliminated (with appropriate guards elsewhere)...on and on the list of optimizations goes (and I've heard from JVM engineers that they've only started to scratch the surface).

This is where the call site optimizations get their second boost. Because JRuby's and Groovy's call sites now move the target of the invocation much closer to the site where it's being invoked, the JVM can actually inline a dynamic call right into the calling method. Or in Groovy's case, it can inline much of the reflected call path, maybe right up to the actual target. So because Groovy has now added the same call site optimization we use in JRuby, it gets a double boost from both eliminating the dispatch overhead and making it easier for the JVM to optimize.

Of course there's a catch. Even if you call a given method on type A a thousand times, somewhere down the road you may get passed an instance of type B that extends and overrides methods from A. What happens if you've already inlined A's method when B comes along? Here again the JVM shines. Because the JVM is essentially a dynamic language runtime under the covers, it remains ever-vigilant, watching for exactly these sorts of events to happen. And here's the really cool part: when situations change, the JVM can deoptimize.

This is a crucial detail. Many other runtimes can only do their optimization once. C compilers must do it all ahead of time, during the build. Some allow you to profile your application and feed that into subsequent builds, but once you've released a piece of code it's essentially as optimized as it will ever get. Other VM-like systems like the CLR do have a JIT phase, but it happens early in execution (maybe before the system even starts executing) and doesn't ever happen again. The JVM's ability to deoptimize and return to interpretation gives it room to be optimistic...room to make ambitious guesses and gracefully fall back to a safe state, to try again later.

Only The Beginning

So where do we go from here? Well ask me or the Groovy guys about putting these optimizations in place and we'll tell you the same thing: it's hard. Maybe too hard, but I managed to do it and I don't really know anything. It took the Groovy guys quite a while too. At any rate, it's not easy enough, and because we have to wire it together by hand (meaning we can only present a finite set of call paths) we're still not giving the JVM enough opportunity to optimize. Sure, we'll all continue to improve what we have for existing JVMs, and our performance will get better and better (probably a lot better than it is now). But we're also looking to the future. And the future holds another key to making the JVM an even better dynamic language runtime: JSR-292.

JSR-292 is basically called the "invokedynamic" JSR. The original idea for 292 was that a new bytecode could be added to the JVM to allow invoking methods dynamically against a target object, without actually knowing the type of the object or signature of the target method. And though that sounds like it might be useful, it turns out to be worthless in practice. Most dynamic languages don't even use standard Java class structures to represent types, so invokedynamic against a target object wouldn't accomplish anything. The methods don't live there. And it turns out there's a political side to it too: getting a new bytecode added to the JVM is *super hard*. So we needed a better way.

John Rose is in charge of the HotSpot optimizing compiler (the "server" compiler) at the heart of Sun's JVM. HotSpot is an amazing piece of software...it does all the optimizations I listed above plus hundreds of others that may or may not make your ears bleed. It has two different JIT compilers for different needs (soon to be merged into a single three-stage optimization pipeline), probably half a dozen different garbage collectors (a few weeks ago I met a guy in charge of one generation of one collector...crazy), and probably a thousand tweakable execution and optimization flags. It can make most Java run as fast as equivalent C++, even while the HotSpot engineers recommend you "just write normal code". In short, HotSpot has balls of steel.

John took over JSR-292 about this time last year. Not much work had been done on it, and it looked like it was moving toward a dead-end; most of the dynamic language projects agreed it wouldn't help them. Around that time, it was becoming apparent that JRuby would be able to make Ruby run really well (aka "fast") on the JVM, but it was taking a lot of work to do it. Tom and I talked with John a few times about strategies, many of which we've put in place over the past year, and they were all rather tricky to implement. Largely, they moved toward making the call path as fast as possible, by both shortening it and making the number and type of parameters match the target all the way through.

In order to reduce this workload for language implementers, John has been working on several features leading up to "invokedynamic". Here's the rough overview of how it will fit together.

The first feature is already working in John's multi-language VM "Da Vinci Machine" project: anonymous classloading. JRuby first improved invocation performance by avoiding reflection and generating little wrapper classes, but those classes incur a very high cost. Each one has to be generated, classloaded, named, stored, and eventually dereferenced and garbage-collected independently. You can't do that with a single class or a single classloader, so we had a class per method, and a classloader per class. That's a crapload of memory used just to get around the JVM's bent toward plain old Java types. Anonymous classloading aims to eliminate that overhead in two ways: first, it will not require hard references or names for these tiny loaded classes, allowing them to easily garbage collect when the code is no longer in use; and second, it will allow you to generate a template class once, then creating duplicates of it with only small constant pool changes. Lost? Keep up with me...it leads into the next one.
The second feature John hopes to have done real soon now: lightweight method handles. Method handles are essentially like java.lang.reflect.Method objects, except that they exactly represent the target method's parameter list and they take up far less memory...about 1/10 that of Method by John's estimate. Here's where the anonymous classloading comes in. Because all methods that have a given signature can be invoked with basically the same code, we only need to generate that handle once. So to support the broad range of classes and method names we'll want to invoke with that handle, we just patch the handle's constant pool. It's like saying "now I want a handle that invokes the same way, but against the 'bar' method in type B". Ahh, now anonymous classloading starts to make sense. We have one copy of the code with several patched instances. It makes me giddy just to think about it, because of how it would help JRuby. Because all our core classes just accept IRubyObject as arguments, we'd have to generate exactly ten primary handles instead of the thousand or more we generate now. And that means we can get even more specific.
Method handles feed into the big daddy itself: dynamic invocation. Because handles are so close to the metal, and because the JVM understands what the hell they are (rather than having to perform lots of nasty tricks to optimize reflection) we can start to feed handles straight back into the JVM's optimization logic. So once we present our dynamic types to the JVM's dynamic lookup logic, we simply have to toss it method handles. And because the JVM can now connect the caller with the callee using standard mechanisms, our call site optimizations get chucked in the bin. The JVM can now treat our dynamic call like any other virtual call. All we need to do is provide the trigger that tells the JVM that the old handle is no longer correct, and it will come back for a new one. And we get to delete half the JRuby codebase that deals with making dynamic invocation fast. WOW.

Of course this is not there yet and won't be until JDK7 (fingers crossed!). We want to continue to support pre-JDK7 JVMs with languages like JRuby and Groovy, so an important component of this work will be backported libraries to do it "as well as possible" without the above features. That work will probably grow out of JRuby, Groovy, Jython, Rhino, and any other dynamic JVM languages, since we're the primary consumers right now and we're making it happen today. But I'll tell you, friends...you don't know what you've been missing on the JVM. Groovy's performance improvement from simply adding call site caches amazes me, even though we received the same boost in JRuby last year. The techniques we're both planning for our next versions will keep performance steadily increasing. And we've got invokedynamic right around the corner to really take us the last mile.

The future is definitely looking awesome for dynamic languages on the JVM. And languages like Groovy and JRuby are proving it.

Thursday, May 01, 2008

Culling the Herd

I've been on Twitter for a while, but only recently started using it in earnest. I've got 'rific, have my Growl notifications up, and run a few Tweetscans through my feed reader to rush to the rescue of JRubyists in trouble. But I try to do something it seems most tweeters don't: I try not to crapflood my followers with useless bullshit.

Don't get me wrong, I'm all for stream-of-consciousness information flow. I do it myself, either when hanging out with people (where it might form the beginning of a conversation) or when on IRC (where it often doesn't, but is easily ignored). But I dunno, it seems to me tweets ought to be something at least a little more substantial.

So I'm keeping my list of followees small and tight. Around 20-25 seems like a pretty good range.

That's meant unfollowing people that tweet nothing but their travel schedules, where in the house they happen to be sitting, what great new product their company just released, and a load of other nonsense. That's meant not following everyone that follows me, especially if their primary topics include walking the dog and taking out the trash. To me, the value of Twitter is both in keeping track of what people I respect are working on or find interesting and as a sort of micro-feed, a little forced 2-second thought break to help me step back from hard problems. Whether you buttered your toast on the bottom or found an unrecognizable lump of once-food in the refrigerator is worthless to me...so if that's the tweets you're inflicting on the world, why should I begin or continue to follow you?

Of course there are a few folks that have a few very insightful tweets sprinkled in with others I don't find interesting...not necessarily a signal-to-noise problem, but a relevant-to-irrelevant thing. It's a judgment call, so if you're not immediately followed by me don't take offense. We just have different interests.

Tuesday, April 29, 2008

Apple Chose...Poorly

So after a long wait, Apple finally released Java 6 for OS X. It's for Leopard only. And Leopard is apparently going to be the only Java-supporting OS without a 32-bit Java 6.

I can accept that it only runs in Leopard. They're moving forward, and with Java shipped as part of the OS it's a lot more hassle to backport it to Tiger. Plus, I've got Leopard so I'd be fine.

But the 32-bit thing really burns me.

It's not like there's only 64-bit Javas out there and Apple would have to do all the heavy lifting to support them on 32-bit machines. The vast majority of installed Java distributions are the 32-bit versions, and Sun ships 32 and 64-bit JDKs for both Linux and Windows. Hell, Landon Fuller even got the FreeBSD patchset JDK 6 to successfully build on Mac. It's missing the "last mile" of OS X integration like a Cocoa UI (X11 only right now) and sound support, but hell, it's there and it runs. So it's not even the JVM bits standing in their way.

Could it possibly be the OS integration? I don't buy that. Unless there's some serious problem with their libraries on 32 versus 64-bit systems, it oughta be a recompile. Even if it's a little more work than that, there's a lot of 32-bit Intel Macs out there.

Of course the fanboys are just going to tell me "welcome to the club." Yes, I know Apple regularly holds back features to encourage people to upgrade OS or hardware. And this is probably one of those cases, since there certainly doesn't seem to be a good technical reason for it. But seriously...this one just seems dumb, since they could have put the same bits in Landon's port and essentially had it working.

Maybe I'm too naive. Is this just standard operating procedure at Apple? Anything we can do to convince them?

The Rubyists are Wrong

There's something that's been bugging me for a long time that I need to get off my chest. Some of you may hate me for it, but perhaps there are others out there with the same complaint, silently in agony, wishing for death to take the pain away. It's time to set the record straight, and prove once and for all that the Rubyists are wrong.

Rubies are almost NEVER cut like this:

The cut shown here is what's called a "brilliant" cut (though it's not faceted enough...artistic license), or more specifically a "round brilliant". Brilliant is typically a diamond cut, and something like 75% of the world's diamonds are cut this way. The shape and angles of the facets are all mathematically designed to refract as much light as possible out the top of the diamond, resulting in the "brilliant" sparkling you see. This is possible because the most popular diamonds are CLEAR. Get it? It's clear so light passes through it. It's not RED. So it's cut in a way that takes advantage of it being CLEAR.

Rubies, on the other hand, are generally cut into ovals or "cushions" but also into some other cuts like "emeralds" (the most common cut for emeralds, in case it wasn't obvious), rectangles, or hearts (ugh). If they show an asterism (a "star" of four or six points due to the ruby's crystal structure) they're usually cut into cabochons, which are shaped but not faceted. Rubies are NOT typically cut into "brilliant" shapes.

See that one in the JRuby logo? That was a public-domain SVG graphic of a diamond that Tom Enebo colored red. A DIAMOND. Oh yeah, and we're wrong too. BTW, here's the Wikipedia article on ruby (the gemstone). Search for "brilliant". Yeah, I didn't think so.

Even the Ruby Association (of whose board Matz himself is chairman) has made the mistake of choosing this brilliant-shaped logo.

We can't really tell how the ruby in the RubyForge logo is cut, since it seems to just be a red hexagon. But I bet it's a hexagonal BRILLIANT cut again.

I don't even know what that is. It's like a brilliant cut with a dome on top of it. Maybe it was designed that way so you could fit "Ruby Inside" inside it. But it's definitely not something you're going to see in a jewelry store.

And there are lots more examples. Check your favorite Ruby project. If they have a ruby in the logo, it's probably a "brilliant" cut. And I think that proves once and for all that we Rubyists aren't as brilliant as we think.

Update: It turns out the Pythonistas are wrong too. Is there no sanity left in the world?

Sunday, April 27, 2008

Promise and Peril for Alternative Ruby Impls

My how things have changed in a couple short years.

Two years ago, in 2006, there were essentially two viable Ruby implementations: Matz's Ruby 1.8.x codebase, and JRuby. At the time, JRuby was just barely starting to run Rails. I consider that a sort of "singularity" in the lifetime of an implementation, the inflection point at which it becomes more than a toy. (OT: these days, I consider the ability to run Rails faster than Matz's Ruby a better inflection point, but we've had the Rails thing going for two years). Cardinal (Ruby on Parrot) was mostly dead, or at least on its way to being dead. YARV, eventually to become the Ruby 1.9 VM, was perhaps only half completed and was not yet officially marked to be the next Ruby. Rubinius had not really been started, or at least had not officially been named and could not be considered anywhere near viable. IronRuby was still Wilco Bauer's IronRuby, a doomed codebase and project name eventually to be adopted by Microsoft's later Ruby implementation effort. So there was very little competition, and most people still considered JRuby to be a big joke. Ha ha.

Fast forward to Spring 2008. Ruby 1.8.x has mostly been put in maintenance mode, but remains by far the most widely-deployed Ruby implementation, despite its relatively poor performance. Largely, this is because only Ruby 1.8.x is 100% compatible with Ruby 1.8.x, and because it's already packaged and shipped on a number of OSes, including Rubyist favorite OS X. But the rest of the field has become a lot more muddled. There are now around six implementations that have past the Rails inflection point or will soon, a handful of others likely to fade into obscurity (after contributing their own genetics to the Ruby ecosystem, surely), a few mysterious up-and-comers...and Cardinal is still dead.

Let's review the promise, peril, and status of all the implementations. Note, this is largely a mix of facts and my opinions. Corrections for the facts are welcome. Corrections for the opinions...well...let's take it offline.

Ruby 1.8

Matz's venerable code base (usually called MRI for "Matz's Ruby Interpreter", MatzRuby or Ruby 1.8 in this article) still has a death-grip on the Ruby world. For 99% of Ruby developers, MatzRuby is still the king of the hill. This is in the face of poor relative performance (slower than JRuby and 1.9 for sure, and slower than most of the others for many cases), poor memory management (conservative, non-compacting GC), and many upstart implementations which solve these problems. Why is this?

It's not hard to answer. It's compatibility and status quo. If all my apps run fine on MatzRuby, and MatzRuby is already installed, and I'm satisfied with the performance characteristics of MatzRuby...why would I run anything else? Because MatzRuby is largely not *that bad* for most uses, especially as relates to simple system scripting work, it's unlikely to go away any time soon. And since all but one of the alternative impls is targeting Ruby 1.8 features and compatibility, the still-in-development Ruby 1.9 is not gaining much traction yet (which is probably a good thing).

You should all know the specs and status for MatzRuby. Current official release is 1.8.6 patchlevel 114. There's a 1.8.7 preview 2 out now that backports a whole bunch of features from 1.9 and breaks some compatibility. The jury's still out on whether it will go final as-is. MatzRuby is a simple AST-walking interpreter, with a conservative GC, minimal/cumbersome Unicode support, and a large library of third-party native extensions. MatzRuby is still the gold standard for what Ruby 1.8 "is".

The peril for Ruby 1.8 right now involves keeping that 99% of Ruby users interested in Ruby while 1.9 bakes without breaking compatibility. Ruby 1.8.7 pre1 introduced some Ruby 1.9 features but also broke a bunch of stuff (most notably Rails), and pre2 doesn't pass the specs 1.8.6 does. We had a design meeting last week where it was decided that we folks working on the Ruby specs need to help the Ruby core team get involved, so they can start running the full suite as part of their development process. That's going to happen soon, but until it does 1.8 releases can break basically anything from day today and nobody will know about it.

Beyond compatibility and keeping the masses happy, Ruby 1.8 could use a little performance, scaling, and memory lovin' too. Unfortunately almost all that effort is going toward Ruby 1.9 now, leaving the vast majority of Ruby users stuck on one of the slowest implementations. That's good for us alternative implementers, since it means we're gaining users every day; but it's not good for the MatzRuby lineage because they're losing mindshare. It's hard to deny that the future of Ruby lies with the "excellent" implementations, if not with the "best" ones, and the definition of "excellent" is moving forward every day. Ruby 1.8 is not.

Ruby 1.9

Ruby 1.9 is the merging of the Ruby 1.8 class library and memory model with a large number of new features and a bytecode-based execution engine. It represents the work of Koichi Sasada, who first announced his YARV ("Yet Another Ruby VM") project at RubyConf 2004. YARV took a bit longer than he and many others expected to be completed, but as a result of his tireless efforts it is now the official Ruby 1.9 VM.

Ruby 1.9 introduces many new features, like a character-aware (Unicode and any other encoding) String implementation, Enumerator/Enumerable enhancements, and numerous refinements and additions to the rest of the class library I won't attempt to list here. Ruby 1.9 is defining, in essence, what the rest of the implementations will soon have to implement. For all the debates, most of the additions in Ruby 1.9 have been well-received by the community, though the exposure level is still extremely low.

Ruby 1.9 has, I believe, reached the Rails singularity. With some work over the past few months, Rails has moved closer and closer to running on Ruby 1.9. Last I heard, there was only one bug that needed patching in the 1.9 codebase for Rails trunk to run unmodified. Expect to see an announcement about Ruby 1.9 and Rails at RailsConf next month.

The interesting thing about Ruby 1.9 is that it will mark only the second non-MatzRuby implementation to run Rails, JRuby being the first. This is due in large part to the massive effort required to implement a bytecode VM for Ruby (1.9 was only just released this past December), and to the fact that Ruby 1.9 is still very much a moving target. APIs are being added and refined, optimizations are being tossed about at the VM level, and memory and GC improvements are being considered. So while it's unlikely that anyone will be moving Rails apps to Ruby 1.9 in the near future, Ruby 1.9 is certainly viable...and represents the most-likely future evolution of the Ruby language itself.

What worries me about Ruby 1.9 is that its performance doesn't seem "better enough" to change the future of Ruby. While Koichi's own benchmarks show it's much faster than Ruby 1.8, on general application benchmarks it's usually less than a 50% improvement. Many times, JRuby is able to exceed Ruby 1.9 performance, even without similar optimizations and feature removals. It is certainly a "better" performance story than Ruby 1.8, but is it enough?

Ruby 1.9 also took the first steps toward concurrency by making threads native, but it encumbered them with a giant lock a la Python. That means you still can't get concurrent execution of Ruby code on Ruby 1.9, something JRuby's been able to do all along, and you can't scale Ruby 1.9 any better on wide systems than you could with Ruby 1.8. There's plans to solve this by adding fine-grained locks to most internal data structures, but that's a hard problem to solve on par with JRuby 2.0 challenges I'll talk about in a minute. And without something like the JVM to really optimize that locking, performance will take a hit.

It's also unclear if people really *want* all of what Ruby 1.9 has to offer. Sure, people love the idea of a real encoding-aware String, but response to the rest of what Ruby 1.9 offers--including performance--has been a collective "meh." People are not flocking to Ruby 1.9 in droves, and many are contemplating whether their future Ruby work will simply be a lateral move to one of the 1.8-compatible implementations. And on the JRuby project, we've received almost no requests to implement Ruby 1.9 features, so we've only added a few tiny ones. Whither Ruby 2.0?

JRuby

Ahh, JRuby. How you have changed my life.

JRuby is a Java-based implementation of Ruby, or if you prefer not to speak the word "Java", it's Ruby for the JVM. JRuby was started in 2002 by Jan Arne Petersen, and though it had a couple good years of activity it never really got to a compatible-enough level to run real Ruby applications. Jan Arne moved on at some point and efforts were largely picked up by Thomas Enebo, current co-lead of JRuby. He was especially active when JRuby was being updated from Ruby 1.6 compatibility to Ruby 1.8.4 compatibility, a task which today is largely complete. I joined the project in fall 2004 after attending RubyConf 2004, and at the time I did not know Ruby. Now I know Ruby in a deeper way than I ever really wanted to...but that's a discussion for another day. I wasn't a hugely active contributor until late 2005, when I started working on a new interpreter and refactoring JRuby internals. I presented JRuby for the first time at RubyConf 2005, and then in early 2006 milestones started dropping like flies: IRB ran, then RubyGems, then Rake...and then Rails. We were still dog slow at the time, but we were viable.

JRuby reached the Rails singularity in time for JavaOne 2006, an event that led Sun to hire Tom and me in fall 2006. We demonstrated on stage JRuby running a simple Rails application. It was cobbled together, running under either Tomcat or simply WEBrick at the time, but we had proved it was possible to have an alternative Ruby implementation compatible enough to run Rails. JRuby was no longer a joke.

Over the past two years (man, has it really been two years?) we've essentially rewritten almost all of JRuby a piece at a time. We've been through three interpreters, one prototype compiler and one complete compiler, multiple Regexp engines, and at least two implementations of the key core classes. I wrote the compiler for JRuby during summer 2007, completing it around RailsConf EU 2007. My first compiler! We now run faster than Ruby 1.8 in both interpreted and compiled modes, with interpreted being perhaps 15-20% faster and compiled being at least a few times faster, generally on par with Ruby 1.9. Of course since we're based on the JVM, we share its object model, garbage collector, binary representation. So JRuby is certainly a "mini-VM" but we leave the nasty bits to JVM implementers to handle. Pragmatism, friends, pragmatism.

Perhaps the most notable result of JRuby's existence is that there are now so many Ruby implementations. If we had not shown the promise, many of the others might not have risked the peril. Oh, and we've ended up with a cracker-jack implementation of Ruby on the JVM too...I suppose that's worth a little something.

Perils...always perils. JRuby has managed to surmount most of the perils that await other implementations. And being on the other side of the chasm, I can tell you now it doesn't get easier.

Compatibility is *hard*. I'm not talking a little hard, I'm talking monumentally hard. Ruby is a very flexible, complicated language to implement, and it ships with a number of very flexible, complicated core class implementations. Very little exists in the way of specifications and test kits, so what we've done with JRuby we've done by stitching together every suite we could find. And after all this time, we still have known bugs and weekly reports of minor incompatibilities. I don't think an alternative implementation can ever truly become "compatible" as much as "more compatible". We're certainly the most compatible alternative impl, and even now we've got our hands full fixing bugs. Then there's Ruby 1.9 support, coming up probably in JRuby 1.2ish. Another adventure.

Performance is also hard, but maybe not *hard* hard. JRuby is lucky to run on one of the fastest VMs in existence. The JVM, in its many incarnations, has been so refined and the JVM implementation arena so competitive that we get a lot of performance for free. But by "for free" I mean Java performance. Java's far easier to write and maintain than C, and on the JVM we know we won't pay a performance penalty for not writing C code. But making Ruby fast on the JVM is where it gets tricky. JVMs are optimized for Java and Java-like languages. JRuby has to include all sorts of tricks and subsystems to make the runtime and compiled Ruby code "feel" a bit more like Java to the JVM. This has involved, in many cases, implementing our own "mini-VM" on top of the JVM, with mixed-mode execution (interpreted, then JITed to JVM bytecode), call site caches (to speed method lookup), and code alterations that sometimes improve performance at the cost of LOC and readability. The challenge for us going forward is to continue improving performance without making JRuby a jumbled mess. John Rose's work on the Da Vinci Machine and dynamic invocation for JDK 7 will help us rip a lot of code out, but only a small subset of JRuby users will see the benefits in the near term. So expect to see us spend a lot more time on performance for Java 5/6 compatible JVMs.

The final big peril for us relates to JRuby 2.0's Java integration support. JRuby currently has a split object model, where Ruby types are all "IRubyObject" implementations and the runtime only understands how to deal with "IRubyObject. This means that in order for us to call methods on non-Ruby Java types, we must wrap them with an IRubyObject wrapper. This is partially to attach our meta-object protocol to those objects, but mostly because every bloody method in the system accepts only IRubyObject as a parameter or return type. In order for us to achieve the "last mile" of Java integration, we need to make the entire system accept "Object" and act appropriately; we've been calling this approach "lightweights", since it would enable using normal Java objects for several core classes like Fixnum and Float. Support for "Object" lightweights would then feed into JRuby 2.0's reworked Java integration layer, eliminating most of the overhead (and code) associated with calling Java methods today. It would also fit better into the invokedynamic work. It's a big job I wouldn't expect to be complete until later this fall, and we need to mercilessly write specs and tests for the current behavior to avoid regressing features we support today. But we're going to do it; we've already started.

Rubinius

Evan Phoenix's Rubinius project is an effort to implement Ruby using as much Ruby code as possible. It is not, as professed, "Ruby in Ruby" anymore. Rubinius started out as a 100% Ruby implementation of Ruby that bootstrapped and ran on top of MatzRuby. Over time, though the "Ruby in Ruby" moniker has stuck, Rubinius has become more or less half C and half Ruby. It boasts a stackless bytecode-based VM (compare with Ruby 1.9, which does use the C stack), a "better" generational, compacting garbage collector, and a good bit more Ruby code in the core libraries, making several of the core methods easier to understand, maintain, and implement in the first place.

The promise of Rubinius is pretty large. If it can be made compatible, and made to run fast, it might represent a better Ruby VM than YARV. Because a fair portion of Rubinius is actually implemented in Ruby, being able to run Ruby code fast would mean all code runs faster. And the improved GC would solve some of the scaling issues Ruby 1.8 and Ruby 1.9 will face.

Rubinius also brings some other innovations. The one most likely to see general visibility is Rubinius's Multiple-VM API. JRuby has supported MVM from the beginning, since a JRuby runtime is "just another Java object". But Evan has built simple MVM support in Rubinius and put a pretty nice API on it. That API is the one we're currently looking at improving and making standard for user-land MVM in JRuby and Ruby 1.9. Rubinius has also shown that taking a somewhat more Smalltalk-like approach to Ruby implementation is feasible.

But here be dragons.

In the 1.5 years since Rubinius was officially named and born into the Ruby world, it has not yet met any of these promises. It is not generally faster than Ruby 1.8, though it performs pretty well on some low-level microbenchmarks. It is not implemented in Ruby: the current VM is written in C and the codebase hosts as much C code as it does Ruby code. Evan's work on a C++ rewrite of the VM will make Rubinius the first C++-based Ruby implementation. It has not reached the Rails singularity yet, though they may achieve it for RailsConf (probably in the same cobbled-together state JRuby did at JavaOne 2006...or maybe a bit better). And the second Rails inflection point--running Rails faster than Ruby 1.8--is still far away.

Compatibility is not going to be a problem for Rubinius. They've worked very hard from the beginning to match Ruby behavior, even launching a Ruby specification suite project to officially test that behavior using Ruby 1.8 as the standard. I have no doubt Rubinius will be able to run Rails and most other Ruby apps people throw at it. And despite Evan's frequent cowboy attitude to language compatibility (such as his early refusal to implement left-to-right evaluation ordering, a fatal decision that led to the current VM rework), compatibility is likely to be a simple matter of time and effort, driven by the spec suite and by actual applications, as people start running real code on Rubinius.

Performance is going to be a much harder problem for Rubinius. In order for Rubinius to perform well, method invocation must be extremely fast. Not just faster than Ruby 1.8 or Ruby 1.9, but perhaps an order of magnitude faster than the fastest Ruby implementations. The simple reason for this is that with so much of the core classes implemented in Ruby, Rubinius is doing many times more dynamic invocations than any other implementation. If a given String method represents one or two dynamic calls in JRuby or Ruby 1.8, it may represent twenty in Rubinius...and sometimes more. All that dispatch has a severe cost, and on most benchmarks involving heavily Ruby-based classes Rubinius has absolutely dismal performance--even with call-site optimizations that finally pushed JRuby's performance to Ruby 1.9 levels. A few benchmarks I've run from JRuby's suite must be ratcheted down a couple orders of magnitude to even complete.

And the Rubinius team knows this. Over the past few months, more and more core methods have been reimplemented in C as "primitives", sometimes because they have to be to interact with C-level memory and VM constructs, but frequently for performance reasons. So the "Ruby in Ruby" implementation has evolved away from that ideal rather than towards it, and performance is still not acceptable for most applications. In theory, none of this should be insurmountable. Smalltalk VMs run significantly faster than most Ruby implementations and still implement all or most of the core in Smalltalk. Even the JVM, largely associated with the statically-typed Java language, is essentially an optimized dynamic language VM, and the majority of Java's core is implemented in Java...often behind interfaces and abstractions that require a good dynamic runtime. But these projects have hundreds of man-years behind them, where Rubinius has only a handful of full-time and part-time enthusiastic Rubyists, most with no experience in implementing high-performance language runtimes. And Evan is still primarily responsible for everything at the VM level.

Of course, it would be folly to suggest that the Rubinius team should focus on performance before compatibility. The "Ruby in Ruby" meme needs to die (seriously!), but other than that Rubinius is an extremely promising implementation of Ruby. Its performance is terrible for most apps, but not all that much worse than JRuby's performance was when we reached the Rails singularity ourselves. And its design is going to be easier to evolve than comparable C implementations, assuming that people other than Evan learn to really understand the VM core. I believe the promise of Rubinius is certainly great enough to continue the project, even if the perils are going to present some truly epic challenges for Evan and company to overcome.

Update: The Rubinius team has had a few things to say about this as well.

Evan Phoenix argues that the 50/50 split between C and Ruby really translates into a lot more logic in Ruby, because of course we all know about Ruby's legendary terseness and density. And he's got a good point; even at 50/50 Rubinius is easily "mostly" Ruby. But nobody would claim that the JVM is Java in Java, even though the ratio of C/C++ code to Java code is vastly in Java's favor. Rubinius has a C core...Rubinius's VM is written in C. It might be 100% "Ruby in Ruby" some day, but for now it's not.

Brian Ford posted further about the "Ruby in Ruby" meme, arguing rightly that "Ruby in Ruby" is an ideal we should all be striving for, and that ideal should never die. I agree wholeheartedly on that point. The meme I referred to is the growing idea that Rubinius is automatically going to be a better implementation than all others simply because it's written in Ruby. So far that hasn't been the case. And that meme has also been used as a club, claiming other implementations (even Matz's own implementation) are not for Ruby programmers as much as Rubinius is.

Rubinius is, and always has been, a great project and a great idea. I talk with Evan and Brian and all the others on a daily basis, I contribute specs whenever I find gaps or fix bugs in JRuby, and I secretly harbor a desire to implement a JRuby/JVM backend for the Rubinius kernel. I'm sure we'll see great things from Rubinius in the future.

IronRuby

I've had a love/hate relationship with Microsoft over the years. Recently, it's been more "I love to hate them", but there are some shining stars over there. And IronRuby is certainly one of them.

Microsoft's IronRuby project (or perhaps "Microsoft IronRuby") is the current most-viable .NET-based Ruby implementation. It is led by John Lam, formerly of RubyCLR fame, and a small team of folks in Microsoft's languages group. IronRuby really has its roots in the Ruby.NET project from Queensland University of Technology, and like JRuby the real seed for both projects was the implementation of a Ruby 1.8-compatible parser. IronRuby is based on Microsoft's Dynamic Language Runtime, a collection of libraries to make dynamic languages easier to implement and to work around the performance constraints of CLR's strong preference for static-typed languages. In recent months it's probably safe to say that IronRuby has been driving DLR work, since by my estimation it represents the most difficult-to-implement dynamic language Microsoft is currently working on, and IronPython is mostly done, other than ongoing performance work.

I call IronRuby a shining star not because of the implementation, which is fairly mundane, or because of the DLR, which is perhaps clever in places but certainly "just plain necessary" for CLR dynlang performance in others. IronRuby is a shining star because it's the first Microsoft project under the Microsoft Permissive License, a "truly OSS" license approved by OSI. It represents the first project at Microsoft that I've thought gives the company any real hope for the future, because John Lam and company could truly show the advantage of more openness and closer community cooperation. So the OSS thing is certainly part of the promise of IronRuby, but it's not really a Ruby thing.

The main promise of IronRuby is a compatible, performant implementation of Ruby that runs on the CLR (and by extension, runs well on Windows). IronRuby currently mostly uses normal CLR types for all the core classes, building Ruby's String on CLR's StringBuilder, Ruby's Array on CLR's ArrayList, and so on. They've also made Ruby objects and CLR objects largely indistinguishable from one another as far as call dispatch goes, where in JRuby all non-Ruby Java objects entering the system must be wrapped with a Ruby-aware dispatcher (to be remedied in JRuby 2.0ish, as mentioned above). Of course IronRuby boasts advantages similar to JRuby, since it can leverage the CLR garbage collector, memory model, performance. And of course, having Microsoft backing your project should count for something.

IronRuby could also provide a Rails deployment option that Windows folks will actually want to use. Windows support in Ruby has always lagged behind UNIX support, partially because Windows "sucks" for various definitions of "sucks", and partially because most Rubyists don't use Windows. The ones that do use Windows have often felt abandoned, leading to projects like Daniel Berger's Ruby fork Sapphire, which counts among its features "Better support for MS Windows". IronRuby on .NET on Windows would in theory integrate very well with other Windows/.NET properties like IIS, ADO (or whatever it's called now), and Microsoft's new MVC layer. So for Windows users, IronRuby ought to be a big win, and they're understandably excited about it. Set up a tweetscan for "ironruby" and you'll see what I mean...there are nearly as many anticipatory tweets about IronRuby as there are practical tweets about JRuby.

But there's some peril here too. IronRuby is largely still being developed in a vacuum. Perhaps in order to have secrets to announce at "the next big conference" or perhaps because Microsoft's own policies require it, IronRuby's development process proceeds largely from all-internal commits, all-internal discussions, and all-internal emails that periodically result in a blob of code tossed over the fence to external contributors. The OSS story has improved, since those of us on the outside can actually get access to the code, but the necessary two-way street still isn't there. That's going to slow progress, and eventually could make it impossible for IronRuby to keep up as resources are moved to other projects at Microsoft. JRuby has managed to sustain for as long as it has with only two fulltime developers entirely because of our community and openness, and indeed JRuby would never have been possible without a fully OSS process.

IronRuby is also going to have trouble running Rails in its current form. Rails 2.x is still hindered by its inability to process concurrent requests in parallel on a single process. Because of various thread-unsafeties in the Rails libraries, concurrent requests must be shunted off to separate processes, or in the case of JRuby to separate JRuby instances in the same JVM process. There is work underway to improve this, some of it through GSOC, but it's still going to be a while before you can run your entire app on a single process with any of the C implementations. Even then, if you want to run many Rails apps, you'll still need multiple process with Ruby 1.8. And this is where IronRuby is going to get burned.

As I understand it, currently there is no way to provide multiple isolated execution environments in IronRuby as you can in JRuby or Rubinius. The reasons for this are beyond me, but many language implementations on top of a general-purpose VM avoid the complexity of "multiple runtimes" to better integrate with the rest of the system. JRuby's MVM support, for example, makes serializing Ruby objects as though they were normal Java objects nearly impossible, because upon deserialization there's no way to know which JRuby instance to attach the object to. In IronRuby's case, any inability to run multiple environments in the same process will mean IronRuby users must launch multiple processes to run Rails, just like the Ruby 1.8 and 1.9 users do. And that would be, in my opinion, an unacceptable state of affairs.

I also believe that the IronRuby team does not yet understand the scope of what's necessary to run Rails. John Lam has been quoted at several events saying they hope to run a "hello world" Rails app at RailsConf, but IronRuby can't run IRB, RubyGems, or Rake yet. John has also been tweeting periodic updates on IronRuby's spec-passing rate, even though Rubinius passes most of those specs and still can't run Rails (and JRuby passes more than Rubinius along with another 48000 assertions in our own test suite, only some of which have equivalents in the specs). As far as time spent on implementation, IronRuby really only has about a year of progress in, since Silverlight integration, demos, and presentations often pull them away for weeks at a time.

There's also a final peril the IronRuby will have to deal with: Microsoft would never back an OSS web framework like Rails in preference to its own. John Lam has repeatedly said that IronRuby will run Rails, and I believe him. But that goal is almost certainly not a Microsoft priority, since they have their own proprietary technologies to push. If John's able to do it, it will have to come from his small team and community contributors, leading back to the OSS peril above.

Be that as it may, an implementation of Ruby for the CLR is certainly going to happen, and I believe it's necessary for the Ruby ecosystem to survive for there to be a CLR Ruby. IronRuby is going to be that project, and it's already driving competition in the Ruby implementation world. John Lam and I talk at conferences, exchange tweets and emails, and I've been building and running IronRuby occasionally to check on their progress. I also know the IronRuby team realizes they could be more open and probably wants to be more open. And perhaps if they read this article they'll start to realize there's a lot more work involved in reaching the Rails singularity than running specs and having a working String implementation. There's pain involved, and they've not yet begun to feel it.

MacRuby

Ruby 1.9 fixes some of MatzRuby's issues, but not all of them. Though it brings a much-faster bytecode VM, improved method-dispatch cost, and a number of other execution-related performance tweaks, it does not solve problems with Ruby's memory model and garbage collector. And partially for this reason Apple's Laurent Sansonetti has been working on MacRuby, a forked rework of Ruby 1.9 targeting the Objective C runtime.

Laurent is famous for his past work on RubyCocoa, bindings to allow Ruby to deliver beautiful, top-notch UIs on OS X. I don't know how long he's been working on MacRuby, since some of its life was spent in secret, but it's been open-source for a few months now. Currently Laurent has been working on converting the core classes from C implementations over to using ObjC equivalents. Part of the goal of MacRuby is to provide a Ruby that can interact directly with ObjC: Ruby objects are ObjC objects and vice versa; Ruby can call methods on ObjC objects and vice-versa. So in this sense, MacRuby is perhaps more similar to JRuby or IronRuby than to Rubinius or the MatzRuby lineage. It is an implementation of Ruby for a general-purpose runtime.

MacRuby promises one thing for certain: tight integration with ObjC and by extension with much of OS X. Because ObjC is the language of choice for development on OS X, MacRuby users will certainly have the cleanest, tightest integration with OS libraries and primitives, far better than any other implementation can provide.

There's also a chance that ObjC's core classes (which essentially are repurposed as MacRuby's core classes) will have better peformance characteristics than the hand-written C impls in MatzRuby. Because MacRuby is based on Ruby 1.9, it shares Ruby 1.9's bytecode-based execution engine. This means that most performance gains will come from better core class implementations. Already Laurent has been tweeting some impressive microbenchmark numbers showing e.g. Array performance can be substantially better than Ruby 1.9. And there are performance gains to be had calling ObjC code, of course, since dispatch is now essentially ObjC dispatch directly rather than passing through Ruby's own dispatch logic.

MacRuby may also eventually represent a better way to run Rails on OS X. Because of YARV, it should have good performance. Because of ObjC, it should have a good memory model. And because it's "MacRuby" it should fit well into the rest of the system, likely leading to a simpler, better-integrated deployment story for Rails.

The biggest peril for MacRuby is pretty obvious: Why Ruby 1.9? Ruby 1.9 is still under active development, and APIs are being added and tweaked as we speak. Laurent's fork is going to get further and further away, even if he's able to keep portions up-to-date; that's going to make compatibility a serious challenge. Choosing Ruby 1.9 certainly makes sense from a performance perspective, but many Ruby apps out there don't run on it, and most developers aren't targeting it because it's still a moving target. Even on JRuby, where we've got prototype implementations of both YARV's and Rubinius's bytecode engines, we've opted not to hit 1.9 features hard yet. Ruby 1.9 is in progress, and that will mean a lot more effort required to keep MacRuby up to date and to make it an attractive option.

There's also a chance that Ruby implemented on top of Objective C isn't going to perform that much better, on the whole, than Ruby 1.9's all-C approach. While some of Laurent's benchmarks have been impressively faster than Ruby 1.9, many others have been equally slower. And all benchmarks I've run, which are a little less "micro", have been slower on MacRuby than either Ruby 1.9 or MatzRuby. That's not very promising.

Regardless of the peril, MacRuby seems like a great idea. Objective C is a solid runtime, and the fact that it's so heavily used on OS X means MacRuby will have an excellent integration story. MacRuby may not add much value in the runtime/performance/execution department, but will potentially teach us all lessons about integrating with a general-purpose runtime. And of course MacRuby may eventually be shipped with OS X, making Ruby a first-class language for writing Mac apps. That alone is probably worth it.

Fading Implementations

It's worth spending a few words on the "no longer viable" implementations here as well.

XRuby, product of Xue Yong Zhi and a few others, was the first Ruby implementation to have a full JVM bytecode compiler. Performance early on looked very good, but the lack of a compatible set of core classes and resource limitations caused its development to lag terribly. The most recent release was 0.3.3 on March 24, and since then there have been only two commits. Xue has admitted he has no time to work on the project, and without a heavy, long-term time investment from someone XRuby is likely to fade away.

Ruby.NET is a similar story. Started off a research grant from Microsoft at Queensland University of Technology, Ruby.NET is the product of John Gough and Wayne Kelly. The project was a proof-of-concept for Ruby on CLR, to show it could be done and work through some of the early challenges in getting there. It was officially made into an open source project last year, and for a while there were many interested contributors. But although the Ruby.NET parser lives on in IronRuby, the project has largely ground to a halt since Wayne officially threw his support behind Microsoft's project. There have been two commits in the past month, and the last two really active mailing list threads were titled "The future of Ruby.NET" and "Has IronRuby killed off Ruby.NET".

And Cardinal is still pining for the fjords.

New Contenders

I don't know much about these projects, but I'm sure they'll all bring their own flavor to the Ruby soup.

HotRuby is an implementation of Ruby in JavaScript. It implements Ruby 1.9's bytecode engine and all core classes using JavaScript types. Performance looks good on their site, but they can't run anything yet and haven't even begun to discuss compatibility. It may show promise in time, but it's an interesting toy for now.

MagLev from GemStone appears to be something Ruby-related. Not much (anything?) has been publicly said about it. It's mysterious. It has a nice viral front page. Rails may be involved.

IronMonkey is an effort to port IronPython and IronRuby to the Tamarin VM Adobe donated to the Mozilla project. It's being led by Seo Sanghyoen, though from what I hear he hasn't had a lot of time to work on it. Python and Ruby in the browser, cross-platform, without vendor-lock in. Could be interesting.

Final Thoughts

Have you started working on your Ruby implementation yet? All the cool kids are doing it. It's remarkable how many implementations of Ruby are in the works right now. It remains to be seen whether the ecosystem can support such diversity in the long term, but at the very least they're introducing splendid variation. And there's a lot more to do with Ruby in terms of performance, scaling, and "getting things done". Ruby's future is looking bright, in no small part due to the many implementations. How's your favorite language looking?

Update: Vladimir Sizikov has a nice short article on the value of the RubySpecs project, and since it's so important for the future of these alternative implementations I thought it deserved a mention. He also includes links to his RubySpecs quickstart guide and the current RubySpecs overview page. If you haven't contributed to the specs yet, you should feel guilty.