Sunday, April 27, 2008

Promise and Peril for Alternative Ruby Impls

My how things have changed in a couple short years.

Two years ago, in 2006, there were essentially two viable Ruby implementations: Matz's Ruby 1.8.x codebase, and JRuby. At the time, JRuby was just barely starting to run Rails. I consider that a sort of "singularity" in the lifetime of an implementation, the inflection point at which it becomes more than a toy. (OT: these days, I consider the ability to run Rails faster than Matz's Ruby a better inflection point, but we've had the Rails thing going for two years). Cardinal (Ruby on Parrot) was mostly dead, or at least on its way to being dead. YARV, eventually to become the Ruby 1.9 VM, was perhaps only half completed and was not yet officially marked to be the next Ruby. Rubinius had not really been started, or at least had not officially been named and could not be considered anywhere near viable. IronRuby was still Wilco Bauer's IronRuby, a doomed codebase and project name eventually to be adopted by Microsoft's later Ruby implementation effort. So there was very little competition, and most people still considered JRuby to be a big joke. Ha ha.

Fast forward to Spring 2008. Ruby 1.8.x has mostly been put in maintenance mode, but remains by far the most widely-deployed Ruby implementation, despite its relatively poor performance. Largely, this is because only Ruby 1.8.x is 100% compatible with Ruby 1.8.x, and because it's already packaged and shipped on a number of OSes, including Rubyist favorite OS X. But the rest of the field has become a lot more muddled. There are now around six implementations that have past the Rails inflection point or will soon, a handful of others likely to fade into obscurity (after contributing their own genetics to the Ruby ecosystem, surely), a few mysterious up-and-comers...and Cardinal is still dead.

Let's review the promise, peril, and status of all the implementations. Note, this is largely a mix of facts and my opinions. Corrections for the facts are welcome. Corrections for the opinions...well...let's take it offline.

Ruby 1.8

Matz's venerable code base (usually called MRI for "Matz's Ruby Interpreter", MatzRuby or Ruby 1.8 in this article) still has a death-grip on the Ruby world. For 99% of Ruby developers, MatzRuby is still the king of the hill. This is in the face of poor relative performance (slower than JRuby and 1.9 for sure, and slower than most of the others for many cases), poor memory management (conservative, non-compacting GC), and many upstart implementations which solve these problems. Why is this?

It's not hard to answer. It's compatibility and status quo. If all my apps run fine on MatzRuby, and MatzRuby is already installed, and I'm satisfied with the performance characteristics of MatzRuby...why would I run anything else? Because MatzRuby is largely not *that bad* for most uses, especially as relates to simple system scripting work, it's unlikely to go away any time soon. And since all but one of the alternative impls is targeting Ruby 1.8 features and compatibility, the still-in-development Ruby 1.9 is not gaining much traction yet (which is probably a good thing).

You should all know the specs and status for MatzRuby. Current official release is 1.8.6 patchlevel 114. There's a 1.8.7 preview 2 out now that backports a whole bunch of features from 1.9 and breaks some compatibility. The jury's still out on whether it will go final as-is. MatzRuby is a simple AST-walking interpreter, with a conservative GC, minimal/cumbersome Unicode support, and a large library of third-party native extensions. MatzRuby is still the gold standard for what Ruby 1.8 "is".

The peril for Ruby 1.8 right now involves keeping that 99% of Ruby users interested in Ruby while 1.9 bakes without breaking compatibility. Ruby 1.8.7 pre1 introduced some Ruby 1.9 features but also broke a bunch of stuff (most notably Rails), and pre2 doesn't pass the specs 1.8.6 does. We had a design meeting last week where it was decided that we folks working on the Ruby specs need to help the Ruby core team get involved, so they can start running the full suite as part of their development process. That's going to happen soon, but until it does 1.8 releases can break basically anything from day today and nobody will know about it.

Beyond compatibility and keeping the masses happy, Ruby 1.8 could use a little performance, scaling, and memory lovin' too. Unfortunately almost all that effort is going toward Ruby 1.9 now, leaving the vast majority of Ruby users stuck on one of the slowest implementations. That's good for us alternative implementers, since it means we're gaining users every day; but it's not good for the MatzRuby lineage because they're losing mindshare. It's hard to deny that the future of Ruby lies with the "excellent" implementations, if not with the "best" ones, and the definition of "excellent" is moving forward every day. Ruby 1.8 is not.

Ruby 1.9

Ruby 1.9 is the merging of the Ruby 1.8 class library and memory model with a large number of new features and a bytecode-based execution engine. It represents the work of Koichi Sasada, who first announced his YARV ("Yet Another Ruby VM") project at RubyConf 2004. YARV took a bit longer than he and many others expected to be completed, but as a result of his tireless efforts it is now the official Ruby 1.9 VM.

Ruby 1.9 introduces many new features, like a character-aware (Unicode and any other encoding) String implementation, Enumerator/Enumerable enhancements, and numerous refinements and additions to the rest of the class library I won't attempt to list here. Ruby 1.9 is defining, in essence, what the rest of the implementations will soon have to implement. For all the debates, most of the additions in Ruby 1.9 have been well-received by the community, though the exposure level is still extremely low.

Ruby 1.9 has, I believe, reached the Rails singularity. With some work over the past few months, Rails has moved closer and closer to running on Ruby 1.9. Last I heard, there was only one bug that needed patching in the 1.9 codebase for Rails trunk to run unmodified. Expect to see an announcement about Ruby 1.9 and Rails at RailsConf next month.

The interesting thing about Ruby 1.9 is that it will mark only the second non-MatzRuby implementation to run Rails, JRuby being the first. This is due in large part to the massive effort required to implement a bytecode VM for Ruby (1.9 was only just released this past December), and to the fact that Ruby 1.9 is still very much a moving target. APIs are being added and refined, optimizations are being tossed about at the VM level, and memory and GC improvements are being considered. So while it's unlikely that anyone will be moving Rails apps to Ruby 1.9 in the near future, Ruby 1.9 is certainly viable...and represents the most-likely future evolution of the Ruby language itself.

What worries me about Ruby 1.9 is that its performance doesn't seem "better enough" to change the future of Ruby. While Koichi's own benchmarks show it's much faster than Ruby 1.8, on general application benchmarks it's usually less than a 50% improvement. Many times, JRuby is able to exceed Ruby 1.9 performance, even without similar optimizations and feature removals. It is certainly a "better" performance story than Ruby 1.8, but is it enough?

Ruby 1.9 also took the first steps toward concurrency by making threads native, but it encumbered them with a giant lock a la Python. That means you still can't get concurrent execution of Ruby code on Ruby 1.9, something JRuby's been able to do all along, and you can't scale Ruby 1.9 any better on wide systems than you could with Ruby 1.8. There's plans to solve this by adding fine-grained locks to most internal data structures, but that's a hard problem to solve on par with JRuby 2.0 challenges I'll talk about in a minute. And without something like the JVM to really optimize that locking, performance will take a hit.

It's also unclear if people really *want* all of what Ruby 1.9 has to offer. Sure, people love the idea of a real encoding-aware String, but response to the rest of what Ruby 1.9 offers--including performance--has been a collective "meh." People are not flocking to Ruby 1.9 in droves, and many are contemplating whether their future Ruby work will simply be a lateral move to one of the 1.8-compatible implementations. And on the JRuby project, we've received almost no requests to implement Ruby 1.9 features, so we've only added a few tiny ones. Whither Ruby 2.0?

JRuby

Ahh, JRuby. How you have changed my life.

JRuby is a Java-based implementation of Ruby, or if you prefer not to speak the word "Java", it's Ruby for the JVM. JRuby was started in 2002 by Jan Arne Petersen, and though it had a couple good years of activity it never really got to a compatible-enough level to run real Ruby applications. Jan Arne moved on at some point and efforts were largely picked up by Thomas Enebo, current co-lead of JRuby. He was especially active when JRuby was being updated from Ruby 1.6 compatibility to Ruby 1.8.4 compatibility, a task which today is largely complete. I joined the project in fall 2004 after attending RubyConf 2004, and at the time I did not know Ruby. Now I know Ruby in a deeper way than I ever really wanted to...but that's a discussion for another day. I wasn't a hugely active contributor until late 2005, when I started working on a new interpreter and refactoring JRuby internals. I presented JRuby for the first time at RubyConf 2005, and then in early 2006 milestones started dropping like flies: IRB ran, then RubyGems, then Rake...and then Rails. We were still dog slow at the time, but we were viable.

JRuby reached the Rails singularity in time for JavaOne 2006, an event that led Sun to hire Tom and me in fall 2006. We demonstrated on stage JRuby running a simple Rails application. It was cobbled together, running under either Tomcat or simply WEBrick at the time, but we had proved it was possible to have an alternative Ruby implementation compatible enough to run Rails. JRuby was no longer a joke.

Over the past two years (man, has it really been two years?) we've essentially rewritten almost all of JRuby a piece at a time. We've been through three interpreters, one prototype compiler and one complete compiler, multiple Regexp engines, and at least two implementations of the key core classes. I wrote the compiler for JRuby during summer 2007, completing it around RailsConf EU 2007. My first compiler! We now run faster than Ruby 1.8 in both interpreted and compiled modes, with interpreted being perhaps 15-20% faster and compiled being at least a few times faster, generally on par with Ruby 1.9. Of course since we're based on the JVM, we share its object model, garbage collector, binary representation. So JRuby is certainly a "mini-VM" but we leave the nasty bits to JVM implementers to handle. Pragmatism, friends, pragmatism.

Perhaps the most notable result of JRuby's existence is that there are now so many Ruby implementations. If we had not shown the promise, many of the others might not have risked the peril. Oh, and we've ended up with a cracker-jack implementation of Ruby on the JVM too...I suppose that's worth a little something.

Perils...always perils. JRuby has managed to surmount most of the perils that await other implementations. And being on the other side of the chasm, I can tell you now it doesn't get easier.

Compatibility is *hard*. I'm not talking a little hard, I'm talking monumentally hard. Ruby is a very flexible, complicated language to implement, and it ships with a number of very flexible, complicated core class implementations. Very little exists in the way of specifications and test kits, so what we've done with JRuby we've done by stitching together every suite we could find. And after all this time, we still have known bugs and weekly reports of minor incompatibilities. I don't think an alternative implementation can ever truly become "compatible" as much as "more compatible". We're certainly the most compatible alternative impl, and even now we've got our hands full fixing bugs. Then there's Ruby 1.9 support, coming up probably in JRuby 1.2ish. Another adventure.

Performance is also hard, but maybe not *hard* hard. JRuby is lucky to run on one of the fastest VMs in existence. The JVM, in its many incarnations, has been so refined and the JVM implementation arena so competitive that we get a lot of performance for free. But by "for free" I mean Java performance. Java's far easier to write and maintain than C, and on the JVM we know we won't pay a performance penalty for not writing C code. But making Ruby fast on the JVM is where it gets tricky. JVMs are optimized for Java and Java-like languages. JRuby has to include all sorts of tricks and subsystems to make the runtime and compiled Ruby code "feel" a bit more like Java to the JVM. This has involved, in many cases, implementing our own "mini-VM" on top of the JVM, with mixed-mode execution (interpreted, then JITed to JVM bytecode), call site caches (to speed method lookup), and code alterations that sometimes improve performance at the cost of LOC and readability. The challenge for us going forward is to continue improving performance without making JRuby a jumbled mess. John Rose's work on the Da Vinci Machine and dynamic invocation for JDK 7 will help us rip a lot of code out, but only a small subset of JRuby users will see the benefits in the near term. So expect to see us spend a lot more time on performance for Java 5/6 compatible JVMs.

The final big peril for us relates to JRuby 2.0's Java integration support. JRuby currently has a split object model, where Ruby types are all "IRubyObject" implementations and the runtime only understands how to deal with "IRubyObject. This means that in order for us to call methods on non-Ruby Java types, we must wrap them with an IRubyObject wrapper. This is partially to attach our meta-object protocol to those objects, but mostly because every bloody method in the system accepts only IRubyObject as a parameter or return type. In order for us to achieve the "last mile" of Java integration, we need to make the entire system accept "Object" and act appropriately; we've been calling this approach "lightweights", since it would enable using normal Java objects for several core classes like Fixnum and Float. Support for "Object" lightweights would then feed into JRuby 2.0's reworked Java integration layer, eliminating most of the overhead (and code) associated with calling Java methods today. It would also fit better into the invokedynamic work. It's a big job I wouldn't expect to be complete until later this fall, and we need to mercilessly write specs and tests for the current behavior to avoid regressing features we support today. But we're going to do it; we've already started.

Rubinius

Evan Phoenix's Rubinius project is an effort to implement Ruby using as much Ruby code as possible. It is not, as professed, "Ruby in Ruby" anymore. Rubinius started out as a 100% Ruby implementation of Ruby that bootstrapped and ran on top of MatzRuby. Over time, though the "Ruby in Ruby" moniker has stuck, Rubinius has become more or less half C and half Ruby. It boasts a stackless bytecode-based VM (compare with Ruby 1.9, which does use the C stack), a "better" generational, compacting garbage collector, and a good bit more Ruby code in the core libraries, making several of the core methods easier to understand, maintain, and implement in the first place.

The promise of Rubinius is pretty large. If it can be made compatible, and made to run fast, it might represent a better Ruby VM than YARV. Because a fair portion of Rubinius is actually implemented in Ruby, being able to run Ruby code fast would mean all code runs faster. And the improved GC would solve some of the scaling issues Ruby 1.8 and Ruby 1.9 will face.

Rubinius also brings some other innovations. The one most likely to see general visibility is Rubinius's Multiple-VM API. JRuby has supported MVM from the beginning, since a JRuby runtime is "just another Java object". But Evan has built simple MVM support in Rubinius and put a pretty nice API on it. That API is the one we're currently looking at improving and making standard for user-land MVM in JRuby and Ruby 1.9. Rubinius has also shown that taking a somewhat more Smalltalk-like approach to Ruby implementation is feasible.

But here be dragons.

In the 1.5 years since Rubinius was officially named and born into the Ruby world, it has not yet met any of these promises. It is not generally faster than Ruby 1.8, though it performs pretty well on some low-level microbenchmarks. It is not implemented in Ruby: the current VM is written in C and the codebase hosts as much C code as it does Ruby code. Evan's work on a C++ rewrite of the VM will make Rubinius the first C++-based Ruby implementation. It has not reached the Rails singularity yet, though they may achieve it for RailsConf (probably in the same cobbled-together state JRuby did at JavaOne 2006...or maybe a bit better). And the second Rails inflection point--running Rails faster than Ruby 1.8--is still far away.

Compatibility is not going to be a problem for Rubinius. They've worked very hard from the beginning to match Ruby behavior, even launching a Ruby specification suite project to officially test that behavior using Ruby 1.8 as the standard. I have no doubt Rubinius will be able to run Rails and most other Ruby apps people throw at it. And despite Evan's frequent cowboy attitude to language compatibility (such as his early refusal to implement left-to-right evaluation ordering, a fatal decision that led to the current VM rework), compatibility is likely to be a simple matter of time and effort, driven by the spec suite and by actual applications, as people start running real code on Rubinius.

Performance is going to be a much harder problem for Rubinius. In order for Rubinius to perform well, method invocation must be extremely fast. Not just faster than Ruby 1.8 or Ruby 1.9, but perhaps an order of magnitude faster than the fastest Ruby implementations. The simple reason for this is that with so much of the core classes implemented in Ruby, Rubinius is doing many times more dynamic invocations than any other implementation. If a given String method represents one or two dynamic calls in JRuby or Ruby 1.8, it may represent twenty in Rubinius...and sometimes more. All that dispatch has a severe cost, and on most benchmarks involving heavily Ruby-based classes Rubinius has absolutely dismal performance--even with call-site optimizations that finally pushed JRuby's performance to Ruby 1.9 levels. A few benchmarks I've run from JRuby's suite must be ratcheted down a couple orders of magnitude to even complete.

And the Rubinius team knows this. Over the past few months, more and more core methods have been reimplemented in C as "primitives", sometimes because they have to be to interact with C-level memory and VM constructs, but frequently for performance reasons. So the "Ruby in Ruby" implementation has evolved away from that ideal rather than towards it, and performance is still not acceptable for most applications. In theory, none of this should be insurmountable. Smalltalk VMs run significantly faster than most Ruby implementations and still implement all or most of the core in Smalltalk. Even the JVM, largely associated with the statically-typed Java language, is essentially an optimized dynamic language VM, and the majority of Java's core is implemented in Java...often behind interfaces and abstractions that require a good dynamic runtime. But these projects have hundreds of man-years behind them, where Rubinius has only a handful of full-time and part-time enthusiastic Rubyists, most with no experience in implementing high-performance language runtimes. And Evan is still primarily responsible for everything at the VM level.

Of course, it would be folly to suggest that the Rubinius team should focus on performance before compatibility. The "Ruby in Ruby" meme needs to die (seriously!), but other than that Rubinius is an extremely promising implementation of Ruby. Its performance is terrible for most apps, but not all that much worse than JRuby's performance was when we reached the Rails singularity ourselves. And its design is going to be easier to evolve than comparable C implementations, assuming that people other than Evan learn to really understand the VM core. I believe the promise of Rubinius is certainly great enough to continue the project, even if the perils are going to present some truly epic challenges for Evan and company to overcome.

Update: The Rubinius team has had a few things to say about this as well.

Evan Phoenix argues that the 50/50 split between C and Ruby really translates into a lot more logic in Ruby, because of course we all know about Ruby's legendary terseness and density. And he's got a good point; even at 50/50 Rubinius is easily "mostly" Ruby. But nobody would claim that the JVM is Java in Java, even though the ratio of C/C++ code to Java code is vastly in Java's favor. Rubinius has a C core...Rubinius's VM is written in C. It might be 100% "Ruby in Ruby" some day, but for now it's not.

Brian Ford posted further about the "Ruby in Ruby" meme, arguing rightly that "Ruby in Ruby" is an ideal we should all be striving for, and that ideal should never die. I agree wholeheartedly on that point. The meme I referred to is the growing idea that Rubinius is automatically going to be a better implementation than all others simply because it's written in Ruby. So far that hasn't been the case. And that meme has also been used as a club, claiming other implementations (even Matz's own implementation) are not for Ruby programmers as much as Rubinius is.

Rubinius is, and always has been, a great project and a great idea. I talk with Evan and Brian and all the others on a daily basis, I contribute specs whenever I find gaps or fix bugs in JRuby, and I secretly harbor a desire to implement a JRuby/JVM backend for the Rubinius kernel. I'm sure we'll see great things from Rubinius in the future.

IronRuby

I've had a love/hate relationship with Microsoft over the years. Recently, it's been more "I love to hate them", but there are some shining stars over there. And IronRuby is certainly one of them.

Microsoft's IronRuby project (or perhaps "Microsoft IronRuby") is the current most-viable .NET-based Ruby implementation. It is led by John Lam, formerly of RubyCLR fame, and a small team of folks in Microsoft's languages group. IronRuby really has its roots in the Ruby.NET project from Queensland University of Technology, and like JRuby the real seed for both projects was the implementation of a Ruby 1.8-compatible parser. IronRuby is based on Microsoft's Dynamic Language Runtime, a collection of libraries to make dynamic languages easier to implement and to work around the performance constraints of CLR's strong preference for static-typed languages. In recent months it's probably safe to say that IronRuby has been driving DLR work, since by my estimation it represents the most difficult-to-implement dynamic language Microsoft is currently working on, and IronPython is mostly done, other than ongoing performance work.

I call IronRuby a shining star not because of the implementation, which is fairly mundane, or because of the DLR, which is perhaps clever in places but certainly "just plain necessary" for CLR dynlang performance in others. IronRuby is a shining star because it's the first Microsoft project under the Microsoft Permissive License, a "truly OSS" license approved by OSI. It represents the first project at Microsoft that I've thought gives the company any real hope for the future, because John Lam and company could truly show the advantage of more openness and closer community cooperation. So the OSS thing is certainly part of the promise of IronRuby, but it's not really a Ruby thing.

The main promise of IronRuby is a compatible, performant implementation of Ruby that runs on the CLR (and by extension, runs well on Windows). IronRuby currently mostly uses normal CLR types for all the core classes, building Ruby's String on CLR's StringBuilder, Ruby's Array on CLR's ArrayList, and so on. They've also made Ruby objects and CLR objects largely indistinguishable from one another as far as call dispatch goes, where in JRuby all non-Ruby Java objects entering the system must be wrapped with a Ruby-aware dispatcher (to be remedied in JRuby 2.0ish, as mentioned above). Of course IronRuby boasts advantages similar to JRuby, since it can leverage the CLR garbage collector, memory model, performance. And of course, having Microsoft backing your project should count for something.

IronRuby could also provide a Rails deployment option that Windows folks will actually want to use. Windows support in Ruby has always lagged behind UNIX support, partially because Windows "sucks" for various definitions of "sucks", and partially because most Rubyists don't use Windows. The ones that do use Windows have often felt abandoned, leading to projects like Daniel Berger's Ruby fork Sapphire, which counts among its features "Better support for MS Windows". IronRuby on .NET on Windows would in theory integrate very well with other Windows/.NET properties like IIS, ADO (or whatever it's called now), and Microsoft's new MVC layer. So for Windows users, IronRuby ought to be a big win, and they're understandably excited about it. Set up a tweetscan for "ironruby" and you'll see what I mean...there are nearly as many anticipatory tweets about IronRuby as there are practical tweets about JRuby.

But there's some peril here too. IronRuby is largely still being developed in a vacuum. Perhaps in order to have secrets to announce at "the next big conference" or perhaps because Microsoft's own policies require it, IronRuby's development process proceeds largely from all-internal commits, all-internal discussions, and all-internal emails that periodically result in a blob of code tossed over the fence to external contributors. The OSS story has improved, since those of us on the outside can actually get access to the code, but the necessary two-way street still isn't there. That's going to slow progress, and eventually could make it impossible for IronRuby to keep up as resources are moved to other projects at Microsoft. JRuby has managed to sustain for as long as it has with only two fulltime developers entirely because of our community and openness, and indeed JRuby would never have been possible without a fully OSS process.

IronRuby is also going to have trouble running Rails in its current form. Rails 2.x is still hindered by its inability to process concurrent requests in parallel on a single process. Because of various thread-unsafeties in the Rails libraries, concurrent requests must be shunted off to separate processes, or in the case of JRuby to separate JRuby instances in the same JVM process. There is work underway to improve this, some of it through GSOC, but it's still going to be a while before you can run your entire app on a single process with any of the C implementations. Even then, if you want to run many Rails apps, you'll still need multiple process with Ruby 1.8. And this is where IronRuby is going to get burned.

As I understand it, currently there is no way to provide multiple isolated execution environments in IronRuby as you can in JRuby or Rubinius. The reasons for this are beyond me, but many language implementations on top of a general-purpose VM avoid the complexity of "multiple runtimes" to better integrate with the rest of the system. JRuby's MVM support, for example, makes serializing Ruby objects as though they were normal Java objects nearly impossible, because upon deserialization there's no way to know which JRuby instance to attach the object to. In IronRuby's case, any inability to run multiple environments in the same process will mean IronRuby users must launch multiple processes to run Rails, just like the Ruby 1.8 and 1.9 users do. And that would be, in my opinion, an unacceptable state of affairs.

I also believe that the IronRuby team does not yet understand the scope of what's necessary to run Rails. John Lam has been quoted at several events saying they hope to run a "hello world" Rails app at RailsConf, but IronRuby can't run IRB, RubyGems, or Rake yet. John has also been tweeting periodic updates on IronRuby's spec-passing rate, even though Rubinius passes most of those specs and still can't run Rails (and JRuby passes more than Rubinius along with another 48000 assertions in our own test suite, only some of which have equivalents in the specs). As far as time spent on implementation, IronRuby really only has about a year of progress in, since Silverlight integration, demos, and presentations often pull them away for weeks at a time.

There's also a final peril the IronRuby will have to deal with: Microsoft would never back an OSS web framework like Rails in preference to its own. John Lam has repeatedly said that IronRuby will run Rails, and I believe him. But that goal is almost certainly not a Microsoft priority, since they have their own proprietary technologies to push. If John's able to do it, it will have to come from his small team and community contributors, leading back to the OSS peril above.

Be that as it may, an implementation of Ruby for the CLR is certainly going to happen, and I believe it's necessary for the Ruby ecosystem to survive for there to be a CLR Ruby. IronRuby is going to be that project, and it's already driving competition in the Ruby implementation world. John Lam and I talk at conferences, exchange tweets and emails, and I've been building and running IronRuby occasionally to check on their progress. I also know the IronRuby team realizes they could be more open and probably wants to be more open. And perhaps if they read this article they'll start to realize there's a lot more work involved in reaching the Rails singularity than running specs and having a working String implementation. There's pain involved, and they've not yet begun to feel it.

MacRuby

Ruby 1.9 fixes some of MatzRuby's issues, but not all of them. Though it brings a much-faster bytecode VM, improved method-dispatch cost, and a number of other execution-related performance tweaks, it does not solve problems with Ruby's memory model and garbage collector. And partially for this reason Apple's Laurent Sansonetti has been working on MacRuby, a forked rework of Ruby 1.9 targeting the Objective C runtime.

Laurent is famous for his past work on RubyCocoa, bindings to allow Ruby to deliver beautiful, top-notch UIs on OS X. I don't know how long he's been working on MacRuby, since some of its life was spent in secret, but it's been open-source for a few months now. Currently Laurent has been working on converting the core classes from C implementations over to using ObjC equivalents. Part of the goal of MacRuby is to provide a Ruby that can interact directly with ObjC: Ruby objects are ObjC objects and vice versa; Ruby can call methods on ObjC objects and vice-versa. So in this sense, MacRuby is perhaps more similar to JRuby or IronRuby than to Rubinius or the MatzRuby lineage. It is an implementation of Ruby for a general-purpose runtime.

MacRuby promises one thing for certain: tight integration with ObjC and by extension with much of OS X. Because ObjC is the language of choice for development on OS X, MacRuby users will certainly have the cleanest, tightest integration with OS libraries and primitives, far better than any other implementation can provide.

There's also a chance that ObjC's core classes (which essentially are repurposed as MacRuby's core classes) will have better peformance characteristics than the hand-written C impls in MatzRuby. Because MacRuby is based on Ruby 1.9, it shares Ruby 1.9's bytecode-based execution engine. This means that most performance gains will come from better core class implementations. Already Laurent has been tweeting some impressive microbenchmark numbers showing e.g. Array performance can be substantially better than Ruby 1.9. And there are performance gains to be had calling ObjC code, of course, since dispatch is now essentially ObjC dispatch directly rather than passing through Ruby's own dispatch logic.

MacRuby may also eventually represent a better way to run Rails on OS X. Because of YARV, it should have good performance. Because of ObjC, it should have a good memory model. And because it's "MacRuby" it should fit well into the rest of the system, likely leading to a simpler, better-integrated deployment story for Rails.

The biggest peril for MacRuby is pretty obvious: Why Ruby 1.9? Ruby 1.9 is still under active development, and APIs are being added and tweaked as we speak. Laurent's fork is going to get further and further away, even if he's able to keep portions up-to-date; that's going to make compatibility a serious challenge. Choosing Ruby 1.9 certainly makes sense from a performance perspective, but many Ruby apps out there don't run on it, and most developers aren't targeting it because it's still a moving target. Even on JRuby, where we've got prototype implementations of both YARV's and Rubinius's bytecode engines, we've opted not to hit 1.9 features hard yet. Ruby 1.9 is in progress, and that will mean a lot more effort required to keep MacRuby up to date and to make it an attractive option.

There's also a chance that Ruby implemented on top of Objective C isn't going to perform that much better, on the whole, than Ruby 1.9's all-C approach. While some of Laurent's benchmarks have been impressively faster than Ruby 1.9, many others have been equally slower. And all benchmarks I've run, which are a little less "micro", have been slower on MacRuby than either Ruby 1.9 or MatzRuby. That's not very promising.

Regardless of the peril, MacRuby seems like a great idea. Objective C is a solid runtime, and the fact that it's so heavily used on OS X means MacRuby will have an excellent integration story. MacRuby may not add much value in the runtime/performance/execution department, but will potentially teach us all lessons about integrating with a general-purpose runtime. And of course MacRuby may eventually be shipped with OS X, making Ruby a first-class language for writing Mac apps. That alone is probably worth it.

Fading Implementations

It's worth spending a few words on the "no longer viable" implementations here as well.

XRuby, product of Xue Yong Zhi and a few others, was the first Ruby implementation to have a full JVM bytecode compiler. Performance early on looked very good, but the lack of a compatible set of core classes and resource limitations caused its development to lag terribly. The most recent release was 0.3.3 on March 24, and since then there have been only two commits. Xue has admitted he has no time to work on the project, and without a heavy, long-term time investment from someone XRuby is likely to fade away.

Ruby.NET is a similar story. Started off a research grant from Microsoft at Queensland University of Technology, Ruby.NET is the product of John Gough and Wayne Kelly. The project was a proof-of-concept for Ruby on CLR, to show it could be done and work through some of the early challenges in getting there. It was officially made into an open source project last year, and for a while there were many interested contributors. But although the Ruby.NET parser lives on in IronRuby, the project has largely ground to a halt since Wayne officially threw his support behind Microsoft's project. There have been two commits in the past month, and the last two really active mailing list threads were titled "The future of Ruby.NET" and "Has IronRuby killed off Ruby.NET".

And Cardinal is still pining for the fjords.

New Contenders

I don't know much about these projects, but I'm sure they'll all bring their own flavor to the Ruby soup.

HotRuby is an implementation of Ruby in JavaScript. It implements Ruby 1.9's bytecode engine and all core classes using JavaScript types. Performance looks good on their site, but they can't run anything yet and haven't even begun to discuss compatibility. It may show promise in time, but it's an interesting toy for now.

MagLev from GemStone appears to be something Ruby-related. Not much (anything?) has been publicly said about it. It's mysterious. It has a nice viral front page. Rails may be involved.

IronMonkey is an effort to port IronPython and IronRuby to the Tamarin VM Adobe donated to the Mozilla project. It's being led by Seo Sanghyoen, though from what I hear he hasn't had a lot of time to work on it. Python and Ruby in the browser, cross-platform, without vendor-lock in. Could be interesting.

Final Thoughts

Have you started working on your Ruby implementation yet? All the cool kids are doing it. It's remarkable how many implementations of Ruby are in the works right now. It remains to be seen whether the ecosystem can support such diversity in the long term, but at the very least they're introducing splendid variation. And there's a lot more to do with Ruby in terms of performance, scaling, and "getting things done". Ruby's future is looking bright, in no small part due to the many implementations. How's your favorite language looking?

Update: Vladimir Sizikov has a nice short article on the value of the RubySpecs project, and since it's so important for the future of these alternative implementations I thought it deserved a mention. He also includes links to his RubySpecs quickstart guide and the current RubySpecs overview page. If you haven't contributed to the specs yet, you should feel guilty.

38 comments:

apeiros said...

Excellent article - as always. One of the big gains in having you in the ruby community is the ever interesting articles you write. Kudos.
What I think went a bit under by your focus on the implementations is the side effects of having many ruby implementations. The two that come to my mind are firstly rubyspec which in my opinion will have a tremendously positive impact on ruby in the long term. Secondly the design meetings. This one is a two-sided sword. If done right it can and will have a huge positive impact on ruby. But I see it having potential to cause damage too. That potential lies in possible disconsenses and the inability to resolve those. Since that depends entirely on the participants of the design meetings it's pretty difficult to predict - and entirely impossible if you don't know the participants for which reason I won't make any prediction :)
Another interesting topic might be forks, like e.g. imperators sapphire.

Anonymous said...

Cardinal, and anything that has to do with Parrot, should never be mentioned again. The Perl community is made up of vaporware and code obfuscation, Parrot is getting old and it still has no serious programming language fully implemented for it.

joachimm said...

Thanks for an excellent article, I thought I should add my own observations.

While I agree that Evan's talk about "ruby in ruby" is quite tiresome, when the ruby/c split is 50/50 and as you say, more and more of the performance intensive parts seems to be moving to c/c++. That said, I think that the "ruby in ruby" promise has always been about the future, rather than the present, even though Evan has (intentionally?) glossed over that fact. Furthermore, the promise was about the core being written not in Ruby, but in a subset of Ruby.

When looking into the source of the c++ branch, I see very little that makes me think that it would be impossible to write a mini-ruby interpreter that outputs c++. C++ source that looks like what is currently in the c++ branch. If the hypothetical ruby written in such a language, would look like idiomatic ruby code I have no idea.

The fact that more and more functionallity, gets moved from Ruby to c/c++ for performance reason is probably unavoidable. This doesn't stop anyone from writing these parts in mini-ruby as well. Again looking into the future, the "small" core in a Ruby like language, turns out to be a "large" one. That is regrettable but hardly a disaster. My only worry is that this subset of Ruby will continue to be "the future" indefinitely.

Now even if that turns out to be true, we will still get another "sane" implementation of Ruby. Which can only be good.

Bob Aman said...

Lots of good points, but regarding MacRuby, frankly, Rails is a non-issue, and by extension, so is Ruby 1.9. The sweet-spot for MacRuby is likely to be GUI desktop applications rather than web apps. Given that OS X deployment involves bundling all the libraries you use in the app within the app itself, Ruby 1.9 compatibility won't be a major problem, at least for smaller libraries anyways. Rails, on the other hand, is a long way from 1.9 compatibility. If anything, the choice to target 1.9 will be an incentive for library writers to actually produce code that's 1.9 compatible. The MacRuby project definitely made the right decision.

chromatic said...

I think the Parrot hackers would be happy to help anyone interested in improving Cardinal. There are no technical barriers to doing so -- only time and resources.

Kjetil Ødegaard said...

Great to see you posting again, Charles. Interesting stuff as always.

Ruby core doesn't seem to have much focus on fleshing out Ruby specs... or is that going to change now?

Rubinius has been hyped heavily but so far delivered very little.

Charles Oliver Nutter said...

apeiros: I didn't mean to exclude or lessen the impact of the Ruby spec project, but I didn't want the article to turn into a spec love-fest. It's mentioned several times, and I agree it's very important for Ruby's future.

anonymous: Parrot to me is one of those projects you still hope to see some day, like Duke Nukem Forever or Ruby 2.0. Some day!

joachimm: The "ruby in ruby" statement about Rubinius has certainly been about the future, I don't disagree there. But Rubinius is sold at every conference, in every interview, and in every article as being "ruby in ruby" now, which is certainly not true. I would not have a problem allowing them some slack if they didn't use "ruby in ruby" as a club to beat on other implementations, as if they deserved some elevated status because of that goal. They could promise to eventually make Rubinius out of pure gold, but as long as it's half lead they don't get to claim such a title.

bob aman: Perhaps that's the case for MacRuby developers, but just about any Rubyist interested in MacRuby is going to try running Rails on it. If it doesn't work, you're right, it probably won't matter. And I agree that MacRuby is likely to find its sweet spot in the GUI development arena. But you can't know *what* Ruby developers might want to use MacRuby for ahead of time, and for folks coming from Ruby 1.8 the choice of Ruby 1.9 is still an odd one. And I'm not entirely sure creating 1.9-compatible libraries is even a good idea right now...since 1.9 isn't entirely 1.9-compatible from day to day. What is Ruby 1.9? Do we have a spec? A test suite? Anything? If the answer is no, it's not ready.

chromatic: There's a third non-technical barrier: nobody seems to care if Cardinal exists or not. That's hard to overcome.

Kjetil Ødegaard: At the Ruby Design Meeting last week I think everyone agreed in principal that Ruby core will need to start looking at using and contributing to the specs, including several Ruby core members themselves. I think we're on the right path now, but it won't happen overnight.

Vladimir Sizikov said...

Apeiros, I fully agree. The RubySpecs is an incredibly important piece of the puzzle.

Inspired by this blog entry,
I just posted my thoughts on the subject of compatibility/RubySpecs:
The value of RubySpecs

wuputah said...

I'm not sure what their goals are, but Phusion, who released Passenger aka mod_rails, is also working on Ruby Enterprise Edition.

steven said...

I don't know to what extent threads are a unique case but I found that in developing threaded programs, making things run on multiple implementations was probably the single best way to flush out timing issues. Feels a little weird to say, but I hope we don't lose that anytime too soon.

Michael Koziarski said...

@Bob Aman:

Rails, on the other hand, is a long way from 1.9 compatibility.

As far as we're aware there's one remaining issue outstanding for 1.9 compatibility with edge rails, a bug in 1.9. There are a few issues which will cause some gotchas, specifically String#chars has different semantics. Basically everything else has mostly been addressed to our knowledge.

Daniel Berger said...

You and the other implementors keep talking about the specs and compatibility, Charles, but you've never stopped to ask yourself one fundamental question.

What if the spec itself is flawed?

MRI is a straightjacket, Charles. Evolve or die.

Charles Oliver Nutter said...

wuputah: My understanding of Phusion's "Ruby Enterprise Edition" is just Ruby 1.8 with some GC improvements. I could certainly be wrong about that, but either way I don't think it qualifies as a separate Ruby implementation.

steven: Threads are definitely one area where implementation details can show through, and testing previously "clean" threaded apps on JRuby often shows they're not as threadsafe as people believe.

daniel berger: I love you man. Yes, MRI is a limiting definition of "Ruby", but it runs the apps. Anyone who wants to run the apps needs to meet a minimum level of compatibility. The choice here is between making a "better Ruby" and a "better Ruby platform". Once we have the latter, we'll be in a position to help drive the former.

izidor said...

About "better Ruby" - this is the largest danger ahead. Having multiple incompatible forks, due to different understanding of "better".

My view of "better" is that you should be compatible with Matz, however hard that may be for you Java and Microsoft and other folks.

You can have internal mechanics of your own, but leave the language alone, please.

His vision of the language as "friendly" may not be the easiest to implement, but don't change the language so you can have faster implementation. Go code in Java instead.

I write programs in Ruby because of Ruby language (and Rails :-), not because of compile/execution speed (ditto for Rails). And since Ruby (and Rails) is already very popular despite this "huge failings" you describe, I would say changing the language is not so very urgent as some people would want everybody to think.

Fuzzyman said...

Although it is at best 'innacurate to say that IronRuby is "the first Microsoft project under the Microsoft Permissive License".

IronPython (and possibly other projects) were licensed under the 'Microsoft Permissive License', which was renamed to the Microsoft Public License, long before IronRuby. (long being relative of course...)

sambo99 said...

Charlie,

You write fascinating articles!

One thing that I think is worth mentioning here about rubinius is the new trend for really open projects. Its not enough just to have your source open.

The whole github + lighthouse + downloadable open irc logs and a high quality spec suite lowers the bar for contributions. I like this trend, its good.

Erlangs actor pattern in ruby can really change the way we think about scalability. Thats good.

And the dream of Ruby in Ruby well that is really good, and I hope it comes good one day.

I keep on thinking that there must be a way to get the runtime to watch over whats going on and have it inline chunks that have not been performing well. Lots of small methods seems like a best practice when it comes to code maintainability, I think our VMs just need to get smarter when it comes to invocation. But this hellishly hard on a dynamic runtime.

Cheers
Sam

Joachim (München) said...

> It's also unclear if people really *want* all of what Ruby 1.9 has to offer.

For me, just _one_ offering is enough to make me move to 1.9: To pass human-readable data through YAML_load / modify / YAML_dump cycles, I need hash insertion order preservation.

Not a spectacular new feature - yet for my little application a decisive advantage of Ruby over other languages.

jherber said...

excellent state of the union charles! bravo!

great to hear after all your incredible work that jruby still has quite a few more tricks up the sleeve. do you think dynamic invocation work in open jdk will have a big impact on performance? do you think the move to java.lang.Object as ruby base will also affect memory footprint?

Charles Oliver Nutter said...

fuzzyman: Ah, thank you, noted. I think I must have assumed it was the first since it was the most visible in my realm.

sambo99: For what it's worth, JRuby has had that same level of openness for at least the 3.5 years I've been involved. And we've used existing community-driven test suites. The creation of the rubyspec is a great consolidation of testing, but it's certainly not the first such effort.

We would also like to move toward more Ruby code, but have opted for now to avoid the performance hit we'd incur doing so. Ruby can be made fast, certainly, and perhaps fast enough to build an entire runtime with, but the techniques and strategies necessary might hold back practical uses of JRuby. So we're pragmatic...JRuby works, and it runs fast. If it's not written in Ruby today, maybe it will be later.

jherber: The move to Object should reduce memory consumption, since core types will have fewer wrapping layer and Java integration will not require wrappers at all. And the dynamic invocation work for JDK7 will most certainly help: it will allow us to delete a lot of code and reduce substantially what code we generate at runtime, for a leaner Ruby; it will also mean we can leverage the JVM's much-better optimizations, rather than wiring our own. The future looks bright :)

Fuzzyman said...

Hi Charles (good article by the way - a very interesting read).

You're quite rude about the IronRuby development process. I must say that from being on the mailing list your description certainly doesn't *seem* right.

I think John might have some things to say about the technical points you raise - but the DLR has supported multiple engines for quite a while now.

Charles Oliver Nutter said...

fuzzyman: I don't believe I was rude at all, or even incorrect. As far as I can tell from monitoring the IronRuby list, the IronRuby team is having no design or dev discussions in the open. You can't run an OSS project that way. And there are plenty of historical examples of how damaging a private "trunk" repository can be to an OSS project. Tossing SVN bundles over the fence every so often does not foster a daily give-and-take with the community.

Ask yourself...doesn't it bother you that you can't update your working copy to the same sources the IronRuby team sees in the morning? Doesn't it bother you that they're having all their substantive conversations behind closed doors, leaving you to guess at progress?

These problems can certainly be remedied, but pretending they don't exist won't help anyone...and will damage IronRuby's long-term OSS prospects.

Jayme Edwards said...

Microsoft's support for ruby - what's the strategy?

Evan Phoenix said...

I've posted a short retort concerning Rubinius at http://blog.fallingsnow.net/2008/04/28/rubinius-retort/.

Anonymous said...

I understand the issues (somewhat), but I think the biggest advantage with rubinius really is that it is a community project. Yes, it may be that evans writes most anyway ;) but the key really was that basically if you contribute, you do have the chance to influence.

And while I personally think that the rubinius project is lagging a tiny bit the last some months, I do believe that there is no REAL argument why this should not work. Just look at smalltalk and how much squeak is touted and loved (even though smalltalk probably no longer has the many coders that python and ruby has these days)

Charles Oliver Nutter said...

Joachim: Is hash ordering really the main feature in 1.9 you're interested in? JRuby already has insertion-ordered hashes in 1.1, so that shouldn't prevent you from giving it a try.

Eloy Duran said...

Excellent article Charles, thanks for your time!
I just wanted to chime in on the MacRuby part.

At least one good reason imo to use 1.9 is because of the native threads, which has been a long standing problem in RubyCocoa, it can be worked around but better is.... well better :)

And also, like said before, MacRuby is about being able to use Cocoa, not so much about existing Ruby code like Rails.
RubyCocoa is the same in this aspect, except it's for 1.8.

Eloy

Laurent Sansonetti said...

Charles, thanks for the kind words and the nice summary of MacRuby. I do agree with your point of view regarding MacRuby.

However, as others suggested, please note that running Rails is not a top priority at the moment. We want to focus on delivering a high quality solution to develop efficient Mac OS X applications, firstly.

As you also expected, I do not think that MacRuby _in its current shape_ will be faster than 1.9 generally speaking. Nevertheless, we also have a new GC which performs faster collections in a separate thread, and even if pure object allocation micro benchmarks reveal to be slower, this might probably change the typical behavior of a Rails app regarding memory usage (once 1.9 can run Rails of course).

And we have lots of ideas for the project, including the rewrite of more critical parts of the 1.9 source base. This is just the beginning :)

Bob Aman said...

@Michael Koziarski


As far as we're aware there's one remaining issue outstanding for 1.9 compatibility with edge rails, a bug in 1.9.


Wasn't really the crux of my argument, but if that's the case, then my bad, obviously a lot of progress has been made since the last time I took a look. That said, in my experience, it still takes awhile for bugs of any significance to get fixed in Ruby itself. So while it might not be Rails's fault, it may very well still end up being quite awhile before Rails can boast of Ruby 1.9 compatibility. Not all that familiar with the details of the bug in question though.

@Charles Oliver Nutter


Perhaps that's the case for MacRuby developers, but just about any Rubyist interested in MacRuby is going to try running Rails on it.


Maybe I'm misunderstanding your statement, not sure, but I think this is really far off the mark. If my purpose for using MacRuby is to write a GUI, I'm going to test that MacRuby meets my needs by writing a simple GUI app, not by trying to run Rails on top of it. Besides that, if what Michael said above is true, then Rails will most likely have achieved 1.9 compatibility long before MacRuby is done.


And I'm not entirely sure creating 1.9-compatible libraries is even a good idea right now... since 1.9 isn't entirely 1.9-compatible from day to day. What is Ruby 1.9? Do we have a spec? A test suite? Anything? If the answer is no, it's not ready.


Did Ruby 1.8 have a spec to start off with? Until Rubinius and friends came along, no. Don't get me wrong, a spec is a good thing. But the presence of a spec is not a prerequisite for usefulness. I'm not advocating that people try to maintain two versions of a library, one for 1.9 and one for 1.8. That would be insane. I would, however, like to see people making use of simple capabilities checks anywhere in a library that there's a known compatibility issue. For example, if you call [] on a string, you should be checking the return value and making sure you got what you expected. With careful short-circuiting in the checks, there shouldn't be any measurable performance penalty for doing so, and the gains are significant.

Charles Oliver Nutter said...

eloy: Part of the problem with that is the fact that Ruby 1.9 doesn't allow its native threads to run concurrently. Perhaps it's a goal for MacRuby to eliminate the giant lock?

laurent: That certainly seems reasonable. I'm looking forward to trying MacRuby for OS X app development too, so hurry up :) And I suppose the 1.9/Rails thing will be solved by the time you have a release. I just hope it won't be too difficult for you to keep aligned with "real" 1.9's growing features and bug fixes.

bob aman: I think my concern with 1.9 specs is more that it's changing too fast. Sure, Ruby 1.8 has had a set of specs written after-the-fact, but it's also mostly frozen (ignoring 1.8.7 for the moment). 1.9 is still under active development. And my point about Rails on MacRuby was just that anyone who's interested in Rails and interested in MacRuby will try to bring the two together. You're right, anyone interested in MacRuby but not Rails on MacRuby will probably not try it :)

Dan Moore said...

Is there an equivalent to MS's DLR for the JVM? If not, could parts of JRuby's implementation be abstracted out to make one?

Charles Oliver Nutter said...

dan moore: There's no one library that serves that purpose, but there are dozens of libraries written to fill some or all of that purpose. Some are written during the course of a language's implementation and never repurposed. Some are built with the express intent of building several languages on top of them. So although it would be nice to have a "one library to rule them all", there have been DLR equivalents of varying completeness on the JVM for years.

I'd love to abstract out as much as possible from JRuby to help add another tool to that toolbox.

Phil said...

> People are not flocking to Ruby 1.9 in droves

Perhaps that has less to do with "we're not interested" and more to do with--oh, I don't know--maybe the fact that they are *actively discouraged* from using it in production? =)

David Majda said...

One addition to the "New Contenders" section: The Ruby.PHP project, which is a Ruby to PHP compiler that I am writing as my Master's thesis.

As I write in the FAQ at the project website, it is in the early stage of development (no "inflection point" in the near future :-), however the compiler is already able to translate simple Ruby scripts into (somewhat ugly) PHP and run them.

There is an online demo of the compiler on the web.

Eloy Duran said...

@charles: I'm sorry I should have been clearer. I was talking about *yet another* problem with ruby's green threads :)

Namely that communication between ruby's green threads and POSIX threads are hard to do right and so in 1.8, threads are almost useless in combination with UI stuff. Luckily for us Cocoa provides lots of nice APIs that can work around this.

For more info see: http://trac.macosforge.org/projects/ruby/wiki/WhyMacRuby

Cheers,
Eloy

chromatic said...

For what it's worth, Stephen Weeks has just resumed work on Cardinal.

LacKac said...

Nice overview, thanks Charles. MacRuby suprised me, I missed the news about it.

Roger Pack said...

The assertion that " you can't scale Ruby 1.9 any better on wide systems than you could with Ruby 1.8" is hopefully going to be overcome by the use of asynchronous database drivers [1] [2] plus fibers. We can only hope :)
[1]http://oldmoe.blogspot.com/2008/07/faster-io-for-ruby-with-postgres.html
[2]http://github.com/tqbf/asymy/tree/master
[3] rails works on 1.9 http://blog.codefront.net/2008/05/25/living-on-the-edge-of-rails-22-pre-railsconf-2008-edition/#comment-664800

I'd also say that after the release of 1.9.1 people will start to adopt it more and it will become the new de facto standard.

Roger Pack said...

regarding rails and ruby 1.9, I'm not sure but I believe it to be compatible now.
http://blog.codefront.net/2008/05/25/living-on-the-edge-of-rails-22-pre-railsconf-2008-edition/