Headius

Saturday, June 23, 2007

Camping at O'Reilly

Rumors of my demise were greatly exaggerated!

I've been mostly MIA the past couple weeks, largely due to the Ruby Kaigi and related events in Tokyo and a short family trip this past week to Lake Michigan's eastern shore (warm sandy beaches and a much-welcomed rest). But I figured I'd check in so everyone knows I'm still around and the wheels are still turning.

I'm at Foo Camp this weekend, and so far it's been a great time. Lots of good discussions, games of Werewolf, and staying up too late (I think I got about 3.5 hrs of sleep...more than enough!) Today I'm going to try to talk about something outside my usual subjects, mostly as a discussion facilitator, but with a bit of opinion tossed in for good measure. Here's the description I posted to the Foo calendar:

The Transformation Age

This is a collective presentation. There are no attendees; only nodes in the matrix. Expect to share.

Part 1: Individuals to collectives, isolated to connected, The Next Evolution.

As the exchange of information has become progressively more rapid, the communication boundaries more transparent, we are transforming into a new kind of collective entity. The swing from isolated individuals or isolated communities with their own distinct stores of knowledge toward a single global community with all stores of knowledge available to everyone is more than just an interesting trend...it represents a new evolutionary direction humanity is taking. It is the dawn of the transformation age for humankind.

We will explore what it means to be more and more a collective being in this new era, what it will mean for our children and grandchildren and nth grandchildren as pervasive interconnection becomes the norm. As all minds become so intimately connected that being disconnected from the global consciousness is like losing an arm...or one's own identity. And we'll look at how this new evolution is affecting and has affected successful and unsuccessful technologies, business models, and governments over the past several decades.

Part 2 (time permitting): Knowledge represented as transformation of information rather than as desperate attempts to snapshot or categorize information at a given point in time.

The idea of categorizing information has served us fairly well when information itself was slow to change, slow to be communicated, and largely static. Books can be sorted in a categorization scheme because the information they contain does not change over time; the categories remain as valid as when they were assigned. But what happens when the entirety of humanity's knowledge is now not only instantly available, but rapidly changing and evolving along with us? Is not categorization of changing information inherently flawed when information snapshots are almost immediately out of date? What can we do to allow everyone in this age of transformation to participate in information sharing in scalable way?

I would propose that when information itself has become so fluid it defies static categorization, that the transformation of knowledge is the new information we need to track. Already you see the signs of this:

Wikis are updated far more often than they have completely new entries added; the success of wiki is in transformation of knowledge over time, and in continuous evolution of the information contained therein.
Scientists in all fields build new ideas upon old; there are no new ideas that don't synthesize existing parts, transforming our understanding of those parts into a new entity.
Agile development emphasizes small, rapid, interactive transformations of data (a software algorithm + user interfaces on the large) rather than as big bang snapshots and releases.
Open source licenses reduce the barriers to accessing data effectively to zero explicitly to allow for rapid transformation of that data into today's entity. "Build from trunk" is becoming a normal recommendation for rapidly transforming projects, since snapshots are immediately out of date.

We'll talk about whether this all makes sense, whether moment-in-time snapshots of information are becoming more or less relevant than the changes over time to that information, and what can be done to both facilitate transformation and adapt our traditional information management ideals to this constantly changing information sea.

If you're interested in hearing more about all this, about how the session goes, or just would like to talk sometime along these lines, gimme a shout.

Sunday, June 10, 2007

JRuby 1.0 Released!

We have finally released JRuby 1.0, based on the last release candidate, RC3. And what more is there to say? Not really a whole lot...It's almost entirely RC3, with one or two minor fixes added in. But it's really turned out to be an outstanding release, and already reports are coming in of folks trying it out en masse. We're very happy.

So I'll do a little recap here. JRuby 1.0 was focused almost entirely on one goal: Ruby 1.8.x compatibility. To that end, we are now the only alternative Ruby implementation that can reasonably claim we're "compatible". It's no longer a question of whether we can run Ruby applications or not...we've proven that again and again. The issues people run into now are those requiring minor behavioral tweaks, minor parser tweaks, and occasionally exploration of some peculiar threading or memory concerns. It's been a long time coming, but the compatibility issue is largely answered.

Now we start looking toward the future. Once you have a working Ruby implementation, what next? I believe we've shown that the correct way to approach Ruby is to get it right first. The next step is making it run as well as possible. Ok, so we've cheated a bit along the way, introducing interpreter optimizations and a JIT compiler, but that's all fun stuff. The heavy lifting for performance and scalability are coming up, and there's a lot of low-hanging fruit we can start to tackle with 1.0 safely behind us. So action item #1 for the future of JRuby is plucking that low-hanging performance fruit.

The second item for any working implementation would have to be platform integration. Our platform is Java (and of course we've cheated a bit here by doing additional work to make Java integration really useful and usable), so integration involves making Ruby a better citizen of the Java platform. In this area, expect to see us further reduce the disconnect between Ruby and Java, allowing you to construct Ruby objects directly from Java code, define real Java classes using Ruby (classes you can then compile Java code to call), and minimizing to as large an extent possible the performance impact of calling from dynamically typed Ruby code into statically typed Java code. Java integration is action item #2.

And then what? Well, there are many options, all very attractive. I personally would like to see time spent implementing potential Ruby 1.9/2.0 features, to provide a second testbed where people can try those features out. I would like to find ways to share code with Rubinius (beyond tests), either by implementing just enough Rubinius logic to run its pure-Ruby core-class implementations or by implementing a full Rubinius kernel on the JVM. I would like to see JRuby expand to the rest of the Java world, bringing the Ruby Way to Java EE. But most importantly, I want to reach out to the Ruby community and find ways for them to be a part of the JRuby process.

Up to now, JRuby has really existed in its own, separate world. There was the Ruby community and also the JRuby community, and although members might identify themselves with both, there was always this perceived boundary.

We need to break that boundary down...not only for JRuby but for the other Ruby implementations as well.

Ruby is coming of age. Multiple implementations shows that Ruby has really matured as a language...and also shows it has a lot of maturing left to do. Ask Evan (of Rubinius) or me how we feel about retry behavior or block argument processing or thread event processing or SAFE levels and tainting and you'll start to understand some of the ugly, hidden bits of Ruby. We need to make sure that Ruby development proceeds as a whole...not necessarily as a single project or a single codebase, but as an open, direct exchange of concerns, complaints, solutions and ideas. We need to start treating the Rubinius community and the JRuby community and any other implementations' communities as part of the whole...different facets of the same gem.

If you have never tried an alternative implementation of Ruby: do it today. Pick out your favorite app, library, or framework and build your next app using something other than Matz's Ruby. Start running your continuous integration tests against JRuby trunk (or Rubinius trunk, if it runs your code ok; those guys sorely need more CI hits). Make sure your gem releases work on JRuby too, and if you have native (i.e. C) code, explore what it would take to do a JRuby port. Show that you welcome diversity into the Ruby world...that you recognize that diversity is an essential part of language evolution.

JRuby has always been a community project, and only by absorbing the JRuby community into the Ruby community will JRuby continue to be successful. If we can make that happen, I see wonderful things in Ruby's future...regardless of which implementation you use.

Monday, June 04, 2007

A Response to Ola's IronRuby Post

I'm not up for creative titles tonight. Hopefully you'll see this in your feed reader and click for a second opinion. Granted, I agree with Ola's IronRuby post on most points, but I disagree on a few key items. So let's dive in, shall we?

You managed to blog this viewpoint before me, Ola, but you know I agree with almost everything. However, to pour a bit more oil on the fire, I'll take it a step further.

Having run the Ruby gauntlet and brought JRuby from not running anything to running Rails almost 100% perfectly (in just over a year, I might add), I will confidently say there's no way with current specs and tests that anyone could create an implementation of Ruby from scratch that will run Rails unless they can look at the existing implementations. I simply do not believe it's possible.

And I'll throw another couple curve balls too:

I don't believe tests and specs will be where they need to be within the next 6-12 months unless there's a major effort put behind them. Even if that effort happens, I still don't think we'll ever have specs enough for someone to implement Ruby "in the dark" for at least a year, if it ever happens.
IronPython did a great job getting pretty close to 100% compatible. But Jim Hugunin had implemented an almost 100% Python-compatible implementation in Jython before going to Microsoft; he didn't need to look at the Python source. I don't believe John Lam has the same level of experience with Ruby, so he's at a severe disadvantage (and to John: I really feel for you, man...this has got to be difficult).
This is a good friend's belief, but he's won me over: we don't believe Microsoft would ever willingly allow IronRuby to get to the point of running Rails, since that would directly compete with their ASP.NET server, software, and tool offerings. What would be the benefit to them of a free runtime running a free language implementation that runs a free web framework? Probably zero. And as Martin Fowler and others have blogged about NUnit, Microsoft hasn't exactly been lovey-dovey with OSS projects that impact (or are perceived to impact) their bottom line.
Given these facts and the current situation, I'd say it's a better bet for us as a community to get behind the Gardens Point Ruby.NET Compiler project, which is already much farther along than IronRuby...plus it's real open source (you can contribute) and they can look at Ruby's source (and have admitted to doing so for at least the parser). I was wary/skeptical of Ruby.NET last year, but they now seem like the current best hope for Ruby on the CLR. That link again is Gardens Point Ruby.NET Compiler. And to the QUT guys working on Ruby.NET, you need to do three things right now to save your project from historical obscurity: set up a public source repository, accept a few external contributors, and start blogging and emailing up a freaking storm.

That about sums up what I have to say about IronRuby, Microsoft, and the future of Ruby on the CLR. The rest of Ola's points I agree with...diversity is important, a spec and test kit are needed immediately, and Microsoft needs to change their OSS attitude pretty quick or risk becoming irrelevant.

Sunday, June 03, 2007

JRuby 1.0.0RC3 Released - And This Is It!

Tom posted the announcements already, but JRuby 1.0.0RC3 is out in the wild! This release is our most important yet, because we intend for this release to become JRuby 1.0. The only things that will change from now until a 1.0 final release later this week would be any showstopping bugs that are extremely low-impact to fix. In general, RC3 should be nearly identical to 1.0 final.

So what does this mean for JRuby? We've taken the approach of calling the JRuby 1.0 release the "Ruby compatible" release. Basically, all known application bugs caused by JRuby incompatibilities with Matz's Ruby (MRI) have been resolved. This doesn't mean there aren't more compatibility bugs; there probably always will be, and we have probably 50-80 in the bug repository right now (and another 80 bugs or enhancements that are JRuby-specific). But it does mean that most applications should "just work" out of the box, and any application failures we knew of have been resolved. In general, the remaining post 1.0 bugs are edge cases and minor correctness issues, like obscure forms of core methods or missing error conditions.

We've tried very hard to make this both a solid release and a stepping stone to JRuby's future. With this release, we can confidently say that we're Ruby 1.8.5 compatible. Any additional fixes we need to make (like the known issues) are in the last 1% of Ruby features we can potentially support. We also recognize that this is a big moment for JRuby. People can start counting on JRuby to run their Ruby applications correctly. Of course many in the JRuby community have already been doing this for many months, but the 1.0 moniker says to the world we feel like we're ready for prime time.

So what comes after 1.0? That breaks down into a few areas:

Performance. JRuby's performance has come a long way in the past year. We've increased speed by at least an order of magnitude, and have enabled the JIT compiler for the 1.0 release. This means that for many cases where JRuby can compile Ruby code, it will perform faster than Ruby 1.8.5. The general case is still a little muddy, though. We've had anecdotal reports that JRuby on Rails in a Java application server performance extremely well, and other reports that general Ruby applications perform somewhat poorly. The general performance situation is not well-understood, but we all agree there's a lot more work to be done. And the good news is that we have a big list of optimizations remaining to continue improving JRuby's specific and general performance.
Java Integration. JRuby does an excellent job of fitting into the Java platform. In almost all cases, you can call libraries, implement interfaces, and extend classes with no difficulty. But there are edge cases--usually nonstandard or antipattern Java--that JRuby doesn't behave as nicely. The code to enable calling Java is also far more complicated than we'd like. To solve both issues, we'll be taking the existing Java integration syntax and API and backing it with a redesigned library. Expect to see something of this work in a 1.1 release later this year. Our goal is to achieve excellent integration with Java, on par with the most tightly-integrated JVM languages available today. And we're not far off.
Ruby 2.0 and Rubinius. We intend to start supporting Ruby 2.0 and Rubinius bytecode execution soon, though the Ruby 2.0 work is much farther along. We intend to catch up the Rubinius work as well as to try adopting some of Rubinius's pure Ruby implementations of core class functionality. We also plan to enable the use of Ruby 2.0 features through configuration and command-line switches to JRuby. Specify something like -J-Djruby.string.version=2 and all strings in the system will be treated as Ruby 2.0 strings, with full character semantics. Or optionally turn on experimental Ruby 2.0 syntax for lambdas and named parameters. Basically we want to bring JRuby up to the bleeding edge of Ruby features, to provide another platform where people can get a feel for what Ruby 2.0 might look like. More exposure for these features will help Matz and Co. decide the best way to move forward. And that's good for everyone.

And of course we'll continue fixing compatibility bugs, Java integration bugs, and expanding JRuby's reach on the Java platforms with more DSLs and API wrappers around all your favorite Java APIs.

But there's one more area I should talk about: how you can get involved.

JRuby is a community project. The core committers play traffic cops as much as developers, routing patches, examining bugs, documenting features. The only way a project like JRuby succeeds is through mass community involvement. We've been extremely lucky to have a constant flow of interested developers into our community, even when OSS community attrition has been very high. The community may look completely different from month to month, but the flow of patches and bugs has steadily increased.

For folks just getting started, I've written a few articles on the JRuby wiki that should help you understand the development process. They're linked from the front page under "Getting Involved".

For easy bugs, I'd recommend looking in the JRuby JIRA for bugs with "rubinius" or bugs reported by Daniel Berger. The Rubinius bugs are almost all problems we've had running Rubinius's excellent specs, and usually represent minor incompatibilities with C Ruby. Daniel's tests are a lot of those edge cases I mentioned...he's running through his own test suite and reporting any incorrect behavior that's come up. Another option would be to just pick your favorite Ruby app and start running its test cases with JRuby. You're sure to find something interesting to report, and it may be an easy fix.

For non-Java-coders (or folks afraid of hacking on JRuby), stop by the RubySpec Wiki and update or author an article. RubySpec is an effort to build a community-driven specification of Ruby that all users and implementers can freely reference. It is linked from the RubyDoc site and is fast becoming a standard way for the community to record language and library behaviors. I believe this is the best and fastest way for us to form a complete specification of Ruby's behavior...and I believe such a specification is becoming extremely important, what with there now being 5-10 different implementations of Ruby, all guessing at what "correct" is. Have you written RubySpec today?

Well that's about it for this release report. If all goes as well as I think it will, JRuby 1.0 final should be released some time this week. Give it a try, I think you'll be pleasantly surprised!

Friday, June 01, 2007

Creating a Field-Initializing 'new' Method

One thing often touted as a missing feature in Ruby is the lack of a constructor form that initializes fields. A few other languages have this feature, including for example Groovy, another JVM dynamic language. The general idea is that if you want to construct an object and initialize a number of fields, you often want to do it in one shot. Rather than modify the class to have additional initializers for all the fields you want to set, there's another option.

Because Ruby is so cool, you can add this feature yourself to all classes at the same time.


class Class
  def new!(*args, &block)
    # make sure we have arguments
    if args && args.size > 0
      # if it's not a Hash, perform a normal "new"
      return new(*args, &block) unless Hash === args[-1]

      # grab the last arg in the list
      last_arg = args.pop

      # make sure all fields actually exist
      last_arg.each_key {|key|
        unless public_instance_methods.include?("#{key}=") do
          raise ArgumentError.new(
                           "No attr setter for name: #{key}") 
        end
      }

      # create the object and set its fields
      new_obj = new(*args, &block)
      last_arg.each {|key, value|
        new_obj.send "#{key}=", value
      }
    else
      # no args, just do a normal "new" with any block passed
      new_obj = new(&block)
    end
    new_obj
  end
end

So with such a simple piece of code, we now have a new! method on all classes that accepts a final parameter--a hash of field names and values--that can be given using Ruby's named-parameter-like syntax. Given a simple class, like the following:


class MyObject
  attr_accessor :foo
  attr_accessor :bar
  
  def initialize(msg)
    puts msg
  end
end

No additional work is needed to use our new! method:


x = MyObject.new!("yippee",
                    :foo => "hello", :bar => "goodbye") 
 => "yippee"
p [x.foo, x.bar]
 => ["hello", "goodbye"]
y = MyObject.new!("blah", :yuck => "baz")
 => error: "No attr setter for name: yuck"

The reason this works is that all classes are instances of the Class class. So the MyObject class definition above is roughly equivalent to saying:


MyObject = Class.new {
  # class def logic here
}

This means that instances of Class, like MyObject, inherit methods defined on Class, like new!. Since all classes in the system are Class objects, all classes instantly gain a new! method.

This is a perfect example of why Ruby is such a powerful language, and why it's so easy in Ruby to use the coolest metaprogramming tricks. And it's a primary reason why frameworks like Rails have been able to do such amazing things. With a language that's this powerful and this easy, you can imagine what else is possible.

Are we having fun yet?

Sunday, May 27, 2007

Adding Annotations to JRuby Using Ruby

I love how meta-programmable Ruby is.

JRuby doesn't support annotations because Ruby doesn't support annotations. So what! We can extend Ruby to add something like annotations:


class JPABean
def self.inherited(clazz)
  @@annotations = {}
end
def self.anno(annotation)
  @@last_annotation = annotation
end

def self.method_added(sym)
  @@annotations[sym] = @@last_annotation
end

def self.get_annotation(sym)
  return @@annotations[sym]
end
end

Given this, we can do things like the following:


class MyBean :sql => "SELECT * FROM stuff WHERE name = 'foo'"
def foo; end

anno :field => :hello_id
attr :hello

anno :field => :bar_in_table
attr_accessor :bar
end

And here's some output from the resulting class:


p MyBean.get_annotation(:foo)
p MyBean.get_annotation(:hello)
p MyBean.get_annotation(:bar=)

=>

{:sql=>"SELECT * FROM stuff WHERE name = 'foo'"}
{:field=>:hello_id}
{:field=>:bar_in_table}

So as you can see we really do have all the necessary requirements to annotate classes. Now what if we just had MyBean be a java.lang.Object extension and stuffed the annotations into the resulting generated class? We can already create real Java classes in this way, but with the above syntax and a little magic in the JRuby Java integration later, we've got annotation support in on Java classes too. This should enable things like Hibernate, JPA, and JUnit 4 to work with JRuby's Ruby-based classes. Or at least, I believe it to be possible. It just requires a little work to add annotation information to the resulting generated classes.

I've planted the seed here and on the JRuby dev list...let's see if it germinates.

JVM Languages: The Future

I've started a Google Group for all those interested in the future of alternative languages (i.e. not Java) on the JVM. A number of you knew this was coming, and I've already invited folks that expressed an interest in my RedMonk Unconference session at JavaOne. The rest of you are also invited.

Please plan to talk shop about language implementation strategies, pain points on the JVM, and what we can do to build a common set of libraries, frameworks, and patterns to ease and improve the Java platform's support for many languages. The time for action is now, my friends, and we have a wealth of information across all our language projects to draw from. We must come together to build a stronger base for everyone to stand upon.

I welcome you to the future.

Saturday, May 26, 2007

Foo Camp 2007

I've been invited to this year's Foo Camp. Since it sounds like a great time and there's a bunch of other folks going I'd like to talk to, I'm accepting the invitation. My purpose posting this entry is to let other campers know I'll be there and try to hook up and chat a bit beforehand. So drop me a line...firstname.lastname@sun.com.

FYI, if any other campers are coming up from Oakland, I'm renting a car. My times are listed on the "Rides Offered" page.

I'm looking forward to a fun weekend!

Tuesday, May 22, 2007

The Final Bugs

We're within weeks of a final JRuby 1.0. We've pared down the bugs that we think we can or must fix for 1.0 and this email is a summary of the ones we need help with. So whatever time you can spare, please have a look and help resolve these.

In order of decreasing priority in JIRA:

JRUBY-820: Net::HTTP.get behaves differently form MRI, failing to get UTF8 properly
-and-
JRUBY-828: UTF-8 regular expressions aren't working

I believe these are both largely the same problem, and it's probably the most visible remaining bug we need to fix. Regular expressions in JRuby currently do not work well with unicode strings, both as incoming match strings and as the regular expression strings themselves. REXML uses /u regular expressions, which is why I believe these issues are closely related. Wes Nakamura has commented that he's looking into 828, but more eyes will help ensure this is fixed.
I believe these are true blocking bugs for 1.0, and I'm not comfortable doing a release if they are not resolved.

JRUBY-971: Make it clear which Java method maps to each equality method

I think we have subtle equality bugs remaining largely because we don't use a consistent naming convention for the Java implementations of Ruby methods like ==, ===, eql? and so on. We should make a quick sweep through the system and make sure all core JRuby code is using the same Java method names for all of these, and binding them in the same way. Note: This is actually the cause of some set_trace_func bugs, since the default === impl should actually invoke ==, causing trace events for both.

JRUBY-969: Add rake and rspec to standard JRuby distribution for 1.0

We have decided to ship both Rake and RSpec with JRuby 1.0, since they are both well-accepted and widely used (moreso for Rake, but RSpec has gained a lot of acceptance the past year). But the mechanism of including them is unclear. I do not want to commit installed gems to SVN; I would prefer to just install them as needed for testing and when building the JRuby distribution. Thoughts?

JRUBY-672: java.lang.Class representation of Ruby class not retrievable

This bug is basically just looking for a way to get at the actual java.lang.Class representing a Ruby-based extension of a Java class. There may have been an API added with Bill's work, or this could be simple to add.

JRUBY-914: JRuby's BigDecimal needs to round

Stu submitted a patch for this, but it unfortunately depends on behavior only present in Java 5 and higher. Since most of us are pretty unfamiliar with BigDecimal, we don't know of a good workaround to support it correctly in Java 1.4.

JRUBY-822: jruby fails to report process exit status correctly

I think Nick may have a grip on this one, but if anyone can offer suggestions as to why the exit codes are apparently shifted in this way, please let us know.

JRUBY-966: Various issues with LocalJumpError not being created early enough
-and-
JRUBY-767: next in an eval should produce a local jump error

These are likely going to be up to Tom and I since they involve pretty deep runtime/interpreter work. But we're also looking for any remaining known issues with LocalJumpError/JumpException that are still out there. I believe we've fixed all such externally-reported issues at the moment.

JRUBY-888: Make regular distro/build create the all-in-one jar by default

We waffle back and forth on this. What's the answer? An all-in-one jar by default, or on request?

JRUBY-873: ant test thread tests sometimes run forever

I managed to get a trace on this one. It appears that some of the tests cause a deadlock between stopping a thread and other operations. But I don't believe anyone's seen it in normal execution yet (or at least nobody's reported it). I'm probably the only one familiar enough with the threading subsystem to fix it, but I'd like to know if anyone's seen other issues.

JRUBY-98: allow "/" as absolute path in Windows
-and-
JRUBY-36: Dir['...'] incompatibilities between Ruby and accross platforms
-and-
JRUBY-61: IO CRLF compatibility with cruby on Windows
-and-
JRUBY-644: Windows line delimiter (\r\n) cause position errors

These blasted Windows pathing and newline bugs just seem to hang on forever. Nobody wants to fix them. There must be someone out there that can help resolve these, yes?

JRUBY-884: Create and consolidate extension and non-standard options

There are a number of configuration options for JRuby. Some have their own flags, like -O to disable ObjectSpace and -C to force compilation of a target script. But many others are only configurable via Java system properties. What we really could use from the community is some idea which settings are worth adding special flags for. We can handle the rest. My personal opinion is that the new -J flag that allows passing flags to the JVM is actually enough.

By my estimation, the rest of the bugs are either almost done or are trivial enough they could be punted to a post 1.0 release. But if we can get these issues knocked down soon (starting with those top couple), 1.0 is going to be an extremely solid release.

Monday, May 14, 2007

Big Plans

I just came across this amusing entry from an old, now-defunct blog of mine on LiveJournal.

The last line pretty much says it all. I must have been really frustrated with my job at the time:

Someone should pay me to sit at home and do great things.

(If you don't get the joke: I now work for Sun Microsystems, at home, on JRuby...so if JRuby's ever considered a "great thing", the prophecy will be fulfilled.)

And yes, I've seen the Microsoft news. I'd hate to be an OSS developer or apologist at Microsoft today. If Sun did something like this I'd resign.

Thursday, May 10, 2007

JSR-292 Summit Wrap-Up

Today Tom and I met with fellow language implementers Oti Humbel, Charlie Groves, and Tobias Ivarsson to start discussions about JSR-292 (invokedynamic) with John Rose, longtime VM implementer and new spec lead. It was an excellent discussion, with John listening patiently to the implementation challenges we've had on JRuby and Jython and offering suggestions. He took copious notes and posted them here.

We have the second edition of the JRuby on Rails talk tomorrow morning, so I'm signing out now. But check out John's post, it's a great first look at a promising start on 292 work.

Monday, May 07, 2007

"the solution is JRuby"

Big news today from ThoughtWorks Studios! Their collaborative development project management solution Mingle (think SourceForge, but pretty and usable) will be the first commercially-distributed Rails application to run on JRuby. Jon Tirsen summarizes the justification for this move in a Studios blog post.

The reasons are obvious ones to those of us who've poured our hearts and souls into making JRuby succeed, but they're excellent points to call out for the Ruby community at large. Mingle 1.0 will be released running with a Jetty web front-end and a Derby database back-end. Mingle 1.1 will be released as a WAR file (Rails-in-a-WAR courtesy of GoldSpike) targeting any app server and any database.

This is a big day for JRuby. ThoughtWorks is one of the most respected names in the Ruby world, and we're proud that they've chosen JRuby as their enterprise deployment solution. We're happy to have ThoughtWorkers like Ola Bini on the team, and look forward to collaborating with ThoughtWorks to continue improving JRuby into the future.

Wednesday, May 02, 2007

JRuby: The T-Shirt!

Yes, I've managed to secure a small order of JRuby t-shirts for giveaway at JavaOne and RailsConf. And they're pretty sweet, have a look at the logo on the front:

And the back will have in white the words:

include Java

How cool is that? The logo was designed and contributed by "Liz" (Liz, let me know if you want me to post contact info or anything...great work) and arranged by Damian Steer, longtime JRuby contributor. I handled the legwork of arranging the shirts, and managed to secure a little funding to get them paid for.

So! How does one get one of these slicko t-shirts? Well, the first round of giveaways goes to folks who've made contributions in the past year. And by contributions I mean either patches or some serious research and legwork on a bug. Obviously there's a core group of folks that should automatically get shirts, so contact me about it if you think you qualify and I'll try to set one aside.

Otherwise...You've got the next week or two to fix a bug, post a patch, and otherwise make yourself useful to the JRuby project. I'm gonna be pretty strict on this...if you want the goods, you've got to help out. May is 1.0 cleanup time, with an RC coming out soon and hopefully a final not too long after. So if you'd been considering helping out, now's the time!

Visit the JRuby homepage, check out the JRuby bug tracker, and grab JRuby source. The details of contributing are linked from the front page of the JRuby wiki.

Update: Oh yeah...how will you get the shirts. The easiest way will be to hunt down Tom and I at the JRuby booth (in Sun's section) between 1PM and 3PM every day. We'll verify that you're not a slacker and make the arrangements. We're looking forward to meeting contributors in person. Otherwise, email me and we'll try to make other arrangements.

For RailsConf, I'll just have the shirts at the hotel, and we'll figure something out. Maybe I'll just bring the box to our training and session there.

One JRuby on Rails Talk Already Full!

Tom and I are giving our JRuby on Rails talk twice at JavaOne next week, and while wrestling with ScheduleBuilder I discovered something surprising: the thursday session is already FULL! Perhaps I underestimated how many people would be interested in seeing us prattle on about the future of Java web development with Rails. I expected that two sessions was extremely optimistic of the scheduling committee. It appears I was quite wrong!

If you haven't signed up for the JRoR session yet, you better do it quick. I'd hate for any of my regular readers to miss out on attending the session. It's gonna be a great show!

Tuesday, May 01, 2007

Dynamic Languages Event at CommunityOne RedMonk Unconference

Ok folks, this is the beginning. I'm going to try, in the midst of JavaOne Day Zero chaos, to run the first big *open* JVM dynamic languages get-together.

It will be part of RedMonk's Unconference running parallel to CommunityOne on Monday, May 7th. It is a *FREE* event. Yes, I said *FREE*.

We're going to have a number of Sun engineers there, including Tom Enebo and myself, John Rose from the JVM, Alex Buckley, and more. There will be representatives from JRuby (besides Tom and I), Jython, SISC, Rhino, Groovy, and others. It's going to be our big opportunity to make sure the JVM continues on a path of solid support for dynamic (and other) languages. And lest ye forget, these discussions will directly influence Java 7...so it's a unique opportunity we must not waste.

Why now? Because interest in dynamic languages for general-purpose application development--especially on top of mainstream virtual machines like the JVM--has simply exploded in the past year. At JavaOne, there's an entire track devoted to Tools and Languages, where the party line in the past has always been "Java Java Java". There are more language-related talks than anyone could attend, where before they were few and far between. And perhaps most importantly, Microsoft today showed they're doing the exact same thing we want to do, announcing their CLR-based Dynamic Language Runtime.

But we have a different palette with which to paint. The JVM is truly open source. All the language implementations are truly open source. We have vast communities in the Java world and strong, energetic communities around each dynamic language project. And all those communities have deep roots in the open source world. Our world is tailor-made for collaboration, and now is the time.

So, on Monday, May 7th (good lord, less than a week away!) I invite language implementers and alternative JVM language enthusiasts to join us. The time has not been decided yet, but I've got two other sessions at 4:00 and 5:00, so it will be before that. Perhaps during the early afternoon period? There's also the possibility of making arrangements for a second discussion that week, maybe co-opting a BOF or a BOF room.

For those of you unable to make it, please try to find a proxy that can represent your interests, or make those interests known to me over email. And know also that I'll be in San Francisco from the 3rd until the 15th, and in Portland, Oregon for RailsConf from then until the 21st. I'd love to get together and discuss language implementation and the future of the JVM. And I love a good beer.

Friends, dynamic languages have truly arrived on the JVM. Let's get together to ensure they feel welcome.

Watch this space for more details.

Wednesday, April 25, 2007

What Would I (Will I?) Change About Ruby

The latest Ruby Blogging Contest hits close to home: What changes would make Ruby into a better language without making it into something that isn't Ruby?

As you might guess, I've got some pretty strong thoughts here. I'm not a heavy Rails user, and I'm not as heavy a Ruby user as I'd like to be. But implementing a Ruby interpreter and now compiler has taught me a few things about what's right and what's wrong with Ruby. I'm not going to complain about performance, whine that the C code is too hard to follow, or even attack C-based extensions. Those may be important issues, but they're all fixable in the long term without breaking anything that works today (or by providing reasonable substitutes). I'm also not going to go into language design ideas...I have mine, you have yours, Matz has his. But my money's on Matz to do the "right thing" with regards to actual language design.

What I'm talking about are a few really important changes to the Ruby runtime, libraries, and ecosystem. Take these as my educated opinions...and don't think too hard about whether I'll be working to change these things in JRuby and in the wider Ruby world.

1. Threading

This more than any other area probably means the most visible changes to Ruby. Ruby currently is green-threaded, as most of you know. JRuby implements native threads mainly because Java uses native threads...we just piggyback off the excellent work of the JVM engineers. And the developing Ruby 1.9, the future successor to the current version 1.8 C implementation, provides something in the middle: native threads with a giant lock, so threads won't run concurrently.

So in general, Ruby is trending toward support for native threads. But there's a problem...some of Ruby's current APIs are impossible to do safely with native threads (and in general, impossible to really do safely with green threads...Ruby just does them anyway). Threading needs to be improved, with support for concurrent execution and removal of operations that prevent that.

Specifically, the following operations and features are inherently unsafe, and are not supported by any mature threaded system:

Thread#kill: Killing one thread from another may leave its locks and resources in an unpredictable state. JRuby currently implements this by setting a kill flag on the target thread and waiting for it to die--basically asking the thread to "please die yourself"--but it's not deterministic and the thread could fail to die.
Thread#raise: Forcing another thread to raise an exception can have the same effect as kill, since the thread may not expect to handle the given exception and may not be able to release locks or tidy up resources. JRuby handles this similar to kill, by setting a field to contain the exception a target thread should "please raise", but again it's not deterministic and there's no way to guarantee the target thread will raise.
Thread#critical=: There is no way to deterministically force true concurrent threads to stop and wait for the the current thread, not to mention the horrendous race conditions that can result when locks are involved. As a result of the many critical problems with critical=, it is already slated to be removed in Ruby 1.9/2.0.

In order for Ruby to survive in a parallel-processing era, unsafe threading operations need to go, and any libraries or apps that depend on them need to find new ways to solve these problems. Sorry folks, these aren't my rules. I understand why people like these features...I like them too. But you can't have your concurrency and eat it too.

2. ObjectSpace

ObjectSpace is Ruby's view into the garbage-collected heap. You can use it to iterate over all objects of a particular type, attach finalizers to any object, look an object up by its object ID, and so on. In Ruby, it's a pretty low-cost heap-walker, able to dig up objects matching particular criteria for you on a whim. It sounds like it might be pretty useful, but it's used by very few libraries...and most of those uses can be implemented in other (potentially more efficient) ways.

JRuby implements ObjectSpace by keeping a separate linked list in memory of weak references to created objects. This means that for every ObjectSpace-aware object that's created, a weakref is added to this list. When the object is collected, the weakref removes itself from the list. Walking all objects of a particular type just involves walking that list. Reconstituting an object ID into the object it references is supported by a separate weak list (again, more memory overhead).

There are no plans currently for ObjectSpace to be removed from Ruby in a future version. But there's a problem...in addition to being pure overhead in JRuby (which you can turn off completely by using the -O flag), ObjectSpace limits evolving development of the Ruby garbage collector, breaks heap and memory transparency, and poses yet more problems for threading.

There are many issues here. First off, the JRuby thing. By having to add ObjectSpace governors for all objects in the system, JRuby pays a very large penalty. We're forced to do this because the JVM (and most other advanced garbage-collecting VMs) does not allow you to traverse in-memory objects nor retrieve the object that is associated with a given ID. In general this is because the JVM does all sorts of wonderful and magical things with objects and memory behind the scenes, and the ability to ask for all objects of a given type or pull an object based on some ID number at any time cripples many of these tricks.

The threading issues are perhaps more important. Imagine if you will a true concurrent VM, with many threads creating objects, maybe one or more threads collecting garbage, and synchronizing all this to guarantee the integrity and efficiency of the heap and garbage collector. There is absolutely no room in this scenario for those multiple threads to request lists of specifically-typed objects at any time, nor to provide an ID and expect its object to be presented to you. These features break encapsulation across threads, they violate security restrictions from thread to thread, and they require whole new levels of locking to ensure that while reading from the heap no other thread produces new objects and no garbage collection occurs. As a result, ObjectSpace harms Ruby by limiting the flexibility of its garbage collecting and threading subsystems, and should be eliminated.

3. $SAFE and tainting

Safe levels are a fairly simple, straightforward way to set a "security level" that governs what operations are possible for a given thread. By setting the safe level to various values, you can limit modification of Object, prevent IO, disallow creation of new methods or classes, and so on. Added to this is the ability to "taint" or "untaint" objects. Tainted objects are considered "unsafe", and so certain security levels will cause errors to be thrown when those objects are passed to safe-only operations.

JRuby has safe level and tainting checks in place, but it's almost assured they're not working correctly. We have never tested them, largely because practically no tests (or perhaps literally no tests) use safe levels or tainting, and we've had *exactly one* bug report relating to safe levels, just a couple weeks ago. And to further kill the possibility of JRuby ever supporting safe levels and tainting correctly, my work tonight to fix some safe level issues revealed that doing so would add a tremendous amount of overhead to almost all critical operations like method creation, module/class mutation, and worst of all, object creation.

At this point, safe levels will probably remain in their current half-implemented state for 1.0, but I think it's almost decided for us that safe levels and tainting will simply not be supported in JRuby. In their place, we'll do two things (which I'd recommend the C implementation consider as well:

Recommend that people who really want "safe" environments use an approach like whytheluckystiff's Sandbox, which takes a more JVM-like approach to safety: it runs code in a true sandboxed sub-runtime with only "safe" operations even defined. In other words, not only is it disallowed to load in files or hit the network, it's physically *impossible* to do so. What makes this even better is that Sandbox is already supported in JRuby (gem "javasand") and JRuby out of the box allows a fine granularity of operations to be disabled in new runtimes.
Implement safe levels like Java handles security restrictions, which we get to leverage since they're already being checked and enforced at the JVM level. We will not be able to map everything...for obvious reasons, checking tainted strings all the time or limiting class and method creation are unlikely to ever happen, but we can limit those operations that the JVM allows us to limit, like loading remote code, opening sockets, accessing local files, and so on. So it's highly likely JRuby's implementation of safe levels will map to clearly-defined sets of Java security restrictions in the near future.

4. Direction

Ruby is a very free-form community. Matz is the most benevolent dictator I've had the pleasure to work with, and most of the community are true free-thinking artists. It's like the hippie commune of the language world. Peace out, man.

But there's a problem here. Ruby needs guidance beyond VM and language design or the loose meanderings of its more vocal community members. It boils down to a few simple points:

Ruby needs a spec. Anyone who believes this isn't true isn't paying attention. Now I'm not talking about a gold-standard legal document signed in blood by Matz and the chief stakeholders of the Ruby community. An officially sponsored, widely supported, and massively publicized community spec would work fine--and probably fit the community and the language better. But something needs to done quickly, since Ruby's "bus number" is dangerously low. A spec is not something to be feared...it's a guarantee that Ruby will live on into the future, that alternative implementations (like JRuby) can't intentionally introduce nasty incompatibilities (or at least, that they'd be easy to discover and easy to document), and perhaps most importantly...that the full glory and beauty of Ruby is published forever for all to see and explore, rather than dangerously trapped in very few minds.
Ruby needs a non-profit governing body. I'm not necessarily talking about a council of elders here, I'm just talking about some legal entity to which OSS copyrights can be assigned, donations can be made, and from which projects and initiatives can be funded. Maybe this would be RubyCentral, maybe this would be some other (new) organization...I don't know that. But it would be a great help to the community and Ruby's future if there were some official organization that could act as caretaker for Ruby's future. I'm all set to sign over any JRuby copyrights I have to such an organization, to protect the future of Ruby on the JVM just like the future of the C implementation. How about you?
Ruby needs you. Granted, this isn't really a change as such. You probably wouldn't be reading this if Ruby didn't already have you. But the Ruby community is at a big point in its lifetime...at risk of losing its identity, being eclipsed by newer projects, or even slipping deep, deep into the trough of disillusionment. What will prevent that happening is the community showing its strong ties, coming together to support official organizations and official documents, and above all, continuing to pour all our hearts into creating newer and better applications and libraries in Ruby, pushing the boundaries of what people think is possible.

Monday, April 16, 2007

Paving the Road to JRuby 1.0: Performance

Since it looks like Antonio Cangiano is going to delay the next running of the bulls until Ubuntu 7.04 is released, I figured the next JRuby 1.0 update could be about performance.

Performance is such a tricky area for Ruby. Folks outside the Ruby community happily malign its performance, fueled by both FUD and by some truths. Folks within the Ruby community either aren't affected by Ruby's performance (it's "fast enough") or they simply don't care (it's slow, but I still love it too much to leave). A small part of the Ruby community takes what is in my opinion a rather anti-Ruby stance: "write it in C" as a targeted solution for identified bottlenecks. I suppose reality lies somewhere inbetween all these views, with Ruby's performance certainly not being stellar in the general case, but reasonable and sometimes surprisingly good for specific cases.

Ruby 1.9 has raised the promise of a new bytecode-based interpreter engine--a Ruby "virtual machine" by some reckonings--with the goal of improving performance foremost on the minds of Ruby's developers. And again the performance question is rather complicated. Antonio's recent shootout, running only Ruby 1.9's chosen benchmarks against all other implementations, shows it doing extremely well. It comes out many times faster than Ruby 1.8 in almost every test, and no other implementation even comes close. The truth however doesn't change very much; for non-synthetic benchmarks (like running Rails) the situation only improves by about 15% for some tests, and for many other tests performance actually degrades. Of course Ruby 1.9 is still under heavy development, and many more improvements are ahead, but the wide range of results demonstrates again that benchmarking must be taken with a grain of salt.

Et tu, JRuby?

So then there's us and JRuby. JRuby's professed goal has never been to be a better Ruby; at best, we're trying to build the best Ruby possible on top of what we believe is the best VM in existence. And toward that end we've put most of our time into compatibility and correctness above all else, aiming for the goal of complete Ruby language compatibility and near-complete builtin-class compatibility, hoping for JRuby to someday be treated as "just another Ruby implementation" on which people can run their apps and design their clever frameworks and libraries. But in the past year, it's become apparent that we could actually exceed Ruby's performance for specific cases in the near term, and for general cases over time. So the target market for JRuby seems to be changing from "Ruby users that must use Java VMs, libraries, and servers" to "Ruby users that want a better-performing, more scalable implementation". And our noble quest for near-complete compatibility gets muddled with all these fiddly performance details.

But that's life, right? The goals you set out for yourself and your projects rarely align perfectly with the goals others set out for you. The trick is achieving a balance between what you want to do with your life and what others (like your community members or your employers) want you to do. Perhaps the successful developer is the one who can derive pleasure from both tasks.

On the road to JRuby 1.0, we've done our best to balance compatibility and performance. We are now nearing the end of the compatibility road, with Ruby language features nearly 100% and builtin classes almost as complete as we can make them on the JVM. The real reason for a JRuby 1.0 now is that we believe we're finally approaching "Ruby compatibility" for some high measure of compatibility, such that the vast majority of platform-agnostic Ruby code should run successfully on JRuby. And that's certainly no small feat, given that just a year ago we celebrated a mostly-broken cobbled-together Rails 1.1 app slowly handling CRUD operations. Today, people are deploying JRuby on Rails apps in production, and the game has only gotten more interesting.

So then, performance. You're all wondering what the answer is to this performance thing, aren't you? Is JRuby going to blow away all competition, including the nascent Ruby 1.9 and mid-term projects like XRuby, Rubinius, and Ruby.NET? It's certainly possible, but it's not our goal. Is JRuby going to be faster than Ruby 1.8 when 1.0 is released? For specific cases, I'd say yes...there's plenty of areas we already perform better than Ruby 1.8. For the general cases, it's hard to say. We perform well serving up Rails requests today, but only about 50-70% of Ruby 1.8's performance. And though we know where most of the bottlenecks lie, we're a little resource limited trying to fix them. Do we believe that we'll be faster than Ruby 1.8 in all general cases in the near future? Yes, we strongly believe that will happen.

Now of course I could ramble on and on about performance and put you all to sleep, but actual numbers will probably keep your interest better than my droning.

The Test

Like the shootout, I'm just running the Ruby 1.9 benchmarks here. We have not done any targeted optimization for these tests; there's no Fixnum magic or anything like that under the covers. What we have done is implement multiple general-purpose optimizations to speed method and block invocation, object creation, and interpretation. We've also spent a little more time getting these tests to compile successfully, but of course any work done on the compiler is generally applicable as well.

These results are all based on JRuby trunk code, revision 3480. I'm running Java 6 on a MacBook Pro 2.16GHz Core Duo, and all code was compiled to target Java 6.

For the first set of results the JRuby command executed basically amounts to the following:

JAVA_OPTS=-Xverify:none jruby SERVER -O [script.rb]

JAVA_OPTS=-Xverify:none specifies not to verify classes on startup; this is a large part of the speed hit Java applications have when starting. We turn it off here to remove a little of that overhead from the benchmarks, since most of them are very short runs to begin with.
SERVER specifies that JRuby should use the "server" VM, which takes a bit longer to optimize a bit more heavily when JITting Java code into native instructions. JRuby generally performs best under the server VM, though using it impacts startup time.
-O disables ObjectSpace in JRuby. This may seem like cheating, but the truth is that when ObjectSpace is enabled we pay double or triple the object creation cost in JRuby since we have to track all objects separately. Ruby's ObjectSpace is essentially zero-cost...it's just a window into the memory manager. Since we don't have a low or zero-cost way to implement ObjectSpace, it falls into what I categorize as "optional incompatibility". If you don't need it, turn it off and JRuby performance will improve. You can call it cheating if you like...the truth is that practically no code actually depends on ObjectSpace.

And then there's the standard disclaimer for any Java application: these times include startup, about 1.0 to 1.3 seconds. I know you all are picky about how benchmarks are run, and you love to include startup time even though the vast majority of the world's work is not done in the first few seconds of execution, but if you'll forgive the disabling of ObjectSpace I'm willing to meet you half way. Startup time is included.

TEST                    MRI     JRuby
--------------------------------------
app_answer              0.584   2.239
app_factorial           ERROR   4.248
app_fib                 7.126   10.549
app_mandelbrot          2.346   10.300
app_raise               2.587   4.441
app_strconcat           1.829   2.141
app_tak                 9.711   13.345
app_tarai               7.529   11.050
loop_times              5.475   6.903
loop_whileloop          9.982   11.797
loop_whileloop2         2.009   3.292
so_ackermann            13.726  26.132
so_array                7.257   8.801
so_concatenate          2.063   3.546
so_count_words          0.507   4.491
so_exception            4.342   8.346
so_lists                1.238   2.744
so_matrix               2.258   4.241
so_nested_loop          5.609   7.898
so_object               7.050   6.496
so_random               2.139   4.643
so_sieve                0.740   2.240
vm1_block               23.604  27.405
vm1_const               16.774  20.650
vm1_ensure              16.546  15.800
vm1_length              21.210  21.899
vm1_rescue              13.170  16.197
vm1_simplereturn        21.091  31.376
vm1_swap                25.114  17.949
vm2_array               6.049   5.690
vm2_method              13.528  17.759
vm2_poly_method         16.886  24.956
vm2_poly_method_ov      4.576   6.972
vm2_proc                7.060   7.797
vm2_regexp              4.421   9.353
vm2_send                4.332   8.198
vm2_super               4.992   7.944
vm2_unif1               3.838   6.095
vm2_zsuper              5.409   8.452
vm3_thread_create_join  0.019   1.592

Some of these are rather surprising results. This is JRuby running in plain old interpreted mode, with no compilation involved. The majority of the tests still have JRuby slower than Ruby 1.8, but the gap has narrowed an incredible amount since last year. In only a few tests are we more than twice as slow as MRI, and on a couple we're almost twice as fast. If you'll imagine startup time removed from these numbers, and believe me when I say Java takes far more than 30-60 seconds to rev up to full speed, then the situation looks even better. What's more, we've got a good several weeks before the first 1.0ish release is scheduled (something betaish or RCish that proudly proclaims it's "done") and a bunch of great committers and community members eyeing performance metrics.

Ok, you may be asking "what about JRuby's compiler?" Yes, there is a compiler in the works. It's primarily been my job to build out the compiler, though Ola has jumped in a few times to offer his excellent help. And progress has been slow but steady. You have to remember that in all the world, there's no known 100% complete Ruby compiler for a general-purpose VM. There's Ruby 1.9, but its bytecodes have been custom designed around Ruby. There's XRuby and Ruby.NET, but it's still unclear how complete they really are. So this is an open area of research and development. But the results are looking great so far.

For this test, both the ahead-of-time (AOT) compiler and the just-in-time (JIT) compiler modes are activated. This forces the target script to be compiled before execution and also compiles any methods hit heavily once execution gets going. The command amounts to the following:

JAVA_OPTS="-Djruby.jit.enabled=true -Xverify:none" /
jruby SERVER -O -C [script.rb]

-Djruby.jit.enabled=true enables the JIT compiler. The default threshold at which a (compilable) method gets compiled is 50 invocations.
-C tells JRuby to compile the target script before executing it. If the script can't be compiled, JRuby bombs out with an error.

The compiler can't handle all the Ruby 1.9 tests yet. Specifically, it doesn't handle multiple assignment (e.g. vm1_swap), exception handling (anything involving rescue or ensure), or full class definitions. But it compiles the majority of the tests.

TEST                    MRI     JRuby
--------------------------------------
app_factorial           0.029   3.459
app_fib                 7.094   5.093
app_mandelbrot          2.340   9.011
app_strconcat           1.827   2.391
app_tak                 9.714   5.394
app_tarai               7.515   4.642
loop_times              5.428   2.942
loop_whileloop          10.016  6.027
loop_whileloop2         2.012   2.191
so_ackermann            13.610  11.254
so_concatenate          2.043   2.396
so_lists                1.250   2.141
so_matrix               2.256   2.394
so_nested_loop          5.614   4.167
so_random               2.158   3.291
so_sieve                0.741   1.798
vm1_block               23.392  12.397
vm1_const               16.980  8.908
vm1_length              21.094  10.901
vm1_simplereturn        21.252  9.345
vm2_array               6.025   3.041
vm2_method              13.049  7.794
vm2_regexp              4.468   7.647
vm2_unif1               3.855   3.293
vm3_thread_create_join  0.017   1.389

Ahh, now things look a bit different! In almost every case, JRuby performs better than Ruby 1.8. In the long running cases, the difference is even more obvious. Short runs still put Ruby 1.8 ahead, but I'm totally ok with that. Java, and by extension JRuby, has never had stellar short-run and startup performance. But we don't really have to if the heavy, long-running apps people actually use end up running faster.

Note also that this is the first real compiler we've had; it's not doing any optimization like Ruby 1.9's and Rhino's compilers, and it's almost certainly far from being efficient. I'm no compiler expert, and this is my first real attempt. These numbers already looking so good demonstrates that there's a grand adventure ahead of us: Ruby really can be made to perform well on the JVM. It just requires a little confidence and a little more effort.

That pretty much wraps up this installment. The bottom line: As we approach JRuby 1.0, our performance is better than it's ever been--faster than Ruby 1.8 for many specific cases, with stellar across-the-board performance right around the corner. And with other implementations like Ruby 1.9, XRuby, Rubinius, and Ruby.NET rapidly coming of age, Ruby's future is looking extremely solid.

script type="text/ruby"

Remember way back when the IRB applet was first thrust upon the world, and I promised that you truly could now use Ruby in the browser? At the time, it was still a little cumbersome, and generally you could only use Ruby by passing it directly into the applet.

Well Dion Almaer has taken it to the next awesome step:

<script type="text/ruby">

Check out his blog entry on how to enable true Ruby scripting on the client side, as well as the link to his demo page (still loads up the applet in the background, but that's a minor implementation detail).

Awesome.

Friday, April 06, 2007

Paving the Road to JRuby 1.0: Unicode

Well friends, the countdown has begun. Within the next several weeks, we're looking at getting a 1.0 release out in either beta or RC form. Of course most of you OSS folks know how arbitrary release numbers are...the day after we branch 1.0 we're going to start committing to 1.1, and the cycle will continue. But it seems like the last year is coming to a very solid milestone, and these stabilization points are necessary once in a while. So it comes to pass.

As part of "closing the door" on 1.0, we're weighing features and bugs and wishes to decide what makes the cut and what must wait for a future release. To help get the word out, I'll be doing a series of entries on where we are, where we'll be for 1.0, and where we're going in the future. First off, one of the biggest issues for Ruby--Unicode support--and what it will look like in JRuby 1.0.

Unicode

Straddling two worlds is complicated. Ruby is dynamically typed, with open classes, redefinable methods, eval...a very different sort of world than Java. If those really obvious items weren't tricky enough, Ruby has another wrench to throw into the works: it cares not for Unicode.

Of all issues, this is probably the most problematic to solve. Java, as you know, is always unicode, representing strings internally as UTF-16 and supporting most external encodings you could ever possibly want. Ruby, on the other hand, pays little attention to encodings most of the time, preferring to just treat strings as an array of bytes. If it's encoded, so be it. If not, so be it. The majority of Ruby's string operations work with bytes (though there are a few smart exceptions), so multibyte-encoded strings can be damaged if you have high expectations.

We've spent the past year weighing options. We want JRuby to be compatible with the C implementation, since it's widely deployed and apps depend on its sometimes quirky string behavior. We also want JRuby to integrate well with Java, and the boundary to be as seamless as possible. How does one accomplish that? We've gone through a couple options:

At first, JRuby's string was represented with a String/StringBuffer internally. This allowed the original JRuby authors the most trivial path to a fully-working string implementation; almost all the heavy lifting was done for them, and for simple cases it worked fine. But the APIs did not conform to what Ruby applications expected, frequently returning 16 bit values for individual characters and reporting incorrect byte lengths for strings that couldn't encode into all 8-bit characters. It was broken, as far as Ruby code was concerned.
The second option, which is what we currently have, is to do what Ruby does...treat all strings as byte[], and implement all operations to the letter of the law. This allows us to make even the lowest-level APIs conform exactly, and to be honest that's been necessary to get applications like Rails up and running. Ruby blurs the distinction between a byte[] and a string so completely that Rails actually now ships with its own multibyte helper, a library that stands a good chance of becoming widely deployed. So to support Ruby and all its many wonderful apps, we fell in line.

At this point it's appropriate to answer a question many of you may have: If JRuby's strings are to be byte[]-based, how can you integrate with Java? A few answers have been proposed. The one I put forward some months ago was that we should treat Java strings like any other Java object, allowing developers to explicitly work with either Java strings or Ruby strings--never the twain meeting without full consent of both parties (or perhaps by using a few helper APIs). This allows Ruby's strings to work as you'd expect, and Java's strings to work as you'd expect, but it doesn't get the nicest flowthrough we really want in this case. So then there's a compromise option likely to be "it" for 1.0.

When crossing the boundary, Ruby will be assumed to use UTF-8 strings.

There are many benefits to this even beyond the fact that it took me 15 minutes to make it work. It allows Ruby strings to behave like Ruby apps expect them to. It allows Java strings to pass into and out of Ruby code without corruption. And it allows Ruby strings to pass into Java code with the only prerequisite (if you want to avoid garbled strings) being that they should be decodable UTF-8. That last item seems like a fairly reasonable requirement given that you're calling 100% unicode Java APIs, and in general the people that will care about this conversion are the ones already using unicode.

Of course, nobody calling these APIs would ever see this or have to do anything special...strings will pass back and forth and encode/decode as necessary, and things will generally just work. And there are performance tricks we can use to speed things up if that becomes a real issue. But it also does something no other proposed solution can: it solves the problem pretty darn well...in time for 1.0.

So for the record, this is the proposed solution we'll go with in the 1.0 timeframe:

Ruby strings are byte[] and conform to Ruby string semantics
Java strings passing into Ruby code will be encoded as UTF-8, with the implication that you should expect to be working with UTF-8 byte[] in the receiving code
Ruby strings passing out of Ruby into Java libraries will be assumed to be UTF-8, and the resulting string on the Java side of the call will reflect that assumption.

And what does the future hold? Well, there's a number of exciting areas happening post 1.0:

At some point, either before 1.0 or immediately after, there will be an installable gem that provides Java-native support for Rails' MultiByte library, which should provide a substantial performance boost there. As it is, the current pure Ruby version works almost 100%, so the extra performance should ultimately just be a bonus.
We will start implementing a Ruby 2.0-compatible string, but it's unclear when would be a good time to flip that switch. I would predict that JRuby users will have a choice of which to use fairly soon, and we'll swap out behavior depending on what you want.
It's also likely that we'll present clean ways to "just use Java strings" for developers that want to do that. This would not result in code that's compatible with stock Ruby, since we'd obviously have to "unicodify" many Ruby methods, but it fits into a larger goal of Ruby-enabling the Java platform as well as possible
And of course, some combination of these three and the current solution is possible, given some time and thought. Ultimately it will be driven by you, the users of JRuby, and what you want to see done with it. Because that's what we here at Sun are trying to be all about...what users and developers want to be doing.

So that about wraps up the unicode story for 1.0. I believe it's an excellent compromise between full Ruby compatibility and tight integration with the Java platform, and it gets us where we need to be right now. So what do you think?

Saturday, March 31, 2007

ActiveRecord 100%, Performance Doubling, Java Support Improving

I haven't blogged a solid JRuby update recently...and this one isn't going to be as solid as I'd like, but it should let you all know that the wheels are still moving and the train is picking up speed. It's astounding how much progress is being made.

ActiveRecord Now Fully Supported

It's good to have such incredible community members and great core team members.

Ola Bini, who recently accepted an offer from ThoughtWorks, has been hitting Rails unit tests hard. And the module with the most failures has always been ActiveRecord. It's the most complicated and the weakest point in the JRuby on Rails story.

Until now.

1031 tests, 3839 assertions, 0 failures, 0 errors

That headline should read "ActiveRecord Fully Supported on JRuby", for you press affiliates out there.

Of course there's a few caveats, but they're pretty minor in this case. That's zero failures and zero errors when running against MySQL, but it's certainly a de-facto standard in the Rails world. The next step is to get as many other databases to 0/0 as possible. And those are much smaller steps than this first one.

So what does this mean for JRuby on Rails users? Well in general, it means that the heart of your DB-driven JRoR apps--ActiveRecord--is now the best-supported piece of the puzzle. It also means that overall Rails support is well into the 99% range and still climbing.

Performance Continuing to Improve

In addition to the focus on getting Rails working well (which is really a focus on improving overall Ruby compatibility) we've been hitting performance much harder the past couple weeks. I especially have been splitting my time between interpreter and compiler work. As a result of all our efforts, the current trunk code is almost another 2X the performance of 0.9.8, released less than a month ago. The compiler also supports almost 2X the number of syntactic constructs it did in 0.9.8, and it's getting faster and more stable every day.

There are actually a number of benchmarks running faster than MRI (the C version of Ruby) even when interpreted, showing that we're finally making good on our promise to have interpreted JRuby run as fast as MRI. So it truly does seem that the compiler will be "bonus" performance once we work out the remaining bottlenecks in the system.

Java Support Looking and Feeling Better

There have been two areas of focus for Java support the past few months: rounding out features and syntax and improving performance. Both had some great jumps in JRuby 0.9.8, with a couple small fixes improving performance many times and with concrete/abstract class extension finally making it into a release.

Since 0.9.8 we've continued to make progress on both fronts. I've made a few additional small changes to improve overall performance, and Ola's been moving some especially critical pieces of code into Java, since they represented a lot of back-and-forth across that barrier. Where calling into Java code was previously as bad as twenty times slower than calling Ruby code, it's now a modest two times slower. We expect to improve that further, getting as close to Ruby invocation speed as possible.

We've also just added what should hopefully be the final revision to our various "import" syntaxes. Oddly enough, we've come full circle to "import":

# create a new constant "System" pointing at it
import java.lang.System

# use Java::syntax for less-common packages
import Java::my.weird.package.MyClass

And we're considering some improvements to how multiple classes are imported and how to specify an alternate constant name (for classes like String, which would be a bad idea to overwrite).

You can expect to see this all solidifying as we run these last miles toward JRuby 1.0.

--

I will blog more on all this as we get into April, but it's truly shaping up to be Springtime for JRuby in JavaLand.