Headius: March 2006

Thursday, March 30, 2006

A Ruby in the Rough

Feeling More and More Ruby with Gems

Things are speedily moving along. We have been continuing our work on Rails, but because of an unusually active JRuby community, we recently changed direction for a bit. We decided to tackle RubyGems.

RubyGems, if you are unfamiliar with it, is a packaging system for bundling up and installing Ruby applications into your local Ruby distribution. It works with both file-based and name-based installs, so that you can either "gem install rails-x.x.x.gem" or "gem install rails" and it will fetch the appropriate archive for you. It manages depedencies and updates seamlessly. It basically does for Ruby what apt-get and rpm do for Linux.

There's not a terribly high level of complexity to the RubyGems code, like there is with IRB or Rails, but it exercises many APIs the former two do not. In our case, the biggest issue was Zlib.

Ruby of course has full Zlib support, with gzip, inflate, checksums, and a nice stream-like API. However, as with many other bit, byte, or math-intensive libraries, Ruby's Zlib is implemented in C code. This certainly speeds up the processing of compressed archives, but it presents a continuing problem for JRuby: we can't inherit *all* of Ruby's libraries just by copying over the lib/ruby/1.8 dir. In this case, luckily, we can do what Ruby does.

Java has included java.util.zip, support for compressed archives, since the 1.1 release, with streams for zip, gzip, and deflate compression and decompression. It also includes the same checksums Ruby includes, CRC32 and Adlter32. It provides probably 90% of what we need in Ruby's Zlib library.

The availability of a pure-Java implementation of a Ruby library certainly cuts down our workload. Since we already have significant capabilities for calling Java code from Ruby, there's usually only a bit of Ruby code required. However everything takes time, and since Rails has been our primary focus we had not spent much time getting Zlib implemented.

Enter our new friend Ola. Ola recently appeared on the jruby-devel mailing list, interested in helping out with JRuby and contributing fixes whereever possible. He also expressed an interest in making RubyGems work. We rattled off a couple items, with Zlib in the list explicitly because it was the major roadblock for RubyGems, and within about a day Ola returned with a preliminary Zlib implementation. I was personally very impressed.

Now it goes without saying that his implementation wasn't perfect the first time, but this really got us going. We switched gears for a few days and worked back and forth with Ola to get Zlib and RubyGems working more smoothly. I think I spent 12 hours on Saturday hacking through Zlib, YAML, and RubyGems code digging for failures. The end result was RubyGems's setup.rb install script running to completion, but it wasn't quite there. Ola came to the rescue to resolve a few more minor issues, Tom committed some additional fixes today, and although we didn't get RubyGems working in time for the 0.8.3 release, we did get a lot done.

At this point, we are able to successfully install RubyGems into the JRuby distribution directory. Huzzah!

Now of course you knew there would be caveats.

Java as a Platform

When running Ruby, there are a number of constants pre-set for you. Among these is RUBY_PLATFORM, which identifies the underlying platform on which Ruby is running. Ruby under Windows reports "mswin". Under linux it reports "linux", unsurprisingly. For JRuby, it made sense to report "java", since we do not match any other platform exactly. Java is its own platform.

The availability of the RUBY_PLATFORM constant allows scripts to change their behavior depending on the platform. Since Windows doesn't directly support symbolic links, discovering a RUBY_PLATFORM of "mswin" means you shouldn't try to create them. Discovering a platform of "linux" means you probably shouldn't generate externally-consumed file paths using backslash characters. Given that even a cross-platform language and implementation must interact with its host environment, these cases make sense.

Unfortunately, Ruby's libraries only know about the platforms that the C implementation supports. Hence, there's a problem when running under JRuby. Since we report our platform as "java", any code that uses RUBY_PLATFORM to turn off unsupported features does not have any effect. The specific case I ran into was within the fileutils library.

fileutils provides a number of convenience methods for manipulating or querying files. There's methods like cp for copying files, uptodate? to check if a file or files' modification times are later than another specified file, and other similar functions. It is with cp that JRuby ran into problems.

fileutils' cp implementation copies one or more source files to a target file or directory. During this process, it also checks whether the target file is equivalent to the source file. This it does in one of two ways:

- On a UNIX variant, fileutils compares the device and inode for the file, using File::Stat#dev and #ino
- On a Windows variant, where device and inode have no real meaning, fileutils compares absolute paths

In order to know which method to use, RUBY_PLATFORM comes into play:


    def fu_windows?
      /mswin|mingw|bccwin|wince|emx/ =~ RUBY_PLATFORM
    end

The fu_windows? method obviously returns true on a Windows platform or on any platform where dev and ino will not be available. Notice that Java is not in that list?

Device and inode have even less meaning in Java, which abstracts away the concept of a "file" into the most generic definition possible. In order to keep Java code running the same on all platforms--without platform checks or code changes--Java's File must cater to the lowest common denominator. That's great for Java code, but for us poor suckers trying to get Ruby to run in Java, it's an issue. We need Ruby's libraries to understand that Java doesn't support the same things a UNIX variant does, and doesn't support other things that a Windows variant does. In short, we need Ruby to understand the Java platform's capabilities.

Now we aren't going to be able to solve that issue today. It would be a little unusual for Ruby to include a "java" platform whereever platform is being queried, and I don't doubt we'd get some pushback on that. We're kicking around possible solutions, ranging from implementing our own versions of those libraries that use RUBY_PLATFORM extensively to including modified Ruby libraries in the base JRuby distribution. In the short term, we chose to locally modify fileutils as follows:


    def fu_windows?
      /mswin|mingw|bccwin|wince|emx|java/ =~ RUBY_PLATFORM
    end

With this change, RubyGems is able to successfully install and copy all required files to the JRuby distribution dir. It's not perfect, but it's a start.

Installation != Working

Aside from the very simple case of installing the sources gem, RubyGems is not quite working correctly in JRuby. With RubyGems installed, previously-functional Rails scripts start breaking in new and exciting ways. Network-based gems do not install, owing to our bare-minimum Socket implementation and our lack of marshalling code for appropriate Ruby classes. File-based gems attempt to install, but get "stuck" somewhere during processing. All are frustrating issues, but we're very close.

Small Moves

Along with RubyGems nearly working, it was a very productive week. A lot of the contributions for RubyGems will be helpful in many other areas, and we have continued work on Rails as well.

- We now have a working Zlib library.
- We have made more progress toward getting Rails to work, by running dispatch.rb/cgi and fixing as we go.
- We finally got out a new release after a long delay.
- We are fairly confident we'll be able to demo everything we want at JavaOne.

Things are looking great, and I'm very excited.

Monday, March 27, 2006

Ultra

It seems that JRuby is more than just an entertaining and challenging project; it's also getting attention from folks that matter at Sun Microsystems.

Tim Bray, Sun Web Technologies director and XML specification co-editor, has recently taken an interest in JRuby. He's corresponded by email with both Tom and myself, and has offered useful insight here and there. It's good to know people are watching JRuby and wishing us well.

Of course, a couple free machines never hurts either.

Tim managed to "jar loose" a couple beautiful Sun Ultra 20 machines, one each for Tom and I. I can't speak for Tom, but considering the dismal condition of my home data center, the new machine is a welcome addition.

I generally have nothing but good things to say about the Ultra 20. We received the base "Large" model as listed on Sun's store; I'll save you a click and paste specs here:

Sun Ultra 20 Workstation
1 AMD Opteron Model 152 Processor
1-MB On-Chip L2 Cache
2-GB Memory
1 250-GB 7200 RPM SATA Disk Drive
1 Quadro FX 1400 Graphics Card
1 DVD-Dual Drive
1 10/100/1000 BaseT Ethernet Port
6 USB Ports
2 IEEE 1394a Ports
1 PCI-Express x16 Slot
2 PCI-Express x1 Slots
4 Conventional PCI Slots (32-Bit/33-MHz)
Sun Studio Software Pre-Installed
Sun Java Studio Creator Software Pre-Installed
Sun Java Studio Enterprise Software Pre-Installed
Solaris 10 Operating System Pre-Installed
3 Year Warranty, 3 Year Parts Exchange

Any review I might give would be impossibly biased by my complete lack of experience with recent hardware. Before the Ultra 20, my fastest machine was my 1.6GHz Pentium M Dell laptop, which overheated when running unit tests too vigorously (producing no end of colorful metaphors from yours truly). I had no desktop, having converted my poor old 1GHz Celeron homebuilt into the headius.com server. Suffice it to say that the Ultra 20 is by far the fastest, quietest, and most impressive workstation I've had the pleasure of using.

Before this machine, the stories of AMD chips trouncing the top-end Intel equivalents were merely anecdotal. Now I get it. The Intel machines I use most days almost seem broken in comparison. Given that AMD topping Intel only means Intel will work harder to create more competitive chips, this bodes extremely well for the next crop.

My only disappointment was that my old 17" Mitsubishi monitor, the only remaining working monitor in my house (don't ask about the 21" Viewsonic that met its untimely demise at the hands of a leaky basement window) was unable to work out-of-the-box with the pre-installed Solaris 10. I was keen to give it a shot, but it seems it will have to wait until I upgrade to a better screen. Hopefully, that will happen in the next week or two.

I would whole-heartedly recommend this machine (or even one of the lower models) to anyone that needs a new box. Even the base "Small" model is still far faster of a machine than I've ever used, and it seems to be priced well. Again, take my star-struck (Sun-struck?) opinion with a grain of salt, but having a powerful, inexpensive computer with a Sun logo on the front seems like a pretty sweet deal to me. YMMV.

In the end, however, a blazing fast machine means faster development, faster test runs, and faster progress on JRuby. Who knows, perhaps I'll even have time for the other 9000 projects I'd like to tackle.

Thank you Tim Bray, and thank you Sun Microsystems.

Monday, March 20, 2006

Enterprise Sour Grapes

Recently a post by James McGovern on Why Ruby Isn't Enterprise Ready floated my way. I felt it necessary to offer up a response. My disclaimers: I am not familiar with McGovern's past work, I am not (yet) using Ruby in the "enterprise", and I am far from a Ruby or Rails expert. I am, however, trying to bring Ruby the capabilities of Enterprise Java using JRuby, and I've done a bit of Enterprise work during my ten years as a Java developer.

Point-by-Point Responses

McGovern's points are not all bad, but most of them are either poorly realized or amount to chicken-and-egg arguments: Ruby isn't in the enterprise now, so it's not ready for the enterprise. I'll try to blaze through the nonsense and tackle the substantive issues more directly.

Point 1: Books for Ruby Suck and are too introductory
Nonsense. Ruby as we know it today is young and has only garnered attention over the past few years. Books are written to be sold, and since almost everyone doing Ruby work has started very recently, the market for introductory books is largest. There are, however, more and more people interested in enterprise Ruby, and so the books are starting to follow. It will take time, just as it did for Java, but it doesn't mean Ruby isn't ready to make the jump to "enterprise" software.

Point 2: Huh? (i.e. poorly-written nonsense)
I couldn't really glean out any specific point being made here. I'm not familiar with "insulting" firms, though perhaps that's a crude attempt at humor. Assuming he means "consulting" firms, he's only partially right; obviously if an organization already has an "enterprise" plan, they won't be told how to rewrite it. However, any consulting firm worth its mettle will try to follow the same best practices when building out a new architecture or conforming to an existing one, and those practices aren't specific to any language or platform. This has nothing to do with Ruby.

Point 3: Ruby isn't enterprise ready because consulting firms aren't doing it
Nonsense. Consulting firms have a vested interest in keeping their technology portfolio tailored to the demands of clients. They also have an interest in telling clients what they actually want, so those demands continue to fit their portfolio--just like Enterprise Thought Leaders have a vested interest in convincing others their platform of choice is the best solution. Regardless, Java has a huge presence in the enterprise space, and has become the perfect vehicle for consulting...a well-understood, boring, accepted platform. .NET is well on its way to "boring" as well. Ruby will have work to do to "break into" this world, but its absence there says nothing about its viability as an enterprise platform. It's a small fish in a big pond...but fish grow.

Point 4: Magazines read by enterprise architects don't cover Ruby
Nonsense. This doesn't even make sense, since I doubt McGovern can speak for all "enterprise architects". If he means "enterprise architects that only use Java or .NET or don't read about languages", then he may be right. Otherwise he's making a blanket statement that's obviously impossible to prove and is at best totally wrong.

Point 5: Fortune 500 company-employed architects don't blog about Ruby
If this is true, then they are at risk of being passed up by others that do read and blog about Ruby. However this is another statement that says nothing about Ruby's enterprise capability. It only continues to say that "Ruby isn't enterprise now, so it's not ready for the enterprise."

Point 6: Large enterprises like big vendors (i.e. the same old Open Source FUD)
Ruby is garnering more and more attention from the "big vendors", but they're naturally cautious about leaping into a new language that may or may not be the right way to go. However, this argument could have been said for languages like Python, which now sees extensive use in enterprise apps and for which Microsoft has been funding IronPython for their CLR. Ruby is simply a tool; the eventual platform we will use to build enterprise applications is being built and will be built upon that tool, and it will in many cases be "big vendor"-driven.

Point 7: Big vendors can't make money off Ruby
Sun does not make a lot of money off Java, if they make any at all. Microsoft does not make a lot of money off the .NET languages. In either case, what little income they gain is offset by the massive research and development staff making those platforms possible. They make their money by selling products related to those platforms, such as toolsets and server software and hardware. Nobody is making money "selling Java" or "selling C#", just like nobody makes money "selling C" or "selling Ruby". As in the real world, the market for tools pales in comparison to the products of using and deploying those tools effectively. And again, this says nothing about Ruby's technical merits.

Point 8: Legal transparency is more important than software development productivity
This has nothing to do with Ruby or anything else, except to say that "so what if there's productivity gains to be made...I'm too busy dealing with the lawyers". Certainly, the cost of answering a subpoena may offset productivity gains elsewhere in an organization. However, McGovern does nothing to demonstrate any causal relationship between the two. I'm not entirely sure that he's demonstrated anything at all.

Point 9: People, then process, then tools
If tools matter least of all, then whether the tool is Ruby or some other language is entirely irrelevant. This point basically states that all his other concerns about Ruby in the enterprise are moot. Huzzah!

Point 10: Do not talk about Fight Club.
There is no point 10 in McGovern's post. I'm not sure if this is intentional or due to carelessness but judging by the rest of the post I'd expect the latter.

Point 11: More ranting about productivity
The absurd productivity gains claimed by some in the Ruby community are a concern, but these same sorts of numbers have been used for every Next Big Thing. Undoubtedly, Ruby will have an effect on productivity, and I believe it will have a strong positive effect, but this point has nothing to do with Ruby in specific.

Point 12: More ranting about books
Complaints about a particular magazine's method of giving out "awards" or publisher's use of those "awards" to market books has nothing to do with Ruby. There's plenty of SD Magazine "award-winning" books in the Java and .NET worlds too.

Point 13: Productivity gains are outweighed by increased contract negotiation time
Clients are interested in getting the best value for their dollar, but are also interested in tried-and-true technologies. As mentioned earlier, Ruby is still young, but its youth says nothing about its capability in the enterprise. I remember plenty of companies that shunned Java early on for the same reasons...if I'd followed their advice I'd be looking for COM+ and Microsoft DNA work. It's just another chicken-and-egg argument that has affected every other language and platform since there have been such things.

Point 14: Agile methodologies should emphasize code generation and the agile community doesn't get it
Other than a brief mention of Ruby, this doesn't seem to have anything to do with the core thrust of his post. It's worth mentioning that the poster child for Agile development, Rails, generates a large amount of its code before and during deployment.

Why Respond to such Drivel?

I've heard many folks ask this question about responses to such misguided "thought leaders" as Mcgovern. Why respond to nonsense? Why give air to FUD?

There's probably a few good answers.

First off, not responding implies to many that there are no counterpoints to be argued. Responding to a well-written and well-thought-out post is probably less important than responding to FUD, since the latter is generally filled with lies and vitriol. This is the case with McGovern's post; the points are not valid and generally say nothing about Ruby, but the title and thrust of the posting implies such. Other "enterprise architects" that skim through such a posting may use it to solidify their own prejudice and bias toward specific tools and platforms, which only hurts the evolution of software and software development. It is for this reason that I choose to respond and choose to publicly call out such bogus claims as well as I am able.

Second, such posting make us folks actively working on Enterprise Ruby more than a little pissed off, because the content is so weak and the facts are so twisted. Consider this post my outlet for such frustration. Edmund Burke said it best: "The only thing necessary for the triumph [of evil] is for good men to do nothing." Perhaps McGovern is not evil, and perhaps I am not good, but I will not stand by and do nothing. If roles are in actuality reversed, I would expect the same from McGovern.

Third, we like to talk. This is certainly true, but does not necessary imply any malice. I like to write...so sue me.

Rails' Generators Are Working

I considered making a post last week, to keep up my at-least-one-posting-per-week schedule. However, I had dug myself deep into Rails' generator script and the built-in generators and resolved myself to finally get them running. There were a number of more complicated issues to solve, but it felt like I was very close to having it working. Any of you all-night hacker types know the feeling; success is just around the corner...maybe this is the last bug...maybe this run will complete without errors.

I can now announce that as of this evening, all the built-in Rails generators appear to be running and generating correctly using JRuby.

This has certainly been a hard-fought battle, and the last week had some big fixes:

As mentioned in Pickaxe, Object does not actually define any instance methods; instead, it mixes them all in from Kernel. However, the original design of JRuby had followed Pickaxe in form rather than substance, defining those methods on Object. While this did not typically affect the functioning of normal Ruby code, it did break one library in particular: delegate. DelegateClass, in delegate, uses Kernel's list of public instance methods to select which methods on the target class are to be delegated. Rails uses it internally for, among other things, delegating some behavior for generator commands to a generator base class. Fixing the issue meant redefining the Object instance methods as Kernel module methods...a fairly major change, but one that does not appear to have caused any other regressions.
My recent addition of binding support had a small flaw. The current chain of method calls in JRuby to do an eval is long and winding (much longer than I would like), and one link in that chain I did not inspect caused two issues: evaluating with a binding did not correctly set "self", and completion of that eval did not correctly reset it. In lieu of cleaning up the eval chain (which I commit myself to eventually do), I made a few modifications so "self" would work correctly.
Enumerable#collect should work without a block; this is not documented in Pickaxe and finding this issue from deep within the bowels of 'generate' was a painful chore. This is a perfect example of a miniscule bug that causes massive trouble; the fix was less than a line of code, but the bug prevented 'generate' from correctly mapping and executing any actions. And why would you want to collect without a block? Answer: if you only have "each" defined and wish to turn your Enumerable into a simple array.
JRuby's Module code inexplicably defined a singleton "new" method. This prevented Module subclasses from defining their own initializers that call "super". I'm still not sure why this was there, but it has been removed.
Module#ancestors failed to include singleton classes.
Java does not support the concept of a process-wide "current directory" as Ruby does. In order to fake this behavior, JRuby originally modified the system property "user.dir" to point at JRuby's new current directory. This was not only a dangerous thing to do, but was also not sufficient to make current directories work correctly. It also would drastically affect all other code running in the same JVM that depended on "user.dir" being correct. My modification was to introduce into JRuby a runtime-wide "current directory". In addition, I fixed a problem in Dir.chdir where failures in the provided block prevented chdir from resetting the dir back to its original location.
Dir.mkdir dir not handle multiple levels of dir creation. The simple fix was to use Java's File#mkdirs instead of File#mkdir. How delicious...an easy bug.
IO#read returned nil at EOF, instead of the correct "".
File#join did not clean up multiple dir separators in a row, resulting in invalid paths like "app//models".

Busy, busy, busy. On top of these fixes I also helped resolve a few regressions discovered by a hardcore JRuby user (who also happens to be a team member). If only I had another eight hours a day to work on this stuff. But, I digress.

At any rate, you're here to read about the generators.

Rails Generators

Part of what makes Rails so agile and powerful is its beautifully simplistic code generation capability. By using the "generate" script, you can generate perhaps 90% of a working web application. With the 1.0 release, there are generators built in to create models, controllers, mailers, plugins, web services, and database and session migration code. There are also third-party generators for quickly generating other bits and pieces of a typical web app.

The generator code is fairly extensive, but unsurprisingly it does not exercise all of Rails' code. It does, however, represent the typical "first step" into the Rails world, and so I set out to get it running in JRuby. After many tests and fixes, documented in my other blog entries and immortalized in CVS, generators now work.

A few test runs to demonstrate:


C:\rails>jruby script\generate scaffold "myapp/Account" open close balance
  create  app/controllers/
  create  app/helpers/
  create  app/views/open
  create  test/functional/
dependency  model
  create    app/models/
  create    test/unit/
  create    test/fixtures/
  create    app/models/account.rb
  create    test/unit/account_test.rb
  create    test/fixtures/accounts.yml

C:\rails>jruby script\generate web_service User add edit list remove
  create  app/apis/
  exists  app/controllers/
  exists  test/functional/
  create  app/apis/user_api.rb
  create  app/controllers/user_controller.rb
  create  test/functional/user_api_test.rb

C:\rails>jruby script\generate plugin SiteMinderAuthentication
  create  vendor/plugins/site_minder_authentication/lib
  create  vendor/plugins/site_minder_authentication/tasks
  create  vendor/plugins/site_minder_authentication/test
  create  vendor/plugins/site_minder_authentication/README
  create  vendor/plugins/site_minder_authentication/Rakefile
  create  vendor/plugins/site_minder_authentication/init.rb
  create  vendor/plugins/site_minder_authentication/lib/site_minder_authentication.rb
  create  vendor/plugins/site_minder_authentication/tasks/site_minder_authentication_tasks.rake
  create  vendor/plugins/site_minder_authentication/test/site_minder_authentication_test.rb

And so on. It's a pretty big milestone to finally have these generators working, and it means that one more step in the Rails development process now works under JRuby. I'm very pleased.

There are a few caveats (of course), but all told they're fairly minor. Rest assured they'll be remedied forthwith:

Among other block arg tricks, specifying an index into an array or hash as a block arg is still nonfunctional. This will require interpreter work and possible parser changes.
There are a couple warnings that display while running 'generator', but they are safely ignored. I believe they are overzealous warnings within the parser, left over from Ruby 1.6.
I can neither confirm nor deny that the generated code and content is correct; however, it looks correct to my untrained eye.
Not all the above fixes are committed; not all fixes committed are guaranteed not to cause regressions.
I'm no Rails expert, despite swimming in the deepest parts of its ocean. I will be putting my Pragmatic 'Rails' book to heavier and heavier use now that we're finally putting JRuby on Rails.

The next big step will be continuing on to get Rails proper running with JRuby. The fixes I've contributed should help speed that process along, and Tom is already well into it. After wrapping up those last minor issues, I will endeavor to help him.

So there you have it. Great progress has been and is being made, and I'm having fun making it happen. Hopefully Rails actually running is coming very soon...stay tuned.

Thursday, March 09, 2006

Progress Update March 9

It's been a busy few days. We've made more good progress, so I figured an update was due.

Last Class

Tom discovered over the weekend that we had a problem setting the "last class" in the interpreter. This interfered with "super" calls working correctly.

The "last class" is used to point at the appropriate class hierarchy for the currently-executing code. We were setting "last class" to the actual class or module the current object was an instance of. This is correct for classes, but it was completely wrong for modules. In Ruby (both C and J), modules are internally inserted into the inheritance chain using module "wrappers". These wrappers become an implicit superclass of the class doing the include, and the wrapper's superclass points at the class's original superclass. So in the following example...

module X; end
class Y; end
class Z < Y; include X; end

..the explicit class hierarchy shows Z extending Y, while internally Z's superclass is a wrapper for X, whose superclass is then Y.

Having "last class" set appropriately in the interpreter is important because super calls must always be able to traverse the appropriate hierarchy. If, for example, you created a module-based "initialize" method (which some apps do to mix-in initialization behavior), and that initialize method called super, it must go to the appropriate superclass's initialize. Since modules do not have an initialize (and are uninstantiable), our original "last class" pointer at the module itself failed miserably. The initialize method was called correctly on the module, but super pointed at nothing.

Once Tom had figured out how we ought to fix this, I whipped up a patch. We modified the method-calling pipeline to send the appropriate "last class" through to execution. This allows module methods to now run within the context of the includer's hierarchy, rather than in their own hierarchies. With that fixed, super calls are now working correctly.

Array#unshift

Array#unshift now accepts an empty arg list. This is perhaps a Ruby 1.8.4 change. We had been forcing it to be at least one argument, as is documented in pickaxe. I noticed this problem while re-testing IRB with the "last class" fixes.

eval + Binding fixes

Binding is a fairly recent addition to JRuby. We had not ever supported binding up until a few weeks ago, which killed apps like IRB and Rails in the cradle. Adding it opened up a world of opportunity.

However, yesterday I discovered that code eval'ed with a Binding was not getting "self" correctly. Throughout the interpreter, there's one universal constant: "self" points to the current object. In a pure OO language like Ruby, all code is executed in the context of an object. Code at the top level executes within the context of the top-level Object instance, instance methods execute within the object they are called against, and so on.

When calling eval, you can choose to specify a binding or not. eval("code") evaluates the code in the current object (i.e. the current "self"), as well as within the current call frame, scope, class, etc. eval("code", some_binding) evaluates the code within another context entirely, with a different "self", frame, and friends.

Our eval, though properly setting up the bound context's frame and friends, was not correctly setting self. It was always making self the current object. I fixed it to set self to the correct object when called with a binding, and it appears to be working correctly.

IRB

With my update to Ruby 1.8.4 code and the "last class" fixes, IRB suddenly stopped working. Bummer. To make matters worse it started failing late last night, and I couldn't find all the issues.

However, on the bus ride to work today, I fixed both the eval and Array problems that were preventing IRB from working. Starting up IRB again I received a lovely surprise:

C:\JRubyWork\jruby>jirb
irb(main):001:0>

It is the good old IRB prompt, rather than the hideous IRB::Workspace::Nonsense!

The "self" issue prevented IRB from running at all with the "last class" fixes in place, and was the cause of two other issues with IRB: the need for --single-irb mode, and the ugly prompt.

With the eval issue fixed, IRB now runs without the need for --single-irb mode, and I have updated our "jirb" scripts to reflect this. It also runs with the appropriate prompt. This will make me much happier and our future IRB demos much prettier.

Rails

My goodness Rails is a gigantic thing.

We are continuing to make good progress on Rails. With all the above fixes, the generate script now runs through to initialization of the Routing subsystem. There it fails with an argument error that could be an interpreter or API issue.

I have also ventured a bit beyond this--by commenting routing out--and ran into a Ruby syntax we do not yet support:

x = {}
[1].each {|x[:foo]|}

Ruby at some point (probably in 1.8) added the ability to specify an array or hash index as the target for a block argument. While I can see the usefulness of this shortcut (rather than {|v| x[:foo] = v}) I must say this syntactic sugar is a bit sweet for my tastes. At any rate, it's there and we need to support it.

Rails so far only uses this syntax when printing out usage for the generate script. I modified that script to use the longhand version, and it was able to continue.

With routing commented out and the block arg syntax modified, the generate script ran to completion. We're getting closer!

Executing the generate script with a specific generator caused some other issues, whereby JRuby sees a yield without seeing a block to yield to. It's not functional yet, but it's moving right along.

Tom is also working on the other end of things, running Rails in "CGI mode" to test execution of an actual Rails request. At this point he's stuck on the same argument issue that's preventing routing from starting up, so that's what we'll work on next.

JavaOne 2006

As most of you will know, we will be presenting JRuby at JavaOne this year. We have almost finished our slides and will be sending them to Sun tomorrow. We obviously can't show you the slides themselves (since we want you to come see them in person) but we do hope to do most of the following five demonstrations:

Interactive JRuby with IRB
JDBC in Ruby
Swing in Ruby
Spring in Ruby
Rails

Obviously the Rails demo will only happen if there's something to show...but we're banking on having something working by then. We may have to drop one of the others for time, but I hope to touch on them all. We hope to see you all at JavaOne!

Thursday, March 02, 2006

IRB is GO!

I should post about things not working more often.

Within a few hours after my previous post, where I showed the world how IRB now starts up successfully in JRuby but does not work, I was back at it trying to fix the next few bugs preventing it from working. The first issue was a NullPointerException deep in the interpreter, when executing an "until" block. Our parser, for right or for wrong, was producing an AST "UntilNode" with no body. While this could be correct or incorrect behavior--since the "until" in question actually did have an empty body--we still were not handling it correctly. The interpreter assumed that all "until"s would have bodies, and when a body turned up null...kaboom. A fix to check for null and not attempt to evaluate the body was an easy, if not entirely kosher, way to fix it. Done.

However, nothing could prepare me for what followed.

C:\JRubyWork\jruby3>jruby C:\ruby\bin\irb
irb(#<IRB::WorkSpace:0x5a9c6e>):001:0> x = 1
=> 1

I expected that the "until" bug would go away...that much was easy. however, I did not expect the variable assignment to work. "Ok," I thought, "that's better progress than I expected, but let's try something more complicated."

irb(#<IRB::WorkSpace:0x5a9c6e>):002:0> puts x
NameError: undefined local variable or method 'x' for #<IRB::WorkSpace:0x5a9c6e>
from (irb):1:in `method_missing'
...

Ahh, there's the comforting disappointment I was used to. The 'x' variable had been declared and assigned, but for whatever reason, it was not visible in the current scope.

Normally, I would have continued on to fix this scoping issue, which certainly would have involved a complicated dive into JRuby internals, hunting for mishandled scopes, bindings, frames, and wrapper objects. In this case, however, I decided to give IRB's "single IRB" mode a try, which simplifies the logical scoping of the IRB workspace. What follows is a series of annotated IRB sessions running--yes, running successfully--under JRuby.

This first demo shows something basic: a multiline do/end array iteration.

C:\JRubyWork\jruby3>jruby C:\ruby\bin\irb --single-irb
irb(#<irb::workspace:0x175ace6>):001:0> [1, 2, 3].each do |i|
irb(#<irb::workspace:0x175ace6>):002:1* puts i
irb(#<irb::workspace:0x175ace6>):003:1> end
1
2
3
=> [1, 2, 3]

This confirmed several things:

method calls were working just fine
array instantiation and integer literals were ok
multi-line constructs were being handled correctly

It is this last one that surprised me a bit. I had not expected multi-line constructs to work so well and without any problems, but there it was. Playing around a bit more, I discovered some other surprises:

line editing was working successfully, and I could arrow-key left and right to correct mistakes
command history was also working, so that up and down arrow would retrieve the next and previous lines, respectively
tab completion does not work

Excluding the tab completion issue (hitting tab just inserts a "tab" character into the current line), the perfectly working line editing and command history totally blew me away. I have NEVER seen a console-mode Java application do such things so seamlessly, much less one running an interactive shell. It appears that IRB's fallback "StdioInputHandler" is far less "dumb" than I expected. It was making Java do things I didn't know Java could do. Excited, I pressed on.

This next demonstration tests the declaration and instantiation of a multiline class, another area I thought would never work correctly.

irb(#<irb::workspace:0x175ace6>):001:0> class MyClass
irb(#<irb::workspace:0x175ace6>):002:1> def hello
irb(#<irb::workspace:0x175ace6>):003:2> "Hello from IRB!"
irb(#<irb::workspace:0x175ace6>):004:2> end
irb(#<irb::workspace:0x175ace6>):005:1> end
=> nil
irb(#<irb::workspace:0x175ace6>):006:0> x = MyClass.new
=> #<myclass:0x1621fe6>
irb(#<irb::workspace:0x175ace6>):007:0> puts x.hello
Hello from IRB!
=> nil

Once again, JRuby (and IRB) thoroughly surprised me. Defining a class over multiple lines worked perfectly, just as you'd expect from IRB running under C Ruby. At this point, IRB was running so well I began to have some doubts. Could it be that IRB had called out to an external C Ruby process for running the interactive portion of the shell? Such a thing would not be unheard of; Rake launches external Ruby processes to run test cases, though you might never notice such a thing. There was, however, a simple way to confirm that I was actually seeing JRuby at work and not C Ruby: call Java code.

JRuby's greatest strength lies, unsurprisingly, in its ability to neatly tie Ruby and Java code together. For what other purpose would we want Ruby running in the JVM than to take advantage of the wealth of libraries the Java world has to offer? The integration is improving more and more with each release, and has become extremely powerful, usable, and above all very Ruby-like.

This next demonstration shows IRB calling Java code.

irb(#<irb::workspace:0x175ace6>):001:0> require 'java'
=> true
irb(#<irb::workspace:0x175ace6>):002:0> include_class "java.lang.System"
=> ["java.lang.System"]
irb(#<irb::workspace:0x175ace6>):003:0> System.out.println("Hello from Java")
Hello from Java
=> nil

Now some of you may not realize what this means. The ability to interactively script and exercise Java code from within an IRB session has huge potential for testing Java code, debugging JRuby (perhaps that's more exciting to me...oh well), and providing all the interactive goodies that Rubyists have taken for granted with the power and variety of Java's capabilities.

So, a final demonstration is in order.

irb(#<irb::workspace:0x175ace6>):001:0> require 'java'
=> true
irb(#<irb::workspace:0x175ace6>):002:0> include_class "javax.swing.JFrame"
=> ["javax.swing.JFrame"]
irb(#<irb::workspace:0x175ace6>):003:0> include_class "javax.swing.JButton"
=> ["javax.swing.JButton"]
irb(#<irb::workspace:0x175ace6>):004:0> frame = JFrame.new("my frame")
=> javax.swing.JFrame[long desc omitted...]
irb(#<irb::workspace:0x175ace6>):005:0> button = JButton.new("my button")
=> javax.swing.JButton[long desc omitted...]
irb(#<irb::workspace:0x175ace6>):006:0> frame.contentPane.add(button)
=> javax.swing.JButton[long desc omitted...]
irb(#<irb::workspace:0x175ace6>):007:0> frame.setSize(200, 100)
=> nil
irb(#<irb::workspace:0x175ace6>):008:0> frame.show
=> nil

The result:

So there you have it. With a few small caveats (like --single-irb), IRB is actually up and working in JRuby, far sooner than I expected. This is turning out to be a really good week.

JRuby Progress Updates: JRuby on Rails, IRB, and the Future

We've been very productive on JRuby over the past week. Progress is being made on many fronts, and I'm more excited than ever about JRuby's potential. Here's a few updates on Rails, IRB, and JRuby's future for those of you following along. It's a long post, but each subsection stands on its own.

JRuby on Rails

I've continued my work getting the Rails "generate" script to run with JRuby. The major hurdle we were coping with last time was parsing the database.yml file. First, a bit of background.

YAML, as most of you will know, is a markup language (or rather, YAML's not A Markup Language) used prominently in Ruby applications for configuration files and for some types of object persistence. YAML's creator, _why, originally wrote YAML parsers and libraries in the language of the target platform; in other words, the original YAML parser was pure Ruby code, written to use Ruby's compiler-compiler library RACC. Now Ruby's troubles with performance are fairly well-documented, but these issues were considerably more pronounced when doing such an intensive process as parsing a large YAML file. _why's solution was to write a new C library for parsing YAML called "syck". Syck did two things: first, it sped up YAML parsing considerably and allowed many languages to use the same parser via language plugin mechanisms; and second, it eliminated the need or availability for a pure Ruby YAML parser.

Enter JRuby. With YAML now being parsed almost exclusively using the C-based syck library, we have been forced to use an older version of the RACC-based pure Ruby parser. When starting work on Rails (and really, when playing with making RubyGems work) the complexity of the YAML parser brought out some problems in JRuby. That was a couple weeks ago.

Almost all of those problems have now been solved.

Our StringIO library (again, Ruby uses C code for StringIO to improve performance...see a pattern forming?) had been tested using all available test cases, but unfortunately those test cases did not cover the simple cases. When yaml.rb (the pure Ruby YAML parser we're using) started to make heavy use of StringIO, failures showed up. Tom Enebo is currently working on fixing the last of those failures, writing more extensive test cases at the same time. However, yaml.rb also contains a "stripped-down" version of StringIO for its own use. During my continued testing, I have been forcing it to use that version while Tom completes his fixes.

Other problems ranged from interpreter bugs (variable scoping, throw/catch not working, etc) to parser bugs (JRuby's parser did not take into account an "eval" called from within a block, which causes variables to be handled a bit differently). Those issues are now sufficiently resolved so that YAML does not show failures.

So back to Rails. Where the "generate" script originally got to the point of parsing database.yml and blew up, it now successfully parses that file and continues on. The next step in the "initialize_database" step of the initializer is to actually instantiate an ActiveRecord adapter based on database.yml. This is where the current failure lies, and where my attentions will be focused.

So to recap Rails progress, the "generate" script's call to the railties initializer successfully runs up to ActiveRecord instantiation, as well as successfully running a number of other initialize tasks. It's getting closer every day.

IRB

Oh, IRB, how we love thee. For those unfamiliar, IRB is the "interactive ruby" shell where you can enter in line-by-line Ruby code and immediately see results. Multi-line constructs like classes, methods, and modules are handled very elegantly, and therefore you can test some fairly complex bits of Ruby quickly and easily. It's a wonderful interactive environment for testing, learning, and experimenting with Ruby.

Unfortunately, it doesn't run under JRuby.

IRB is a very complicated beast. Running IRB results in almost every aspect of the underlying interpreter getting a good pounding; the parser is brutalized for parsing small snippits of code, the evaluator must translate that code into appropriate state changes, and any aspect of the Ruby language must be instantiable and callable interactively. Beyond even the Ruby aspects, IRB provides line-editing capabilities, tab completion, and command history features. Naturally, this presents many challenges for JRuby, and the ability to run IRB would be a huge demonstration of JRuby's maturity.

Running IRB under JRuby originally just blew up immediately; there were core bugs in the libraries and interpreter that prevented early stages of IRB's startup from completing successfully. Many of those issues were the same ones fixed for Rails' "generate" script, such as the parser/block issues and many interpreter bugs. Today I fixed another issue affecting both Rails and IRB, where throw/catch was not correctly passing back the symbol thrown. I was working on "generate", but remembered that I had stopped previous IRB work because of an apparent try/catch problem.

So I took a break from rails and attempted to start up IRB.

C:\rails>jruby C:\ruby\bin\irb
irb(#<irb::workspace:0x5a9c6e>):001:0>

To my amazement, IRB successfully started up. Although hopeful, I had always worried that there were core requirements of IRB that could never be satisfied by JRuby, and that even starting it up would never be possible. Seeing the IRB prompt comeup successfully was a huge relief to me and an unexpected nugget of joy. I'm so glad it happened in the morning; I'll be glowing all day.

Now don't get me wrong. IRB still doesn't work right. I naturally proceeded to type in the beginning of a class definition, and IRB blew up immediately after hitting enter. I never expected the prompt would just start working, and the blow up doesn't temper my joy in any way. There's still more work to do, but this is a very exciting milestone in my book. I now believe without a doubt that we will get IRB to run. The implications of successfully running such a complicated script in JRuby are tremendous, and finally reaching this milestone has made my day.

The Future

Ahh the future. Such a magical time. If not for the promise of the future, what point would there be in writing software. Truly, my greatest motivation for rolling out of bed each day is the possible future I will be walking into.

My hopes for JRuby's future are starting to take shape.

Recently, I encountered more issues with JRuby's performance being a bit lacking. Actually, let's just say it: JRuby is really slow right now. A microbenchmark recently posted to the ruby-talk mailing list implemented a brute-force Sudoku-solving aogorithm. The original poster's compared Ruby's performance to native C code; where the C code took seconds to run, the Ruby version of the algorithm took over half a minute.

Again, Ruby's struggles with performance are widely known. It's also obvious that Ruby's creators and developers are aware of these issues, since many core libraries are implemented in C and since Ruby 2.0 will boast a new interpreter and Vitual Machine as well as many VM features comparable to those in Java and .NET's runtimes.

Naturally, curious how JRuby would perform on this benchmark--and with full awareness that JRuby's performance is far from spectacular--I ran it and waited for a result.

And waited. And waited.

After a few minutes, I killed the VM, assuming that there was something broken in JRuby that prevented the algorithm from terminating successfully. I did a bit of debugging, traced into the very depths of JRuby's evaluator, and found nothing. As far as I could tell, progress was being made and the algorithm was moving forward. My findings warranted another run.

JRuby took over 800 seconds to complete the benchmark, around 13.6 minutes.

I will admit the realization that JRuby is an order of magnitude slower than C Ruby came as a bit of a shock to me. There are many definitions of slow; Ruby's "slow" is for most purposes "fast enough". Java's "slow" is in most cases much faster than is required, and in some cases faster than native C code. JRuby's "slow", it would seem, is a different beast altogether.

However, I am reactionary. Such disappointment immediately sours my stomach and gives me a headache. Could I have been wrong about JRuby's potential? Will this never work?

Performance is a unique problem in JRuby. Since we do not have the option of running native C code for any libraries, and since reimplementing core features in pure Java is both time-consuming and not in the spirit of what we're trying to accomplish, performance concerns have taken a back seat to functionality, compatibility, and correctness. Performance problems are not easily isolated, and never easily solved. However...I love a challenge.

The redesign of JRuby's interpreter over the past several months has been focused on two things: enabling missing features like continuations and green threading; and providing a Java-friendly design that could more easily transition to optimized interpreters and eventually bytecode compilation. What I've essentially been doing amounts to painstaking refactoring of all JRuby's functional guts, from the AST-walking evaluator to the class and object implementations to the threading, framing, scoping, and call mechanisms. All these areas were originally written and designed based on Ruby 1.6 code; there were flashes of OO genius, but the mostly procedural approach of Ruby's C code shined bright throughout JRuby. As you might guess, this is certainly the easiest way to port a language interpreter to any platform: reimplement the same code in your target language of choice. As you might also guess, this does not generally take advantage of that target language's best features.

In JRuby's case, a major missing piece was the inability to longjmp, C's function for leaping from one call stack to another. longjmp is heavily used (understatement!) in Ruby for everything from threading to continuations to exception handling. Missing longjmp in Java presents a very large hole when porting Ruby C. Many creative attempts to mimic longjmp were therefore created: exception-based flow control allowed loop keywords like 'next' to throw control back to a higher-level loop construct; a recursive evaluator repeatedly called itself for new AST nodes, ever-deepening the stack but always keeping lower nodes within the context of higher ones; exception-based "return sleds" allowed returns to bubble their results back up to the appropriate recipient; and on and on. Many of these approaches were extremely novel, worthy of their own papers and accolades. Indeed, several of them have shown up in academic papers and PhD theses in some form or another.

Unfortunately, these features still tried to mimic the way C code worked, which was never 100% achievable. longjmp is an extremely powerful tool that requires the capability to store, retrieve, and manipulate your own call stack. Java provides no such capability, and while exceptions do allow us to escape the stack--mimicking one aspect of longjmp--there is no ability to restore that stack. A new approach was needed.

Enter the JRuby redesign. In October of 2005, I began the process of unraveling JRuby's code with a number of design goals in mind:

The new interpreter must be iterative, rather than recursive, so escaping and restoring the stack are possible. This would enable continuations and green threading.
The JRuby code must be drastically cleaned up and simplified, and there must be a clear separation of concerns to allow future implementations of key subsystems.
JRuby must continue to work with no functional regression throughout this redesign.

The first two points are fairly straightforward. The new interpreter design enables us to provide all the required Ruby language features in a much more Java-friendly way. It also helps qualify JRuby as a real "VM", or at least a micro-VM layer on top of the JVM. I'm planning to start documenting this new design (since it has evolved over time and out of necessity), but it's fairly well-understood within the JRuby team.

The third goal, however, continues to be a serious pain-in-the-ass.

Ruby as a language and as a platform is poorly-specified. There is no conclusive specification; indeed the best spec is the incomplete (but still astounding) documentation provided by Dave Thomas's "Pickaxe" book, Programming Ruby. Given this lacking, the only way a Ruby interpreter can be determined to conform to the Ruby Way is by actually running it. Primarily, this means unit tests.

The Rubicon project was spawned out of a set of unit tests Dave and the PragProg folks created while writing the first edition of "Pickaxe". It tested out many of the features and scriptlets demonstrated in the book, and provided a wide but fairly shallow set of test cases to excercise Ruby features. Rubicon today exists as the "rubytests" project on RubyForge, where it has languished in recent years. Nobody likes writing tests after-the-fact, and the value of such tests is dubious.

JRuby makes heavy use of Rubicon, as well as some of Ruby's and our own internal unit tests, to ensure compatiblity and prevent regression. Anything not covered by those tests or by applications that run on JRuby remain unknown, untested areas until discovered by a new script or application. However, they're the best we've got right now. By implementing a Ruby that can run all or most of those tests as well as as few key applications, we can cobble together over time a pretty good Ruby. Current efforts to run more advanced applications like Rails or IRB are driven by the fact that those test cases do not excercise enough of JRuby to be conclusive, and the more we run the better we get.

When the redesign began, it was immediately apparent that without continually running those test cases and applications we would be diving down the rabbit hole with no insurance; refactoring an entire interpreter is obviously extremely dangerous without a language spec or appropriate unit tests. Goal #3 above became an absolute necessity.

As a result, after every major VM change these past months we have continued to run test cases and scripts to ensure that regressions are prevented or kept to a bare minimum. JRuby's codebase is not terribly large; a wholesale refactoring would not normally take months to complete. However with the added restriction that it must continue to work, the time-to-implementation increased tremendously. In addition, and of primary importance to performance, tradeoffs had to be made between "doing things right" and "doing things fast". Things had to get worse before they could get better.

JRuby's performance was no great shakes before the refactoring, but the 0.8.2 release appears to be as much as 30% faster than the current HEAD version in some scenarios. While such a decrease in speed is worrisome, it comes with the fact that the new VM will enable performance-enhancing optimizations in ways the original never could.

My interest in those enhancements was revitalized by the poor benchmark results. Perhaps one of the most impotant is JRuby's eventual ability to compile Ruby code into Java bytecode. After a long discussion with my good friend Kelly, I believe we have devised a way to make compilation happen without sacrificing goal #1 above. More on that in a future post.

I also started looking to isolate the performance problems. Immediately, I started looking at the redesigned interpreter engine. To make a long story short, the current interpreter has more overhead than the original because rather than recursing for additional nodes in the AST, it "trampolines" from one to the next. Each node encountered is associated with a number of instructions; those instructions are executed in sequence, allowing the Java call stack to remain at the same level and enabling the potential for continuations and green threading (since we can now step away from one instruction sequence and into another, efffectively doing what longjmp does for C). This flexiblity initially comes with decreased performance since the instruction fetch cycle, decoding that instruction into sub-instructions, maintaining a cursor within the AST and instruction sequence, and double-dispatching for each instruction all add overhead.

Small changes in the interpreter can have a drastic effect on performance, and so to put my mind at ease I went ahead with a couple optimizations I had put off. Specifically, I reworked the way flow-control, return values, and exception handling worked, reducing the number of calls and objects created. The results were very promising: a subset of the sudoku benchmark improved by roughly 9%. Since this small change only represented one tiny aspect of the interpreter, my fears have been temporarily put to rest.

Based on my reexamination of the interpreter and on the results of this small optimization, I do not believe that JRuby's performance issues will be a problem much longer. I'm also confident that we can begin improving performance rather than degrading it, since the current interpreter is only a few steps off from its eventual structure. Combining future interpreter optimizations with potentially compiling many or all pure Ruby methods to Java bytecode means we should see drastic improvements in the coming months. Will we ever run as fast as Ruby 1.8 or 2.0? Will we run faster? Time will tell.

Headius