Thursday, March 30, 2006

A Ruby in the Rough

Feeling More and More Ruby with Gems

Things are speedily moving along. We have been continuing our work on Rails, but because of an unusually active JRuby community, we recently changed direction for a bit. We decided to tackle RubyGems.

RubyGems, if you are unfamiliar with it, is a packaging system for bundling up and installing Ruby applications into your local Ruby distribution. It works with both file-based and name-based installs, so that you can either "gem install rails-x.x.x.gem" or "gem install rails" and it will fetch the appropriate archive for you. It manages depedencies and updates seamlessly. It basically does for Ruby what apt-get and rpm do for Linux.

There's not a terribly high level of complexity to the RubyGems code, like there is with IRB or Rails, but it exercises many APIs the former two do not. In our case, the biggest issue was Zlib.

Ruby of course has full Zlib support, with gzip, inflate, checksums, and a nice stream-like API. However, as with many other bit, byte, or math-intensive libraries, Ruby's Zlib is implemented in C code. This certainly speeds up the processing of compressed archives, but it presents a continuing problem for JRuby: we can't inherit *all* of Ruby's libraries just by copying over the lib/ruby/1.8 dir. In this case, luckily, we can do what Ruby does.

Java has included, support for compressed archives, since the 1.1 release, with streams for zip, gzip, and deflate compression and decompression. It also includes the same checksums Ruby includes, CRC32 and Adlter32. It provides probably 90% of what we need in Ruby's Zlib library.

The availability of a pure-Java implementation of a Ruby library certainly cuts down our workload. Since we already have significant capabilities for calling Java code from Ruby, there's usually only a bit of Ruby code required. However everything takes time, and since Rails has been our primary focus we had not spent much time getting Zlib implemented.

Enter our new friend Ola. Ola recently appeared on the jruby-devel mailing list, interested in helping out with JRuby and contributing fixes whereever possible. He also expressed an interest in making RubyGems work. We rattled off a couple items, with Zlib in the list explicitly because it was the major roadblock for RubyGems, and within about a day Ola returned with a preliminary Zlib implementation. I was personally very impressed.

Now it goes without saying that his implementation wasn't perfect the first time, but this really got us going. We switched gears for a few days and worked back and forth with Ola to get Zlib and RubyGems working more smoothly. I think I spent 12 hours on Saturday hacking through Zlib, YAML, and RubyGems code digging for failures. The end result was RubyGems's setup.rb install script running to completion, but it wasn't quite there. Ola came to the rescue to resolve a few more minor issues, Tom committed some additional fixes today, and although we didn't get RubyGems working in time for the 0.8.3 release, we did get a lot done.

At this point, we are able to successfully install RubyGems into the JRuby distribution directory. Huzzah!

Now of course you knew there would be caveats.

Java as a Platform

When running Ruby, there are a number of constants pre-set for you. Among these is RUBY_PLATFORM, which identifies the underlying platform on which Ruby is running. Ruby under Windows reports "mswin". Under linux it reports "linux", unsurprisingly. For JRuby, it made sense to report "java", since we do not match any other platform exactly. Java is its own platform.

The availability of the RUBY_PLATFORM constant allows scripts to change their behavior depending on the platform. Since Windows doesn't directly support symbolic links, discovering a RUBY_PLATFORM of "mswin" means you shouldn't try to create them. Discovering a platform of "linux" means you probably shouldn't generate externally-consumed file paths using backslash characters. Given that even a cross-platform language and implementation must interact with its host environment, these cases make sense.

Unfortunately, Ruby's libraries only know about the platforms that the C implementation supports. Hence, there's a problem when running under JRuby. Since we report our platform as "java", any code that uses RUBY_PLATFORM to turn off unsupported features does not have any effect. The specific case I ran into was within the fileutils library.

fileutils provides a number of convenience methods for manipulating or querying files. There's methods like cp for copying files, uptodate? to check if a file or files' modification times are later than another specified file, and other similar functions. It is with cp that JRuby ran into problems.

fileutils' cp implementation copies one or more source files to a target file or directory. During this process, it also checks whether the target file is equivalent to the source file. This it does in one of two ways:

- On a UNIX variant, fileutils compares the device and inode for the file, using File::Stat#dev and #ino
- On a Windows variant, where device and inode have no real meaning, fileutils compares absolute paths

In order to know which method to use, RUBY_PLATFORM comes into play:

def fu_windows?
/mswin|mingw|bccwin|wince|emx/ =~ RUBY_PLATFORM

The fu_windows? method obviously returns true on a Windows platform or on any platform where dev and ino will not be available. Notice that Java is not in that list?

Device and inode have even less meaning in Java, which abstracts away the concept of a "file" into the most generic definition possible. In order to keep Java code running the same on all platforms--without platform checks or code changes--Java's File must cater to the lowest common denominator. That's great for Java code, but for us poor suckers trying to get Ruby to run in Java, it's an issue. We need Ruby's libraries to understand that Java doesn't support the same things a UNIX variant does, and doesn't support other things that a Windows variant does. In short, we need Ruby to understand the Java platform's capabilities.

Now we aren't going to be able to solve that issue today. It would be a little unusual for Ruby to include a "java" platform whereever platform is being queried, and I don't doubt we'd get some pushback on that. We're kicking around possible solutions, ranging from implementing our own versions of those libraries that use RUBY_PLATFORM extensively to including modified Ruby libraries in the base JRuby distribution. In the short term, we chose to locally modify fileutils as follows:

def fu_windows?
/mswin|mingw|bccwin|wince|emx|java/ =~ RUBY_PLATFORM

With this change, RubyGems is able to successfully install and copy all required files to the JRuby distribution dir. It's not perfect, but it's a start.

Installation != Working

Aside from the very simple case of installing the sources gem, RubyGems is not quite working correctly in JRuby. With RubyGems installed, previously-functional Rails scripts start breaking in new and exciting ways. Network-based gems do not install, owing to our bare-minimum Socket implementation and our lack of marshalling code for appropriate Ruby classes. File-based gems attempt to install, but get "stuck" somewhere during processing. All are frustrating issues, but we're very close.

Small Moves

Along with RubyGems nearly working, it was a very productive week. A lot of the contributions for RubyGems will be helpful in many other areas, and we have continued work on Rails as well.

- We now have a working Zlib library.
- We have made more progress toward getting Rails to work, by running dispatch.rb/cgi and fixing as we go.
- We finally got out a new release after a long delay.
- We are fairly confident we'll be able to demo everything we want at JavaOne.

Things are looking great, and I'm very excited.