Saturday, June 10, 2006

Bringing RubyGems to JRuby OR The Zen of Slow-running Code

JRuby now supports RubyGems, and gems install correctly from local and remote. That's a huge achievement, especially considering the extra work that was required around YAML to get to this point. However, I'll start off with the caveats this time.

JRuby is very slow to install gems. You'll see what I mean in a moment, but it's so slow that something's obviously broken. That's perhaps the good news. Even on a high-end box, it's so intolerably slow that there's got to be a key fault keeping the speed down. We believe there are a couple reasons for it.

headius@opteron:~/rubygems-0.8.11$ time jruby gem install rails --include-dependencies
Attempting local installation of 'rails'
Local gem file not found: rails*.gem
Attempting remote installation of 'rails'
Updating Gem source index for:
Successfully installed rails-1.1.2
Successfully installed rake-0.7.1
Successfully installed activesupport-1.3.1
Successfully installed activerecord-1.14.2
Successfully installed actionpack-1.12.1
Successfully installed actionmailer-1.2.1
Successfully installed actionwebservice-1.1.2
Installing RDoc documentation for rake-0.7.1...
Installing RDoc documentation for activesupport-1.3.1...
Installing RDoc documentation for activerecord-1.14.2...
Installing RDoc documentation for actionpack-1.12.1...
Installing RDoc documentation for actionmailer-1.2.1...
Installing RDoc documentation for actionwebservice-1.1.2...

real 63m16.575s
user 55m5.939s
sys 0m25.547s

Ruby in Ruby

I'll tackle the simpler reason first: we still have a number of libraries implemented in Ruby code.

At various times, the following libraries have all been implemented in Ruby code within JRuby: zlib, yaml, stringio, strscan, socket...and others irrelevant to this discussion. This provided us a much faster way to implement those libraries, but owing to the sluggishness of JRuby's interpreter, this also meant these libraries were slower than we would like. This is actually no different from C Ruby; many of these intensive libraries are implemented in C code in Ruby, with no Ruby code to be seen.

Some, such as zlib, yaml, and stringio are on their way to becoming 100% Java implementations, but they're not all the way there yet. This is generally because Ruby code is so much simpler and shorter than Java code; completing the conversion to Java is painful in many ways.

Ola's work on the zlib and yaml libraries have been a tremendous help. He provided the first ruby implementation of zlib, and has provided incremental improvements to it, generally by sliding it closer and closer to 100% java. Ola also ported a fast YAML parser from Python, first to Ruby and now increasingly to JRuby, resulting in his RbYAML and JvYAML projects. Our old, original Ruby 1.6 yaml.rb parser was extremely slow. The new parsers have made YAML parsing speed many orders of magnitude faster. Tom and Ola have both worked to improve stringio. stringio provides an IO-like interface into a string, much like Java's StringBuffer/StringBuilder classes. In Ruby, understandably, this is implemented entirely in C. Our version, while slowly becoming 100% Java, is still quite a bit slower than it ought to be.

The continued porting of these hot-spot libraries from Ruby to Java will have perhaps the largest effect on gem install performance. However, there's another cause for alarm.

Fun with Threads

Threading in Ruby has a somewhat different feel from threading on most other languages and platforms. Ruby's threads are so easy to use and so lightweight that calling them "threads" is a bit misleading. They can be stopped, killed, and terminated in a variety of ways from outside themselves. They are trivial to launch: { do something }. C Ruby also implements them as green threads, so no matter how many threads you spawn in Ruby, you're looking at a single processor thread to execute them. That means considerably less overhead, but practically no multi-core or multi-threading scalability at all. In short, Ruby's threads allow you to use and manipulate them in ways no other platform or language's threads allow while simultaneously giving you only a subset of typical threading benefits. For sake of brevity, I will refer to them as rthreads for the rest of this article.

The increased flexibility of rthreads mean that kicking off an asynchronous job is trivial. You can spin off many more threads than could be expected from native threading, using them for all manner of tasks where a parallel or asynchronous job is useful. The rthread is perhaps more friendly to users of the language than native threads: most of the typical benefits of threading are there without many of the gotchas. Because of this, I have always expected that Ruby code would use rthreading in ways that would horrify those of us with pure native threading. Therefore, I decided during my early redesign that supporting green threading--and even better, m:n threading--should be a priority. Our research into why gems are slow seems to have confirmed this is the right path.

RubyGems makes heavy use of the net/http package in Ruby. It provides a reasonably simple interface to connect, download, and manipulate http requests and responses. However, it shows its age in a few ways; other implementations of client-side http are around, and there are occasional calls to replace net/http as the standard.

net/http makes heavy use of net/protocol, a protocol-agnostic library for managing sockets and socket IO. It has various utilities for buffered IO and the like. It also makes use of Ruby's "timeout" library.

The timeout library allows you to specify that a given block of code should only execute for a given time. As you might guess, this requires the use of threading. However, you might be surprised how it works:

from lib/ruby/1.8/timeout.rb
  def timeout(sec, exception=Error)
return yield if sec == nil or
raise ThreadError, "timeout within critical session" if Thread.critical
x = Thread.current
y = Thread.start {
sleep sec
x.raise exception, "execution expired" if x.alive?
yield sec
# return true
y.kill if y and y.alive?

It's fairly straightforward code. You provide a block and an optional timeout period. If you specify no timeout, just execute the block. If we're in a critical section (which prevents more than one thread from running), throw an error. Otherwise, start up a thread that sleeps for the timeout duration and execute the block. If the timeout thread wakes up before the block is complete, interrupt the working thread. Otherwise, kill the timeout thread and return.

With rthreads, this is a fairly trivial operation. It gives Ruby's thread scheduler one extra task...starting up a lightweight thread and immediately putting it to sleep. Now it can be argued that this is a waste of resources, creating a thread every time you want to timeout a task. I would agree, since a single thread-local "timeout worker" would suffice, and would not require launching many threads. However, this sort of pattern is not unexpected with such a simple and consumable threading API. Unfortunately, it's a larger problem under JRuby.

JRuby is still 1:1 rthread:native thread, which means the timeout code above launches a native thread for every timeout call. Obviously this is less than ideal. It becomes even less ideal when you examine more closely how timeout is used in net/protocol:

from lib/ruby/1.8/net/protocol
    def read(len, dest = '', ignore_eof = false)
LOG "reading #{len} bytes..."
read_bytes = 0
while read_bytes + @rbuf.size < len
dest << (s = rbuf_consume(@rbuf.size))
read_bytes += s.size
dest << (s = rbuf_consume(len - read_bytes))
read_bytes += s.size
rescue EOFError
raise unless ignore_eof
LOG "read #{read_bytes} bytes"
def rbuf_fill
timeout(@read_timeout) {
@rbuf << @io.sysread(1024)

For those of you not as familiar with Ruby code, let me translate. The read operation performs a buffer IO read, reading bytes into a buffer until the requested quantity can be returned. To do this, it calls rbuf_fill repeatedly to fill the buffer. rbuf_fill, in order to enforce a protocol timeout, uses the timeout method for each read of 1024 bytes from the stream.

Here's where my defense of Ruby ends. Let's dissect this a bit.

First off, 1024 is nowhere near large enough. If I want to do a buffered read of a larger file (like oh, say, a gem) I will end up reading it in 1024-byte chunks. For a large file, that's hundreds or potentially thousands of read calls. What exactly is the purpose of buffering at this point?

Second, because of the timeout, I am now spawning a thread--however green--for every 1024 bytes coming out off the stream. Because of the inefficiency of net/protocol and timeout, we have a substantial waste of time and resources.

Now translate that to JRuby. Much of JRuby is still implemented in Ruby, which means that some calls which are native in Ruby are much slower in JRuby today. Socket IO is in that category, so doing a read every 1024 bytes greatly increases the overhead of installing a gem. Perhaps worse, JRuby implements rthreads with native threads, resulting in a native thread spinning up for every 1024 bytes read. For a 500k file, that means we're reading 500 times and launching 500 timeout threads in the process. Not exactly efficient.

We will likely try to submit a better timeout implementation, or a protocol implementation that reads in larger chunks (say 8k or 16k), but we have learned a valuable lesson here: rthreads allow for and sometimes make far easier threading scenarios we never would have attempted with native threads. For that reason, and because we'll certainly see this in other libraries and applications, we will continue down the m:n path.

Coolness Still Abounds

As always, despite these obstacles and landmines, we have arrived at a huge milestone in JRuby's development. RubyGems and Ruby go hand-in-hand like Java and jarfiles. The ability to install gems is perhaps the first step toward a really usable general-purpose Ruby implementation. Look for a release of JRuby--with a full complement of Ruby libraries and RubyGems preinstalled--sometime in the next week or two.