Thursday, December 23, 2010

Improved JRuby Startup by Deferring Gem Plugins

Another present for you JRubyists out there!

JRuby has had notoriously bad startup times. Not as bad as, say, IronRuby (sorry guys!), but definitely a big fat hit every time you need to run some Ruby code from the command line. Some of this overhead was related to JRuby, and we've steadily worked to improve that over the years. Some of it is due to the JVM, most commonly due to running on the "server" Hotspot VM or another JVM that does not have an interpreter (both of which start up considerably slower than Hotspot/OpenJDK's "client" mode). I've blogged tips and tricks for JRuby startup before, and these mostly apply to vanilla JRuby startup performance.

However, a large part of the overhead was not specifically due to JRuby or the JVM, but to RubyGems. RubyGems in version 1.3 added support for "plugins", whereby gems could include a specially-named file to extend the functionality of RubyGems itself. Most of these plugins added command-line tools like "gem push" for pushing a new gem to gemcutter.org (now built-in for pushing to rubygems.org). Unfortunately, the feature was originally added by having RubyGems do a full scan of all installed gems on every startup. If you only had a few gems, this was a minor problem. If you had more than a few, it became a big fat O(N) problem, where each of those N could be arbitrarily complex in themselves.

The good news is that it looks like my proposed change – making plugin scanning happen *only* when using the "gem" command – appears likely to be approved for RubyGems 1.4, due out reasonably soon.

Here's the patch and the impact to RubyGems startup times are below. The first two times are without the patch, with the first time against a "cold" filesystem. The final time is with the patch in place. In all cases, it's against my local JRuby working copy, which has around 500 gems installed.

~/projects/jruby ➔ jruby -e "t = Time.now; require 'rubygems'; puts Time.now - t"
17.09

~/projects/jruby ➔ jruby -e "t = Time.now; require 'rubygems'; puts Time.now - t"
6.959

~/projects/jruby ➔ git stash pop
# On branch master
# Changed but not updated:
# (use "git add <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: lib/ruby/site_ruby/1.8/rubygems.rb
# modified: lib/ruby/site_ruby/1.8/rubygems/gem_runner.rb
...

~/projects/jruby ➔ jruby -e "t = Time.now; require 'rubygems'; puts Time.now - t"
0.481


It's truly a shocking difference, and it's easy to see why JRuby (plus RubyGems) has had such a bad startup-time reputation.

I've already made this change locally to JRuby's copy of RubyGems, which should help any users working against JRuby master. The change will almost certainly ship in JRuby 1.6, with RCs showing up in the next couple weeks. So with this change and my JRuby startup tips, we're on the road to a much more pleasant JRuby experience.

Happy Hacking!

5 comments:

Colby Gutierrez-Kraybill said...

Thanks for all of your hard work!

sahglie said...

Sah weet! The JRuby team is awesome!

Mark Menard said...

That's really awesome Charles. Thanks!

Clayton O'Neill said...

Is this patch against 1.4? I can't get it to apply to the rubygems that jruby 1.5.5 ships with. It's missing the Gem.load_plugins call.

John Woodell said...

This is why we use bundler08 on AppEngine. The standalone flag should be part of the next bundler release. The standalone flag will stub out rubygems and generate an environment file that puts all the gems (your Gemfile yields) on the LOAD_PATH.