Wednesday, October 17, 2007

Another Performance Discovery: REXML

I've discovered a really awful bottleneck in REXML processing.

Look at these results for parsing our build.xml:

read content from stream, no DOM
2.592000 0.000000 2.592000 ( 2.592000)
1.326000 0.000000 1.326000 ( 1.326000)
0.853000 0.000000 0.853000 ( 0.853000)
0.620000 0.000000 0.620000 ( 0.620000)
0.471000 0.000000 0.471000 ( 0.471000)
read content once, no DOM
5.323000 0.000000 5.323000 ( 5.323000)
5.328000 0.000000 5.328000 ( 5.328000)
5.209000 0.000000 5.209000 ( 5.209000)
5.173000 0.000000 5.173000 ( 5.173000)
5.138000 0.000000 5.138000 ( 5.138000)

When reading from a stream, the content is read in in chunks, with each chunk being matched in turn. Because our current regexp engine uses char[] instead of byte[], each chunk must be decoded into UTF-16 characters, matched, and encoded back into UTF-8 bytes.

For small chunks, like those read off a stream, this decode/encode cycle is fairly quick. Here, the streamed numbers are pretty close to MRI. However, when an XML string is parsed from memory, the process goes like this:
  1. set buffer to entire string
  2. match against the buffer
  3. set buffer to post match (remainder of the string)
Now this is obviously a little inefficient, since it creates a lot of extra strings, but a copy-on-write String implementation helps a lot. However in our case it also means that we decode/encode the entire remaining XML string for every element match. For any nontrivial file, this is *terrible* overhead.

So what's the fix? Here's the same second benchmark using a StringIO object passed to the parser instead, with a simple change to rexml/source.rb:

Index: lib/ruby/1.8/rexml/source.rb
--- lib/ruby/1.8/rexml/source.rb (revision 4596)
+++ lib/ruby/1.8/rexml/source.rb (working copy)
@@ -1,4 +1,5 @@
require 'rexml/encoding'
+require 'stringio'

module REXML
# Generates Source-s. USE THIS CLASS.
@@ -8,7 +9,7 @@
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String
elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and

New numbers:
read content once, no DOM
0.640000 0.000000 0.640000 ( 0.640000)
0.693000 0.000000 0.693000 ( 0.693000)
0.542000 0.000000 0.542000 ( 0.542000)
0.349000 0.000000 0.349000 ( 0.349000)
0.336000 0.000000 0.336000 ( 0.336000)

This is a perfect indication why JRuby's Rails performance is currently nowhere near what it will be. We continue to find these little gems...and there's no telling how many more are out there. With recent execution performance numbers looking extremely solid and recent Rails performance getting closer and closer, the upcoming 1.1 release ought to be amazing.


Anonymous said...

Nice work with JRuby. You and your fellow committers are doing a great job.

What's the best guess on the arrival timeframe for the Oniguruma port?

Charles Oliver Nutter said...

anonymous: Marcin has said it will be done for the final 1.1 release, but perhaps not before RubyConf in early November.

gnupate said...

Awesome work by you and the rest of the JRuby crew Charles. I think you're really showing the benefit of a cleaner code base, and a willingness to dig into performance issues. Keep it up!

Jim Deville said...

Awesome job. I'm looking forward to playing with JRuby more after my current project is released.

Anonymous said...

Was this change applied to the Jruby 1.1.1 code base. It does not seem to be in that code base.