I've discovered a really awful bottleneck in REXML processing.
Look at these results for parsing our build.xml:
read content from stream, no DOM
2.592000 0.000000 2.592000 ( 2.592000)
1.326000 0.000000 1.326000 ( 1.326000)
0.853000 0.000000 0.853000 ( 0.853000)
0.620000 0.000000 0.620000 ( 0.620000)
0.471000 0.000000 0.471000 ( 0.471000)
read content once, no DOM
5.323000 0.000000 5.323000 ( 5.323000)
5.328000 0.000000 5.328000 ( 5.328000)
5.209000 0.000000 5.209000 ( 5.209000)
5.173000 0.000000 5.173000 ( 5.173000)
5.138000 0.000000 5.138000 ( 5.138000)
When reading from a stream, the content is read in in chunks, with each chunk being matched in turn. Because our current regexp engine uses char[] instead of byte[], each chunk must be decoded into UTF-16 characters, matched, and encoded back into UTF-8 bytes.
For small chunks, like those read off a stream, this decode/encode cycle is fairly quick. Here, the streamed numbers are pretty close to MRI. However, when an XML string is parsed from memory, the process goes like this:
- set buffer to entire string
- match against the buffer
- set buffer to post match (remainder of the string)
So what's the fix? Here's the same second benchmark using a StringIO object passed to the parser instead, with a simple change to rexml/source.rb:
Diff:
Index: lib/ruby/1.8/rexml/source.rb
===================================================================
--- lib/ruby/1.8/rexml/source.rb (revision 4596)
+++ lib/ruby/1.8/rexml/source.rb (working copy)
@@ -1,4 +1,5 @@
require 'rexml/encoding'
+require 'stringio'
module REXML
# Generates Source-s. USE THIS CLASS.
@@ -8,7 +9,7 @@
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String
- Source.new(arg)
+ IOSource.new(StringIO.new(arg))
elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and
New numbers:
read content once, no DOM
0.640000 0.000000 0.640000 ( 0.640000)
0.693000 0.000000 0.693000 ( 0.693000)
0.542000 0.000000 0.542000 ( 0.542000)
0.349000 0.000000 0.349000 ( 0.349000)
0.336000 0.000000 0.336000 ( 0.336000)
This is a perfect indication why JRuby's Rails performance is currently nowhere near what it will be. We continue to find these little gems...and there's no telling how many more are out there. With recent execution performance numbers looking extremely solid and recent Rails performance getting closer and closer, the upcoming 1.1 release ought to be amazing.