I've discovered a really awful bottleneck in REXML processing.
Look at these results for parsing our build.xml:
read content from stream, no DOM
2.592000 0.000000 2.592000 ( 2.592000)
1.326000 0.000000 1.326000 ( 1.326000)
0.853000 0.000000 0.853000 ( 0.853000)
0.620000 0.000000 0.620000 ( 0.620000)
0.471000 0.000000 0.471000 ( 0.471000)
read content once, no DOM
5.323000 0.000000 5.323000 ( 5.323000)
5.328000 0.000000 5.328000 ( 5.328000)
5.209000 0.000000 5.209000 ( 5.209000)
5.173000 0.000000 5.173000 ( 5.173000)
5.138000 0.000000 5.138000 ( 5.138000)
When reading from a stream, the content is read in in chunks, with each chunk being matched in turn. Because our current regexp engine uses char[] instead of byte[], each chunk must be decoded into UTF-16 characters, matched, and encoded back into UTF-8 bytes.
For small chunks, like those read off a stream, this decode/encode cycle is fairly quick. Here, the streamed numbers are pretty close to MRI. However, when an XML string is parsed from memory, the process goes like this:
- set buffer to entire string
- match against the buffer
- set buffer to post match (remainder of the string)
So what's the fix? Here's the same second benchmark using a StringIO object passed to the parser instead, with a simple change to rexml/source.rb:
Diff:
Index: lib/ruby/1.8/rexml/source.rb
===================================================================
--- lib/ruby/1.8/rexml/source.rb (revision 4596)
+++ lib/ruby/1.8/rexml/source.rb (working copy)
@@ -1,4 +1,5 @@
require 'rexml/encoding'
+require 'stringio'
module REXML
# Generates Source-s. USE THIS CLASS.
@@ -8,7 +9,7 @@
# @return a Source, or nil if a bad argument was given
def SourceFactory::create_from arg#, slurp=true
if arg.kind_of? String
- Source.new(arg)
+ IOSource.new(StringIO.new(arg))
elsif arg.respond_to? :read and
arg.respond_to? :readline and
arg.respond_to? :nil? and
New numbers:
read content once, no DOM
0.640000 0.000000 0.640000 ( 0.640000)
0.693000 0.000000 0.693000 ( 0.693000)
0.542000 0.000000 0.542000 ( 0.542000)
0.349000 0.000000 0.349000 ( 0.349000)
0.336000 0.000000 0.336000 ( 0.336000)
This is a perfect indication why JRuby's Rails performance is currently nowhere near what it will be. We continue to find these little gems...and there's no telling how many more are out there. With recent execution performance numbers looking extremely solid and recent Rails performance getting closer and closer, the upcoming 1.1 release ought to be amazing.
5 comments:
Nice work with JRuby. You and your fellow committers are doing a great job.
What's the best guess on the arrival timeframe for the Oniguruma port?
anonymous: Marcin has said it will be done for the final 1.1 release, but perhaps not before RubyConf in early November.
Awesome work by you and the rest of the JRuby crew Charles. I think you're really showing the benefit of a cleaner code base, and a willingness to dig into performance issues. Keep it up!
Awesome job. I'm looking forward to playing with JRuby more after my current project is released.
Was this change applied to the Jruby 1.1.1 code base. It does not seem to be in that code base.
Post a Comment