Tuesday, November 27, 2007

Java 6 Port for OS X (Tiger and Leopard)

I just stumbled across this little gem today:

Landon Fuller's JDK 6 Port for OS X

Who Landon Fuller is I don't know. But I find it incredibly impressive that he's managed to get the base JDK 6 ported to OS X and working. Talk about showing the value of an open-source JDK...Landon Fuller FTW. Apple, are you hiring? Perhaps this guy can kick the Apple JDK process in the ass.

So naturally when I'm confronted with a final JDK 6 release for OS X one thing immediately springs to mind: performance.

We'd always suspected that the early preview version of JDK 6 on OS X was not showing us the true awesome performance we could expect from a final version.

We were right.

So I'll give you two sets of numbers, one that's specious and unreliable and the other that's a more real-world test.

fib numbers

Yes, good old fib. A constant in benchmarking. It shows practically nothing, and yet people use it to demonstrate perf. And in the case of Ruby 1.9, they've specifically optimized for integer-math-heavy benchmarks like this.

Ruby 1.9:

  0.400000   0.000000   0.400000 (  0.413737)
0.420000 0.010000 0.430000 ( 0.421622)
0.400000 0.000000 0.400000 ( 0.411591)
0.410000 0.000000 0.410000 ( 0.411593)
0.400000 0.000000 0.400000 ( 0.410080)
0.410000 0.000000 0.410000 ( 0.408836)
0.400000 0.000000 0.400000 ( 0.408572)
0.410000 0.000000 0.410000 ( 0.408114)
0.400000 0.000000 0.400000 ( 0.410374)
0.400000 0.000000 0.400000 ( 0.413096)

Very nice numbers, especially considering Ruby 1.8 benchmarks at about 1.7s on my system. Ruby 1.9 contains several optimizations for integer math, including the use of "tagged integers" for Fixnum values (saving object costs) and fast math opcodes in the 1.9 bytecode specification (avoiding method dispatch). JRuby does neither of these, representing Fixnums as a normal Java object containing a wrapped Long and dispatching as normal for all numeric operations.

JRuby trunk:
  0.783000   0.000000   0.783000 (  0.783000)
0.510000 0.000000 0.510000 ( 0.510000)
0.510000 0.000000 0.510000 ( 0.510000)
0.506000 0.000000 0.506000 ( 0.506000)
0.505000 0.000000 0.505000 ( 0.504000)
0.507000 0.000000 0.507000 ( 0.507000)
0.510000 0.000000 0.510000 ( 0.510000)
0.507000 0.000000 0.507000 ( 0.507000)
0.508000 0.000000 0.508000 ( 0.508000)
0.510000 0.000000 0.510000 ( 0.510000)

This is improved from numbers in the 0.68s range under the Apple JDK 6 preview. Pretty damn hot, if you ask me. I love being able to sit back and do nothing while performance numbers improve. It's a nice change from 16 hour days.

Anyway, back to performance. JRuby also supports an experimental frameless execution mode that omits allocating and initializing per-call frame information. In Ruby, frames are used for such things as holding the current method visibility, the current "self", the arguments and block passed to a method, and so on. But in many cases, it's safe to omit it entirely. I haven't got it running 100% safe in JRuby yet, and probably won't before 1.1 final comes out...but it's on the horizon. So then...numbers.

JRuby trunk, frameless execution:
  0.627000   0.000000   0.627000 (  0.627000)
0.409000 0.000000 0.409000 ( 0.409000)
0.401000 0.000000 0.401000 ( 0.401000)
0.402000 0.000000 0.402000 ( 0.402000)
0.403000 0.000000 0.403000 ( 0.403000)
0.403000 0.000000 0.403000 ( 0.403000)
0.404000 0.000000 0.404000 ( 0.405000)
0.401000 0.000000 0.401000 ( 0.401000)
0.403000 0.000000 0.403000 ( 0.403000)
0.405000 0.000000 0.405000 ( 0.405000)

Hello hello? What do we have here? JRuby actually executing fib faster than an optimized Ruby 1.9? Can it truly be?

Pardon my snarkiness, but we never thought we'd be able to match Ruby 1.9's integer math performance without seriously stripping down Fixnum and introducing fast math operations into the compiler. I guess we were wrong.

M. Ed Borasky's MatrixBenchmark

I like Borasky's matrix benchmark because it's a non-trivial piece of code, and pulls in a Ruby standard library (matrix.rb) as well. It basically inverts a matrix of a particular size and multiplies the original by the inverse. I show here numbers for a 64x64 matrix, since it's long enough to show the true benefit of JRuby but short enough I don't get bored waiting.

Ruby 1.9:
Hilbert matrix of dimension 64 times its inverse = identity? true
21.630000 0.110000 21.740000 ( 21.879126)
JRuby trunk:
Hilbert matrix of dimension 64 times its inverse = identity? true
14.780000 0.000000 14.780000 ( 14.780000)

This is down from 16-17s under the Apple JDK 6 preview and a clean 25% faster than Ruby 1.9.

So what have we learned today?
  • Sun's JDK 6 provides frigging awesome performance
  • Apple users are crippled without a JDK 6 port. Apple, I hope you're paying attention.
  • Landon Fuller is my hero of the week. I know Landon will just point at the excellent work to port JDK 6 to FreeBSD and OpenBSD...but give yourself some credit, you did what none of the other Leopard whiners did.
  • JRuby rocks
Note: You have to be a Java Research License licensee to legally download the binary or source versions of Landon's port. That or complain to Apple about some dude making a working port before they did. Landon mentions on his blog that he plans to contribute this work to OpenJDK soon...which would quickly result in a buildable GPLed JDK for OS X. Awesome.

18 comments:

Seo Sanghyeon said...
This comment has been removed by the author.
Antonio Cangiano said...

Very impressive results, Charles. I have a gut feeling that JRuby will become the impementation of choice for quite a few developers.

Jim said...

Good news for all leopard users :)

Niels Bech Nielsen said...

If you're at JavaPolis on December 12, you might be interested in this:

http://www.javapolis.com/confluence/display/JP07/Meetups

I don't know if its for real, but..

Attila Szegedi said...

Googling for the guy's name reveals this entry from 2002:

http://conferences.oreillynet.com/cs/os2002/view/e_spkr/1386

saying "Landon Fuller is an engineer in Apple's BSD Technology Group and one of the primary architects of the Darwin ports system."

So, maybe Apple doesn't need to hire him; they did it already :-)

Attila Szegedi said...

Actually, it also looks like he left Apple since.

murphee said...

I'm confused about your remark about of tagged integers in 1.9: I thought Ruby 1.8.x already used tagged integers to represent Fixnums... isn't that the reason why Fixnums max out at 29 bit?

Karl von Laudermann said...

I think you're jumping the gun a bit in your excitement. I visited the page you linked to, and noticed two major omissions:

1. It's for Intel Macs only. And based on an earlier post a little farther down the page, it seems that PPC support hasn't been started and will be a long time coming. This means that it doesn't exist for my machine, which is only 2.5 years old.*

2. If you look at the TODO list, you'll see that a few things are missing, the largest of which is that Swing support is provided via X11, rather than natively. And again, this is identified as a major undertaking. You might not care about Swing, with your fib benchmarks and Rails and such, but access to Swing from Ruby is literally the only reason I have any interest in JRuby!

*Why is it that all of these open source Mac projects are targetting Intel first and PPC later or not at all? PPC binaries run on Intel Macs due to Rosetta. Therefore logically if you're going to target one architecture, you should choose PPC so that it will run on Macs of both architectures.

Joe said...

Karl said:
*Why is it that all of these open source Mac projects are targetting Intel first and PPC later or not at all? PPC binaries run on Intel Macs due to Rosetta. Therefore logically if you're going to target one architecture, you should choose PPC so that it will run on Macs of both architectures.

Hmmm. I wonder why anyone would start with the present and future architecture instead of the one that once was the wave of the future? Especially if there's only a major performance hit on intel it's only available under rosetta...

Maybe we need someone to produce Attesor, so the native Intel code can run on the PPC.

PS. I have two PPC machines and only one Intel. Guess which one I use most...

Anonymous said...

He's the good fellow who provided patches to some of the MOAB (Month of Apple Bugs)...

Karl von Laudermann said...

Joe:

It's a choice between taking a performance hit on the newer architecture vs. not running *at all* on the older architecture. The older architecture being not that old; as I said, I bought my computer only 2.5 years ago, which was a month or two before Jobs even *announced* the switch to Intel, let alone finished the conversion across all Mac models. There's still plenty of PowerPC Macs in use today.

And this is assuming that the developer has already decided to only support one architecture in the first place, instead of going Universal early on. Admittedly, I don't know what's involved in creating a Universal binary if you're not using Obj-C/Cocoa/XCode. (If you are, it only requires checking a "Make Universal" checkbox in XCode, at least according to Jobs). So it may be a sensible decision in the short term, for volunteer-driven open-source projects.

Koz said...

Apple users are crippled without a JDK 6 port. Apple, I hope you're paying attention.

Not to sound like too much of a troll, but Sun could also make this happen. I realise that apple wanted their own jvm for 'better OS integration', but they've abandoned that route now. Sun clearly has the talent and knowledge to do it....

Charles Oliver Nutter said...

koz: I don't speak for Sun on this, but my understanding is that it's largely a resource issue. Maintaining JDK releases for Solaris, Windows, and Linux takes up a *lot* of effort.

Albert Strasheim said...

I suspect that somewhere down the line someone could pull together the work from Landon Fuller and Gary Benson to deliver JDK 6 on Mac OS X PPC.

http://gbenson.livejournal.com/

AndersM said...

Karl said:
*Why is it that all of these open source Mac projects are targetting Intel first and PPC later or not at all? PPC binaries run on Intel Macs due to Rosetta. Therefore logically if you're going to target one architecture, you should choose PPC so that it will run on Macs of both architectures.

I dunno about "all these open source projects", but the reason for this port being limited to Intel macs is obvious: Sun's JVM contains some very close-to-metal Sparc and x86 code. A nontrivial amount is assembly or assembly-generating C++, and some of the low-level code is very subtle and heavily optimized. I've read some of this code as part of a research project on Java performance, and I wouldn't want to be the one who had to port it to a different CPU architecture. =)

David Koontz said...

> murphee said...

> I'm confused about your remark
>about of tagged integers in 1.9: I
>thought Ruby 1.8.x already used
>tagged integers to represent
>Fixnums... isn't that the reason why
>Fixnums max out at 29 bit?

Fixnums max out at the 29th bit, which is actual 30 bits of data (1073741823). Or is that what you were saying in which case I'll just shut up now.

An interesting observation is that JRuby is a bit different in the cutoff from Fixnum to Bignum as it uses a long so its cutoff is (9223372036854775807) or 63 bits.

Scott said...

Hi Charles,
I met you briefly at RailsConf back in May and you quickly got me interested in JRuby. I'm consistently impressed with how much it has advanced just since then. You and the rest of the team are doing really great work. Keep it up!

murphee said...

@david koontz:
Hm... I guess what I meant was just this:
(2**29).class => Fixnum
(2**30).class => Bignum

30 bits of data makes sense, since Ruby's tags take up 2 bits.