Headius

Saturday, January 06, 2007

Five Things About Me

Tor, you sneaky devil. You tagged me before anyone else had a chance. You grabbed the brass ring. Kudos.

So to continue the "5 Things" meme (for the record, I really hate the word "meme"), I present for you five things you probably don't know about me. Actually, some of you will know some of these facts, but I doubt any of you will know them all. I've tried to pick the most quirky or interesting bits out of my otherwise humdrum life.

Some time in 1998, I became the lead developer on the LiteStep project. LiteStep was a very popular replacement for the Explorer desktop shell on Windows during the late 90s. It provided a new taskbar, desktop window, NeXT-like dock, and pluggable UI and theming system. For hardcore users tired of the boring Explorer UI, it was the state of the art.

Originally created by a fellow named Francis Gastellu, it had by 1998 grown rather quiet. At the time, the codebase was silently fading away, with none of the original developers still working on the project and few active developers interested in or able to make a large time commitment to get LiteStep going again. I discovered LiteStep and was attracted by its ability to replace the entire desktop Look & Feel of my Windows machines. I had also been an avid Win32 developer, releasing the shareware program "Hack-It" to some minimal financial success. However the LiteStep code was in really rough shape.

Almost all the logic was packed into a single large C file that controlled the main desktop window. All the other modules were heavily dependent on this one piece of code, which ultimately crippled LiteStep's ability to incorporate certain types of UI plugins into a user's desktop. I tackled the problem in two ways:
1. I started converting the core plugins to C++ pure virtual classes and implementations, to allow for a more componentized system
2. And I reworked all the critical functionality from the desktop module into a central runtime, allowing all other modules to finally remove their desktop dependencies
Over the next year, LiteStep started to grab the attention of the desktop theming community once again. "Skinning" in general really took off during this time, with the launch of new shells GeoShell, DarkStep, and others. An article published in Wired (for which I was interviewed but not quoted) detailed this new movement.

Sadly, with the release of theming capabilities in Windows XP, the rise of Linux desktops, and the rebirth of Macintosh with OS X, LiteStep has long since fallen from grace. But to this day I still have the odd person walk up to me and thank me for my efforts during that time. LiteStep, we barely knew ye.

I suppose an addendum to this item is that for many years I wrote at least as much Win32 C++ code as I did Java, and I still have the programming guides to prove it. How's that for diversity?

I do not remember a time in my life I was not in front of a computer. The first computing experience I can remember was programming and playing with BASIC on my Atari 400, writing little games and buying programming books containing short apps I could type in...carefully...one finger at a time. I remember saving my programs to the Atari cassette tape drive and praying, praying, praying it would actually take. I remember dialing up to text-based information services at 300bps over an acoustic coupler. In third grade, a mentor came to my elementary to teach me to program in Apple BASIC, though I never owned an Apple computer until my current MacBook Pro.

Throughout gradeschool and highschool, my primary interests lie with computers. I ran a BBS called "Terminal Nightmare" (clever, eh?) for which I toiled many hours creating ANSI graphics and advertising on more popular boards. I brought C programming manuals to school in 8th grade to read during slow periods. I wrote C and assembler code on embedded processors for my dad's electronics design ventures in 9th grade. And so on and so forth. I've been a computer geek as long as I can remember, and I've never had a problem with that.

Toward the end of highschool I started thinking about degree programs. I initially started my post-secondary education in Organic Chemistry, and completed the first two years of requirements. But I hated labs. Some time during the second year, I discovered that there was something called a "Computer Science" degree. Oh, hell yes. From then on I never performed another titration or chromatograph, and I couln't be happier.

When I am not programming (which is extremely rare) I am an enthusiast of complete-information strategy games. I have spent some amount of time reading about and studying Go, which is my favorite game. I enjoy playing various Shogi variants (including Shogi, Chu Shogi, Tenjiku Shogi, and Tori Shogi), though I don't claim to be good at any of them. I will play Xiang Qi, but it's not one of my favorites, and I have not learned any particularly good strategies. I also play Chess, having been taught by my father at an early age.

Occasionally me and a few local friends will get together and play these games until the wee hours of the morning. Some people have LAN parties; we have strategy gaming parties. We most frequently play Bughouse when we can find four people and two clocks, but we often just get together to play the above games one-on-one.

And by "complete-information" games, I mean those in which there is no element of chance. I do not enjoy dice games, and I will play card games only if present company prefers such games. My opinion is that if I lose a game, I would much rather it be due to my own ineptitude than due to random chance.

I was one of the best fight-game players in local arcades in the late 1990s. Oddly enough, I was never drawn to Street Fighter, but I spent literally thousands of dollars over the years getting good at the Mortal Kombat and Killer Instinct series of games from Midway. My friend and I would generally spend most weekend nights at arcades, usually playing for minimum cost against players short on skill but long on quarters. We got quite good.

I was also pretty heavily addicted to those games. During my first two years at the University of Minnesota, I generally skipped class to play. There was such a rush from getting a higher combo, or beating a new player who tried to represent. I also made many friends in those arcades whose names I never knew and whom I have never seen since...but there was a bond among us gamers.

When I had the means, I began to collect arcade machines. Unfortunately, the means ran dry after only a few purchases, but I've been happy to have them. I own the following arcade machines, stowed in my basement and occasionally played:
- Ultimate Mortal Kombat 3 (in an old Atari Rampart cabinet)
- Killer Instinct 2 (in a KI1 cabinet; the sound ROMs are corrupt, so they need a refresh)
- Killer Instinct 1 (original cabinet; not functional at the moment)
- Mortal Kombat 2 (board only)
- Mortal Kombat 1 (board only)
- Teenage Mutant Ninja Turtles (in an old Taito cabinet with no volume control so it's freaking loud)
- Asteroids (yes, the original, in great working condition; however it's in a Lunar Lander cabinet, of which only a few thousand were ever made. Definitely the gem of the collection).
I also own the hollowed-out remnants of an old Gun Fight cabinet. I intended to restore it, but the side art and wood were in very poor shape. It's rotting in the garage.

I'd love to have a Ms PacMan, Q*Bert, or Tron machine. Unfortunately, so would the rest of the world.

I write and eat left-handed, though I prefer my right hand for almost everything else. Unfortunately, like most lefties, this means I can't use writing utensils that may smear or smudge. You lefties know what I'm talking about: the dreaded "pencil hand" you get from dragging your hand through what you've just written. In junior high I finally got tired of having to wash pencil lead off my hand every day, and for several years I utilized a novel solution:

I wrote backwards.

Friday, January 05, 2007

Ruby Breaks TIOBE Top Ten; Declared Language of the Year

The headline says it all, really!

The TIOBE Programming Community Index measures language popularity based on "the world-wide availability of skilled engineers, courses and third party vendors" using the major search engines. It's not not a terribly scientific way to measure popularity, but I'm not sure anyone has a better index.

Ruby has been moving up every month during 2006 and for the first time has broken the top ten in January 2007. TIOBE also declared it the "Programming Language of 2006", which comes as no surprise to us Rubyists who love the language so much.

Congratulations, Ruby!

Thursday, January 04, 2007

New JRuby Compiler: Progress Updates

I've been cranking away on the new compiler. I'm a bit tired and planning to get some sleep, but I've gotten the following working:

all three kinds of calls
local variables
string, fixnum, array literals
'def' for simple methods and arg lists
closures

Now that last item comes with a big caveat: I have no way to pass closures I create. The code is compiling, basically as a Closure class that you initialize with local variables and invoke. But since block management in JRuby is still heavily dependent on ThreadContext nonsense, there's no easy way to pass it to a given method. So the next step to getting closures to work in the compiler is to start passing them on the call path as a parameter, much like we do for ThreadContext.

I've managed to keep the compiler fairly well isolated between node walking and bytecode generation, though the bytecode generator impl I have currently is getting a little large and cumbersome. It's commented to death, but it's pushing 900 LOC. It needs some heavy refactoring. However, it's behind a fairly straightforward interface, so the node-walking code doesn't ever see the ugliness. I believe it will be much easier to maintain, and it's certainly easier to follow.

In general, things are moving along well. I'm skipping edge cases for some nodes at the moment to get bulk code compiling. There's a potential that as this fills out more and handles compiling more code, it could start to be wired in as a JIT. Since it can fail gracefully if it can't compile an AST, we'd just drop back to interpreted mode in those cases.

So that's it.

...

Ok, ok, here's performance numbers. Twist my arm why don't you.

(best times only)

The new method dispatch benchmark tests 100M calls to a simple no-arg method that returns 'self', in this case Fixnum#to_i. The first part of the test is a control run that just does 100M local variable lookups.

method dispatch, control (var access only):
 interpreted, client VM: 1.433
 interpreted, server VM: 1.429
 ruby 1.8.5: 0.552
 compiled, client VM: 0.093
 compiled, server VM: 0.056

Much better. The compiler handles local var lookups using an array, rather than going through ThreadContext to get a DynamicScope object. Much faster, and HotSpot hits it pretty hard. At worst it takes about 0.223s, so it's faster than Ruby even before HotSpot gets ahold of it. The second part of the test just adds in the method calls.

method dispatch, test (with method calls):
 interpreted, client VM: 5.109
 interpreted, server VM: 3.876
 ruby 1.8.5: 1.294
 compiled, client VM: 3.167
 compiled, server VM: 1.932

Better than interpreted, but slow method lookup and dispatch is still getting in the way. Once we find a single fast way to dynamic dispatch I think this number will improve a lot.

So then, on to the good old fib tests.

recursive fib:
 interpreted, client VM: 6.902
 interpreted, server VM: 5.426
 ruby 1.8.5: 1.696
 compiled, client VM: 3.721
 compiled, server VM: 2.463

Looking a lot better, and showing more improvement over interpreted than the previous version of the compiler. It's not as fast as Ruby, but with the client VM it's under 2x and with the server VM it's in the 1.5x range. Our heavyweight Fixnum and method dispatch issues are to blame for the remaining performance trouble.

iterative fib:
 interpreted, client VM: 17.865
 interpreted, server VM: 13.284
 ruby 1.8.5: 17.317
 compiled, client VM: 17.549
 compiled, server VM: 12.215

Finally the compiler shows some improvement over the interpreted version for this benchmark! Of course this one's been faster than Ruby in server mode for quite a while, and it's more a test of Java's BigInteger support than anything else, but it's a fun one to try.

All the benchmarks are available in test/bench/compiler, and you can just run them directly. If you like, you can open them up and see how to use the compiler yourself; it's pretty easy. I will be continuing to work on this after I get some sleep, but any feedback is welcome.

Wednesday, January 03, 2007

InvokeDynamic: Actually Useful?

Over time I've become less convinced that hotswappable classes would be an absolute requirement for the proposed invokedynamic bytecode to be useful, and more convinced that there's a number of ways a dynamic language like Ruby or Groovy could utilize the new bytecode. This post gives a little background on invokedynamic and attempts to summarize a few ideas off the top of my head.

Many folks, myself included, have long held that the proposed invokedynamic bytecode would only be useful if coupled with hotswappable classes. Hotswapping is the mechanism by which we could alter class structure after definition and have existing instances of the class pick up those changes. It's true this would be required if we were to compile Ruby all the way to bytecode; since Ruby classes are always open, we need the ability to add and remove methods without destroying already-created instances. The argument goes that if invokedynamic requires a dynamically-invoked method to exist on a target receiver's type, then we would only ever be able to invokedynamic against compiled Ruby code if we could continue to alter those types when classes get re-opened.

I do believe that hotswapping would be useful, but it's fraught with many really difficult problems. To begin with, there's Java's security model, whereby a class that's been loaded into the system *can not* be modified in most typical security contexts. The JVM does have the ability to replace existing method definitions at runtime, but that's generally reserved for debugging purposes, and it doesn't allow adding or removing methods. It also does not currently have the ability to wholesale remove and replace a class that has live instances, and it's an open research question to even consider the ramifications of allowing such a thing.

So what are the alternatives? Gilad Bracha proposed having the ability to attach methods dynamically to a given static class at runtime. This would perhaps be similar to the CLR's "dynamic methods". This idea perhaps has more merit...one issue not addressed by hotswappable classes is that even once we compile Ruby to bytecode, it's still dynamic and duck-typed. Would all methods accept Object and return Object? Is that useful? By specifically stating that some methods are dynamic and mutable (in the case of a Ruby class, likely all methods we've compiled), you effectively create the equivalent of hotswapping without breaking existing static types and their security semantics.

But this is all research that could and perhaps should occur outside invokedynamic, and it all may or may not be related. So then, can invokedynamic be useful with these class-structure questions unanswered? What does invokedynamic mean?

To me, invokedynamic means the ability to invoke a method without statically binding to a specific type, and perhaps additionally without specifying static types for the parameter list. For those that don't know, when generating method-call bytecodes for the JVM, you must always provide two things in addition to the method name: the class within which the method you're invoking lives and the precise parameter list of the method you want to call. And there's not much wiggle room there; if you're off on the target type or if the receiver you're calling against has not yet been cast to (or been determined to match) that type, kaboom. If your parameter list doesn't match one on the target type, kaboom. If your parameters haven't been confirmed as being compatible with that signature, kaboom. Perhaps you can see, then, why writing a compiler for the JVM is such a complicated affair.

So there's potential for invokedynamic to make even static compilation easier. Without the need to specify all those types, we can defer that compile-time magic to the VM, if we so choose. We don't have to dig around for the exact signature we want or the exact target type. Given a receiver object, a method name, and a bundle of parameter objects, invokedynamic should "do the right thing."

Now we start to see where this could be useful. Any dynamic language on the JVM is going to be most interesting in the context of the platform's available libraries. Ruby is great on its own, and there's certainly an entire (potentially large) market segment that's interested in JRuby purely as an alternative Ruby runtime. But the larger market, and the more intriguing application of JRuby, is as a language to tie the thousands of available Java libraries together. And that requires calling Java code from Ruby and Ruby code from Java with as little complexity and overhead as possible.

Enter invokedynamic.

Now I've only recently started to see how invokedynamic could really be useful even without dynamic methods or hotswappable classes, so this list is bound to grow. I'd love to have all three features, of course, but here's a few areas that invokedynamic alone would be useful:

Our native implementations of Ruby methods can't really be tied to a specific concrete class, since we have to be able to rewire them at runtime if they're redefined. If invokedynamic came along with a mechanism for doing a Java-based "method_missing", whereby we could intercept dynamic calls to a given object and dispatch in our own way, we could make use of the bytecode without having hot-swappable classes.
It would also aid compilation and code generation. In my work on the prototype compiler, one of the biggest stumbling blocks is making sure I'm binding method calls to the appropriate target type. I must make sure the receiver of a method has been casted to the type I intend to bind to or Java complains about it. If there were a way to just say invokedynamic, omitting the target type, it would make compilation far simpler; and I don't believe HotSpot would have to do any additional work to make it fast, since it already has optimizations under the covers that are fairly type-agnostic.
To a lesser extent, invokedynamic could push the smarts of determining appropriate method signatures onto the VM. I would supply a series of parameters and a method name, and tell the VM to invokedynamic. The VM, in turn, would look at the params and name and select an appropriate method from the receiving object. This is in essence all that's needed for real duck typing to work.

This last item calls out a perhaps surprising area that invokedynamic would be very useful: invoking Java code from a dynamic language.

When calling Java code from Ruby, for example, all we really have to work with are two details: a method name and potentially an arity. We can do some inference based on the actual types of parameters, but there's a lot of magic and a number of heuristics involved. If there were a JVM-native mechanism for calling arbitrary methods on a given object, without having to statically bind to those methods, it would eliminate much of our Java integration layer.

All told, I think invokedynamic would definitely be much more than a PR stunt, as some have claimed. It would eliminate one of the most difficult barriers to generating JVM bytecodes by allowing arbitrary method calls that aren't necessarily bound to specific types. I for one would vote yes, and I plan to throw my weight behind making invokedynamic do everything I need it to do...with or without hotswapping.

Tuesday, January 02, 2007

Groovy 1.0 is Released!

Congratulations to the Groovy team on their release of Groovy 1.0! Groovy is another dynamic language for the JVM inspired by features in Smalltalk, Python, and of course Ruby. It's been a long time coming, and a lot of hard work involved, but Groovy 1.0 is finally here.

See the announcement from Guillaume Laforge, one of the Groovy team members.

Here's hoping there's a bright future of cooperation between the Groovy team and the other dynamic languages for the JVM.

Monday, January 01, 2007

Welcome Nick Sieger to the JRuby Team

The team has grown again! After I asked the JRuby community to nominate a new team member, based on past code, mailing list, documentation, or other contributions, a number of folks thought Nick Sieger would be a good addition. And we agreed.

Nick is the original author of the ActiveRecord-JDBC connector, and has done a lot of work wiring JRuby up with NanoContainer. He's been an active member of the mailing lists and you've probably all read his blog at some point...if only for his excellent summary posts from RubyConf 2006. Even better, Nick hails from the Minneapolis area like Tom and I, and we attend the same Ruby user group meetings with the Ruby Users of Minnesota.

We also expect Nick will bring his familiarity with Maven 2 and his professional experience leading both Java and Ruby-based projects. He's a good developer and a good leader to add to the team.

Hopefully this will also serve as a reminder that JRuby is a true Open Source project, and anyone with Ruby and/or Java experience can easily start helping out. The team and the community continue to grow, as does Ruby's potential on the JVM.

Welcome to the team, Nick!

Wednesday, December 27, 2006

Making Dynamic Invocation Fast: An Idea?

Evan Phoenix (of Rubinius fame) were discussing dynamic dispatch today on #rubinius, sharing our caching strategies and our dispatch woes. We talked at length about various strategies for speeding dispatch, cache invalidation mechanisms, compilation techniques, and so on. All gloriously fun stuff.

So at one point I related to him the extended version of my plans for speeding up dynamic dispatch. I relate it now to you to hopefully spur some discussion. There are no original ideas anymore, right? Or are there?

The initial experiment with Fixnum basically was the static version of my fuzzy vision. During parse time, a very trivial static table mapped method names to numbers. I only handled three method names, +, -, and <. Then those numbers were passed to Fixnum during method dispatch, where a very trivial static switch statement used them to "fast dispatch" to the appropriate methods.

The ultimate vision, however, is much grander.

The general idea is that during dynamic dispatch, we want to use the method name and the class as keys to locate a specific method instance. The typical way this is done is by keeping a hashtable on every class, and looking up methods by name. This works reasonably well, but each class must have its own table and we must pay some hashtable cost for every search. This is compounded when we consider class hierarchies, where the method we're looking for might be in a super class or a super class's super class, ad infinatum.

Retrieving the class is the easy part; every method we want to dispatch will have a receiver, and receivers have a reference to their class.

So then, how do we perform this magic mapping from M method names and N classes to a given method? Well hell, that's a matrix!

So then the great idea: build a giant matrix mapping method names (method IDs) and classes (class IDs) to the appropriate switchable value. Then when we dispatch to the class, we use that switchable value in a neatly dense switch statement.

Let me repeat that in another way to make it clear...the values in the matrix are simple "int" values, ideally as closely-numbered (dense) as possible for a given class's set of methods. In order to dispatch to a named method on a given object, we have the following steps:

Get the method ID from the method (likely calculated at parse or compile time)
Get the class ID from the class (calculated sequentially at runtime)
Index into the matrix using method ID and class ID to get the class-specific method switch value
Use the switch value to dispatch (via a switch statement, of course) to the appropriate method in the target class

Ignoring the size of this matrix for a moment, there are a number of benefits to this data structure:

We have no need for per-class or inline method caches
Repopulating the switch table for a given class's methods is just a matter of refilling its column in the matrix with appropriate values
Since the matrix represents all classes, we can avoid searching hierarchies once parents' methods are known; we just represent those methods in the matrix under the child's column and fast-dispatch to them as normal
We can save off and reconstitute the mapping to avoid having to regenerate it

...and there's also the obvious benefit: using the switch values stored in the matrix, we can do a fast dispatch on the target object with only the minimal cost of looking up a value in the matrix. An additional benefit for C code would be simply storing the address of the appropriate code in the matrix, avoiding any switch values completely.

Now naturally we're talking about a big matrix here, I'll admit that. A fellow by the name of "Defiler" piped up and said an incomplete app of his showed 1147 unique method names across 1258 classes. That works out to a matrix with 1.4M elements. In the Java world, where everything is 32 bits, that's around 6MB of memory consumed just for method mapping. Acceptable? Perhaps. But what about using a sparse matrix?

A sparse matrix attempts to be as efficient as possible when there are large numbers of "zeros" in the matrix elements. That is, interestingly enough, exactly what we have here. The vast majority of entries in our method-map would contain a zero, since most methods would not exist on most classes. Zero, as it turns out, would neatly map to our good friend "method_missing", and so dispatching to a method on a target class that doesn't implement it would continue to work as it does today. Even better, method_missing dispatch would be just as fast as a regular method dispatch, excluding the actual method_missing implementation code, of course.

So then, there it is...a potential plan for fast dynamic method invocation in JRuby (and perhaps in Rubinius, if Evan agrees the idea has merit). Any of you out there heard of such a thing? Is this an old idea? Are there problems with it? Share!

Holiday Fun: Interpretation, Loading, Dispatching Work

This week (what's left of it) I'm spending on performance again. It's officially a holiday for Sun, so I'm not technically on the clock. I could even sleep the entire week and pick things back up on Tuesday.

Yeah, right.

One of the oddities of working on JRuby while working at Sun stems from the fact that I was working on JRuby for fun for the past two years. Now that this is the full-time job, it's even harder to pull myself away from it. I thought perhaps making JRuby my job might take away some of the attraction. In reality, it's just made a good thing better, since I can spend long hours on the hard problems I would never have tackled before.

So instead of stopping work completely, I'm shifting gears a bit. Instead of the heavy Rails focus we've had over the past month, I'm hitting other "fun" stuff instead. Compilation, interpretation performance, and so on. We know there's still lots of fruit to be plucked from the performance tree, and of course any additional work/research on compilation will help that eventual goal.

Compiler Refactoring Underway

As far as compilation goes, I've started committing a refactored compiler; it separates AST-node-walking from code generation, so the backend could be swapped out with a YARV generator or future compiler revisions. I think that's needed to allow the compiler and the AST to evolve independently, since I believe there will be many possible compile targets and potentially many AST changes in the future. Nothing too fancy there, but it's a bit more readable so hopefully others can also contribute.

Interpreter Enhancements and Fixes

On the interpretation front, there are a few items I've been working on.

1. Speeding up method invocation and block management (committed)

Ruby's AST represents method calls that take a block by putting the block first in the AST. You encounter an "iter node" that points at the "call" with which it is associated. So the typical way you evaluate those nodes is to create the block object and then visit the call. Unfortunately, the call itself may lead to other calls with their own blocks, for evaluating the arguments or receiver.

The end result of this node ordering is that every time a method is called, the evaluation of its args and receiver has to juggle the current "block available" status around, so the block doesn't get consumed before it's needed. Because the block gets created early, we have to push it and the "block available" status onto a stack. This is the primary reason the block logic in the interpreter and ThreadContext is so complicated.

An example would help illustrate what's happening. For the following code:

foo("hello") { 1 }

The parser generates the following hierarchy of AST nodes:

IterNode[]        <= this is the block
 NewlineNode[]
  FixnumNode[] 1  <= this is the Fixnum 1 inside the block
 FCallNode[] foo  <= this is the call to foo
  ArrayNode[]: {StrNode[]}
   StrNode[]"hello"

The node associated with the block is encountered first, so we construct the block then. We move on to the "foo" call, and the block is consumed. No problem, right? However, here's a more complicated example that illustrates the trouble with this AST ordering:

foo { 1 }.bar { 2 }.baz(hello { 3 }) { 4 }

Here things are a bit more interesting. The parser produces the following output:

IterNode[]            <= the "4" block
 NewlineNode[]
  FixnumNode[] 4
 CallNode[] baz
  IterNode[]          <= the "2" block
   NewlineNode[]
    FixnumNode[] 2
  CallNode[] bar
   IterNode[]         <= the "1" block
    NewlineNode[]
     FixnumNode[] 1
   FCallNode[] foo
 ArrayNode[]: {IterNode[]}
  IterNode[]          <= the "3" block
   NewlineNode[]
    FixnumNode[] 3
  FCallNode[] hello

So what actually happens here? First the "4" block is encountered, instantiated, and pushed onto the block stack. Then we proceed to the "baz" call. Unfortunately, the "baz" call has both a receiver and arguments, so we have to hide the "4" block to prevent it being consumed. We proceed to evaluate the receiver for "baz", encountering the "2" block. The "2" block is instantiated and pushed down, and we move on to the "bar" call. The "bar" call has another receiver; we evaluate that, encountering the "1" block and the "foo" call it's associated with. The "foo" call consumes the "1" block and returns a receiver for "bar". The "bar" call consumes the "2" block and returns a receiver for "baz". Now the "baz" call has to evaluate its arguments, so the "3" block is created and consumed by the "hello" call. Finally, with a receiver and args, we can call "baz" and consume the "4" block.

Confused? Me too. Why would you order it this way? Perhaps it's to ease parsing, or perhaps there's some reason I don't know. However, I believe the following ordering is much simpler (and I know it makes interpretation easier):

(extraneous nodes omitted)

CallNode[] baz
 CallNode[] bar       <= the receiver for "baz", a call to "bar"
  FCallNode[] foo     <= the receiver for "bar", a call to "foo"
   IterNode[]         <= the "1" block associated with "foo"
  IterNode[]          <= the "2" block associated with "bar"
 ArrayNode[]          <= args to "baz"
  FCallNode[] hello   <= the call to "hello"
   IterNode[]         <= ...and its "3" block
 IterNode[]           <= finally the "4" block

The advantages here should be obvious. We encounter the blocks in order, so there's no stack juggling involved. Because there's no stack juggling, we don't have to "hide" blocks as we evaluate receivers and arguments. Finally, because we know we'll only encounter blocks for methods that require them, there's no additional overhead for methods that don't need blocks. It's a good change, and I would love to understand why Ruby uses the more complicated AST structure instead of this.

I have already committed a change to reorder the way these AST nodes are handled. The AST itself is unchanged, but the visit to a given block (IterNode) just sets that node into the associated call and proceeds. The calls themselves are now responsible for creating an associated block (if necessary)...*after* receiver and args have been dealt with. This means two things: calls that don't accept blocks don't pay any block-manipulation penalty; and calls with peripheral blocks (for finding args or receivers) don't pay any block-manipulation penalty either. Only the calls that need blocks have to deal with them.

Eventually this will become an AST change, but this short-term fix resolves 90% of the interpreter goofiness right now. This change will also eventually mean the iter stack (and potentially the block stack) disappear too. Huzzah!

2. LoadService fixes, enhancements, and optimizations (committed)

LoadService cleanup and improvements are well under way. LoadService is responsible for "load" and "require" calls and does all the searching for files and management of loaded extensions and libraries. Unfortunately the existing heuristic was both broken and terribly inefficient.

Ruby's 'load' behavior is easy enough...just look for the exact file and execute it in the current runtime. Ruby's 'require' however has a bit more magic to it.

If you specify a full filename to 'require' it will use that filename to load either a source file or an extension, depending on whether you specify ".rb" or ".[so|o|dll|etc..]". If you do not specify an extension, it will search for .rb, .so, etc in turn until it finds something, If it finds nothing, that's a load error. Here's the "ri" doc for MRI's "require":

--------------------------------------------------------- Kernel#require
  require(string)    => true or false
------------------------------------------------------------------------
  Ruby tries to load the library named _string_, returning +true+ if
  successful. If the filename does not resolve to an absolute path,
  it will be searched for in the directories listed in +$:+. If the
  file has the extension ``.rb'', it is loaded as a source file; if
  the extension is ``.so'', ``.o'', or ``.dll'', or whatever the
  default shared library extension is on the current platform, Ruby
  loads the shared library as a Ruby extension. Otherwise, Ruby tries
  adding ``.rb'', ``.so'', and so on to the name. The name of the
  loaded feature is added to the array in +$"+. A feature will not be
  loaded if it's name already appears in +$"+. However, the file name
  is not converted to an absolute path, so that ``+require
  'a';require './a'+'' will load +a.rb+ twice.

There's also the issue of what paths to search. Ruby searches the current directory first, followed by custom load paths, site_ruby dirs, and ruby/1.8 dirs. This allows a number of mechanisms for overriding more general locations with more specific ones for particular uses.

JRuby adds a new wrinkle here: classloader resources. JRuby supports JARing up source files and loading them through Java's classloader mechanisms. This is how the JRuby applet and the "complete" JRuby JAR work: they simply include all Ruby source into the archive. So then for us the classloader/classpath represents an additional path to be searched.

The primary problem with JRuby's load heuristic is that it ended up searching the classloader far too frequently; and in many cases it searched it multiple times for files that could not exist, such as for complete absolute paths (starting with '/', which *never* works for classloader resources). A second problem with the heuristic is that it would try all locations for all extensions, so for example it would search for xxx.rb everywhere possible, then xxx.rb.ast.ser (our serialized AST format) everywhere possible, then xxx.so (used internally for extensions...though that may change), and so on. The result of this is that any extensions or serialized scripts went through the full monty of searches before being found on a second or third pass.

These two issues were compounded when running JRuby with a very large classpath. Because classloader resources can be expensive to search, our loading became linearly slower in relation to the number of JARs (or perhaps the size of jars) included into the JVM. More JARs, slower classloader resource searching, slower startup.

The fix for these issues is twofold: search a given load location for all filename extensions before moving on, and only search the classloader as a last resort for filenames likely to be found there. It's fairly simple to explain, but the LoadService code had been endlessly hacked and rejiggered over the years. The cleanup was 90% of the battle; the fixes were considerably easier.

A final issue with LoadService, which is now fixed, was that it allowed require to include files with no extensions at all. This is not correct Ruby behavior, and so now those files can't be loaded. This actually caused a bug a long time ago where the extension-free "rake" startup script was being loaded before the "rake.rb" library file it tried to locate. The result was an eventual stack overflow as the file tried to continually load itself. Goofy behavior we should never see again.

3. Speeding up dynamic dispatch

I'm also doing some experimental work to speed up dynamic dispatch. Currently, methods are located by name, looking up a callable object out of a big hash on a per-class basis. This works reasonably well, and there are caches to speed the process, but with all the recent performance work this search has started to become the new bottleneck.

The eventual callable objects looked up have another flaw: they make Hotspot optimization more difficult. Because they're behind an ICallable interface, because they have multiple levels of logic as well as pre/post-call setup and teardown, and because there are many different implementations of ICallable, they end up slowing execution down significantly.

Hotspot is really an amazing piece of work. For the vast majority of Java code, it's able to unroll loops, inline invocations, dynamically optimize conditionals and switches, and generally improve the speed of code by a drastic amount. When running in "server" mode, where Hotspot lets code run longer in interpreted mode before optimizing it, even greater improvements can be seen. For example, A fib benchmark--recursive and iterative--under Java 6 client and server VMs:

(best times shown)

client recursive:
8.551000

client iterative:
17.995000

server recursive:
5.008000

server iterative:
13.191000

MRI recursive:
1.670000

MRI iterative:
16.964403

These numbers aren't bad, really. We even beat MRI for this trivial benchmark when we start hitting Bignums heavily (Java's BigInteger implementation is quite a bit faster than Ruby's). However, we can certainly do better.

There are two experimental changes I'be been working on. The first eliminates the pre/post method setup for core libraries when that setup is not necessary. I call it "fast invocation", and it's applicable to a large majority of core class methods. For example, with just fast invocation, the recursive fib numbers above drop to the sub-5s range. This change is perfectly safe, since it's just eliminating interpreter overhead that would otherwise be wasted cycles. You can expect to see it included in JRuby soon.

The second change is the holy grail of dynamic invocation: eliminate, to the greatest extent possible, the overhead of looking up and dispatching to a given method. In short, make it as close to a simple static dispatch as possible. This is where the real speed gains in JRuby will start to show up.

I have some experimental code right now, focused on the fib benchmark, that is both safe and drastically improves performance. It's sub-par code at the moment, but it does produce results like this:

recursive before:
5.008000

recursive after:
3.864000

Now of course, this is still interpreted. The same change when applied to my experimental Ruby compiler produces a much more drastic effect:

compiled recursive after:
1.550100

Now we start to see the value of eliminating dynamic-dispatch overhead. This is actually *faster* than Ruby's recursive fib, a feat that hasn't been accomplished by JRuby at any time in the past.

The trick to this is fairly simple. For common core methods which are known to be simple Java code, such as for Fixnum's +, -, and < implementations, I provide integer IDs. Within the Fixnum implementation there's a new implementation of "callMethod", our dynamic-dispatcher, which switches on these IDs. For methods it knows, such as the aforementioned +, -, and <, it dispatches directly to op_plus, op_minus, or op_lt, the Java implementations. This skips the lookup phase, the ICallable implementation, the ThreadContext manipulation, and the pre/post-method setup code completely. It's also perfectly safe, again, because all those pieces only waste cycles for simple methods like this.

Now one problem with a simple approach like this is that if you redefine Fixnum#+, that change won't be picked up. The simple Fixnum callMethod won't ever try a method search for methods it knows can be fast-dispatched. I resolved this in my experimental code by adding a "clean" flag to the Fixnum class. If any any point after its initial definition the Fixnum class becomes "dirty", e.g. if you add or redefine a method, the old, slow dispatch will come back into play. My simple version is too coarse-grained, killing fast dispatching for all methods if any of them are changed, but the principal is sound. I'm going to be exploring this the rest of the week, trying to find a more complete solution...but some good things are around the corner.

--

All told, it's been a productive few days since JavaPolis. I'm hoping to write up a JavaPolis recap soon, but I'm keen to use this holiday time to get some cool stuff done. You'll hear more after the first of the year...hopefully with committed code and additional benchmarks against JRuby trunk :)

Sunday, December 10, 2006

Two JRuby Talks at JP: What's the Difference

There will be two JRuby talks at JavaPolis.

The first is part of the "University" sessions tomorrow. We're the middle hour of a three-hour bit on scripting languages for the JVM. During that session we're going to be focusing on practicalities like building a simple app, using IRB, and getting a basic Rails app scaffolded and working. We'll talk a bit about JRuby futures, but we won't demo any of the newer, crazier things like NetBeans or GlassFish.

The second talk is on Wednesday, and will cover some of the same material as the first but hopefully have different demos. We'll show the Agile Web Development with Rails v2's "Depot" application running in JRuby under WEBrick (hopefully GlassFish if some issues are worked out, but likely not), we'll demo some NetBeans Ruby support (try to fit in as much as Tor can finish by Tuesday afternoon), and if there's time to implement some of it, we'll do Rails + JavaEE stuff as well (maybe, maybe not, given that we only have two days left).

Both talks should be useful, but there's no way we can avoid some duplication, especially since many folks will only attend one of the talks.

I hope to see plenty of people at both, however :)

JRuby JavaPolis Meetup

Another conference, another meetup!

JRuby JavaPolis Meetup

Join us! We haven't decided on a location, and we're taking suggestions. These little meetups have been great fun in the past, so don't be shy and feel free to grab us after one of our talks to help figure out a good place to go.

Monday, December 04, 2006

JRuby IRB Applet Revisited

Damian Steer, regular JRuby contributor, has taken the IRB applet and run with it. He's gotten readline working (history, line editing, tab-completion), added some fonts and colors to differentiate things, and even put in an intellisense-like menu for tab completion of method names.

Very cool stuff. Keep in mind also that this could be embedded in any app (like an IDE) to provide a really nice looking interactive console. Thanks Damian!

Saturday, December 02, 2006

Another Step Toward Rails WAR Files

Ashish Sahni has posted instructions for packaging a Rails app as a WAR on his blog, based on the work of a number of JRuby community members. I ran through his instructions, and have only two modifications to make:

When installing Rails, you still probably want to pass --no-ri --no-rdoc since rdoc generation is still far too slow.
Because JRuby has a classloading bug (not loading from context classloader) you'll need to put the rails-integration JAR in glassfish/lib.

I was only able to get the "properties" page to display, at rails/info/properties. A separate page that used ActiveRecord didn't successfully execute...but it was close enough to taste.

So then, there's two important points you should get out of this:

Rails in a WAR file will happen...soon

This has always been our goal, with every improvement and fix we made to JRuby. The holy grail of Rails deployment would be to "zip it up and deploy it" like Java app developers can do with WAR files. And that goal is just around the corner, thanks to the JRuby community. That leads me to the second point:

The JRuby community is growing fast and doing great things

This is especially exciting to me. Tom and I are only two folks, even with full-time license to work on JRuby and related projects. The only way Ruby can be successful on the JVM is with community cooperation, and the work toward WAR deployment exemplifies what can happen.

I'd like to thank the folks on the JRuby-Extras project who have been working on this: Robert Egglestone, Chris Nelson, Fausto Lelli, and also Ashish Sahni at Sun who put together a great walkthrough.

Excellent work.

Wednesday, November 29, 2006

NetBeans + Ruby = Awesome

Tor Norbye is a programming machine on par with the legendary Ola of Bini. He's the one-man force working on NetBeans Ruby support, and his progress has been epic. Here's his latest screenshot and a short blurb about it:

NetBeans + Ruby = True

In just over two months' time, Tor seems to be (in my opinion) on the verge of eclipsing every other Ruby editor/IDE out there. I've been using development builds of his stuff and it's really superb. A few features I've missed before that are now rapidly maturing in NB+RB:

Highlight usages: not genius, but smarter than dumb-as-a-post right now; great for local and block vars, getting smarter for methods
Clickable methods and variables: limited to method-scope for variables or file-scope for methods, but coming along very quickly...and even those limited scopes are way better than nothing at all. I don't know how many times I've wanted to Ctrl-Click a method in some giant Ruby file and go straight to the method def. Awe-some.
Inline refactoring: I've demoed this a couple times, but for local and block vars you can do inline renaming...essentially renaming all instances of the variable at the same time. It's really nice, looks cool, and represents the tip of the iceberg for refactoring capabilities.

There's plenty of other stuff that's cool, like the Navigator view of a file (shows classes, modules, methods, of currently-selected file), AST view (great for us Ruby language hackers that like to see the actual structure of things), and more.

Things are definitely shaping up nice for next-generation Ruby IDEs.

Sunday, November 19, 2006

Using JRuby's "complete" JAR for OS X App Bundles

Now this is really cool. Tony Hursh, commenter on the previous "Advanced Rails Deployment" post, put together an OS X Application Bundle template that allows you to use the JRuby "complete" JAR file as the base of a typical OS X app. What does that mean? That means you just toss the complete JAR into this template, code up some Ruby code, and have a nice dock icon and menu bar like any other app. You can ship the entire app as a bundle, with JRuby as the built-in Ruby interpreter. Awesome.

Some pics showing the menu working like you'd expect and the JRuby logo as a dock icon:

Check out Tony's JRuby OS X App Bundle walkthrough to see for your self. It's a really outstanding application of JRuby's "Ruby-in-a-JAR" support.

Eclipse to NetBeans: Quick Outline Module

I'm trying to be 100% NetBeans these days, and I'll be documenting tips and tricks as I learn them. Hopefully others going through the same exercise will find these tips and they'll help make the transition smoother.

Why am I making this move, you ask? Well, of course there's the whole fact that I work for Sun, but that's not the primary motivation; if a tool doesn't accomplish what I want, I'm not going to use it. But NetBeans has made amazing strides this past year. It's now not only better-looking and faster than Eclipse, it also includes includes a much, much larger set of functionality in the base download. And that download? 30-60MB smaller than Eclipse 3.2. That's pretty amazing.

Anyway, one feature I sorely missed was the "Quick Outline" in Eclipse, where Command-O (or Ctrl-O) brings up a fast search for members of the current class. It's pretty darn useful, and I really, really missed it.

However, Sandip Chitale to the rescue. He's created a "Java File Structure" module, which exactly duplicates the Quick Outline functionality. Thank goodness!

Read Sandip's post to install the NetBeans Quick Outline module (Java File Structure) and try it for yourself. He's also got a quick Java class hierarchy module that looks great, but it's not something I use very often. Perhaps I will use it more now.

Oh, and you can find information about the modules in Help after they're installed, but to save you some confusion: The shortcut for File Structure is initially Cmd-Shift-S (Ctrl-Shift-S) and for Hierarchy is Cmd-Shift-H (Ctrl-Shift-H).

One more NetBeans annoyance down the drain!

Friday, November 17, 2006

Advanced Rails Deployment with JRuby

There's a lot of work going on right now focusing on various mechanisms for deploying JRuby-based apps. This article will summarize some of the work happening and why it's really, really important for the Ruby world.

Ruby-in-a-JAR

First, a little sideline into general Ruby embeddability work.

Over the past few days I've made modifications to enable running a Ruby app completely out of a single JRuby JAR file (Java ARchive). The major changes required were:

Add jarjar to the project to combine all dependency jars into a single archive, including jline (readline support), asm (compiler, other stuff in the future), bsf (scripting API), jvyaml and plaincharset (Ola's JvYAML library and supporting charset lib).
Add an Ant task to build the "complete" jar and include all Ruby standard libraries in the same archive.
Somewhat unrelated, a small patch for irb/init.rb to allow it to fail gracefully if it can't load locale-specific files from the filesystem.

So by running "ant jar-complete" you get out a single JAR file that contains a complete, working Ruby interpreter plus stdlib. For example:

~ $ java -jar jruby-complete.jar -e "puts 'Hello, Ruby-in-a-JAR!'"
Hello, Ruby-in-a-JAR!
~ $ java -jar jruby-complete.jar -rirb -e "IRB.start"
irb(main):001:0>

Of course, you don't have to take my word for it. I've uploaded a copy of this jar for you to try yourself. The full archive is about 2900kb. That's suitable for embedding in just about any application, and in fact I used a stripped-down 1600kb version for the JRuby Applet. Note: this is JRuby trunk code and mostly experimental...but that's what makes it so fun :)

Ruby-in-a-JAR - The ultimate in Ruby interpreter portability

A Better Deployment for Rails

Now the main course: several folks have been working on several exciting deployment scenarios for JRuby on Rails apps.

The current "best option" for deploying Rails apps into production generally involves the HTTP front-end Mongrel. Although Mongrel is largely written in Ruby, it is very fast, largely because of its native C component for HTTP request parsing. It's also considerably more secure than CGI-based options, largely because of creator Zed Shaw's attention to detail. The typical Rails app will be deployed as a "pack of Mongrels", where the number of desired concurrent requests is multiplied by the number of independent Rails apps to determine a total number of processes. These processes must be managed, monitored, and respawned as appropriate, but the result is a fairly stable and scalable deployment model.

However with JRuby, there will soon be a better option. I previously reported about TAKAI Naoto's efforts to deploy Rails behind an AsyncWeb front-end, showing tremendous performance improvements over a WEBrick-based deployment. Naoto-san has now taken things to the next level: Rails deployment under GlassFish.

The potential here should be obvious. GlassFish, like other Java EE application servers, is extremely good at scaling up many concurrent requests across many independent applications; so good that many organizations deploy only a single appserver-per-machine and stuff it full of applications to serve. That means a single server, a single process to manage. GlassFish also supports clustering, which means you'll be able to hit the deploy button once and have your n-server cluster instantly start serving up Rails. But there's one last area that trumps all the rest:

That single app server can handle as many concurrent requests across as many independent Rails apps as you desire, scaling them across all the CPU cores you can throw at it.

That's right...no more N * M process management, no more zombie processes, no more immature tooling to manage all those servers and all those deployments. One tool, one server process, no headaches.

That's an extremely bright future, and we're almost there.

What Next?

Naoto-san is not the only one working on JRuby on Rails deployment options. There are a number of folks in the JRuby community approaching the same goal from different directions, using innovative techniques like servlets implemented in Ruby and Spring-based service wiring. The JRuby community sees the potential here and things are moving very quickly.

My work on Ruby-in-a-JAR will also play directly into this. Currently most deployment scenarios require Rails app files to remain "loose" on the filesystem, as with the current standard deployment model. However it won't be long before you can zip up your Rails app into a WAR file (Web ARchive) and deploy it lock, stock, and barrel to as many servers as you want.

These efforts combined will create, in my opinion, the most manageable, scalable, powerful Rails deployment model yet available...and it's just around the corner.

We've also launched into Rails compatibility work in earnest. I've created a wiki page on JRuby support for Rails that details the results of running Rails' own test cases. Long story short: we're looking pretty damn good.

JRuby on Rails is in the home stretch. And we're covering ground very quickly.

Wednesday, November 15, 2006

Jython: Alive and Well (and looking for love)

I thought I'd make a diversion from my usual JRuby activities this weekend and ping the Jython dev list. I'd been lurking for a month or two, seeing almost no activity other than the occasional email, bug report, or request for help. There were perhaps 5 emails in the last month. Not good.

So I reached out to see if anyone was actually listening and to offer whatever support I can provide. As it turns out, there's still a dev team and a user community on the lists, though development has slowed almost to a standstill. There hasn't been a release in years, and Jython is currently about 2x slower than current CPython, while only supporting Python 2.1 semantics in the widely-available release.

But there's a light. The existing team and users are very much interested in getting Jython going again, and from my examination of the code it shouldn't be too difficult for new devs to get involved. Jython also has a pretty good story for compilation to Java bytecode--better than JRuby currently--so it has a strong base to start from.

So here's my request to folks reading this post: If you're a Python fan and a Java developer, now's your time to show devotion to both communities. The Jython guys could use some help, and the project could certainly use some new blood. It doesn't matter to me if you're not a Ruby fan, or heck, if Jython ends up reaching CPython 2.5 compatibility before we get a JRuby 1.0 release out. I'd just really like the JVM Python story to have a happy ending...and I think Jython's long slumber needs to come to an end.

Jython is available on SourceForge, and the existing dev team are friendly, enthusiastic folks. Post this entry to your blogs, send it to your friends, print up flyers and hand them out at your local Python or Java User Group meetings. Jython needs some love, and now is the age of dynlangs for the JVM. Stop by and lend them a hand...I have.

Ruby for the Web? Check!

Well friends, it's time for another episode of "Impossible Or Not!" Today's contender is Ruby in the browser, long desired but never achieved. There are front-ends to Ruby services, delicious Ruby-JavaScript libraries, and of course the ever-popular Ruby on Rails web framework. However, developers have been clamoring for something more.

IRB in an Applet

A long, long time ago in a web far away, there was born a bright-eyed new child named Java. Java found its first public uses in flashing, twirling, annoying buttons called Applets. It was also a bit slow, having just been released into the world. So the big bad public said "Java is slow and only useful for annoying buttons!" And so Java was branded the "slow annoying button" language.

But Java has grown up. It hid away from the public eye, dwelling in dark, dank servers and enterprises. It learned the value of five nines uptime and horizontal scalability. All the while, it improved its public face, preparing for a return home to its birthplace on the web.

Now Java has grown up. It has learned how to appease the enterprise gods while presenting itself to the web in beautiful, performant glory. And it has a new friend: Ruby.

Yes, JRuby can run in an applet. No, it's not that hard. No, the archive doesn't have to be this big (the applet above is about a 1.6MB JAR file, but it includes stuff it doesn't need). Yes, this means you could start writing stuff for web pages in Ruby. No, I'm not kidding.

Yes, Virginia, there is a Santa Claus.

Updated: I fixed the issues under Windows, so it should work for those of you that reported errors. Thanks for the heads up!

Tuesday, November 14, 2006

Sun Turning Heads

Thijs van der Vossen posts an interesting perspective on recommending Sun and how his opinion of the company has been changed by recent events. It seems that the truth is really getting out: Sun "gets it" and is rising again as a great innovator.

It's a great time to be here.

Monday, November 13, 2006

Java Open-Sourced Under GPL; Sun Shines Brighter

Yes, it's official. News is already starting to pop up around the net about Java's open-sourcing and especially about the choice of license: GPLv2. I must admit my jaw dropped when I first heard about this a couple weeks ago, and it was a hard secret to sit on. Not only is it open source...it's open source using the most vigorously open license out there.

I think the GPL is a great choice for Java. Not only will it be fully compatible with the vast range of GPLed software, but any folks hoping to release their own versions will be compelled to make their changes available as source. Say what you like about the GPL and its "virulence" or its "tainting", but for an open development platform about to explode in the open-source world, it's hard to say what license would be a better choice.

I'm proud to be at Sun surrounded by thought-leaders smart enough to see this is the right thing to do. Sun is BACK, baby!

Tune in for Jonathan Schwartz's and Rich Green's webcast at 9:30PT to get all the details about what's being opened up when.