Showing posts sorted by relevance for query duby. Sort by date Show all posts
Showing posts sorted by relevance for query duby. Sort by date Show all posts

Saturday, April 03, 2010

Getting Started with Duby

Hello again!

As you may know, I've been working part-time on a new language called Duby. Duby looks like Ruby, since it co-opts the JRuby parser, and includes some of the features of the Ruby language like optional arguments and closures. But Duby is not Ruby; it's statically typed, compiles to "native" code (JVM bytecode, for example) before running, and does not have any built-in library of its own (preferring to just use what's available on a given runtime). Here's a quick sample of Duby code:

class Foo
def initialize(hello:String)
puts 'constructor'
@hello = hello
end

def hello(name:String)
puts "#{@hello}, #{name}"
end
end

Foo.new('Hiya').hello('Duby')

This post is not going to be an overview of the Duby language; I'll get that together soon, once I take stock of where the language stands as far as features go. Instead, this "getting started" post will show how you can grab the Duby repository and start playing with it right now.

First you need to pull down three resources: Duby itself, BiteScript (the Ruby DSL I use to generate JVM bytecode), and a JRuby 1.5 snapshot:
~/projects/tmp ➔ git clone git://github.com/headius/duby.git
Initialized empty Git repository in /Users/headius/projects/tmp/duby/.git/
remote: Counting objects: 2810, done.
remote: Compressing objects: 100% (1291/1291), done.
remote: Total 2810 (delta 1690), reused 2509 (delta 1447)
Receiving objects: 100% (2810/2810), 10.64 MiB | 722 KiB/s, done.
Resolving deltas: 100% (1690/1690), done.

~/projects/tmp ➔ git clone git://github.com/headius/bitescript.git
Initialized empty Git repository in /Users/headius/projects/tmp/bitescript/.git/
remote: Counting objects: 470, done.
remote: Compressing objects: 100% (404/404), done.
remote: Total 470 (delta 166), reused 313 (delta 57)
Receiving objects: 100% (470/470), 93.56 KiB, done.
Resolving deltas: 100% (166/166), done.

~/projects/tmp ➔ curl http://ci.jruby.org/snapshots/jruby-bin-1.5.0.dev.tar.gz | tar xz
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 11.3M 100 11.3M 0 0 353k 0 0:00:32 0:00:32 --:--:-- 262k

~/projects/tmp ➔ ls
bitescript duby jruby-1.5.0.dev

~/projects/tmp ➔ mv jruby-1.5.0.dev/ jruby

Once you have these three pieces in place, Duby can now be run. It's easiest to put the JRuby snapshot in PATH, but you can just run it directly too:
~/projects/tmp ➔ cd duby

~/projects/tmp/duby ➔ ../jruby/bin/jruby bin/duby -e "puts 'hello'"
hello

~/projects/tmp/duby ➔ ../jruby/bin/jruby bin/dubyc -e "puts 'hello'"

~/projects/tmp/duby ➔ java DashE
hello

Finally, you may want to create a "complete" Duby jar that includes Duby, BiteScript, JRuby, and Java classes for command-line or Ant task usage. Using JRuby 1.5's Ant integration, the Duby Rakefile can produce that for you:
~/projects/tmp/duby ➔ ../jruby/bin/jruby -S rake jar:complete
(in /Users/headius/projects/tmp/duby)
mkdir -p build
Compiling Ruby sources
Generating Java class DubyCommand to DubyCommand.java
javac -d build -cp ../jruby/lib/jruby.jar:. DubyCommand.java
Compiling Duby sources
mkdir -p dist
Building jar: /Users/headius/projects/tmp/duby/dist/duby.jar
mkdir -p dist
Building jar: /Users/headius/projects/tmp/duby/dist/duby-complete.jar

~/projects/tmp/duby ➔ java -jar dist/duby-complete.jar run -e 'puts "Duby is Awesome!"'
Duby is Awesome!


Hopefully we'll soon have duby.jar, duby-complete.jar, and a new Duby gem released, but this is a quick way to get involved.

I'll get back to you with a post on the Duby language itself Real Soon Now!

Update: I have also uploaded a snapshot duby-complete.jar (which includes both the Main-Class for jar execution and the simple Ant task org.jruby.duby.ant.Compiler) on the Duby Github downloads page. Have fun!

Sunday, August 31, 2008

A Duby Update

I haven't forgotten about my promise to post on FFI and MVM APIs, but I've been taking occasional breaks from JRuby (heaven forbid!) to get some time on on Duby.

What Is Duby?

Duby, for those who have not heard of it, is my little toy language. It's basically a Ruby-like static-typed language with local type inference and not a whole lot of bells and whistles. The goal for Duby (which is most definitely a working name...it will probably change), is to provide the all the best parts of Ruby syntax people are familiar with, but add to it:

  • Written all in Ruby (and obviously the eventual plan would be to port it to Duby)
  • Backend-agnostic (JVM is obviously my focus, but nothing stops someone from building an LLVM or CLR typer+compiler)
  • Minimally-intrusive static typing (Duby infers types from arguments and calls, like Scala)
  • Features missing from Java (Duby treats module inclusion and class reopening like defining extension methods in C#)
  • A very pluggable type inference engine (Duby's "Java" typer is currently all of about 20 lines of code that plugs into the engine)
  • A pluggable compiler (Duby will allow adding compiler plugins to turn str1 + str2 into concatenation or StringBuffer calls, for example)
  • Absolutely no runtime dependencies (I want compiled output from Duby to be *done*, so there's no runtime library to lug along so it works; once compiled, there are no dependencies on Duby)
The primary motivation for Duby was originally to have a Ruby-like language we could use to implement parts of JRuby. The JVM, it's type system, and its bytecode are all actually really, really nice. There's a huge collection of libraries, fast primitives (that get optimized into faster native code), and a bytecode specification that's pretty easy for almost anyone to grok. But there's a problem: Java.

Now don't get me wrong, Java is a great language, but it's become a victim of its own success. While other languages have been adding niceities like local type inference, structural typing, closures and extension methods, Java's stayed pretty much the same. There have been no major language changes to Java since Java 5's additions of generics, enums, annotations, varargs, and a few other miscellaneous odds and ends. Meanwhile, I live in a torturous world between Ruby and Java, where I'd love to write everything in Ruby (too slow, too inexact for "stable layer" code), but must instead write everything in Java (with associated syntactic baggage and 20th-century language design). And so necessity dictates taking a new approach.
def fib(a => :fixnum)
if a < 2
a
else
fib(a - 1) + fib(a - 2)
end
end

puts fib(45)
So here we see an example I've shown in previous posts, but with a twist. First off, it's almost exactly the same as the equivalent Ruby code except for the argument type declaration => :fixnum. The rest of the script is all vanilla Ruby, even down to the puts call at the bottom.

But all is not as it seems. This is not Ruby code.

The type declaration in the method def looks natural, but it's not actually parseable Ruby. I had Tom Enebo hack a change in to JRuby's parser (off by default) to allow that syntax. Duby originally had a syntax something like this, so it could be parsed by any Ruby impl:

def fib(a)
{a => :fixnum}
...
end
But it's obviously a lot uglier.

New Type Inference Engine

Ignoring Java for a moment we can focus on the type inference happening here. Originally Duby only worked with explicit Java types, which obviously meant it would only ever be useful as a JVM language. The use of those types was also rather ugly, especially in cases where you just want something "Fixnum-like". So even though I had a working Duby compiler several months ago, I took a step back to rewrite it. The rewrite involved two major changes:
  1. Rather than build Duby directly on top of JRuby's AST I introduced a transformation phase, where the Ruby AST goes in and a Duby AST comes out. This allowed me to build up a structure that more accurately represented Duby, and also has the added bonus that transformers could be built from any Ruby parse output (like that of ruby_parser).
  2. Instead of being inextricably tied to the JVM's types and type system, I rewrote the inference engine to be type-system independent. Basically it uses all symbolic and string-based type identifiers, and allows wiring in any number of typing plugins, passing unresolved nodes to them in turn. Two great example plugins exist now: a Math plugin that knows how to handle mathematical and boolean operators against numeric types like :fixnum (it knows :fixnum < :fixnum produces a :boolean, for example), and a Java plugin that knows how to reach out into Java's classes and methods to infer return types for calls out of Duby-space.
The result of this is that up to the point of compilation, there's no explicit dependency on any named set of types, any type system, or any backend. Here's the output from the type inference engine running against that fib script above:
* [Simple] Learned local type under MethodDefinition(fib) : a = Type(fixnum)
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "<" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "<" Type(fixnum) on Type(fixnum) = Type(boolean)
* [AST] [Call] resolved!
* [AST] [Condition] resolved!
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "-" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "-" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "fib" Type(fixnum) on Type(script) not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1635aad>
* [Java] Failed to infer Java types for method "fib" Type(fixnum) on Type(script)
* [Simple] Deferring inference for FunctionalCall(fib)
* [Simple] Retrieved local type in MethodDefinition(fib) : a = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "-" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "-" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "fib" Type(fixnum) on Type(script) not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1635aad>
* [Java] Failed to infer Java types for method "fib" Type(fixnum) on Type(script)
* [Simple] Deferring inference for FunctionalCall(fib)
* [Simple] Method type for "+" on not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "+" on not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1635aad>
* [Java] Failed to infer Java types for method "+" on
* [Simple] Deferring inference for Call(+)
* [Simple] Deferring inference for If
* [Simple] Learned method fib (Type(fixnum)) on Type(script) = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [AST] [PrintLine] resolved!
* [Simple] Entering type inference cycle
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [Simple] [Cycle 0]: Inferred type for FunctionalCall(fib): Type(fixnum)
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [Simple] [Cycle 0]: Inferred type for FunctionalCall(fib): Type(fixnum)
* [Simple] Method type for "+" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xcc5002>
* [Math] Method type for "+" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] [Cycle 0]: Inferred type for Call(+): Type(fixnum)
* [AST] [If] resolved!
* [Simple] [Cycle 0]: Inferred type for If: Type(fixnum)
* [Simple] Inference cycle 0 resolved all types, exiting
There's a lot going on here. You can see the MathTyper and JavaTyper both getting involved here. Since there's no explicit Java calls it's mostly the MathTyper doing all the heavy lifting. The inference stage progresses as follows:
  1. Make a first pass over all AST nodes, performing trivial inferences (declared arguments, literals, etc).
  2. Add each unresolvable node encountered to an unresolved list.
  3. Cycle over that list repeatedly until either all nodes have resolved or the list's contents do not change from one cycle to the next.
It's a fairly brute-force inference mechanism, certainly not on the scale of a full Hindley-Milner inference. Honestly I find the type declaration in the argument list to be far more helpful than harmful, though, and I'm not smart enough to write my own H/M engine at the moment.

Include Java

Anyway, back to Duby. Here's a more complicated example that makes calls out to Java classes:
import "System", "java.lang.System"

def foo
home = System.getProperty "java.home"
System.setProperty "hello.world", "something"
hello = System.getProperty "hello.world"

puts home
puts hello
end

puts "Hello world!"
foo
Here we see a few new concepts introduced.

First off, there's an import. Unlike in Java however, import knows nothing about Java types; it's simply associating a short name with a long name. The syntax (and even the name "import") is up for debate...I just wired this in quickly so I could call Java code.

Second, we're actually making calls that leave the known Duby universe. System.getProperty and setProperty are calls to the Java type java.lang.System. Now the Java typer gets involved. Here's a snippit of the inference output for this code:
* [Simple] Method type for "getProperty" Type(string) on Type(java.lang.System meta) not found.
* [Simple] Invoking plugin: #<Duby::Typer::MathTyper:0xaf17c7>
* [Math] Method type for "getProperty" Type(string) on Type(java.lang.System meta) not found
* [Simple] Invoking plugin: #<Duby::Typer::JavaTyper:0x1eb717e>
* [Java] Method type for "getProperty" Type(string) on Type(java.lang.System meta) = Type(java.lang.String)
* [AST] [Call] resolved!
The Java typer is fairly simple at the moment. When asked to infer the return type for a call, it takes the following path:
  1. Attempt to instantiate known Java types for the target and arguments. It makes use of the list of "known types" in the typing engine, augmented by import statements. If those types successfully resolve to Java types...
  2. It uses Java reflection APIs (through JRuby) to look up a method of that name with those arguments on the target type. From this method, then, we have a return type. The return type is reduced to a symbolic name (since again, the rest of the type inference engine knows nothing of Java types) and we consider it a successful inference. If the method does not exist, we temporarily fail to resolve; it may be that additional methods are defined layer that will support this name and argument list.
So in this case, the "System" type has been associated with the "java.lang.System" class (the "meta" in the type reference means it's a class reference rather than an instance reference), and the argument type "string" resolves to "java.lang.String". So java.lang.System.getProperty(java.lang.String) resolves as returning java.lang.String, and we have successfully resolved the call.

Next Steps

I see getting the JVM backend and typer working as two major milestones. Duby already can learn about Java types anywhere in the system and can compile calls to them. But mostly what works right now is what you see above. There's no support for array types, instantiating objects, or hierarchy-aware type inference. There's no logic in place to define new types, static methods, or to define or access fields. All this will come in time, and probably will move very quickly now that the basic plumbing is installed.

I'm hoping to get a lot done on Duby this month while I take a "pseudo-vacation" from constant JRuby slavery. I also have another exciting project on my plate: wiring JRuby into the now-functional "invokedynamic" support in John Rose's MLVM. So I'll probably split my time between those. But I'm very interested in feedback on Duby. This is real, and I'm going to continue moving it forward. I hope to be able to use this as my primary language some day soon.

Update: A few folks asked me to post performance numbers for that fib script above. So here's the comparison between Java and Duby for fib(45).

Java source:
public class FibJava {
public static int fib(int a) {
if (a < 2) {
return a;
} else {
return fib(a - 1) + fib(a - 2);
}
}

public static void main(String[] args) {
System.out.println(fib(45));
}
}
Java time:
➔ time java -cp . FibJava
1134903170

real 0m13.368s
user 0m12.684s
sys 0m0.154s
Duby source:
def fib(a => :fixnum)
if a < 2
a
else
fib(a - 1) + fib(a - 2)
end
end

puts fib(45)
Duby time:
➔ time java -cp . fib
1134903170

real 0m12.971s
user 0m12.687s
sys 0m0.112s
So the performance is basically identical. But I prefer the Duby version. How about you?

Friday, March 21, 2008

More Fun With Duby

It's been...oh my, almost two weeks since I broke the news about Duby. Since then I attended PyCon and we managed to get JRuby 1.1 RC3 out the door, which is looking like it will become JRuby 1.1 final. But I've still been working on Duby in my spare time. It's kinda nice to have a different "spare time" project than JRuby for a while.

After my previous post, I continued to carry the compiler forward. I managed to get it compiling the following syntactic structures:

  • until and while loops
  • static method definitions and calls (using "def self.foo" syntax)
  • array types (which combined with static calls allowed creating a "main" method that worked!)
  • instance variables (using @foo to mean the "foo" field on a Java type)
  • type resolution from any Java type
  • imports (with "import foo.bar.Baz" syntax)
  • packages (using "module Foo" syntax, though that will probably change)
  • type inference across methods on the same type, including smarts for instance vs static
All very exciting stuff, and starting to get very close to a language that would be usable for simple algorithms. Except there was just one problem: every new feature I added made the codebase more difficult to understand.

Reloaded

I had just started to explore the next steps when I realized the codebase was becoming too hard to follow. After chasing around a few inference bugs I decided it was time to take a step back. I also hoped all along to eliminate the hard requirement to declare a return type, since that at least should be apparent from inspecting the method body...but it wouldn't be possible with the original design.

The initial Duby compiler was entirely made up of decorations on the existing Ruby AST. A CallNode, for example, was decorated with methods called "compile", "declared_type", "signature" and so on that were called in various sequences to first infer types and then produce bytecode. There were a few problems with this approach that became apparent as work continued:
  1. The hypothetical Duby AST structure did not map 1:1 to the Ruby AST structure; it was richer, with type information, method signatures, and overloaded/hooked syntax. To support this in the existing AST, many layers of complexity had to be added to the compilation process: "is this call being used as a call or as a type declaration?"
  2. The decorator methods were inextricably entwined with Java/JVM-specific details, be they type names, bytecodes, or just obviously Java-centric syntax. This not only made it more difficult to evolve the language, but made it impossible for the language to live anywhere but on JVM and be compiled by anything but JRuby.
  3. My grand plans for the language were quickly exceeding what would be possible by simply decorating the AST.
The third bullet may spark some interest, so I'll explain very briefly (since it's still just forming in my head). As I thought more about how to map Ruby to the JVM, I realized that very few algorithms, very little syntax couldn't be made to map to *something* fast, static-typed, and conceptually the same. Mix-ins, for example, could easily be implemented like C# 3.0's extension methods. Blocks could be implemented behind the scenes as anonymous types of particular signatures. Even operator overloading could easily be mapped to appropriate methods on Numeric types, Collection types, and many others. The key to all this was having type inferencing and compiler layers that were not just flexible, but infinitely pluggable.

The Dream of Duby

The dream, as I see it, would be to use a Ruby-like syntax to wire up any arbitrary types using what looks generally like idiomatic Ruby code. For example, our friend fib(). The same fib() method I showed before, executing against primitive integer values, could execute against boxed Integer objects or BigInteger objects just by specifying the right plugin. So if within the context of a file, I declare that "fixnum" types are to be represented as BigInteger objects, and math operations against them are to call appropriate methods on BigInteger, so be it...that's what comes out the other side.

The term for this is, I believe, "specialization". In the case above, it's explicit specialization on a case-by-case basis. For many cases, this is all you need...you know the types you're dealing with ahead of time and can comfortably declare them or specify type mappings during compilation. But the more interesting side of this comes from general-purpose specialization in the form of parametric polymorphism...or in terms you might be more familiar with: generics.

I am certainly stepping beyond my current education and understanding here, but the pieces are starting to fit together for me. I and others like me have long been envious of languages like Haskell which infer types so incredibly well. And living between the Ruby and Java worlds, I've often felt like there had to be some middle ground that would satisfy both ends of the spectrum: a static, verified type system flexible enough to consume and specialize against any incoming types just by loading a plugin or recompiling, combined with the cleanest, most expressive (while still being one of the most reable) dynamic language syntaxes around. And so that's where I'd like Duby to fit.

The New Version

So where does it stand now?

Things have been moving fast. With JRuby 1.1 RC3 out of the way, I've taken some time to go back and rework the Duby pipeline.

Now, the only decoration on the Ruby AST is in the form of transformation logic to produce a richer Duby syntax tree. Literals are the same. The three call types have been unified into two: Call and FunctionalCall (against "self"). The various types of code bodies have been unified into a single Body construct. Method definition is represented through MethodDefinition and StaticMethodDefinition, both of which aggregate a "signature" element to represent declared or inferred signature information. The several looping constructs (excluding for loops, which are block-based iterations) have been unified into Loop. And so on. Not all the syntax supported by the Duby prototype has been represented in transformation, but it's not far off. And I'm taking a much more measured approach now.

The new AST structure affords a much more realistic typing and compilation pipeline. Each of the Duby AST nodes defines a single method "infer" which, when passed a Typer, will walk its children and infer as many types as it is able. Each child walks its own children, and so on, unifying and generalizing types as it goes (though the unification and generalization is stubbed out now to only allow exact matches...this will eventually be the responsibility of the back-end to handle). Simple code that calls fully-resolved target methods and has no unknown branches may completely resolve during this first pass.

In more complicated cases, each node that can't make an accurate determination about inferred type registers itself with the Typer in its "deferred" list. Once the initial inference cycle has run, all nodes in the AST will have either successfully inferred their result type or registered with the deferred list. At this point, you can either continue to pass ASTs through the Typer, or you can begin the resolution phase.

To start the resolution phase, the "resolve" method on the Typer is called, which attempts to iteratively resolve all deferred nodes in the AST. This resolution process in theory will loop until either the deferred list has been drained of all nodes (which will presumably then be 100% resolved), or until two successive resolution cycles fail to alter the list (perhaps because more information is needed or there are circular references for which the user must add hints). In general, this means that the deepest unresolved child nodes will fall out first. For example, if you have a method "foo" that calls a method "bar" before "bar" has been defined in the source.
def foo
bar
end
def bar
1
end
During the first inference cycle, bar will completely resolve but foo will be deferred. During the resolution phase, foo will now see that bar has been defined and resolved, and will itself resolve. Both return :fixnum (though the decision of what type ":fixnum" resolves to will be left to the compiler backend for a given system).

Back to fib()

Here's our friend fib(), which serves as a nice simple example:
def fib(n)
{n => :fixnum}

if n < 2
n
else
fib(n - 1) + fib(n - 2)
end
end
The fib() method is actually fairly interesting here because it recurses. If this were a simple recursion, it would be impossible to determine what actual type fib returns without an explicit declaration, since no other information is available, and this would produce an error of the following variety:
~/NetBeansProjects/jruby ➔ cat recurse.rb
def recurse
recurse
end
~/NetBeansProjects/jruby ➔ jruby lib/ruby/site_ruby/1.8/compiler/duby/typer2.rb recurse.rb
Could not infer typing for the following nodes:
FunctionalCall(recurse) (child of MethodDefinition(recurse))
MethodDefinition(recurse) (child of Script)
Script (child of )

AST:
Script
MethodDefinition(recurse)
{:return=>Type(notype)}
Arguments
FunctionalCall(recurse)
Here we see a simple "recurse" method that just calls itself, and as you'd expect type inference fails. Because the return value of "recurse" depends on knowing the return value of "recurse", resolution fails.

However in the case of fib(), we don't have a simple recursion, we have a conditional recursion. The default behavior for the Duby "Simple" typer is to assume that if one branch of an If can successfully resolve, that's the type to use (temporarily) as the value of the If (while still marking the if as unresolved, and unifying the two bodies later). And since the root of the fib() method only contains an If, it can successfully resolve. Let's try it:
~/NetBeansProjects/jruby ➔ jruby lib/ruby/site_ruby/1.8/compiler/duby/typer2.rb fib.rb
Could not infer typing for the following nodes:
Call(<) (child of Condition)
Condition (child of If)
Call(-) (child of FunctionalCall(fib))
FunctionalCall(fib) (child of Call(+))
Call(-) (child of FunctionalCall(fib))
FunctionalCall(fib) (child of Call(+))
Call(+) (child of If)
...
Ouch, what happened here? Actually it's pretty easy to understand...the calls to "<", "-", and "+" were unknown to the Typer, and so they could not be resolved. As a result, the If Condition could not resolve, nor could the body of the Else statement. This is not necessarily a fatal state, merely an incomplete one. The "resolve" method on Typer can be called with "true" to force an error to be raised, or with no arguments to just "do the best it can do". In this case, using the simple call to the typer, it raises and displays the error, but there's no reason that more information couldn't be added to the system to allow a subsequent resolution to proceed.

Pluggable Inference

This is where having a pluggable engine starts to come in handy. Though the mechanism is currently pretty crude, there's already a basic ability to specify a plugin for method resolution. In order to enlist in method resolution, a plugin must define a "method_type" method that accepts a parent Typer, a target type, a method name, and a list of parameter types. If at any time method resolution fails in the Simple Typer, the plugin Typers will be called in turn. So in this case, I created a simple "Math" typer that's aware of a few simple operations with LHS and RHS of :fixnum. Let's try it:
~/NetBeansProjects/jruby ➔ jruby -rcompiler/duby/plugin/math lib/ruby/site_ruby/1.8/compiler/duby/typer2.rb fib.rb

AST:
Script
MethodDefinition(fib)
{:return=>Type(fixnum), :n=>Type(fixnum)}
Arguments
RequiredArgument(n)
Body
Noop
If
Condition
Call(<)
Local(name = n, scope = MethodDefinition(fib))
Fixnum(2)
Local(name = n, scope = MethodDefinition(fib))
Call(+)
FunctionalCall(fib)
Call(-)
Local(name = n, scope = MethodDefinition(fib))
Fixnum(1)
FunctionalCall(fib)
Call(-)
Local(name = n, scope = MethodDefinition(fib))
Fixnum(2)
Notice now that the return type of fib() has been correctly inferred to be :fixnum. Huzzah! We are successful!

Debugging

Along with work to make the system more pluggable and the code easier to follow, I've also been trying to provide useful debugging output. Man, I think making debugging output useful and readable is harder than writing a type inference engine in the first place. I must have spent a good hour just tweaking output so it didn't look totally heinous. And it's still not great, but it's definitely usable. Here, for your edification, is the debugging output from type inference on fib.rb:
* [Simple] Learned local type under MethodDefinition(fib) : n = Type(fixnum)
* [Simple] Retrieved local type in MethodDefinition(fib) : n = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "<" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "<" Type(fixnum) on Type(fixnum) = Type(boolean)
* [AST] [Call] resolved!
* [AST] [Condition] resolved!
* [Simple] Retrieved local type in MethodDefinition(fib) : n = Type(fixnum)
* [Simple] Retrieved local type in MethodDefinition(fib) : n = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "-" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "-" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "fib" Type(fixnum) on Type(script) not found
* [Simple] Deferring inference for FunctionalCall(fib)
* [Simple] Retrieved local type in MethodDefinition(fib) : n = Type(fixnum)
* [AST] [Fixnum] resolved!
* [Simple] Method type for "-" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "-" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] Method type for "fib" Type(fixnum) on Type(script) not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "fib" Type(fixnum) on Type(script) not found
* [Simple] Deferring inference for FunctionalCall(fib)
* [Simple] Method type for "+" on not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "+" on not found
* [Simple] Deferring inference for Call(+)
* [Simple] Deferring inference for If
* [Simple] Learned method fib (Type(fixnum)) on Type(script) = Type(fixnum)
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [Simple] [Cycle 0]: Inferred type for FunctionalCall(fib): Type(fixnum)
* [Simple] Method type for "fib" Type(fixnum) on Type(script) = Type(fixnum)
* [AST] [FunctionalCall] resolved!
* [Simple] [Cycle 0]: Inferred type for FunctionalCall(fib): Type(fixnum)
* [Simple] Method type for "+" Type(fixnum) on Type(fixnum) not found.
* [Simple] Invoking plugin: #<Compiler::Duby::Typer::MathTyper:0xbc8159>
* [Math] Method type for "+" Type(fixnum) on Type(fixnum) = Type(fixnum)
* [AST] [Call] resolved!
* [Simple] [Cycle 0]: Inferred type for Call(+): Type(fixnum)
* [AST] [If] resolved!
* [Simple] [Cycle 0]: Inferred type for If: Type(fixnum)
* [Simple] Inference cycle 0 resolved all types, exiting
What's Next

I should clarify a few things before getting back to work:

This codebase is mostly separate from, but heavily advised by the original Duby prototype. I learned a lot from that code, and it's still more functional from a compilation standpoint, but it's not really something I can evolve. The new codebase will probably be at the same level of functionality with a week or so.

Largely the more measured pace of this new codebase is because of two key goals.

I'd like to move the system ever toward full, general type inference when possible. That includes inferring method parameters as well as return types. That also means there will have to be a much more comprehensive type-inference engine than the current "simple" engine, but nothing about the current system will break forward compatibility with such work.

I'd also like to see the entire type inferencing pipeline and ideally most of the compilation pipeline entirely platform-independent. There has been a surprising amount of interest in Duby from folks desiring to target LLVM, C, x86 ASM, Parrot, and others. Sadly, without a realistic codebase to work from--one which isolates typing logic from the underlying platform--none of that work has moved forward (though it's almost frightening how quickly people pounced on the idea). So the new system is almost entirely independent of Java, the JVM, and JRuby. Ideally the only pieces you'd need to reimplement (or plugin) for a given platform would be the parser (producing a Duby AST through whatever mechanism you choose) and the type mapper/compiler for a given system. The parser would probably be the more difficult, but since the language syntax is "basically Ruby" and designed to support full AOT compilation you can use any Ruby parser to produce a Ruby AST and then transform it in the same way I transform JRuby's AST. The compiler will mostly be a matter of mapping Duby syntactic structures and inferred generic types to code sequences and native types on your platform of choice.

Over the weekend I'll probably try to absorb the available literature on type inference, to learn what I'm doing wrong and what I could be doing better...but I think my "common sense" approach seems to be working out pretty well so far. We shall see. Suggestions for papers, ideally papers designed for mortals, are welcome.

So there you have it...the Duby update several of you have been asking for. Satisfied for now? :)

(BTW, the code is all available in the JRuby repository, under lib/ruby/site_ruby/1.8/compiler/duby. The new stuff is largely under the ast/ subdir and in the "transform.rb" and "typer2.rb" files. The tests of interest are in test/compiler/duby/test_ast.rb and test_typer.rb)

Sunday, March 30, 2008

Duby's C Backend POC

Late last week I wired up a quick proof-of-concept C back-end for Duby, to show that it is possible. Duby's design represents all types as symbols, which allows the type representation to vary across platforms. So in fib(), the fixnum type is represented as ":fixnum" throughout, and it can be mapped to whatever the backend decides fixnums should be. This means that, in general, code you write in Duby will type infer (for some definition of infer) and produce an AST suitable for compilation to any backend or type system. And with the appropriate plugins, like the "math" typer from my previous Duby post, you can do cool things like represent integer math as either primitive math operations or method calls against object types. But the source code remains the same.

I also wanted to blunt some criticism I received during the first Duby prototype. It represented types as symbols, but symbols nearly identical to their eventual Java types. In order for Duby to be generally useful (in, for example, Rubinius) platform-agnostic symbolic typing is a must.

So here's the code for my little C backend in its current state. It's pretty crude at the moment, but you can get the general idea. There's obviously work to do: it doesn't correctly insert "return" keywords, doesn't insert end-of-line semicolons everywhere appropriate (and often inserts them in totally inappropriate place), and in general doesn't produce entirely valid C code, but those are mostly minor details; the important takeaway here is that the typing and general Duby AST representation can easily be compiled this way.

Ok then! Given a simple script:

def fib(n)
{n => :fixnum}
if n < 2
n
else
fib(n - 1) + fib(n - 2)
end
end

def go
fib(35)
end
We can run the C "compiler" thus:
jruby -rduby/plugin/math lib/ruby/site_ruby/1.8/duby/c_compiler.rb fib.rb
And we get a fib.rb.c file as a result:
int fib(int n) {;
if (n < 2) {n} else {fib(n - 1) + fib(n - 2)};
}

;
int go() {fib(35)}

;
As I said, it's not the prettiest thing in the world, but I think it demonstrates that Duby should easily be able to target pretty much any backend platform you like. Fun stuff!

Wednesday, August 26, 2009

Introducing Surinx

In June of this year, I spent a few hours formulating a dynamic cousin to Duby. Duby, if you don't remember, is a static-typed language with Ruby's syntax and Java's type system. Duby supports all Ruby's literals, uses local type inference (only argument types *must* be declared), and runs as fast as Java (because it produces nearly identical bytecode). But with the advent of invokedynamic, Duby needed a playmate.


Enter "Juby". Juby is intended to be basically like Duby, in that it uses Java's types and Ruby's syntax. But it takes advantage of the new invokedynamic opcode to be 100% dynamic. Juby is a dynamic Duby, or perhaps a dynamic Java with Ruby syntax. It's not hard to comprehend.

But the name was a problem. Juby sounds like "Jewbie", a pejorative term for a Jewish person. So I sent out a call for names to the Twitterverse, ultimately ending up with far more names than I could choose from.

The name I have chosen, Surinx, has a simple story attached. You see, when James Gosling created Java, he originally named it "Oak", after the tree outside his window. So I followed his lead; the tree (a bush, really) outside my window is a lilac, Syringa vulgaris. The genus "Syringa" is derived from New Latin, based on a Greek word "surinx" meaning "shepherd's pipe" from the use of the Syringa plant's hollow stems to make pipes. Perhaps Surinx is building on the "hollow stem" of the JVM to produce something you can smoke (other dynamic languages) with. Combined with its cousin "Duby", we have quite a pair.

And in other news, the simple Surinx implementation of "fib" below (identical to Ruby) manages to run fib(40) in only 7 seconds on current MLVM (OpenJDK7 + invokedynamic and other tidbits), a five-fold improvement over JRuby's fastest mode (jruby --fast).

def fib(a)
if a < 2
a
else
fib(a - 1) + fib(a - 2)
end
end

Given that JRuby has started to support invokedynamic, the solid performance of Surinx bodes very well for JRuby's future.

Please welcome Surinx to your language repertoire!

Saturday, April 03, 2010

Using Ivy with JRuby 1.5's Ant Integration

JRuby 1.5 will be released soon, and one of the coolest new features is the integration of Ant support into Rake, the Ruby build tool. Tom Enebo wrote an article on the Rake/Ant integration for the Engine Yard blog, which has lots of examples of how to start migrating to Rake without leaving Ant behind. I'm not going to cover all that here.

I've been using the Rake/Ant stuff for a few weeks now, first for my "weakling" RubyGem which adds a queue-supporting WeakRef to JRuby, and now for cleaning up Duby's build process. Along the way, I've realized I really never want to write Ant scripts again; they're so much nicer in Rake, and I have all of Ruby and Ant available to me.

One thing Ant still needs help with is dependency resolution. Many people make the leap to Maven, and let it handle all the nuts and bolts. But that only works if you really buy into the Maven way of life...a problem if you're like me and you live in a lot of hybrid worlds where the Maven way doesn't necessarily fit. So many folks are turning to Apache Ivy to get dependency management in their builds without using Maven.

Today I thought I'd translate the simple "no-install" Ivy example build (warning, XML) to Rake, to see how easy it would be. The results are pretty slick.

First we need to construct the equivalent to the "download-ivy" and "install-ivy" ant tasks. I chose to put that in a Rake namespace, like this:

namespace :ivy do
ivy_install_version = '2.0.0-beta1'
ivy_jar_dir = './ivy'
ivy_jar_file = "#{ivy_jar_dir}/ivy.jar"

task :download do
mkdir_p ivy_jar_dir
ant.get :src => "http://repo1.maven.org/maven2/org/apache/ivy/ivy/#{ivy_install_version}/ivy-#{ivy_install_version}.jar",
:dest => ivy_jar_file,
:usetimestamp => true
end

task :install => :download do
ant.path :id => 'ivy.lib.path' do
fileset :dir => ivy_jar_dir, :includes => '*.jar'
end

ant.taskdef :resource => "org/apache/ivy/ant/antlib.xml",
#:uri => "antlib:org.apache.ivy.ant",
:classpathref => "ivy.lib.path"
end
end

Notice that instead of using Ant properties, I've just used Ruby variables for the Ivy install version, dir, and file. I've also removed the "uri" element to ant.taskdef because I'm not sure if we have an equivalent for that in Rake yet (note to self: figure out if we have an equivalent for that).

With these two tasks, we can now fetch ivy and install it for the remainder of the build. Here's running the download task from the command line:
~/projects/duby ➔ rake ivy:download
(in /Users/headius/projects/duby)
mkdir -p ./ivy
Getting: http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.0.0-beta1/ivy-2.0.0-beta1.jar
To: /Users/headius/projects/duby/ivy/ivy.jar

Now we want a simple task that uses ivy:install to fetch resources and make them available for the build. Here's the example from Apache, using the cachepath task, written in Rake:
task :go => "ivy:install" do
ant.cachepath :organisation => "commons-lang",
:module => "commons-lang",
:revision => "2.1",
:pathid => "lib.path.id",
:inline => "true"
end

Pretty clean and simple, and it fits nicely into the flow of the Rakefile. I can also switch this to using the "retrieve" task, which just pulls the jars down and puts them where I want them:
task :go => "ivy:install" do
ant.retrieve :organisation => 'commons-lang',
:module => 'commons-lang',
:revision => '2.1',
:pattern => 'javalib/[conf]/[artifact].[ext]',
:inline => true
end

This fetches the Apache Commons Lang package along with all dependencies into javalib, separated by what build configuration they are associated with (runtime, test, etc). Here it is in action:
~/projects/duby ➔ rake go
(in /Users/headius/projects/duby)
mkdir -p ./ivy
Getting: http://repo1.maven.org/maven2/org/apache/ivy/ivy/2.0.0-beta1/ivy-2.0.0-beta1.jar
To: /Users/headius/projects/duby/ivy/ivy.jar
Not modified - so not downloaded
Trying to override old definition of task buildnumber
:: Ivy 2.1.0 - 20090925235825 :: http://ant.apache.org/ivy/ ::
:: loading settings :: url = jar:file:/Users/headius/.ant/lib/ivy.jar!/org/apache/ivy/core/settings/ivysettings.xml
:: resolving dependencies :: commons-lang#commons-lang-caller;working
confs: [default, master, compile, provided, runtime, system, sources, javadoc, optional]
found commons-lang#commons-lang;2.1 in public
:: resolution report :: resolve 64ms :: artifacts dl 3ms
---------------------------------------------------------------------
| | modules || artifacts |
| conf | number| search|dwnlded|evicted|| number|dwnlded|
---------------------------------------------------------------------
| default | 1 | 0 | 0 | 0 || 1 | 0 |
| master | 1 | 0 | 0 | 0 || 1 | 0 |
| compile | 1 | 0 | 0 | 0 || 0 | 0 |
| provided | 1 | 0 | 0 | 0 || 0 | 0 |
| runtime | 1 | 0 | 0 | 0 || 0 | 0 |
| system | 1 | 0 | 0 | 0 || 0 | 0 |
| sources | 1 | 0 | 0 | 0 || 1 | 0 |
| javadoc | 1 | 0 | 0 | 0 || 1 | 0 |
| optional | 1 | 0 | 0 | 0 || 0 | 0 |
---------------------------------------------------------------------
:: retrieving :: commons-lang#commons-lang-caller
confs: [default, master, compile, provided, runtime, system, sources, javadoc, optional]
4 artifacts copied, 0 already retrieved (1180kB/30ms)

But if I have multiple artifacts, this could be pretty cumbersome. Since this is Ruby, I can just put this in a method and call it repeatedly:
def ivy_retrieve(org, mod, rev)
ant.retrieve :organisation => org,
:module => mod,
:revision => rev,
:pattern => 'javalib/[conf]/[artifact].[ext]',
:inline => true
end

artifacts = %w[
commons-lang commons-lang 2.1
org.jruby jruby 1.4.0
]

task :go => "ivy:install" do
artifacts.each_slice(3) do |*artifact|
ivy_retrieve(*artifact)
end
end

Look for JRuby 1.5 release candidates soon, and let us know what you think of the new Ant integration!

Saturday, March 28, 2009

BiteScript 0.0.1 - A Ruby DSL for JVM Bytecode

I have finally released the first version of BiteScript, my little DSL for generating JVM bytecode. Install it as a gem with "gem install bitescript".

require 'bitescript'

include BiteScript

fb = FileBuilder.build(__FILE__) do
public_class "SimpleLoop" do
public_static_method "main", void, string[] do
aload 0
push_int 0
aaload
label :top
dup
aprintln
goto :top
returnvoid
end
end
end

fb.generate do |filename, class_builder|
File.open(filename, 'w') do |file|
file.write(class_builder.generate)
end
end

BiteScript grew out of my work on Duby. I did not want to call directly into a Java bytecode API like ASM, so I wrapped it with a nice Ruby-like layer. I also wanted the option of having blocks of bytecode look like raw assembly, but also callable as an API.

Currently only two projects I know of make use of BiteScript: Duby and the upcoming Ruby-to-Java "compiler2" in JRuby, which will also be released as a gem.

For a longer example, you can look at tool/compiler2.rb in JRuby, lib/duby/jvm/jvm_compiler.rb in Duby, or an example implementation of Fibonacci in BiteScript.

I'm open to suggestions for how to improve the API, and I'd also like to add the missing Java 5 features. The better BiteScript works, the better Duby and "compiler2" will work.

For folks interested in using BiteScript, the JVM Specification is an easy-to-read complete reference for targeting the JVM, and here is my favorite JVM opcode quickref.

Monday, March 10, 2008

Duby: A Type-Inferred Ruby-Like JVM Language

It's been one of those nights, my friends! An outstanding night!

All day Sunday I had the stupids. It was probably related to a few beers on Saturday night, or the couple glasses of red wine before that. Whatever it was, I didn't trust myself to work on any JRuby bugs for fear of introducing problems if my brain clicked off mid-stream. So I started playing around with my old bytecode generating DSL from a few months back. Then things got interesting.

We've long wanted to have a "Ruby-like" language, probably a subset of Ruby syntax, that we could compile to solid, fast, idiomatic JVM bytecode. Not a compiler for Ruby, with all the bells and whistles that make Ruby both difficult to support and inefficient to use for implementing itself. A real subset language that produces clean, tight JVM bytecode on par with what you'd get from compiled Java. But better, because it still looks and feels mostly like Ruby.

So I wrote one! And I used my bytecode library too!

Let's say we want to implement our good friend fib.

class Foo
def fib(n)
if (n < 2)
n
else
fib(n - 2) + fib(n - 1)
end
end
end
This is normal ruby code. Given a Fixnum input, it calculates the appropriate fibbonaci number and returns it. It's slow in Ruby for a few reasons:
  1. In JRuby, it uses a boxed integer value. Matz's Ruby and Rubinius use tagged integers to improve performance, and we rely on the JVM to optimize as much as it can (which turns out to be a *lot*). But it's still way slower than using primitives directly.
  2. The comparison operations, integer math operations, and fib operations are all dynamic invocations. So there's at least a bit of method lookup cost, and then a bunch of abstraction cost. You can reduce it, but you can't eliminate it.
  3. There are many Ruby features that influence performance negatively even when you're not using them. It's very difficult, for example, to optimally store local variables when the local scope can be captured at any time. So either you rely on tricks, or you store local variables on the heap and deal with them being slow.
When working with a statically-typed language, you can eliminate all of this. In Java, for example, you have both object types and primitive types; primitive operations are extremely fast and eventually JIT down to machine-code equivalents; and the feature set is suitably narrow to allow current JVMs to do amazing levels of optimization.

But of course Java has its problems too. For one, it does very little guessing or "inferring" of types for you, which means you generally have to declare them all over the place. On local variables, on parameters, on return types. C# 3.0 aims to correct this by adding type inference all over, but then there's still other syntactic hassle using any C-like language: curly braces, semicolons, and other gratuitous syntax that make up a lot of visual noise.

Wouldn't be nice if we could take the code above, add some type inference logic, and turn it into a fast, static-typed language?
class Foo
def fib(n)
{n => java.lang.Integer::TYPE, :return => java.lang.Integer::TYPE}
if (n < 2)
n
else
fib(n - 2) + fib(n - 1)
end
end
end
And there it is! This is the same code as before, but now it's been decorated with a little type declaration block (in the form of a Ruby hash/map) immediately preceding the body of the method. The type decl describes that the 'n' argument is to be mapped to a primitive int, and the method itself will return a primitive int (and yes, I know those could be inferred too...it shall be soon). The rest of the method just works like you'd expect, except that it's all primitive operations, chosen based on the inferred types. For the bold, here's the javap disassembly output from the compiler:
Compiled from "superfib.rb"
public class Foo extends java.lang.Object{
public int fib(int);
Code:
0: iload_1
1: ldc #10; //int 2
3: if_icmpge 10
6: iload_1
7: goto 27
10: aload_0
11: iload_1
12: ldc #10; //int 2
14: isub
15: invokevirtual #12; //Method fib:(I)I
18: aload_0
19: iload_1
20: ldc #13; //int 1
22: isub
23: invokevirtual #12; //Method fib:(I)I
26: iadd
27: ireturn

public Foo();
Code:
0: aload_0
1: invokespecial #15; //Method java/lang/Object."<init>":()V
4: return

}
A few items to point out:
  • A default constructor is generated, as you'd expect in Java. This will be expected to also recognize "def initialize" constructors. I haven't decided if I'll allow overloading or not.
  • Notice the type signature for fib and all the type signatures for calls it makes are correctly inferred to the correct types.
  • Notice all the comparison and arithmetic operations are compiled to the correct bytecodes (iadd, isub, if_icmpge and so on).
And the performance is what you'd hope for:
$ jruby -rjava -e "t = Time.now; 5.times {Java::Foo.new.fib(35)}; p Time.now - t"
0.681
$ jruby -rnormalfib -e "t = Time.now; 5.times {Foo.new.fib(35)}; p Time.now - t"
27.851
Here's another example, with some string operations thrown in:
class Foo
def bar
{:return => java.lang.String}

'here'
end
end

class Foo
# reopening classes works in the same file only (for now)
def baz(a)
{a => java.lang.String}

b = "foo"
a = a + bar + b
puts a
end
end
It works, of course:
$ jruby -rjava -e "Java::Foo.new.baz('Type inference is fun')"
Type inference is funherefoo
And once again, the disassembled output:
Compiled from "stringthing.rb"
public class Foo extends java.lang.Object{
public java.lang.String bar();
Code:
0: ldc #13; //String here
2: areturn

public void baz(java.lang.String);
Code:
0: ldc #15; //String foo
2: astore_2
3: aload_1
4: checkcast #17; //class java/lang/String
7: aload_0
8: invokevirtual #19; //Method bar:()Ljava/lang/String;
11: invokevirtual #23; //Method java/lang/String.concat:(Ljava/lang/String;)Ljava/lang/String;
14: checkcast #17; //class java/lang/String
17: aload_2
18: invokevirtual #23; //Method java/lang/String.concat:(Ljava/lang/String;)Ljava/lang/String;
21: astore_1
22: getstatic #29; //Field java/lang/System.out:Ljava/io/PrintStream;
25: aload_1
26: invokevirtual #34; //Method java/io/PrintStream.println:(Ljava/lang/String;)V
29: return

public Foo();
Code:
0: aload_0
1: invokespecial #36; //Method java/lang/Object."<init>":()V
4: return

}
Notice here that the + operation was detected as acting on two strings, so it was compiled to call String#concat rather than try to do a numeric operation. These sorts of mappings are simple to add, and since there's type information everywhere it's also easy to come up with cool ways to map Ruby syntax to Java types.

My working name for this is going to be Duby, pronounced "Doobie", after Duke plus Ruby. Duby! It has a nice ring to it. It may be subject to change, but that's what we'll go with for the moment.

Currently, Duby supports only the features you see here. It's a very limited subset of Ruby right now, and the subset doesn't support all Java primitive types, for example, so there's a lot of blanks to be filled. It also doesn't support static (class) methods, doesn't wire up "initialize" methods, doesn't support packages (for namespacing) or imports (to shrink type names) or superclasses or interfaces or enums or generics or what have you. But it's already functional for simple code, as you see here, so I think it's a great start for 10 hours of work. The rest will come, as needed and as time is available.

What are my plans for it? Well many of you may know Rubinius, an effort to reimplement Ruby in Ruby, modeled after the design of many Smalltalk VMs. Well, in order to make JRuby more approachable to Ruby developers, Duby seems like the best way to go. They'll be able to write mostly idiomatic Ruby code and know it will both perform at Java speeds and provide compile-type checking that all is wired up correctly. So we can avoid the overhead of "turtles all the way down" by just teaching one turtle how to speak JVM bytecode and building on that.

I also hope this will lead to lots of new ideas about how to implement languages for the JVM. It's certainly attractive to language users to be able to contribute to language implementations using the language in question, and if we can come up with compilers and sub-languages like Duby it could make the JVM more approachable to a wide range of developers.

Oh, and for those of you curious: the Duby compiler is written mostly in Ruby too.

Thursday, April 02, 2009

How JRuby Makes Ruby Fast

At least once a year there's a maelstrom of posts about a new Ruby implementation with stellar numbers. These numbers are usually based on very early experimental code, and they are rarely accompanied by information on compatibility. And of course we love to see crazy performance numbers, so many of us eat this stuff up.

Posting numbers too early is a real disservice to any project, since they almost certainly don't represent the eventual real-world performance people will see. It encourages folks to look to the future, but it also marginalizes implementations that already provide both compatibility and performance, and ignores how much work it has taken to get there. Given how much we like to see numbers, and how thirsty the Ruby community is for "a fastest Ruby", I don't know whether this will ever change.

I thought perhaps a discussion about the process of optimizing JRuby might help folks understand what's involved in building a fast, compatible Ruby implementation, so that these periodic shootouts don't get blown out of proportion. Ruby can be fast, certainly even faster than JRuby is today. But getting there while maintaining compatibility is very difficult.

Performance Optimization, JRuby-style

The truth is it's actually very easy to make small snippits of Ruby code run really fast, especially if you optimize for the benchmark. But is it useful to do so? And can we extrapolate eventual production performance from these early numbers?

We begin our exploration by running JRuby in interpreted mode, which is the slowest way you can run JRuby. We'll be using the "tak" benchmark, since it's simple and easy to demonstrate relative performance at each optimization level.
# Takeuchi function performance, tak(24, 16, 8)
def tak x, y, z
if y >= x
return z
else
return tak( tak(x-1, y, z),
tak(y-1, z, x),
tak(z-1, x, y))
end
end

require "benchmark"

N = (ARGV.shift || 1).to_i

Benchmark.bm do |make|
N.times do
make.report do
i = 0
while i<10
tak(24, 16, 8)
i+=1
end
end
end
end

And here's our first set of results. I have provided Ruby 1.8.6 and Ruby 1.9.1 numbers for comparison.
Ruby 1.8.6p114:
➔ ruby bench/bench_tak.rb 5
user system total real
17.150000 0.120000 17.270000 ( 17.585128)
17.170000 0.140000 17.310000 ( 17.946869)
17.180000 0.160000 17.340000 ( 18.234570)
17.180000 0.150000 17.330000 ( 17.779536)
18.790000 0.190000 18.980000 ( 19.560232)

Ruby 1.9.1p0:
➔ ruby191 bench/bench_tak.rb 5
user system total real
3.570000 0.030000 3.600000 ( 3.614855)
3.570000 0.030000 3.600000 ( 3.615341)
3.560000 0.020000 3.580000 ( 3.608843)
3.570000 0.020000 3.590000 ( 3.591833)
3.570000 0.020000 3.590000 ( 3.640205)

JRuby 1.3.0-dev, interpreted, client VM
➔ jruby -X-C bench/bench_tak.rb 5
user system total real
24.981000 0.000000 24.981000 ( 24.903000)
24.632000 0.000000 24.632000 ( 24.633000)
25.459000 0.000000 25.459000 ( 25.459000)
29.122000 0.000000 29.122000 ( 29.122000)
29.935000 0.000000 29.935000 ( 29.935000)

Ruby 1.9 posts some nice numbers here, and JRuby shows how slow it can be when doing no optimizations at all. The first change we look at, and which we recommend to any users seeking best-possible performance out of JRuby, is to use the JVM's "server" mode, which optimizes considerably better.
JRuby 1.3.0-dev, interpreted, server VM
➔ jruby --server -X-C bench/bench_tak.rb 5
user system total real
8.262000 0.000000 8.262000 ( 8.192000)
7.789000 0.000000 7.789000 ( 7.789000)
8.012000 0.000000 8.012000 ( 8.012000)
7.998000 0.000000 7.998000 ( 7.998000)
8.000000 0.000000 8.000000 ( 8.000000)

The "server" VM differs from the default "client" VM in that it will optimistically inline code across calls and optimize the resulting code as a single unit. This obviously allows it to eliminate costly x86 CALL operations, but even more than that it allows optimizing algorithms which span multiple calls. By default, OpenJDK will attempt to inline up to 9 levels of calls, so long as they're monomorphic (only one valid target), not too big, and no early assumptions are changed by later code (like if a monomorphic call goes polymorphic later on). In this case, where we're not yet compiling Ruby code to JVM bytecode, this inlining is mostly helping JRuby's interpreter, core classes, and method-call logic. But already we're 3x faster than interpreted JRuby on the client VM.

The next optmization will be to turn on the compiler. I've modified JRuby for the next couple runs to *only* compile and not do any additional optimizations. We'll discuss those optimizations as I add them back.
JRuby 1.3.0-dev, compiled (unoptimized), server VM:
➔ jruby --server -J-Djruby.astInspector.enabled=false bench/bench_tak.rb 5
user system total real
5.436000 0.000000 5.436000 ( 5.376000)
3.655000 0.000000 3.655000 ( 3.655000)
3.662000 0.000000 3.662000 ( 3.662000)
3.683000 0.000000 3.683000 ( 3.683000)
3.668000 0.000000 3.668000 ( 3.668000)

By compiling, without doing any additional optimizations, we're able to improve performance 2x again. Because we're now JITing Ruby code as JVM bytecode, and the JVM eventually JITs JVM bytecode to native code, our Ruby code actually starts to benefit from the JVM's built-in optimizations. We're making better use of the system CPU and not making nearly as many calls as we would from the interpreter (since the interpreter is basically a long chain of calls for each low-level Ruby operation.

Next, we'll turn on the simplest and oldest JRuby compiler optimization, "heap scope elimination".
JRuby 1.3.0-dev, compiled (heap scope optz), server VM:
➔ jruby --server bench/bench_tak.rb 5
user system total real
4.014000 0.000000 4.014000 ( 3.942000)
2.776000 0.000000 2.776000 ( 2.776000)
2.760000 0.000000 2.760000 ( 2.760000)
2.769000 0.000000 2.769000 ( 2.769000)
2.768000 0.000000 2.768000 ( 2.769000)

The "heap scope elimination" optimization eliminates the use of an in-memory store for local variables. Instead, when there's no need for local variables to be accessible outside the context of a given method, they are compiled as Java local variables. This allows the JVM to put them into CPU registers, making them considerably faster than reading or writing them from/to main memory (via a cache, but still slower than registers). This also makes JRuby ease up on the JVM's memory heap, since it no longer has to allocate memory for those scopes on every single call. This now puts us comfortably faster than Ruby 1.9, and it represents the set of optimizations you see in JRuby 1.2.0.

Is this the best we can do? No, we can certainly do more, and some such experimental optimizations are actually already underway. Let's continue our exploration by turning on another optimization similar to the previous one: "backtrace-only frames".
JRuby 1.3.0-dev, compiled (heap scope + bracktrace frame optz), server VM:
➔ jruby --server -J-Djruby.compile.frameless=true bench/bench_tak.rb 5
user system total real
3.609000 0.000000 3.609000 ( 3.526000)
2.600000 0.000000 2.600000 ( 2.600000)
2.602000 0.000000 2.602000 ( 2.602000)
2.598000 0.000000 2.598000 ( 2.598000)
2.602000 0.000000 2.602000 ( 2.602000)

Every Ruby call needs to store information above and beyond local variables. There's the current "self", the current method visibility (used for defining new methods), which class is currently the "current" one, backref and lastline values ($~ and $_), backtrace information (caller's file and line), and some other miscellany for handling long jumps (like return or break in a block). In most cases, this information is not used, and so storing it and pushing/popping it for every call wastes precious time. In fact, other than backtrace information (which needs to be present to provide Ruby-like backtrace output), we can turn most of the frame data off. This is where we start to break Ruby a bit, though there are ways around it. But you can see we get another small boost.

What if we eliminate frames entirely and just use the JVM's built-in backtrace logic? It turns out that having any pushing/popping of frames, even with only backtrace data, still costs us quite a bit of performance. So let's try "heap frame elimination":
JRuby 1.3.0-dev, compiled (heap scope + heap frame optz), server VM:
➔ jruby --server -J-Djruby.compile.frameless=true bench/bench_tak.rb 5
user system total real
2.955000 0.000000 2.955000 ( 2.890000)
1.904000 0.000000 1.904000 ( 1.904000)
1.843000 0.000000 1.843000 ( 1.843000)
1.823000 0.000000 1.823000 ( 1.823000)
1.813000 0.000000 1.813000 ( 1.813000)

By eliminating frames entirely, we're a good 33% faster than the fastest "fully framed" run you'd get with stock JRuby 1.2.0. You'll notice the command line here is the same; that's because we're venturing into more and more experimental code, and in this case I've actually forced "frameless" to be "no heap frame" instead of "backtrace-only heap frame". And what do we lose with this change? We no longer would be able to produce a backtrace containing only Ruby calls, so you'd see some JRuby internals in the trace, similar to how Rubinius shows Rubinius internals. But we're getting respectably fast now.

Next up we'll turn on some optimizations for math operators.
JRuby 1.3.0-dev, compiled (heap scope, heap frame, fastops optz), server VM:
➔ jruby --server -J-Djruby.compile.frameless=true -J-Djruby.compile.fastops=true bench/bench_tak.rb 5
user system total real
2.291000 0.000000 2.291000 ( 2.225000)
1.335000 0.000000 1.335000 ( 1.335000)
1.337000 0.000000 1.337000 ( 1.337000)
1.344000 0.000000 1.344000 ( 1.344000)
1.346000 0.000000 1.346000 ( 1.346000)

Most of the time, when calling + or - on an object, we do the full Ruby dynamic dispatch cycle. Dispatch involves retrieving the target object's metaclass, querying for a method (like "+" or "-"), and invoking that method with the appropriate arguments. This works fine for getting us respectable performance, but we want to take things even further. So JRuby has experimental "fast math" operations to turn most Fixnum math operators into static calls rather than dynamic ones, allowing most math operations to inline directly into the caller. And what do we lose? This version of "fast ops" makes it impossible to override Fixnum#+ and friends, since whenever we call + on a Fixnum it's going straight to the code. But it gets us another nearly 30% improvement.

Up to now we've still also been updating a lot of per-thread information. For every line, we're tweaking a per-thread field to say what line number we're on. We're also pinging a set of per-thread fields to handle the unsafe "kill" and "raise" operations on each thread...basically we're checking to see if another thread has asked the current one to die or raise an exception. Let's turn all that off:

JRuby 1.3.0-dev, compiled (heap scope, heap frame, fastops, threadless, positionless optz), server VM:
➔ jruby --server -J-Djruby.compile.frameless=true -J-Djruby.compile.fastops=true -J-Djruby.compile.positionless=true -J-Djruby.compile.threadless=true bench/bench_tak.rb 5
user system total real
2.256000 0.000000 2.256000 ( 2.186000)
1.304000 0.000000 1.304000 ( 1.304000)
1.310000 0.000000 1.310000 ( 1.310000)
1.307000 0.000000 1.307000 ( 1.307000)
1.301000 0.000000 1.301000 ( 1.301000)

We get a small but measurable performance boost from this change as well.

The experimental optimizations up to this point (other than threadless) comprise the set of options for JRuby's --fast option, shipped in 1.2.0. The --fast option additionally tries to statically inspect code to determine whether these optimizations are safe. For example, if you're running with --fast but still access backrefs, we're going to create a frame for you anyway.

We're not done yet. I mentioned earlier the JVM gets some of its best optimizations from its ability to profile and inline code at runtime. Unfortunately in current JRuby, there's no way to inline dynamic calls. There's too much plumbing involved. The upcoming "invokedynamic" work in Java 7 will give us an easier path forward, making dynamic calls as natural to the JVM as static calls, but of course we want to support Java 5 and Java 6 for a long time. So naturally, I have been maintaining an experimental patch that eliminates most of that plumbing and makes dynamic calls inline on Java 5 and Java 6.
JRuby 1.3.0-dev, compiled ("--fast", dyncall optz), server VM:
➔ jruby --server --fast bench/bench_tak.rb 5
user system total real
2.206000 0.000000 2.206000 ( 2.066000)
1.259000 0.000000 1.259000 ( 1.259000)
1.258000 0.000000 1.258000 ( 1.258000)
1.269000 0.000000 1.269000 ( 1.269000)
1.270000 0.000000 1.270000 ( 1.270000)

We improve again by a small amount, always edging the performance bar higher and higher. In this case, we don't lose compatibility, we lose stability. The inlining modification breaks method_missing and friends, since I have not yet modified the call pipeline to support both inlining and method_missing. And there's still a lot of extra overhead here that can be eliminated. But in general we're still mostly Ruby, and even with this change you can run a lot of code.

This represents the current state of JRuby. I've taken you all the way from slow, compatible execution, through fast, compatible execution, and all the way to faster, less-compatible execution. There's certainly a lot more we can do, and we're not yet as fast as some of the incomplete experimental Ruby VMs. But we run Ruby applications, and that's no small feat. We will continue making measured steps, always ensuring compatibility first so each release of JRuby is more stable and more complete than the last. If we don't immediately leap to the top of the performance heap, there's always good reasons for it.

Performance Optimization, Duby-style

As a final illustration, I want to show the tak performance for a language that looks like Ruby, and tastes like Ruby, but boasts substantially better performance: Duby.
def tak(x => :fixnum, y => :fixnum, z => :fixnum)
unless y < x
z
else
tak( tak(x-1, y, z),
tak(y-1, z, x),
tak(z-1, x, y))
end
end

puts "Running tak(24,16,8) 1000 times"

i = 0
while i<1000
tak(24, 16, 8)
i+=1
end

This is the Takeuchi function written in Duby. It looks basically like Ruby, except for the :fixnum type hints in the signature. Here's a timing of the above script (which calls tak the same as before but 1000 times instead of 5 times), running on the server JVM:
➔ time jruby -J-server bin/duby examples/tak.duby
Running tak(24,16,8) 1000 times

real 0m13.657s
user 0m14.529s
sys 0m0.450s

So what you're seeing here is that Duby can run "tak(24,16,8)", the same function we tested in JRuby above, in an average of 0.013 seconds--nearly two orders of magnitude faster than the fastest JRuby optimizations above and at least an order of magnitude faster than the fastest incomplete, experimental implementations of Ruby. What does this mean? Absolutely nothing, because Duby is not Ruby. But it shows how fast a Ruby-like language can get, and it shows there's a lot of runway left for JRuby to optimize.

Be a (Supportive) Critic!

So the next time someone posts an article with crazy-awesome performance numbers for a Ruby implementation, by all means applaud the developers and encourage their efforts, since they certainly deserve credit for finding new ways to optimize Ruby. But then ask yourself and the article's author how much of Ruby the implementation actually supports, because it makes a big difference.

Update, April 4: Several people told me I didn't go quite far enough in showing that by breaking Ruby you could get performance. And after enough cajoling, I was convinced to post one last modification: recursion optimization.
JRuby 1.3.0-dev, compiled ("--fast", dyncall optz, recursion optz), server VM:
➔ jruby --server --fast bench/bench_tak.rb 5
user system total real
0.524000 0.000000 0.524000 ( 0.524000)
0.338000 0.000000 0.338000 ( 0.338000)
0.325000 0.000000 0.325000 ( 0.325000)
0.299000 0.000000 0.299000 ( 0.299000)
0.310000 0.000000 0.310000 ( 0.310000)

Woah! What the heck is going on here? In this case, JRuby's compiler has been hacked to turn recursive "functional calls", i.e. calls to an implicit "self" receiver, into direct calls. The logic behind this is that if you're calling the current method from the current method, you're going to always dispatch back to the same piece of code...so why do all the dynamic call gymnastics? This fits a last piece into the JVM inlining-optimization puzzle, allowing mostly-recursive benchmarks like Takeuchi to inline more of those recursive calls. What do we lose? Well, I'm not sure yet. I haven't done enough testing of this optimization to know whether it breaks Ruby in some subtle way. It may work for 90% of cases, but fail for an undetectable 10%. Or it may be something we can determine statically, or something for which we can add an inexpensive guard. Until I know, it won't go into a release of JRuby, at least not as a default optimization. But it's out there, and I believe we'll find a way.

It is also, incidentally, only a few times slower than a pure Java version of the same benchmark, provided Java is using all boxed numerics too.

Friday, June 27, 2008

JRuby Japanese Tour 2008 Wrap-Up!

Whew! I survived the insanity of the JRuby Japanese Tour 2008, and now it's time to report on it. This post will mostly be a blow-by-blow account of the trip, and I'll try to post more in-depth thoughts later. I am still in Tokyo and need to repack my luggage, so this will be brief.

Day 1

  • Left my warm, sunny vacation in Michigan to board flight #1 from Chicago to Minneapolis
  • First-class upgrade for Chicago-Minneapolis flight. Whoopie...it's like an hour flight.
  • Just enough time in Minneapolis airport to change some money. Hopefully 20k ¥ will be enough. Compatible cash machines are hard to come by in Japan.
  • Twelve-hour flight #2 from Minneapolis to Narita. Glad I moved to a window seat facing a bulkhead, since I was able to stretch out and sleep most of the flight.
  • Arrived in Narita without event; purchased bus ticket to Tsukuba and rented a cell phone.
  • Bus ride to Tsukuba went through some very nice countryside.
  • Arrived in Tsukuba, walked about five minutes to the hotel and ran into Chad Fowler, Evan Phoenix, Rich Kilmer and others gathering in the lobby.
  • Checked into my room, dropped off my stuff, went back down to lobby to find other American rubyists had taken off for dinner. Teh suck. Ate unimpressive dinner alone in hotel restaurant.
Day 2
  • Ruby Kaigi day one.
  • Sun folks provided a JRuby t-shirt, which was great because I forgot to pack one of mine!
  • Delivered JRuby presentation, and it went very well. Demos almost all worked perfectly, lots of questions showed people were impressed.
  • Met up with Sun guys at Kaigi booth for a bit. They were giving away little Duke+Ruby candies! Awesome!
  • At evening event, lots of discussion about JRuby, and I got to show off Duby a little bit too.
Day 3
  • Ruby Kaigi day two.
  • Some good talks, but definitely a "day two" slate.
  • Especially liked Naoto Takai and Koichiro Ohba's enterprise Ruby talk. Very pragmatic, hopefully very helpful for .jp Rubyists.
  • Met up with ko1 and Prof Kakehi from Tokyo University to discuss progress of MVM collaboration.
  • Had to leave before Reject Kaigi to transfer to a hotel near Haneda Airport.
  • Dinner at Haneda Airport with Takashi Shitamichi of Sun KK.
  • Stayed at nearby comfortable JAL Hotel, basically JAL's airport hotel.
Day 4
  • Quick breakfast at Haneda Airport; "morning set" included a half-boiled egg, toast, salad, coffee.
  • Flight #3 from Haneda to Izumo. Upgraded for 1000¥ to first class.
  • Takashi and I met Matsue folks at Izumo airport for a short ride to Matsue.
  • Met up with NaCl folks including Matz, Shugo Maeda, and others.
  • Listened to a presentation in English about a large Ruby app migrated from an old COBOL mainframe app. JRuby used to interface Ruby with reporting solutions.
  • Lunch box at NaCl after presentation.
  • Played Shogi with Shugo after lunch. Shugo beat me pretty handily. He said I was strong (but I think he was just being polite).
  • Played Igo (Go) with Hideyuki Yasuda. He played with a 9-stone handicap and was winning when I had to leave. He invited me to join NaCl Igo club on KGS and said he'd like to continue the game online.
  • Delivered a lecture on JRuby with Takashi at Shimane University. Received some good questions, but it was a tough crowd (kids just starting out in CS).
  • Took a quick tour of Matsue Castle with Takashi and a personal guide. Basically ran to the top, looked around, and ran back down. Back to work!
  • Evening event at office of the "Ruby City Matsue" project. While everything was being set up, demonstrated JRuby/JVM/HotSpot optimizations for NaCl folks.
  • Delivered the opening toast for the evening event. Barely had time to eat between questions from folks. Showed off Duby and more HotSpot JIT-logging coolness.
  • Post-event trip to local Irish pub. Many Guinness were drunk. Many Rubyists were drunk.
  • Walked back to hotel earlier than others to get some sleep.
Day 5
  • Took a bus from Matsue back to Izumo Airport.
  • Flights #4 and #5 took me from Izumo to Haneda and Haneda to Fukuoka. Saw Mount Fuji from the air, poking just above the clouds.
  • Takashi and I missed flight #5, so we were delayed about an hour.
  • Arrived in Fukuoka, immediately raced over to Ruby Business Commons event. Delivered presentation to an extremely receptive crowd.
  • Post RBC event included beer, various dried fishy things, and lots of photo-taking and card-exchanging. Showed off Duby again, and met authors of an upcoming Japanese book on JRuby!
  • Invited out for more drinking (famous Fukuoka Shouchu), but Takashi wisely told them we were very tired.
Day 6
  • Breakfast at Izumo airport. "Honey toast" and coffee. Basically a thick slice of toast with butter and honey.
  • Flight back to Tokyo (Haneda) from Izumo.
  • Checked into Cerulean Tower Hotel in Shibuya, my final home for the trip.
  • Off to Shinagawa to present JRuby at Rakuten's offices. Across the street from Namco/Bandai! Lots of great questions...I think they were impressed.
  • Back to Shibuya with Takashi for Shabu-Shabu dinner. Mid-range Japanese beef...truly excellent. Ate way too much.
Day 7
  • Woke up a couple times in the night with indigestion. Why oh why did I eat so much beef?
  • Off to Sun KK offices in Yoga for a public techtalk event.
  • JRuby presentation wowed attendees...lots of questions after and great discussions.
  • Traditional Japanese-style dinner with Takai-san and Ohba-san, plus the excellent Sun KK JRuby enthusiasts.
  • Witnessed registration of new jruby-users.jp site and discussed new mascot ideas for JRuby. NekoRuby, perhaps?
  • Many new consumption firsts: Hoppy (cheap beer plus shouchu on ice), raw beef, and raw horse. I can check horse off my list.
  • Ramen in Shibuya to close out the night.
Day 8
  • Internal presentation on JRuby at Sun KK offices. Slim attendance...there were apparently HR training sessions at the same time. But still fun.
  • Big sigh of relief at being done presenting. Lunch at Chinese restaurant with Takashi and said goodbye. どもありがとございます, Shitamichi-san!
  • Stopped into hotel to arrange remaining plans for the day.
  • Visted Edo-Tokyo Museum in Ryogoku, to see images and artifacts of the old capital. Choked up a bit watching videos of the incendiary bombing raids by the US on Tokyo.
  • Headed to Akihabara to meet up with ko1. Wandered around for about an hour before heading up to his "Sasada-lab".
  • Mini "JRuby Kaigi" at Sasada-lab. Showed off Duby, JRuby optimizations, and Ruby-Processing demos (plus applets!).
  • Off to "Meido Kissa" ("Maid Cafe") with ko1 and other members of ruby-core. Very unusual experience, but entertaining.
  • Back to hotel. Considered a trip to my favorite Belgian beer bar in Shibuya ("Belgo"), but I'm plumb tuckered out. Catching up on email, IRC, and writing this post.
Day 9
  • All that remains is getting to Narita and flying home. Hopefully all will go smoothly.
Well, that's about it. Any names I've excluded I'll hopefully include in more detailed posts later. To all the Japanese Rubyists (and JRubyists): thank you so much for making me feel so welcome, and I hope we can work together to make JRuby the best Ruby implementation it can possibly be. Please feel free to email me (Japanese is ok!) or find me on IRC (Japanese is ok!) and please try to help the jruby-users.jp site be a success when it is ready. I think there's a tremendous future for JRuby in Japan!

Monday, October 27, 2008

Duby Presentation at Ruby Users of MN

Here's the presentation I gave tonight on Duby. I'm putting together an in-depth post for release this week.

Monday, July 27, 2009

JRuby's Importance to Ruby, and eRubyCon 2009

I'm going to be speaking about JRuby again this year at eRubyCon, in Columbus OH. I just got back from Rails Underground, which reminded me how much I love the smaller regional Ruby conferences. So I'm totally pumped to go to eRubyCon this year.


The idea of "Enterprise Ruby" has become less repellant since Dave Thomas's infamous keynote at RalsConf 2006. There are a lot of large, lumbering organizations out there that have yet to adopt any of the newer agile language/framework combinations, and Rails has most definitely led the way. I personally believe that in order for Ruby to become more than just a nice language with a great community, it needs to gain adoption in those organizations, and it needs to do it damn quickly. JRuby is by far the best way for that to happen.

There's another aspect to adoption I think has escaped a lot of Rubyists. In 2006 and 2007, Ruby gained a lot of Java developers who were running away from bloated, over-complicated frameworks and the verbosity and inelegance of Java. When I asked at Ruby conferences in 2005, 2006, and 2007 how many people had done Java development in a former life, almost everyone in the room raised their hands. When I've asked the same question in 2008 and 2009, it's down to less than half the room. Where did they go?

The truth is that the Java platform now has reasonably good answers to Ruby in Groovy, Scala, and Clojure, and reasonably good answers to Rails in Grails and Lift. And yet many Rubyists don't realize how important it is for JRuby to continue doing well, many still seeing it as simply "nice to have" while dismissing the entirety of the Java platform as unimportant to Ruby's future. It's an absurd position, but I blame myself for not making this case sooner.

I believe that JRuby is the most crucial technology for Ruby's future right now. Regardless of how fast or how solid the C or C++ based Ruby implementations get, the vast majority of large organizations are *never* going to run them. That's the truth. If we can leverage JRuby to grab 1-2% of the Java market, we'll *double* the size of the Ruby community. If we completely lose the Java platform to alternatives, Rubyists may not have the luxury of remaining Rubyists in the future. It's that big a deal.

So I hope you'll come by eRubyCon and hear what we've been working on in JRuby and what we have planned for the future, especially our work on making JRuby a stronger JVM citizen. I'm certain to expand on the Hibernate-based prototype code I showed at Rails Underground, and hope to have some additional, never-before-seen demonstrations that will shock and amaze you. And if there's time, I'll demonstrate my two research pets, the "Ruby Mutant" twins Duby and Juby.

See you there!

Sunday, March 28, 2010

Ruby Summer of Code 2010

This year, no major Ruby organization got accepted to Google's Summer of Code (even though a half dozen Python projects got accepted, but I won't rant here). What do we as Rubyists do? Take it sitting down? NO! We make our own Summer of Code!

Thanks to Engine Yard, Ruby Central, and the Rails team, Ruby Summer of Code has raised $100k in just three days, allowing us to run 20 student projects! Hooray!

Ruby Summer of Code

Now of course we really would love to have some JRuby projects involved. There's so much exciting stuff going on with JRuby, and I believe it's the most promising platform for really growing the Ruby community. So we've set up a page for JRuby Ruby Summer of Code 2010 ideas. Here's a few to get you started:

  • JRuby on Android work, including command-line tooling, performance work, and all the little bits and pieces needed to make Ruby a first-class Android language.
  • Porting key C extensions to JRuby, so there's an alternative for people migrating.
  • A super-fast lightweight server similar to the GlassFish gem.
  • A full Hibernate and/or JPA backend for DataMapper or DataObjects, so that all databases Hibernate supports "just work" with JRuby.
  • Work on JRuby's nascent suport for Ruby C extensions by building the API out
  • Help get JRuby's early optimizing compiler wired up, to take JRuby's perf to the next level
  • Duby-related projects, like IDE support, better tooling, codebase cleanup, features, documentation.
And there's dozens of other projects out there just waiting for you! Add yourself as a student on the RubySOC page, add some ideas to the JRuby ideas page, and let's get hacking!

Tuesday, March 10, 2009

Compiling Ruby to Java Types

"Compiler #2" as it has been affectionately called is a compiler to turn normal Ruby classes into Java classes, so they can be constructed and called by normal Java code. When I asked for 1.3 priorities, this came out way at the top. Tom thought perhaps I asked for trouble putting it on the list, and he's probably right (like asking "prioritize these: sandwich, pencil, shiny gold ring with 5kt diamond, banana"), but I know this has been a pain point for people.

I have just landed an early prototype of the compiler on trunk. I made a few decisions about it today:

  • It will use my bytecode DSL "BiteScript", just like Duby does
  • It will use the *runtime* definition of a class to generate the Java version
The second point is an important one. Instead of having an offline compiler that inspects a file and generates code from it, the compiler will actually used the runtime class to create a Java version. This means you'll be able to use all the usual metaprogramming facilities, and at whatever point the compiler picks up your class it will see all those methods.

Here's an example:

# myruby.rb
require 'rbconfig'

class MyRubyClass
def helloWorld
puts "Hello from Ruby"
end
if Config::CONFIG['host_os'] =~ /mswin32/
def goodbyeWorld(a)
puts a
end
else
def nevermore(*a)
puts a
end
end
end

Here we have a class that defines two methods. The first, always defined, is helloWorld. The second is conditionally either goodbyeWorld or nevermore, based on whether we're on Windows. Yes, it's a contrived example...bear with me.

The compiler2 prototype can be invoked as follows (assuming bitescript is checked out into ../bitescript):

jruby -I ../bitescript/lib/ tool/compiler2.rb MyObject MyRubyClass myruby

A breakdown of these arguments is as follows:
  • -I ../bitescript/lib includes bitescript
  • tool/compiler2.rb is the compiler itself
  • MyObject is the name we'd like the Java class to have
  • MyRubyClass is the name of the Ruby class we want it to front
  • myruby is the library we want it to require to load that class
Running this on OS X and dumping the resulting Java class gives us:

Compiled from "MyObject.java.rb"
public class MyObject extends org.jruby.RubyObject{
static {};
public MyObject();
public org.jruby.runtime.builtin.IRubyObject helloWorld();
public org.jruby.runtime.builtin.IRubyObject nevermore(org.jruby.runtime.builtin.IRubyObject[]);
}

The first thing to notice is that the compiler has generated a method for nevermore, since I'm not on Windows. I believe this will be unique among dynamic languages on the JVM: we will make the *runtime* set of methods available through the Java type, not just the static set present at compile time.

Because there are no type signatures specified for MyRubyClass, all types have defaulted to IRubyObject. Type signature logic will come along shortly. And notice also this extends RubyObject; a limitation of the current setup is that you won't be able to use compiler2 to create subclasses. That will come later.

Once you've run this, you've got a MyObject that can be instantiated and used directly. Behind the scenes, it uses a global JRuby instance, so JRuby's still there and you still need it in classpath, but you won't have to instantiate a runtime, pass it around, and so on. It should make integrating JRuby into Java frameworks that want a real class much easier.

So, thoughts? Questions? Have a look at the code under tool/compiler2.rb in JRuby's repository. The entire compiler is so far only 78 lines of Ruby code.