Friday, October 31, 2008

FFI for Ruby Now Available

One of the largest problems plaguing Ruby implementations (and plaguing some other language implementations, so I hear from my Pythonista friends) is the ever-painful story of "extensions". In general, these take the form of a dynamic library, usually written in C, that plugs into and calls Ruby's native API as exposed through ruby.h and libruby. Ignoring for the moment the fact that this API exposes way more of Ruby's internals than it should, extensions present a very difficult problem for other implementations:

Do we support them or not?

In many cases, this question is answered for us; most extensions require access to object internals we can't expose, or can't expose without extremely expensive copying back and forth. But there's also a silver lining: the vast majority of C-based extensions exist solely to wrap another library.

Isn't it obvious what's needed here?

This problem has been tackled by a number of libraries on a number of platforms. On the JVM, there's Java Native Access (JNA). On Python, there's ctypes. And even on Ruby, there's the "dl" stdlib, wrapping libdl for programmatic access to dynamic libraries. But dl is not widely used, because of real or perceived bugs and a rather arcane API. Something better is needed.

Enter FFI.

FFI stands for Foreign Function Interface. FFI has been implemented in various libraries; one of them, libffi, actually serves as the core of JNA, allowing Java code to load and call arbitrary C libraries. libffi allows code to load a library by name, retrieve a pointer to a function within that library, and invoke it, all without static bindings, header files, or any compile phase.

In order to address a need early in Rubinius's dev cycle, Evan Phoenix came up with an FFI library for Rubinius, wrapping the functionality of libffi in a friendly Ruby DSL-like API.

A simple FFI script calling the C "getpid" function:

require 'ffi'

module GetPid
extend FFI::Library

attach_function :getpid, [], :uint
end

puts GetPid.getpid
Because JRuby already ships with JNA, and because FFI could fulfill the C-extension needs of almost all Ruby users, we endeavored to create a compatible implementation. And by we I mean Wayne Meissner.

Wayne is one of the primary maintainers of JNA, and has recently spent time on a new higher-performance version of it called JFFI. Wayne also became a JRuby committer this spring, and perhaps his most impressive contribution to date is a full FFI library for JRuby, based on JNA (eventually JFFI, once we migrate fully) and implementing the full set of what we and Evan agreed would be "FFI API 1.0". We shipped the completed FFI support in JRuby 1.1.4.

The "Passwd" and "Group" structures for functions like 'getpwuid':
module Etc
class Passwd < FFI::Struct
layout :pw_name, :string, 0,
:pw_passwd, :string, 4,
:pw_uid, :uint, 8,
:pw_gid, :uint, 12,
:pw_dir, :string, 20,
:pw_shell, :string, 24
end
class Group < FFI::Struct
layout :gr_name, :string, 0,
:gr_gid, :uint, 8
end
end
In JRuby 1.1.5, we've taken another step forward with the API, adding support for callbacks. How would you represent a callback you pass into a C function from Ruby? How else! As a block!

Binding and calling "qsort" with an array of integers:
require 'ffi'

module LibC
extend FFI::Library
callback :qsort_cmp, [ :pointer, :pointer ], :int
attach_function :qsort, [ :pointer, :int, :int, :qsort_cmp ], :int
end

p = MemoryPointer.new(:int, 2)
p.put_array_of_int32(0, [ 2, 1 ])
puts "Before qsort #{p.get_array_of_int32(0, 2).join(', ')}"
LibC.qsort(p, 2, 4) do |p1, p2|
i1 = p1.get_int32(0)
i2 = p2.get_int32(0)
i1 < i2 ? -1 : i1 > i2 ? 1 : 0
end
puts "After qsort #{p.get_array_of_int32(0, 2).join(', ')}"
But what good is having such a library if it doesn't run everywhere? Up until recently, only Rubinius and JRuby supported FFI, which made our case for cross-implementation use pretty weak. Even though we were getting good use out of FFI, there was no motivation for anyone to use it in general, since the standard Ruby implementation had no support.

That is, until Wayne pulled another rabbit out of his hat and implemented FFI for C Ruby as well. The JRuby team is proud to announce a wholly non-JRuby library: FFI is now available on Ruby 1.9 and Ruby 1.8.6/7, in addition to JRuby 1.1.4+ and Rubinius (though Rubinius does not yet support callbacks).

Session showing installation and use of FFI in C Ruby:
$ sudo gem install ffi
Password:
Building native extensions. This could take a while...
Successfully installed ffi-0.1.1
1 gem installed
Installing ri documentation for ffi-0.1.1...
Installing RDoc documentation for ffi-0.1.1...
[headius @ cnutter:~]
$ irb
>> require 'ffi'
=> true
>> module RubyFFI
>> extend FFI::Library
>> attach_function :getuid, [], :uint
>> end
=> #<FFI::Invoker:0x1fe8c>
>> puts RubyFFI.getuid
501
=> nil
>>
Our hope with JRuby's support of FFI and our release of FFI for C Ruby is that we may finally escape the hell of C extensions. Next time you need to call out to a C library, don't write a wrapper shim in C! Write it using FFI, and it will work across implementations without recompile.

Here's some links to docs on FFI. As with most open-source projects, documentation is a little light right now, but hopefully that will change.

Calling C from JRuby
Rubinius's Foreign Function Interface
On the Rubinius FFI

A key feature that's not well documented is the use of FFI's templating system to generate bindings based on the current platform's header files. Here's a sample from the "Etc" module above.

Etc module template, showing how to pull in header files and inspect a struct definition:
module Etc
class Passwd < FFI::Struct
@@@
struct do |s|
s.include "sys/types.h"
s.include "pwd.h"

s.name "struct passwd"
s.field :pw_name, :string
s.field :pw_passwd, :string
s.field :pw_uid, :uint
s.field :pw_gid, :uint
s.field :pw_dir, :string
s.field :pw_shell, :string
end
@@@
end
class Group < FFI::Struct
@@@
struct do |s|
s.include "sys/types.h"
s.include "grp.h"

s.name "struct group"
s.field :gr_name, :string
s.field :gr_gid, :uint
end
@@@
end
end
As more docs come to my attention, I'll update this post and add links to the JRuby wiki page. For those of you interested in the Ruby FFI project itself, check out the Ruby FFI project on Kenai. And feel free to hunt down any of the JRuby team, including Wayne Meissner, on the JRuby mailing lists or in #jruby on FreeNode.

Update: Wayne has posted a follow-up with more details here: More on Ruby FFI

Wednesday, October 29, 2008

Using Rubinius's Kernel in JRuby

After a long day fixing bugs, ya gotta hack on something frivolous once in a while.

I've started to play with a proof-of-concept branch that uses Rubinius's kernel (pure Ruby core class implementations) in replacement for our own (pure Java). Initially I've been playing with Hash, which is one of the few Rubinius core classes that does not have any "primitives" (native methods implemented in C++).

The challenge to make it work was mostly in adding a few utilities it needed (using Array for "Tuple", setting the top "self" to "MAIN", loading a Type module for coercions, etc) and getting the intialization paths to use the pure-Ruby Hash impl instead of our concrete Java impl. I was able to overcome all these and get Rubinius's hash to function in both compiled and interpreted modes, and run a few simple tests and benchmarks.

It is, as you would expect, dozens of times slower than our Java impl, but the performance is better than I expected once the various JIT layers kick in. I doubt it will be possible to get its performance to the same level as hand-written Java code. However, I do believe it's possible to improve performance substantially by enabling some normally unsafe optimizations only for these pure-Ruby kernel classes.

I don't expect too many additional challenges loading the other core classes. Obviously we'll need to wire up primitives, but Rubinius's new C++ VM defines them the same way we do in JRuby. Methods in C++ are annotated with information that allows the VM to bind them to the right class and name, just like in JRuby. The only major difference is that Rubinius has a fairly rigid boot sequence I'll mostly be able to fake, since JRuby is already booted by the time the Rubinius kernel loads. My bootloading code to get Hash running was about 10 lines of code, plus some modifications wherever we instantiated our native RubyHash type directly. Basically, JRuby starts up with its own core class impls, then loads the Rubinius kernel to replace them.

At any rate, this will probably become a fun side project, since the goal of running the same pure Ruby kernel across many implementations is certainly attractive...even if our kernel is pretty much done already. I'd love to see how fast we can reasonably make Rubinius's kernel run atop the JVM, and the exercise will certainly put JRuby through its paces.

I'll push it to an SVN branch later today. Comments and suggestions are welcome :)

Update: The branch has been pushed: http://svn.codehaus.org/jruby/branches/rbx/

Monday, October 27, 2008

Duby Presentation at Ruby Users of MN

Here's the presentation I gave tonight on Duby. I'm putting together an in-depth post for release this week.