Saturday, July 17, 2010

Browsing Memory with Ruby and Java Debug Interface

This is the third post in a series. The first two were on Browsing Memory the JRuby Way and Finding Leaks in Ruby Apps with Eclipse Memory Analyzer

Hello again, friends! I'm back with more exciting memory analysis tips and tricks! Ready? Here we go!

After my previous two posts, several folks asked if it's possible to do all this stuff from Ruby, rather than using Java or C-based apps shipped with the JVM. The answer is yes! Because of the maturity of the Java platform, there are standard Java APIs you can use to access all the same information the previous tools consumed. And since we're talking about JRuby, that means you have Ruby APIs you can use to access that information.

That's what I'm going to show you today.

Introducing JDI

The APIs we'll be using are part of the Java Debug Interface (JDI), a set of Java APIs for remotely inspecting a running application. It's part of the Java Platform Debugger Architecture, which also includes a C/++ API, a wire protocol, and a raw wire protocol API. Exploring those is left as an exercise for the reader...but they're also pretty cool.

We'll use the Rails app from before, inspecting it immediately after boot. JDI provides a number of ways to connect up to a running VM, using VirtualMachineManager; you can either have the debugger make the connection or the target VM make the connection, and optionally have the target VM launch the debugger or the debugger launch the target VM. For our example, we'll have the debugger attach to a target VM listening for connections.

Preparing the Target VM

The first step is to start up the application with the appropriate debugger endpoint installed. This new flag is a bit of a mouthful (and we should make a standard flag for JRuby users), but we're simply setting up a socket-based listener on port 12345, running as a server, and we don't want to suspend the JVM when the debugger connects.

jruby -J-agentlib:jdwp=transport=dt_socket,server=y,address=12345,suspend=n -J-Djruby.reify.classes=true script/server -e production

The -J-Djruby.reify.classes bit I talked about in my first post. It makes Ruby classes show up as Java classes for purposes of heap inspection.

The rest is just running the server in production mode.

As you can see, remote debugging is already baked into the JVM, which means we didn't have to write it or debug it. And that's pretty awesome.

Let's connect to our Rails process and see what we can do.

Connecting to the target VM

In order to connect to the target VM, you need to do the Java factory dance. We start with the com.sun.jdi.Bootstrap class, get a com.sun.jdi.VirtualMachineManager, and then connect to a target VM to get a com.sun.jdi.VirtualMachine object.
vmm = Bootstrap.virtual_machine_manager
sock_conn = vmm.attaching_connectors[0] # not guaranteed to be Socket
args = sock_conn.default_arguments
args['hostname].value = "localhost"
args['port'].value = "12345"
vm = sock_conn.attach(args)

Notice that I didn't dig out the socket connector explicitly here, because on my system, the first connector always appears to be the socket connector. Here's the full list for me on OS X:
➔ jruby -rjava -e "puts com.sun.jdi.Bootstrap.virtual_machine_manager.attaching_connectors
> "
[com.sun.jdi.SocketAttach (defaults: timeout=, hostname=charles-nutters-macbook-pro.local, port=),
com.sun.jdi.ProcessAttach (defaults: pid=, timeout=)]

The ProcessAttach connector there isn't as magical as it looks; all it does is query the target process to find out what transport it's using (dt_socket in our case) and then calls the right connector (e.g. SocketAttach in the case of dt_socket or SharedMemoryAttach if you use dt_shmem on Windows). In our case, we know it's listening on a socket, so we're using the SocketAttach connector directly.

The rest is pretty simple: we get the default arguments from the connector, twiddle them to have the right hostname and port number, and attach to the VM. Now we have a VirtualMachine object we can query and twiddle; we're inside the matrix.

With Great Power...

So, what can we do with this VirtualMachine object? We can:
  • walk all classes and objects on the heap
  • install breakpoints and step-debug any running code
  • inspect and modify the current state of any running thread, even manipulating in-flight arguments and variables
  • replace already-loaded classes with new definitions (such as to install custom instrumentation)
Here's the output from JRuby's ri command when we ask about VirtualMachine:
➔ ri --java com.sun.jdi.VirtualMachine
-------------------------------------- Class: com.sun.jdi.VirtualMachine
(no description...)
------------------------------------------------------------------------


Instance methods:
-----------------
allClasses, allThreads, canAddMethod, canBeModified,
canForceEarlyReturn, canGetBytecodes, canGetClassFileVersion,
canGetConstantPool, canGetCurrentContendedMonitor,
canGetInstanceInfo, canGetMethodReturnValues,
canGetMonitorFrameInfo, canGetMonitorInfo, canGetOwnedMonitorInfo,
canGetSourceDebugExtension, canGetSyntheticAttribute, canPopFrames,
canRedefineClasses, canRequestMonitorEvents,
canRequestVMDeathEvent, canUnrestrictedlyRedefineClasses,
canUseInstanceFilters, canUseSourceNameFilters,
canWatchFieldAccess, canWatchFieldModification, classesByName,
description, dispose, eventQueue, eventRequestManager, exit,
getDefaultStratum, instanceCounts, mirrorOf, mirrorOfVoid, name,
process, redefineClasses, resume, setDebugTraceMode,
setDefaultStratum, suspend, toString, topLevelThreadGroups,
version, virtualMachine

We can basically make the target VM dance any way we want, even going so far as to write our own debugger entirely in Ruby code. But that's a topic for another day. Right now, we're going to do some memory inspection.

Creating a Histogram of the Heap
The simplest heap inspection we might do is to produce a histogram of all objects on the heap. And as you might expect, this is one of the easiest things to do, because it's the first thing everyone looks for when debugging a memory issue.
classes = VM.all_classes
counts = VM.instance_counts(classes)
classes.zip(counts)

VirtualMachine.all_classes gives you a list (a java.util.List, but we make those behave mostly like a Ruby Array) of every class the JVM has loaded, including Ruby classes, JRuby core and runtime classes, and other Java classes that JRuby and the JVM use. VirtualMachine.instance_counts takes that list of classes and returns another list of instance counts. Zip the two together, and we have an array of classes and instance counts. So easy!

Let's take these two pieces and put them together in an easy-to-use class
require 'java'

module JRuby
class Debugger
VMM = com.sun.jdi.Bootstrap.virtual_machine_manager

attr_accessor :vm

def initialize(options = {})
connectors = VMM.attaching_connectors
if options[:port]
connector = connectors.find {|ac| ac.name =~ /Socket/}
elsif options[:pid]
connector = connectors.find {|ac| ac.name =~ /Process/}
end

args = connector.default_arguments
for k, v in options
args[k.to_s].value = v.to_s
end

@vm = connector.attach(args)
end

# Generate a histogram of all classes in the system
def histogram
classes = @vm.all_classes
counts = @vm.instance_counts(classes)
classes.zip(counts)
end
end
end

I've taken the liberty of expanding the connection process to handle pids and other arguments passed in. So to get a histogram from a VM listening on localhost port 12345, we can simply do:
JRuby::Debugger.new(:hostname => 'localhost', :port => 12345).histogram

Now of course this list is going to have a lot of JRuby and Java objects that we might not be interested in, so we'll want to filter it to just the Ruby classes. On JRuby master, all the generated Ruby classes start with a package name "ruby". Unfortunately, jitted Ruby methods start with a package of "ruby.jit" right now, so we'll want to filter those out too (unless you're interested in them, of course...JRuby is an open book!)
require 'jruby_debugger'

# connect to the VM
debugr = JRuby::Debugger.new(:hostname => 'localhost', :port => 12345)
histo = debugr.histogram
# sort by count
histo.sort! {|a,b| b[1] <=> a[1]}
# filter to only user-created Ruby classes with >0 instances
histo.each do |cls,num|
next if num == 0 || cls.name[0..4] != 'ruby.' || cls.name[5..7] == 'jit'
puts "#{num} instances of #{cls.name[5..-1].gsub('.', '::')}"
end

If we run this short script against our Rails application, we see similar results to the previous posts (but it's cooler, because we're doing it all from Ruby!)
➔ jruby ruby_histogram.rb | head -10
11685 instances of TZInfo::TimezoneTransitionInfo
1071 instances of Gem::Version
1012 instances of Gem::Requirement
592 instances of TZInfo::TimezoneOffsetInfo
432 instances of Gem::Dependency
289 instances of Gem::Specification
142 instances of ActiveSupport::TimeZone
118 instances of TZInfo::DataTimezoneInfo
118 instances of TZInfo::DataTimezone
45 instances of Gem::Platform

Just so we're all on the same page, it's important to know what we're actually dealing with here. VirtualMachine.all_classes returns a list of com.sun.jdi.ReferenceType objects. Let's ri that.
➔ ri --java com.sun.jdi.ReferenceType
--------------------------------------- Class: com.sun.jdi.ReferenceType
(no description...)
------------------------------------------------------------------------


Instance methods:
-----------------
allFields, allLineLocations, allMethods, availableStrata,
classLoader, classObject, compareTo, constantPool,
constantPoolCount, defaultStratum, equals, failedToInitialize,
fieldByName, fields, genericSignature, getValue, getValues,
hashCode, instances, isAbstract, isFinal, isInitialized,
isPackagePrivate, isPrepared, isPrivate, isProtected, isPublic,
isStatic, isVerified, locationsOfLine, majorVersion, methods,
methodsByName, minorVersion, modifiers, name, nestedTypes,
signature, sourceDebugExtension, sourceName, sourceNames,
sourcePaths, toString, virtualMachine, visibleFields,
visibleMethods

You can see there's quite a bit more you can do with a ReferenceType. Let's try something.

Digging Deeper Into TimezoneTransitionInfo

Let's actually take some time to explore our old friend TimezoneTransitionInfo (hereafter referred to as TTI). Instead of walking all classes in the system, we'll want to just grab TTI directly. For that we use VirtualMachine.classes_by_name, which returns a list of classes on the target VM of that name. There should be only one, since we only have a single JRuby instance in our server, so we'll grab that class and request exactly one instance of it...any old instance.
tti_class = debugr.vm.classes_by_name('ruby.TZInfo.TimezoneTransitionInfo')[0]
tti_obj = tti_class.instances(1)[0]
puts tti_obj

Running this we can see we've got the reference we're looking for.
➔ jruby tti_digger.rb
instance of ruby.TZInfo.TimezoneTransitionInfo(id=2)

ReferenceType.instances returns a list (no larger than the specified size, or all instances if you specify 0) of com.sun.jdi.ObjectReference objects.
➔ ri --java com.sun.jdi.ObjectReference
------------------------------------- Class: com.sun.jdi.ObjectReference
(no description...)
------------------------------------------------------------------------


Instance methods:
-----------------
disableCollection, enableCollection, entryCount, equals, getValue,
getValues, hashCode, invokeMethod, isCollected, owningThread,
referenceType, referringObjects, setValue, toString, type,
uniqueID, virtualMachine, waitingThreads

Among the weirder things like disabling garbage collection for this object or listing all threads waiting on this object's monitor (a la 'synchronize' in Java), we can access the object's fields through getValue and setValue.

Let's examine the instance variables TTI contains. You may recall from previous posts that all Ruby objects in JRuby store their instance variables in an array, to avoid the large memory and cpu cost of storing them in a map. We can grab a reference to that array and display its contents.
var_table_field = tti_class.field_by_name('varTable')
tti_vars = tti_obj.get_value(var_table_field)
puts "varTable: #{tti_vars}"
puts tti_vars.values.map(&:to_s)

And the new output:
➔ jruby tti_digger.rb
varTable: instance of java.lang.Object[7] (id=13)
instance of ruby.TZInfo.TimezoneOffsetInfo(id=15)
instance of ruby.TZInfo.TimezoneOffsetInfo(id=16)
instance of org.jruby.RubyFixnum(id=17)
instance of org.jruby.RubyFixnum(id=18)
instance of org.jruby.RubyNil(id=19)
instance of org.jruby.RubyNil(id=19)
instance of org.jruby.RubyNil(id=19)

Since the varTable field is a simple Object[] in Java, the reference we get to it is of type com.sun.jdi.ArrayReference.
➔ ri --java com.sun.jdi.ArrayReference
-------------------------------------- Class: com.sun.jdi.ArrayReference
(no description...)
------------------------------------------------------------------------


Instance methods:
-----------------
disableCollection, enableCollection, entryCount, equals, getValue,
getValues, hashCode, invokeMethod, isCollected, length,
owningThread, referenceType, referringObjects, setValue, setValues,
toString, type, uniqueID, virtualMachine, waitingThreads

Of course each of these references can be further explored, but already we can see that this TTI instance has seven instance variables: two TimezoneOffsetInfo objects, two Fixnums, and three nils. But we don't have instance variable names!

Instance variable names are only stored on the object's class. There, a table of names to offsets is kept up-to-date as new instance variable names are discovered. We can access this from the TTI class reference and combine it with the variable table to get the output we want to see.
# get the metaclass object and class reference
metaclass_field = tti_class.field_by_name('metaClass')
tti_class_obj = tti_obj.get_value(metaclass_field)
tti_class_class = tti_class_obj.reference_type

# get the variable names from the metaclass object
var_names_field = tti_class_class.field_by_name('variableNames')
var_names = tti_class_obj.get_value(var_names_field)

# splice the names and values together
table = var_names.values.zip(tti_vars.values)
puts table

This looks a bit complicated, but there's actually a lot of boilerplate here we could put into a utility class. For example, the metaClass and variableNames fields are standard on all (J)Ruby objects and classes, respectively. But considering that we're actually walking a remote VM's *live* heap...this is pretty simple code.

Here's what our script outputs now:
➔ jruby tti_digger.rb
"@offset"
instance of ruby.TZInfo.TimezoneOffsetInfo(id=25)
"@previous_offset"
instance of ruby.TZInfo.TimezoneOffsetInfo(id=26)
"@numerator_or_time"
instance of org.jruby.RubyFixnum(id=27)
"@denominator"
instance of org.jruby.RubyFixnum(id=28)
"@at"
instance of org.jruby.RubyNil(id=29)
"@local_end"
instance of org.jruby.RubyNil(id=29)
"@local_start"
instance of org.jruby.RubyNil(id=29)

We could go even deeper, but I think you get the idea.

Your Turn

Here's a gist of the three scripts we've created, so you can refer to and build off of them. And of course the javadocs and ri docs will help you as well, plus everything we've done here you can do in a jirb session.

There's a lot to the JDI API, but once you've got the VirtualMachine object in hand it's pretty easy to follow. As you'd expect from any debugger API, you need to know a bit about how things work on the inside, but through the magic of JRuby it's actually possible to write most of those fancy memory and debugging tools entirely in Ruby. Perhaps this article has peaked your interest in exploring JRuby internals using JDI and you might start to write debugging tools. Perhaps we can ship a few utilities to make some of the boilerplate go away. In any case, I hope this series of articles shows that JRuby users have an amazing library of tools available to them, and you don't even have to leave your comfort zone if you don't want to.

Note: The variableNames field is a recent addition to JRuby master, so if you'd like to play with that you'll probably want to build JRuby yourself or wait for a nightly build that picks it up. But you can certainly do a lot of exploring even without that patch.