Wednesday, April 25, 2007

What Would I (Will I?) Change About Ruby

The latest Ruby Blogging Contest hits close to home: What changes would make Ruby into a better language without making it into something that isn't Ruby?

As you might guess, I've got some pretty strong thoughts here. I'm not a heavy Rails user, and I'm not as heavy a Ruby user as I'd like to be. But implementing a Ruby interpreter and now compiler has taught me a few things about what's right and what's wrong with Ruby. I'm not going to complain about performance, whine that the C code is too hard to follow, or even attack C-based extensions. Those may be important issues, but they're all fixable in the long term without breaking anything that works today (or by providing reasonable substitutes). I'm also not going to go into language design ideas...I have mine, you have yours, Matz has his. But my money's on Matz to do the "right thing" with regards to actual language design.

What I'm talking about are a few really important changes to the Ruby runtime, libraries, and ecosystem. Take these as my educated opinions...and don't think too hard about whether I'll be working to change these things in JRuby and in the wider Ruby world.

1. Threading

This more than any other area probably means the most visible changes to Ruby. Ruby currently is green-threaded, as most of you know. JRuby implements native threads mainly because Java uses native threads...we just piggyback off the excellent work of the JVM engineers. And the developing Ruby 1.9, the future successor to the current version 1.8 C implementation, provides something in the middle: native threads with a giant lock, so threads won't run concurrently.

So in general, Ruby is trending toward support for native threads. But there's a problem...some of Ruby's current APIs are impossible to do safely with native threads (and in general, impossible to really do safely with green threads...Ruby just does them anyway). Threading needs to be improved, with support for concurrent execution and removal of operations that prevent that.

Specifically, the following operations and features are inherently unsafe, and are not supported by any mature threaded system:

Thread#kill

Killing one thread from another may leave its locks and resources in an unpredictable state. JRuby currently implements this by setting a kill flag on the target thread and waiting for it to die--basically asking the thread to "please die yourself"--but it's not deterministic and the thread could fail to die.

Thread#raise

Forcing another thread to raise an exception can have the same effect as kill, since the thread may not expect to handle the given exception and may not be able to release locks or tidy up resources. JRuby handles this similar to kill, by setting a field to contain the exception a target thread should "please raise", but again it's not deterministic and there's no way to guarantee the target thread will raise.

Thread#critical=

There is no way to deterministically force true concurrent threads to stop and wait for the the current thread, not to mention the horrendous race conditions that can result when locks are involved. As a result of the many critical problems with critical=, it is already slated to be removed in Ruby 1.9/2.0.
In order for Ruby to survive in a parallel-processing era, unsafe threading operations need to go, and any libraries or apps that depend on them need to find new ways to solve these problems. Sorry folks, these aren't my rules. I understand why people like these features...I like them too. But you can't have your concurrency and eat it too.

2. ObjectSpace

ObjectSpace is Ruby's view into the garbage-collected heap. You can use it to iterate over all objects of a particular type, attach finalizers to any object, look an object up by its object ID, and so on. In Ruby, it's a pretty low-cost heap-walker, able to dig up objects matching particular criteria for you on a whim. It sounds like it might be pretty useful, but it's used by very few libraries...and most of those uses can be implemented in other (potentially more efficient) ways.

JRuby implements ObjectSpace by keeping a separate linked list in memory of weak references to created objects. This means that for every ObjectSpace-aware object that's created, a weakref is added to this list. When the object is collected, the weakref removes itself from the list. Walking all objects of a particular type just involves walking that list. Reconstituting an object ID into the object it references is supported by a separate weak list (again, more memory overhead).

There are no plans currently for ObjectSpace to be removed from Ruby in a future version. But there's a problem...in addition to being pure overhead in JRuby (which you can turn off completely by using the -O flag), ObjectSpace limits evolving development of the Ruby garbage collector, breaks heap and memory transparency, and poses yet more problems for threading.

There are many issues here. First off, the JRuby thing. By having to add ObjectSpace governors for all objects in the system, JRuby pays a very large penalty. We're forced to do this because the JVM (and most other advanced garbage-collecting VMs) does not allow you to traverse in-memory objects nor retrieve the object that is associated with a given ID. In general this is because the JVM does all sorts of wonderful and magical things with objects and memory behind the scenes, and the ability to ask for all objects of a given type or pull an object based on some ID number at any time cripples many of these tricks.

The threading issues are perhaps more important. Imagine if you will a true concurrent VM, with many threads creating objects, maybe one or more threads collecting garbage, and synchronizing all this to guarantee the integrity and efficiency of the heap and garbage collector. There is absolutely no room in this scenario for those multiple threads to request lists of specifically-typed objects at any time, nor to provide an ID and expect its object to be presented to you. These features break encapsulation across threads, they violate security restrictions from thread to thread, and they require whole new levels of locking to ensure that while reading from the heap no other thread produces new objects and no garbage collection occurs. As a result, ObjectSpace harms Ruby by limiting the flexibility of its garbage collecting and threading subsystems, and should be eliminated.

3. $SAFE and tainting

Safe levels are a fairly simple, straightforward way to set a "security level" that governs what operations are possible for a given thread. By setting the safe level to various values, you can limit modification of Object, prevent IO, disallow creation of new methods or classes, and so on. Added to this is the ability to "taint" or "untaint" objects. Tainted objects are considered "unsafe", and so certain security levels will cause errors to be thrown when those objects are passed to safe-only operations.

JRuby has safe level and tainting checks in place, but it's almost assured they're not working correctly. We have never tested them, largely because practically no tests (or perhaps literally no tests) use safe levels or tainting, and we've had *exactly one* bug report relating to safe levels, just a couple weeks ago. And to further kill the possibility of JRuby ever supporting safe levels and tainting correctly, my work tonight to fix some safe level issues revealed that doing so would add a tremendous amount of overhead to almost all critical operations like method creation, module/class mutation, and worst of all, object creation.

At this point, safe levels will probably remain in their current half-implemented state for 1.0, but I think it's almost decided for us that safe levels and tainting will simply not be supported in JRuby. In their place, we'll do two things (which I'd recommend the C implementation consider as well:
  • Recommend that people who really want "safe" environments use an approach like whytheluckystiff's Sandbox, which takes a more JVM-like approach to safety: it runs code in a true sandboxed sub-runtime with only "safe" operations even defined. In other words, not only is it disallowed to load in files or hit the network, it's physically *impossible* to do so. What makes this even better is that Sandbox is already supported in JRuby (gem "javasand") and JRuby out of the box allows a fine granularity of operations to be disabled in new runtimes.
  • Implement safe levels like Java handles security restrictions, which we get to leverage since they're already being checked and enforced at the JVM level. We will not be able to map everything...for obvious reasons, checking tainted strings all the time or limiting class and method creation are unlikely to ever happen, but we can limit those operations that the JVM allows us to limit, like loading remote code, opening sockets, accessing local files, and so on. So it's highly likely JRuby's implementation of safe levels will map to clearly-defined sets of Java security restrictions in the near future.
4. Direction

Ruby is a very free-form community. Matz is the most benevolent dictator I've had the pleasure to work with, and most of the community are true free-thinking artists. It's like the hippie commune of the language world. Peace out, man.

But there's a problem here. Ruby needs guidance beyond VM and language design or the loose meanderings of its more vocal community members. It boils down to a few simple points:
  • Ruby needs a spec. Anyone who believes this isn't true isn't paying attention. Now I'm not talking about a gold-standard legal document signed in blood by Matz and the chief stakeholders of the Ruby community. An officially sponsored, widely supported, and massively publicized community spec would work fine--and probably fit the community and the language better. But something needs to done quickly, since Ruby's "bus number" is dangerously low. A spec is not something to be feared...it's a guarantee that Ruby will live on into the future, that alternative implementations (like JRuby) can't intentionally introduce nasty incompatibilities (or at least, that they'd be easy to discover and easy to document), and perhaps most importantly...that the full glory and beauty of Ruby is published forever for all to see and explore, rather than dangerously trapped in very few minds.
  • Ruby needs a non-profit governing body. I'm not necessarily talking about a council of elders here, I'm just talking about some legal entity to which OSS copyrights can be assigned, donations can be made, and from which projects and initiatives can be funded. Maybe this would be RubyCentral, maybe this would be some other (new) organization...I don't know that. But it would be a great help to the community and Ruby's future if there were some official organization that could act as caretaker for Ruby's future. I'm all set to sign over any JRuby copyrights I have to such an organization, to protect the future of Ruby on the JVM just like the future of the C implementation. How about you?
  • Ruby needs you. Granted, this isn't really a change as such. You probably wouldn't be reading this if Ruby didn't already have you. But the Ruby community is at a big point in its lifetime...at risk of losing its identity, being eclipsed by newer projects, or even slipping deep, deep into the trough of disillusionment. What will prevent that happening is the community showing its strong ties, coming together to support official organizations and official documents, and above all, continuing to pour all our hearts into creating newer and better applications and libraries in Ruby, pushing the boundaries of what people think is possible.

19 comments:

Dekaritae said...

Haii. I'm just buzzing around, tracking down the various people who were involved in the LiteStep community, to see where they're at now. If you're interested in catching up, lots of old-timers still hang out on #FPN, irc.freenode.net.

Cormac said...

"And the developing Ruby 1.9, the future successor to the current version 1.8 C implementation, provides something in the middle: native threads with a giant lock, so threads won't run concurrently."

Wow. I did not know that. That is hilariously useless.

Robert said...

Is there anything that can be done with the growing corpus of tests Ruby/JRuby are collecting to help define an initial "spec"? At least that would help people fumble into the intended design.

More importantly, I suspect the intent of the underlying design decisions in (J)Ruby are largely undocumented right now. That does scare me. It's not that you or Matz are hoarding secrets, of course. It's a lot of work to break out in verbage what is important. But it's critical for long-term viability of the language.

Robert Thau said...

Two notes on ObjectSpace:

First off, your comments seem mainly aimed at #each_object; the Ruby interface to finalizers is also in the ObjectSpace module, and it's not clear whether you have any problems with it.

But #each_object is clearly a problem, so on to that.

Looking at the gems I have lying around to support an incipient production rails app, I see 16 uses of ObjectSpace.each_object, almost all to enumerate classes and modules, a la

ObjectSpace.each_object(Class) { |klass| ... }
ObjectSpace.each_object(Module) { |module| ... }

(The only other case is the test for the sqlserver db adapter, which uses it to enumerate instances of DBI::StatementHandle).

So, how much would it ease the pain if ObjectSpace.each_object(x) only worked if x was Class, Module, or a class that had been decorated in some specific way in advance, e.g.

class SomeClass;
include ObjectSpace::Walkable;
end

or the like?

(As a matter of implementation, it *might* work for ObjectSpace::Walkable.included to redefine 'new' in the including class to put the newly created object on the weak lists. That implementation has a somewhat subtle pitfall: you could use metaprogramming to include Walkable in a class after instances have already been created --- and those instances would remain untracked. This is the sort of situation where I wouldn't be *too* uncomfortable saying that the user got what they deserved --- but I imagine that reasonable people could disagree.

And unfortunately, supporting modules that include Walkable this way is more of a headache --- you have to worry about classes that include a module that includes a module that includes Walkable through multiple levels of indirection --- and one of *those* modules including Walkable after instances already exist. So, under this proposal, ObjectSpace.eachObject(Numeric) could no longer work --- but hey, even in MRI, it doesn't catch a lot of the numbers, because they're immediates).

Patrick Mueller said...

Threading; with ya.

ObjectSpace; not needed at runtime, but extremely useful for debug/diagnostic purposes; making it 'off' by default seems like the correct approach. I've used some experimental J9 extensions that do this sort of thing, and they have literally saved my ass (finding leaks). A previous comment indicates one use is to search for Class and Module instances; yes, there should be a way to do that as well, orthogonal to iterating all the live objects in the system.

$SAFE and tainting; again, I think this has a place in debug/diagnostic mode, and less of a place in production code. Static and dynamic analysis for taint. And I'm not happy with Java's security model; I don't put myself in a position of running untrusted code, so I get to avoid all the SecurityManager business. But I'd be willing to play with more security if it was more straight-forward to use.

Direction: +1. We see some of the same issues with PHP: no spec, test cases not quite complete enough; and we're adding tests to the corpus now. PHP has some kind of a quasi-official governance group (The PHP Group), but it's not as official as Apache or anything. On the other hand, governing bodies sound like committees, and I'm not sure you want to go there either. Benevolent dictators, when they work, work great.

Andrew Law said...

Hi Charles,

This isn't the ideal way to ask a jruby question but I'm having no joy posting to users@jruby.codehaus.org (I get the mailer daemon barfing even though I am a subscriber and I've had no response since I've forwarded it on to user-owner). Is there another way I can ask what is definately a newbie question and probably a simple mistake on my part?

Looking forward to hearing from you. Keep up the great work too!

Regs, Andrew

sanxiyn said...

PHP has some kind of a quasi-official governance group (The PHP Group), but it's not as official as Apache or anything. On the other hand, governing bodies sound like committees, and I'm not sure you want to go there either. Benevolent dictators, when they work, work great.

Python's benevolent dictator is Guido van Rossum, but Python does have the non-profit governing body called Python Software Foundation, which holds all copyrights relating to CPython implementation, funds Python Conferences, and give grants. They are not mutually exclusive.

Anonymous said...

Good Morning Charles,

In regards to your comment posted on Slava's Blog re his entry re your entry.

Both of you missed one truly painful problem with your idea of using external tools or setting of runtime switches to allow introspection of the heap -

It has the effect of altering the runtime characteristics of the erroneous program so that the particular error being investigated may be altered and hence may in fact disappear.

Reasoning: Historical attempts at doing this over many years in both mainframe and minicomputer environments has shown that (for myself and colleagues of mine) there are some errors that will be altered by changing the runtime characteristics of the program. This arises due to compiler and/or external runtime mmonitoring tools altering the program in question.

If the introspection tools are always incorporated into the runtime of the program, the base system will (in most situations) not change and the error should be repeatable and findable.

Charles Oliver Nutter said...

robert: Yes! Actually I've been trying to push the RubyTests project on RubyForge as a home for wayward tests, and we've previously contributed all of JRuby's tests there as well. However I haven't managed to get buy-in from other testing projects or Ruby implementations yet, so it isn't worth us putting our tests permanently there. Perhaps if the community lobbied for more cooperation in this area?

robert thau: That's an interesting idea. Actually one obvious place ObjectSpace is used is to locate all TestCase implementations for running a unit test run. But that case and most of your example cases cold be done by using the "inherited" hook on module to record when a given class or module is extended. And you also mention finalizers; finalizers are useful at times, and I think they could remain useful...but I would rather see them registered at the object itself rather than at the heap, with the garbage collector executing finalizers on objects as needed. But this requires a more advanced garbage collection system than Ruby has at present, so it may not be suitable yet.

patrick mueller: agreed on all points except $SAFE. I doubt any serious security expert would recommend relying on safe, and I know I wouldn't trust it. The edges are not well-defined, there's absolutely no testing coverage to guarantee it works, and barely anyone knows how to use it right. As a result, most people who are in a position to need $SAFE shouldn't be using $SAFE anyway.

andrew law: you can join #jruby on freenode IRC, or you could contact the codehaus folks to see why you can't get on the lists. Otherwise, email me at charles.nutter@sun.com

anonymous: that's certainly true, but you've actually just made my point for me...having that information available at runtime all the time, i.e. being able to inspect the running heap, impacts execution *all the time*. Is that worth it? If you can make something run twice as fast by requiring that heap inspection and memory profiling be explicitly enabled, isn't that a better way to do things? Of course there's no perfect answer, and people seem to think that having runtime access to the heap is so powerfully useful that it should always be present. And then nobody uses it, except for debugging purposes. I say leave it off and only turn it on when you need to do debugging. It's not worth taking the performance and complexity hit all the time for functionality used 1% of the time.

robert thau said...

Ummmm... as I read the docs on ObjectSpace#define_finalizer, Ruby finalizers already are per-object; it's just that the method that attaches them to an object happens to be part of the ObjectSpace module.

Charles Oliver Nutter said...

robert thau: Yes, they are already per-object, but you don't attach them to the object directly (which would be something like defining a finalize method on the type, for example), you tell the GC (via ObjectSpace): "Hey, run this code when this object gets collected". It's a subtle difference, but it's an important one.

Anonymous said...

Charles,

Allowing the introspection all the time will only seriously affect the runtime performance if the implementation is not done correctly.

It is worth it (in my experience).

Let's take a reality check. Our machines today are 500 to 2000 times faster and bigger than they were 20 years ago. But what do we see today. Most of the performance effects are taken up by all the additional guff (gui, os etc) that doesn't really give us any more speed than 10 - 15 years ago (or more).

Having had of necessity to trace the execution path of various bits of code in the past in a Microsoft environment, it is obvious that much of the performance of many applications is caught up in system facilities and not our programs themselves.

When I first started out in the industry in the late 70's and early 80's, it was a common saying (an experience) that a mainframe would double in size but you are the user would get maybe 20% performance increase. That happened then and it happens now.

Leaving in the ability to do introspection should not kill your application if the implementation has been effectively done.

Bruce Rennie
(God's Own Country Downunder)

Charles Oliver Nutter said...

Bruce Rennie: You can certainly make that claim, but making it too often about too many features eventually means your system spends more of its time supporting features you *might* use *someday* than actually getting work done. Sure, machines are 500 to 2000 times faster...and we're doing at least that many times as much with them. Moore's law has ended, and we're not able to do as much with the same individual cores as we'd like. That means scaling horizontally. That means concurrency. And concurrency and live heap inspection do not mix.

There's probably nothing I can say to convince you that these features are not worth the impact they have on performance and evolution of languages and systems. But if you want these features to survive, you need to do something to help make sure they're implemented "effectively". Try it yourself, see how easy it is to make many threads all create garbage, a few more clean it up, and allow heap inspection across the whole lot without crippling the system. Maybe I'm totally off base. Maybe I'm not.

Yes, live heap inspection is useful. So would be full runtime profiling, tracking of all object creations and collections, logging every packet sent and the time it took, and tracing all user operations. But we don't do all those things all the time because we actually want our software to accomplish something. Feel free to run on systems that leave those sorts of features on at runtime if they perform well enough for you. Feel free to run your systems in debug mode all the time, just in case you might want to query that runtime information on a whim. Me, I'd rather my machine's cycles are spent getting work done, and I'm willing to trade a little convenience to do it.

Justin W Smith said...

Great review of whats "wrong" with Ruby.
I truly love Ruby, but there are many aspects of the language which need to mature.

When I first learned Ruby, I quickly realized how limited its implementation of Threads is. For small scripts and other "one-off" work that we all do Threading doesn't matter, but Ruby's usage has quickly expanded beyond this. It's becoming a standard language for implementing large enterprise-level systems.

It is critical that the issues that you mention here are corrected in a thoughtful manner.

Anonymous said...

Charles,

Your statement "Sure, machines are 500 to 2000 times faster...and we're doing at least that many times as much with them." is an awful big claim - please demonstrate this. I worked on minicomputers in the aerly days of my career in computing - 2 Mbytes of memory (true, there was no GUI involved), we ran (with multiple users) word processors, spreadsheets, databases, network services of various kinds, and various others kinds of apps. We still do the same today, some things are now much easier (such as processing and editing sound and images) to do. But for many people, the machines of today are "pretty" and not doing much more for them than was available 15 years ago.

Certainly, we can run systems on micros today that we once required supercomputers to do.

But as experience has shown (tracing the execution of code through system calls in MS software) much of what is done today ends up copying the same piece of information many many times before it is actually used, this is hideous in the case of strings.

Algorithms used are simple because they are easy to code and effiency is not looked at because the coding is more complicated (sorting is one area this happens). I have also seen coding that has ended up a complicated mess instead of thinking outside the box and finding the appropriate algorithm (example is using "don't cares" to simpify test cases where necessary).

There are all sorts of things that have happened that eat into this massive spead up we have that results in little actual gain.

Having introspection always on does not mean that it is always running, it only means that the needed information and links are available for inspection (which for garbage collected systems should already be there).

Your statement "And concurrency and live heap inspection do not mix" is also a biggie. Can you prove that this is always the case or is it only the case in the systems you work and play in. I am not saying that there might not be great difficulties in doing this but are you categorically sure that it can't be done or is being done now (say in some Erlang or its ilk).

Your statement "So would be full runtime profiling, tracking of all object creations and collections, logging every packet sent and the time it took, and tracing all user operations. But we don't do all those things all the time because we actually want our software to accomplish something" - I have used systems that provided runtime profiling and other than the slowdown due to I/O concerns with some implementations didn't add much time to the overall process (particularly those systems that provided large buffers and writing to disk). Tracking of object creations and deletions depend on the kind of language and garbage collector you use - again if this is a logging function that is implemented correctly (for the system at hand) may add little time to each creation deletion compared to the time taken up in other parts of the system which are not under your control. Tracing of user operations add very little to the operation of any system as compared to the reation time and thick time of a user is miniscule (I take you know what I mean here). Tracing and logging of communications packets is something that has been done for decades with little or no impact on the systems involved, processors have tended to keep ahead of the communications frequencies in use on networks. It has required smart design to do this to ensure that there is no impact but it is done.

All of the above takes thinking outside the box in many cases - cros fertilisation of different fields makes a big difference to solving these kinds of problems.



I hope you are not trying to say that all the machines you work on are running above 80% all the time. I personally have come across few machines (very few) that are doing something useful most of the time - this is why there is a default process that does nothing called the Idle Process (though some smart souls do use this to do long calculations like Pi).

Lastly, the fact that you seem to use interpreted or virtual machine languages in at least part of your work and play says (at least to me) that efficiency in the runtime is not one of your priorities all the time. If that is the case and I haven't misinterpreted you, then some of your complaints may well be based on inefficient implementations not on the actual facilities being supplied.

Your experience is different to mine and you have a right to your opinion as do I and others who fell different to either of us.

Tom Palmer said...

For ObjectSpace, I've wondered how much of the same feature set is available in Java 6 with its cool heap information. Any way to tap into this (even with JNI?) to provide cheaper ObjectSpace on Java 6 plus? Even if just meant for debugging purposes?

For thread killing, I know less safe ways of killing threads like for instance power outages. To the extent I understand it, I don't see the huge trouble with kill/raise. (The whole "critical" thing, however, seems scary.)

Austin said...

I haven't used ObjectSpace much and maybe I am mistaken, but I really like it because it provides some of the same power that Smalltalk's browse instances of, etc.

Sure this stuff could be duplicated in an IDE or another library, but it would be a shame to move away from a rich introspective dynamic language, just for the sake of removing overhead and performance gains. Ruby doesn't have a spec, but ObjectSpace belongs in it.

I think the more you change Ruby, the more "outside" the Ruby community you will be.

Thanks for all the excellent work you are doing with jruby.

Bob Aman said...

ObjectSpace.each_object(Module) { |module| ... }

I find myself using the above to locate all subclasses of a particular class on a fairly regular basis. I would love to have a more efficient alternative, but I'm not willing to rely on mechanisms that force the programmer to write extra code to register a reference to the subclass with the parent class. That will inevitably lead to bugs when the programmer forgets to add the registration code. I can't even begin to say how much I want a Module#descendants method built into Ruby.

SP said...

I really hope you are not suggesting implementing something like the JCP in the legal entity proposal?

It's current implementation for Java is totally flawed, only allowing major corporations with lots of moola invested in enterprise Java getting a say in anything significant. The JCP only listens to non-real-world-developers. They are either "computer scientist" types who do not work in the enterprise (nor do they deliver systems to production and support them either) or they sell enterprise Java products to large corporations (where only usage of buzz words in the appropriate space is the technical requirement). Neither of which represent the real world developers out there that need to use Java in production business systems and it has shown a lot in the last 5 years!

Therefore I hope your "legal entity" suggestion does not encompass the "committee" nightmare created by Sun's JCP, because if we enter into such an idea with the notion that JCP actually works then we will fail miserably like Java.