Monday, April 25, 2011

Ruby for the Java world


Ever since the invention of the computer, software development has been trending towards higher-level languages. From assembly language, to C, to C++, to Java, each step up has met with the same criticisms from the old guard: it's slow; it's buggy; developers don't want to lose control. But gradually, as hardware has sped up, and new research and development has improved compilers, interpreters, and virtual machines, developers have inevitably migrated to the higher level, enhancing their productivity by freeing themselves from lower-level concerns
Java now holds the lead in many areas of software development, but dynamic languages threaten to bypass it in this inexorable climb. Languages such as Python, Perl, Rexx, Groovy, TCL, and Ruby have been doing yeoman service in specialized domains such as file processing, test automation, software builds, glue code, and Web GUIs for years—hence, their historic name "scripting languages." But in the last few years, they have been making headway in the heavy-duty jobs once reserved mostly for C++, Java, and other compiled languages.
In the last year, the Ruby on Rails (RoR) Web framework has given Ruby a big boost. RoR builds definitions for all tiers of a typical Web application—GUI, business logic, and persistence—from simple Ruby code, thereby minimizing redundancy, boilerplate code, source-code generation, and configuration. RoR's ease of use showcases the Ruby language; and Ruby, a full-fledged software language, has much more to offer than RoR.
As a long-time Java developer, I am likely to stick with Java for some time to come. But I keep my eye on other languages that can play a role in my Java-based systems, and Ruby has recently emerged as a particularly good candidate. With the help of the JRuby interpreter, Ruby works well with Java, configuring, integrating, and reusing Java software (more on that below). And simply learning Ruby has improved my Java code. Ruby lets me easily accomplish techniques such as functional programming and metaprogramming, which in Java I can only do with difficulty. Learning these techniques in Ruby has helped me better appreciate when and how to use them in Java development.
With this article, I hope to share with you some of my excitement about what Ruby can do for my Java-based systems. I compare the strengths and weaknesses of both Java and Ruby, and present the pros and cons of the JRuby interpreter. I also show where to best draw the dividing line between Java and Ruby to benefit from each. I illustrate these points with code samples and present a messaging example that shows how to integrate Java systems with Ruby, putting to good use the flexibility, expressiveness, and power of a dynamic metaprogrammable language.

Ruby vs. Java

This article explains Ruby from the Java developer's viewpoint, focusing on a comparison between the two languages. Like Java, Ruby is a full-featured object-oriented language. But there are many significant differences. Ruby is dynamically typed and runs in a source-code interpreter, and it conveniently supports metaprogramming as well as the procedural and functional paradigms. I won't go into details of Ruby syntax, since that subject has been covered extensively elsewhere.

Dynamic typing

Java has static typing. You declare the type of each variable, and then, during compilation, you get an error message if you use a variable of the wrong type. Ruby, on the other hand, has dynamic typing: You don't declare types for variables or functions, and no type-check occurs until runtime, when you get an error if you call a method that doesn't exist. Even then, Ruby doesn't care about an object's class, just whether it has a method of the name used in the method call. For this reason, the dynamic approach has earned the name duck typing: "If it walks like a duck and quacks like a duck, it's a duck."
Listing 1. Duck typing
class ADuck
def quack()
puts "quack A";
end
en d
class BDuck
def quack()
puts "quack B";
end
en d
# quack_it doesn't care about the type of the argument duck, as long
# as it has a method called quack. Classes A and B have no
ck end
# inheritance relationship. def quack_it(duck) duck.qu
a
a = ADuck.new
b = BDuck.new
ack_it(b)
quack_it(a) q
u


Java also lets you achieve dynamic typing, using reflection, but this clumsy and verbose workaround produces confusing exceptions like NoSuchMethodError and InvocationTargetException; in practice, these exceptions tend to pop up in reflective Java code far more often than the equivalents in Ruby.
Even in nonreflective Java code, you often lose static type information. For example, execute() methods in the Command design pattern must return Object rather than a specific type in pre-Java 5 code, resulting in ClassCastExceptions. Likewise, when signatures change between compile-time and runtime, runtime Errors ensue. In practice, whether in Java or Ruby, such errors rarely cause severe field bugs. A strong unit test suite—which you need anyway!—generally catches them in time.
Ruby's dynamic typing means you don't repeat yourself: How often in Java have you had to suffer through verbose code along the lines of XMLPersistence xmlPersistence = (XMLPersistence)persistenceManager.getPersistence();? Ruby eliminates the need for the type declaration and casting (as well as parentheses and semicolon): a typical Ruby equivalent would be xmlPersistence = persistence_manager.persistence.
Ruby's dynamic typing does not mean weak typing—Ruby always requires you to pass objects of the correct type. Java, in fact, enforces types more weakly than Ruby. For example, Java evaluates "4" + 2 as "42", coercing the integer to a string, while Ruby throws a TypeError, telling you it "can't convert Fixnum into String." Likewise, Java, sacrificing correctness for speed, can silently overflow an integer operation, producing weirdness such as Integer.MAX_VALUE + 1, which equalsInteger.MIN_VALUE, while Ruby simply expands integers as needed.
Despite Ruby's advantages, Java's static typing does give it one ability that leaves it as the preferred choice for large-scale projects: Java tools understand code at development-time. IDEs can trace dependencies between classes, find usages of methods and classes, auto-complete identifiers, and help you refactor code. Though parallel Ruby tools exist with limited functionality, they lack type information and so cannot perform all these tasks.

Interpreted language

Ruby runs in an interpreter, so you can test code without a distracting wait from the compiler; you can even run Ruby interactively, executing each line as you type it. Besides the fast feedback, interpreted languages have a rarely noted advantage in dealing with field bugs. With compiled languages, to analyze the code causing a field bug, the field engineer must determine the deployed application's exact build version and then look up the code in the source-code configuration management system. In real life, often this is simply impossible. With interpreted languages, on the other hand, the source code is immediately available for analysis and even, when necessary, for emergency on-the-spot fixes.

Interpreted languages have an unfortunate reputation for slowness. Compare the history of Java, a "semi-compiled" language: In the early years, the JVM always interpreted the bytecode at runtime, which contributed to Java's reputation for slowness. Yet Java developers quickly learned that most applications spend most of their time waiting on the user or network I/O and the remaining bottlenecks are best optimized in higher-level algorithms, rather than in raw low-level tweaks. Over the years, Java rapidly gained speed; for example, just-in-time compilers with optimization based on dynamic analysis sometimes allow Java to outdo C++, which is limited to optimization based on static compile-time analysis.
Already, for most applications, Ruby is no slower than other languages. In the near future, Ruby will get a further boost as Ruby's native interpreter moves to a bytecode-based system and the JVM JRuby interpreter gains the ability to compile Ruby to Java bytecode. Eventually, any performance lag will gradually become negligible for most functionality.

Functional programming

Ruby supports multiple programming paradigms with equal ease. In addition to its pure object-oriented style (even integers and the like are objects), it also supports a procedural style suited to one-off scripts: you can write code outside any class or function, as well as functions outside any class. (Ruby silently adds these functions to the Object class, maintaining the object-orientation behind the scenes.)
Most interestingly, for Java programmers looking for new and useful perspectives, Ruby supports the functional paradigm. You can use most functional programming constructs in Java, with some support from libraries such as Jakarta Commons Collections, but the syntax is clumsier. (See "Functional Programming in the Java Language" by Abhijit Belapurkar and Jakarta Commons Collections in Resources.) Ruby, though not a purely functional language, treats functions and anonymous blocks of code as full citizens of the language, which can be passed around and manipulated like any ordinary objects.
In Java, you often iterate over a collection's Iterator, running the logic on each element. But the iteration is just an implementation detail: in general, you are just trying to apply the same logic to each element of a collection. In Ruby, you pass a code block to a method that does just that, iterating behind the scenes. For example, [1, 2, 3, 4, 5].each{|n| print n*n, " "} prints the string 1 4 9 16 25; an iterator takes each element of the list and passes it into the code block as the variable n.
Functional programming is useful in wrapping code blocks with instructions to be executed before and after the block. For example, in Java, you can use the Command design pattern to ensure the opening and closing of a file, database transaction, or other resource. This burdens the code with the useless overhead of an anonymous inner class and callback method; in addition, there is the distracting rule that variables passed into the anonymous Command class must be declaredfinal. And to ensure that logic always executes at the end of a code block, the whole thing must be wrapped withtry{...}finally{...}.
In Ruby, you can wrap any function or code block, with no need for anonymous classes or method definitions. In Listing 2, the file opens, then closes after the write() method. No code is needed beyond the transaction() method and the code block.
Listing 2. Wrap a code block
File.open("out.txt", "a") {|f|
f.write("Hello")
}


Metaprogrammable language

You generally define your Java classes as source code, but you can also manipulate class definitions at runtime. This requires advanced techniques such as bytecode enhancement during class-loading. Hibernate, for example, inserts data-access logic directly into the business objects' bytecode, saving the application programmer from coding the extra data-access layer. But class manipulation, called metaprogramming, is practical only for infrastructure programmers: application programmers cannot usefully introduce these tricky and fragile techniques.
Even at development time, Java limits the developers' ability to change classes. To add an isBlank() method that tells you if a String is all white space, you'd have to add a StringUtils class with the static method (as Apache Commons does); logically, the new method belongs in String.
In Ruby, on the other hand, you can simply extend the built-in String class with a blank? method. In fact, because in Ruby everything is an object, you could even augment the Fixnum class, the equivalent of Java's primitive int, as shown in Listing 3.
Listing 3. Add method to built-in classes String and Fixnum
class String
# Returns true if string is all white space.
# The question mark indicates a Boolean return value.
def blank?() !(self = /\S/) end
en d
class Fixnum
# Returns 0 or 1 which in Ruby are treated as false and true respectively.
def odd?() return self % 2 end
en d
puts " ".blank? # true
# The next line evaluates if-then similarly to Java's ternary operator ?:
puts (if 23.odd? then "23 odd" else "23 even" end)


JRuby: Ruby in a Java world

As a Java programmer, you won't want to use Ruby in production until you can get it to interact with existing Java applications and libraries, which hold within them a tremendous variety of essential functionality. JRuby, an open source Ruby interpreter for the JVM, simplifies Ruby-Java integration. You can call Java libraries from Ruby, script Java applications with an embedded interpreter, or even use Ruby libraries from Java. Exactly the same Ruby code runs in JRuby and the standard Ruby interpreter, except where Ruby code calls into native (C-coded) or Java libraries.
The JVM is often contrasted with .Net's multilanguage Common Language Runtime as supporting only a single language. But in fact, the JVM executes not only Java, but also Python, JavaScript, Groovy, Scheme, and many other languages, which means that where necessary, Ruby code can interact with these languages as well
As of mid-July 2006, JRuby remains in prerelease mode (version 0.9). But it's catching up rapidly: a team of volunteers has released five versions since January 2005. JRuby's maturity is continually evaluated with a test suite that calibrates it against the standard interpreter, and it now passes more than 90 percent of the tests plus provides basic support for Ruby on Rails.
To try JRuby, be sure that Java SE 5 is installed and that JAVA_HOME is set. (Java Runtime Environment 1.4 was supported in JRuby 0.8.3 and below; it will be supported from the next patch release after 0.9.) Download the compressed file from the project's page and uncompress it. Set the JRUBY_HOME environment variable to the JRuby base directory. You can experiment interactively with jirb in the bin directory. For most purposes, you'll use the jruby interpreter—create a file and pass its name as a parameter to the jruby batch/shell script in directory bin.
In addition to running ordinary Ruby code, you can also use JRuby to construct Java objects, call Java methods, and inherit from Java classes. A Ruby class can even implement Java interfaces—necessary for statically calling Ruby methods from Java.
To initialize the libraries used for accessing Java from Ruby, start with require "java". Then specify the Java classes to use with the include_class method, for example, include_class "javax.jms.Session". You can include an entire Java package into a Ruby module with include_package. Like Java's wildcard package-import statement, it's advisable to avoid the namespace pollution of include_package; in JRuby, there is the additional penalty of a performance hit as the interpreter searches all packages for the desired class. Stick to include_class where possible.
The names of many Java standard classes overlap with the names of Ruby classes. To resolve collisions, pass a code block into the include_class function, returning a new name for the Java class, and JRuby will use this as an alias (see Listing 4).
Listing 4. Include a Java class with clashing name
require "java"
# The next line exposes Java's String as JString
include_class("java.lang.String") { |pkg, name| "J" + name }
s = JString.new("f")


Alternatively, you can create a Ruby module that includes the Java class definitions, but in a separate namespace. For example:
Listing 5. Java module importing multiple Java classes
require "java"
module JavaLang
include_package "java.lang"
end
s = JavaLang::String.new("a")


What's JRuby good for?

Dynamic languages like Ruby are most commonly used for specialized areas such as gluing other systems together; JRuby takes on this role in the Java world. For example, JRuby can pull data from one system, transform it and insert it into another. When the requirements change, modifying a JRuby script is as easy as changing a configuration file, thereby avoiding the complex compile-and-deploy cycle of Java integration code.
In addition to calling Java from Ruby, you can call Ruby from Java, making your application scriptable. With JRuby's minimal syntactic overhead, you can create an easy-to-use domain-specific language for users to work with For example, a gaming engine's scripting system can present Ruby classes describing characters, vehicles, and other game entities.
Moreover, with Ruby's dynamism, users can change definitions of scriptable classes. The Ruby objects allow direct method access to state and behavior. With Java, on the other hand, you typically pass around a Map with user-configurable keys, lacking the full functionality of objects.
A Ruby script is like a souped-up configuration file. Configuration for Java applications typically comes in an XML or properties file, but these are limited only to parameters decided on at development time. With a Ruby script fed to your scripting system either from a text file or from an embedded editor, the user can freely customize behavior wherever you choose to place a scripting hook. In this way, Ruby combines configuration with behavior, providing the functionality of Java plug-in APIs, but without a Java IDE or a compiler, and sparing the additional steps of building and deploying jar files.
For example, a user-provided script can hook into an application's management events to filter for certain suspicious conditions, then send a notification to a system administrator and log into a special security-issues database, or a start-up script can purge old files and reinitialize custom datastores. Likewise, many rich clients allow users to change the position of menus and toolbars—with a JRuby hook, a user's new menu items trigger any behavior the user wants.
For convenience in scripting Java applications, the Bean Scripting Framework (BSF) provides a standard interface between the JVM and multiple dynamic languages, including Ruby, Python, BeanShell, Groovy, and JavaScript; Java Specification Request (JSR) 223, the successor of BSF, will be a standard part of Java 6. The Java code can send variables into the JRuby namespace; JRuby can manipulate these Java objects directly or return a value back to Java. With the BSF and JSR 223, the syntax for interoperation between Java and any scripting language is the same. Listing 6 shows a basic example of BSF use with Ruby, and the full code is in the online samples under the directory bsf_example. Note that BSF does not include JRuby support out of the package; but simple instructions for adding it are available in the JRuby documentation.

Listing 6. Embed a Ruby interpreter
...
// JRuby must be registered in BSF.
// jruby.jar and bsf.jar must be on classpath.
BSFManager.registerScriptingEngine("ruby",
, new String[]{"rb"}); BSFManager manager = new BSFManager(); //
"org.jruby.javasupport.bsf.JRubyEngine "Make the variable myUrl available from Ruby.
ww/jruby.org"), URL.class); // Note that the Method getDefaultPort is ava
manager.declareBean("myUrl", new URL("http:// wilable from Ruby // as getDefaultPort and also as defaultPort.
// and a Java method call. String result = (String) manager.eva
// The following line illustrates the combination of Ruby synta xl( "ruby", "(java)", 1, 1, "if $myUrl.defaultPort< 1024 then " + "'System port' else 'User port' end");
.. .


It is also possible to instantiate a JRuby interpreter directly from Java, but as this ties your Java code to Ruby-specific Java wrapper classes, it's best to stick with BSF/JSR 223. (An upcoming release of JRuby will also allow direct embedding of the interpreter with cleaner encapsulation of Ruby objects, without the need for BSF or JSR 223.)

Limitations

It's important to remember that JRuby is still in prerelease mode and still has some limitations, which will be fixed before the 1.0 release.
In particular, a Ruby class can't extend abstract Java classes. Unfortunately, to get around this limitation, you can't simply create a concrete Java subclass, implementing abstract methods with dummy methods, which the Ruby class then extends. This is because of a second limitation in Ruby/Java inheritance in current prerelease versions of JRuby: Java code cannot polymorphically call a Ruby-coded method that overrides a (concrete) Java method.
The limitations make the use of Swing especially difficult. For example, it's impossible to extend AbstractTableModel to take advantage of the functionality that it adds to the TableModel interface. You can work around this by converting inheritance to delegation: Extend the abstract class with a concrete Java "shim" class that delegates to an object of interface type, where the interface includes all the abstract Java methods. A Ruby class implements the delegate interface. Though the wordiness of this approach obviates the usual advantages of JRuby, it does offer a workaround for this limitation and provides flexibility where it is needed: in the subclass functionality (see table_example in attached code listings).
Ruby comes with standard libraries that provide a wide range of functionality. Those that do not use native code are included in the JRuby release. The JRuby team is gradually porting most libraries, although some that are a thin layer over native code, like the GUI library Tk, will probably not be ported. Java's standard libraries, which are included, provide all necessary functionality.
JRuby now provides basic support for Ruby on Rails in the WEBrick Web server; the next major milestone is full support in any servlet container. RoR support will let programmers combine the ease of the Web framework with the vast array of existing Java libraries, while also validating JRuby as an alternative Ruby interpreter.
Java SE 5 is required for JRuby 0.9 because of a bug, but JRE 1.4 will be supported in the next release.
The JRuby team has so far prioritized correctness over performance; optimization is a top priority for the 0.9.x releases.

Example

The example jms_example.rb illustrates the use of JRuby at greater length. It shows how to transform messages from one class structure to another, simulating software that integrates two differently defined distributed game engines. The code uses advanced functional and metaprogramming techniques to transform XML messages of any type as Ruby objects. Comments in the code walk you through the logic.
Achieving this level of configurability in Java is nearly impossible; at best, users can plug in compiled code. But in Ruby, the dynamic code is as natural and easy to use as any statically defined code.

Ruby and the future of Java

Ruby has a lot to teach Java programmers. The Ruby on Rails framework shows how simple it can be to develop a Web application; with JRuby, RoR will soon be able to reuse existing Java functionality. JRuby is set to join Jython, JavaScript, and other dynamic languages in the scripting of Java applications. Developers who want to keep their skills up to date should learn dynamic languages, which are likely to take over more and more domains in application development. Even when not using dynamic languages, Java developers can benefit from the insights Ruby gives into concepts like functional programming and metaprogramming.

No comments:

Post a Comment