Strength & Speed: Leveraging Java into RAD with JRuby
This rant was drafted in Aug 2013 but never published. I completely forgot
about it until multiple days of frigid weather in NYC drove me to do a
comprehensive review of moschetti.org and I discovered this in a
TBD.tar file. I believe Ruby has become a bit less shiny in the past
4.5 years but this is nonetheless a good example of software factoring and
reuse of core critical components.
Java: A Good Foundation
We're not here to debate if Java is "better" than C++ or Python or FORTRAN
or COBOL or Smalltalk. Simply put, Java is a good language for building both
reusable components and applications. Setting aside
exotic reflection and bytecode manipulation for the moment, Java offers
these features:
- More than acceptable performance
- Great compile-time checking of code and other benefits of a static
type system
- A very (arguably over) complete platform
- Strong open source community support and vendor support
- Broad variety of third party products offer integration via Java APIs
- Large and economically fluid global talent pool
Where Java Gets ... Tedious
However, doing quick work with Java such as small apps and utilities
can be tedious, especially when dealing with data and file integration
activities. The capabilities and benefits outlined above become much
less critical at the "edge" of the software stack and in some cases
actually become hindrances. In general:
- There is far less of a need for a strong, well-engineered interface
(GUI, command line, etc.) that can service many needs. In fact, arguably
the interfaces to these kinds of programs can be made as specific and narrow
as desired to simplify and target use of the program. If you need to do
something else, build a different app. Of course, this approach is
successful only if a well-factored software stack is in use; otherwise,
it is likely that you will be copying and modifying large chunks of underlying
code instead of picking and choosing different component ingredients.
- Performance and security do not have to be "overengineered" to satisfy
the most demanding consumer. The program, as a runtime, has a defined
performance and security profile and in many cases does not need to run as fast
as theoretically possible. Tradeoffs between performance, memory use,
storage, and compactness and/or ease of computing can be made at this level
of the stack. The underlying components, however, clearly need
to be engineered to be as fast and secure as possible because they become
the limiting factors for any consuming program.
- Apps and utilities tend to deal with externalized data as an important
part of their function. Files, data streams, command line arguments, even
things typed into screens. The predominant types we find in this space
are strings and collections of strings and although Java is certainly
capable of dealing with them, other languages and environments often make it
far easier to work with these two types.
In short, sometimes you just want write 10-50 lines of code quickly to
get something done.
The solution is clear: Develop a multi-language
software base with Java at the core and a scripting language that can
access the Java code functionality. This will permit you to
enjoy the best of both worlds.
Enter Ruby
The Ruby language is currently enjoying a burst of popularity largely generated
by the Ruby on Rails framework, but it is nonetheless a capable language at
a basic level. Like Perl and Python, Ruby has relaxed type declaration,
outstanding string manipulation functions, and somewhat more powerful
collections operations that Java, and offers
functional programming for those programmers (and programs) that well-benefit
from this programming style. It also has rich ecosystem of open
source modules called "gems" that satisfy many common programming needs.
As of this writing there are several Ruby implementations including JRuby,
a 100% pure Java implementation of Ruby. It has been well-engineered to
cooperate with the JRE both in terms of its ability to be embedded in a
Java program (i.e. an existing Java program constructs some Ruby source code
and calls an eval method) and import existing Java libraries into a Ruby
program. It is the latter case that is the focus of this article.
Why Ruby?
No flames, please; this is not about why one language is better than
another in an absolute sense
The implusive response is "why not?" The abstract academic response is "it
actually doesn't matter; it's the multi-language leverage concept that is
important." But from a practical standpoint, scripting is going to be
done in Perl, Python, Groovy, Ruby, or more recently, Scala.
- As much as
I am a long-time fan of Perl and a Perl user, the more modern languages have
a more refined and symmetric approach to objects. Plus, Perl integration with
Java is always untidy; I much prefer integrating it to C or C++.
- Groovy is more like dynamic
Java and although that is good and it has many features of Python and Ruby,
the syntax and collections and i/o handling are not quite as "easy/compact"
as Python or Ruby.
- Scala is very promising especially because it compiles to Java byte code
but it is a little new to the party. Watch for an article on Scala leverage
in the future...
This leaves Python and Ruby. Python was out of the gate first with JPython
but the community quickly and wisely refocused efforts on a 100% pure Java
implementation of Python, yielding (get it?) Jython. The truth is, for most
of the RAD use cases encountered, both Jython and JRuby are perfectly
acceptable. I chose to use Ruby and JRuby for these examples for these
reasons:
- There has been a lot of activity in the Ruby space of late. Yes, Rails
is a big part of that.
- As a Perl fan, there are many syntax and function similarities to Perl
that make me feel more at home with Ruby.
The Meat
To begin, assume we have these Java classes:
- Persistor, an interface to a persistence framework.
- DBImpl, a persistence engine binding that implements Persistor
- DAL, a data access layer that provides functional access to
data. It consumes Persistor and basically hides SQL or noSQL or
any other oddments from the applications.
Last but but by no means least:
- FancyMath, a nontrivial object that depends on several
other classes, has real state, an externalized form different from the
internal representation, some beefy methods, etc. The methods have real
algorithms and complex implementations and are our own work product, not
open source. It has its own set of
test drivers (functional and performance). In short, not a
glorified HashMap with bespoke get/set of Strings. This is a core
component and something you would not want to reimplement in another language.
Each of these classes is built into a different .jar file of
course because they have different physical and logical dependencies. There
is no reason FancyMath should depend on a specific persistor and
certainly we don't want the persistence layer dependent on FancyMath.
To simplify the example, we will name the archives persistor.jar,
dbimpl.jar, DAL.jar, and fancymath.jar,
We'll see why a real-life multi-jar scenario is important to consider later on.
Any number of Java programs can be written with these jars and these programs
will benefit from compile-time checking and static typing; nothing particularly
special here. But let's look at the following use case:
- CSV content will be fetched via http from a web service
- A local file be used as a category code mapping file
- Certain functions in FancyMath will be called
- The result will be written to the database
The "800 lb gorilla" in this setup is FancyMath. Everything else
is easy and in the case of some languages, very easy. But we need to leverage
the work expended on creating and maintaining fancymath.jar.
In this case, more time is spent in getting the right import statements,
properly allocating arrays, making HashMaps, and finding 3rd party/open source
libs than actually doing the work.
This Java program might look like the following. Notes:
- In the spirit of
apples to apples, I am using as few non-platform libs as possible (i.e. not
using the apache commons IOUtils module)
- The program is lacking in exception blocks, checks for null, closing
i/o resoruces, etc. but those would be roughly equivalent in both Java and Ruby.
We do not show them here to make the comparison a little clearer.
- Restraint has been applied to trying to compactify the source. The goal
here is to create a program quickly but with an eye toward downstream
maintenance (or at least comprehension).
- The example is conceptual and might not actually compile in a cut-and-
paste scenario.
import com.me.Persistor;
import com.me.PersistorFactory;
import com.me.SomePersistorFactoryImpl;
import com.me.DAL;
import com.me.FancyMath;
import java.util.Scanner;
import java.util.Map;
import java.util.HashMap;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.URL;
import java.net.URLConnection;
public class Loader1 {
private String getURL(String url) {
URL website = new URL(url);
URLConnection connection = website.openConnection();
BufferedReader in = new BufferedReader(
new InputStreamReader(
connection.getInputStream()));
StringBuilder response = new StringBuilder();
String inputLine;
while ((inputLine = in.readLine()) != null) {
response.append(inputLine);
}
in.close();
return response.toString();
}
private Map bulkContentToMap(String content, String fldDelim) {
Map tbl = new HashMap();
String[] lines = content.split("\n");
for(String l : lines) {
String[] flds = l.split(fldDelim);
tbl.put(flds[0], flds);
}
return tbl;
}
public static void main(String[] args) {
String s1;
s1 = getURL("http://machine/path");
Map tbl1 = bulkContentToMap(s1, ",");
// This is arguably slightly too "loose" but let's permit it for now...
s1 = new Scanner(new File("path/to/codemap.csv")).useDelimiter("\\Z").next();
Map tbl2 = bulkContentToMap(s1, ',');
Persistor p = some PersistorFactory arrangement with dbimpl;
Map m = new HashMap();
for( Map.Entry<String,Object> me : tbl1.entrySet()) {
m.clear();
m.put("key", ((String[])tbl2.get(me.getKey()))[1]; // yikes
String[] data = me.getValue();
m.put("val1", data[1]);
{ // Turn "John A. Smith" into "JAS":
StringBuilder sb2 = new StringBuilder();
for(String p : data[2].split(" ")) {
sb2.append(Character.toUpperCase(p.charAt(0)));
}
m.put("user", sb2.toString());
}
m.put("smoothed", FancyMath.smooth(data));
DAL.insertCurve(p, m);
}
}
And here is how we might run it:
$ java -classpath persistor.jar:dbimpl.jar:DAL.jar:fancymath.jar Loader1.class
In contrast, this is what the Ruby version looks like:
include Java # tell JRuby to activate Java class loader machinery
import com.me.SomePersistorFactoryImpl;
import com.me.DAL;
import com.me.FancyMath; # The whole reason we're doing this...
require 'net/http'
def bulkContentToMap(content, fldDelim)
tbl = {}
content.split("\n").each { |line|
flds = line.split(fldDelim)
tbl[flds[0]] = flds # tbl[key] point to entire record
}
tbl
end
uri = URI('http://machine/path')
c = Net::HTTP.get(uri)
tbl1 = bulkContentToMap(c, ',')
c = IO.read('path/to/codemap.csv')
tbl2 = bulkContentToMap(c, ',')
p = some PersistorFactory arrangement with dbimpl;
tbl1.each_pair { |key,data|
m = {}
m["key"] = tbl2[key][1]
m["val1"] = data[1]
# Turn "John A. Smith" into "JAS":
m["user"] = data[2].split(" ").map {|w| w[0].chr }.join.upcase
m["smoothed"] = FancyMath.smooth(data)
DAL.insertCurve(p, m)
}
And here is how we might run it:
$ env CLASSPATH="persistor.jar:dbimpl.jar:DAL.jar:fancymath.jar" jruby loader1.rb
What are some interesting things we see here?
- Ruby lvals need no explicit type declaration. They are what the result of
the rval expression returns. This means that intermediate values in a series
of function calls do not need a bevy of imports or other mechanisms for type
declarations. This makes program construction both faster and for relatively
small utils, clearer because attention is not drawn away from the really important
and useful parts of the program.
- Map and list handling is just easier. When dealing with string-keyed maps
and lists of data, Ruby (and Python and Perl and ...) is just simpler than Java.
- There are a host of functional and collection processing idioms in Ruby that
are not exactly immediately obvious in their purpose to the novice but they
appear so often that one becomes acclimatead to their use and output and they
are powerful and compact. See the expression for assigning m["user"]
above and compare to Java.
- Ruby is used to powerfully deal with cracking and assembling data for
passing to FancyMath.smooth() and DAL.insertCurve(). The
complex and potentially high-performance aspects of that software is "safely"
contained in Java and none of it is required to be re-engineered in Ruby including
persisting to the database. Constructing a strong data access layer (DAL)
over persistence is a vital factoring and insulation exercise even in a single
language world. The effort to do so is repaid many times over in a multi-language
leverage scenario.
- The setup of the runtime CLASSPATH is the same.
- There is a subtle issue of ensuring that the appropriate types (or toString()
equivalents) can be created in Ruby to be properly passed to the Java layer.
A Map containing a bespoke Ruby native object (i.e. some class we might create
locally in the Ruby source) cannot be interpreted by the Java layer. This also
means that common Java types like java.util.Date which appear in Java
method signatures cannot consume the Ruby "natural equivalents"; a util or
a "string representation bridge" must be used to create the Java type from the
Ruby type.
- Least important but still relevant: the Ruby program is about half the length
of the Java program.
Like this? Dislike this? Let me know
Site copyright © 2013-2024 Buzz Moschetti. All rights reserved