Developer

Jakarta Commons Lang project offers centralized utility functions

Most developers have their own sets of frequently used utility functions, but they aren't properly organized or documented. The Commons Lang project's goal is to provide a uniform, tested, and documented library.


Most developers have worked with a small set of utility functions, such as capitalize, stringToInt, or split. Often this is as far as a common library goes; the management effort spent on unit testing, documentation, and packaging is absent. The code is usually the product of one developer and hasn't met the challenge of peer review.

Apache Jakarta Commons Lang is an API, which aims to centralize these common utility functions with good documentation, high stability, and a goodly amount of peer review—or arguments, as the case sometimes may be.

Getting involved
A couple of years ago, the lack of common code planning was the state of things where I worked. We had two String handling classes, a handful of util subpackages spread around the system, and documentation so hazy we often weren't aware of util packages that others had put together. Furthermore, there was no plan for factoring out common components into a common package.

It was frustrating, so I began to write my own set of common classes in a library called GenJavaCore. This library reached a certain level of maturity with average documentation levels and some stability in parts of the code I used frequently, but there was no peer review. I had a few users around the world, but nothing big.

When I joined Apache (Jakarta Taglib's String Taglib being my entry piece), part of my focus was extracting the String library from String Taglib to the Jakarta Commons project. It incorporated a smaller, preexisting StringUtils library and became one of the initial pieces in the new Lang project.

The central idea of Jakarta Commons Lang is the enhancement of the standard Java libraries, most notably java.lang.*, but it does reserve the right to incorporate other Java libraries, such as java.util.Date or Java array functions. The core of the library is easily applicable by any Java developer. It avoids declaring itself to be a repository of utility classes.

For the last year, the Jakarta Commons Lang project has been evolving and maturing, pieced together from code around the Jakarta projects, from committer's own personal libraries (e.g., GenJavaCore), and from the minds of the committers. Many times a simple idea has grown into a more powerful one when it hit the mail list.

A small slice of Commons
My core interest is in the String Util classes, StringUtils, RandomStringUtils, and CharSetUtils. They form the backbone to the String Taglib and are the classes in which I've invested most of my time. Naturally, I find them to be very useful; I use them daily. Notable features of the StringUtils class are:
  • ·        capitalise(String): A String capitalization function (the British spelling has remained) that properly uses toTitleCase and not toUpperCase as most string libraries do.
  • ·        join(Object[], String): This joins the toString of each object in the object array into a single String with the specified delimiter. So join( {"A","B",C"}, ";") results in  "A;B;C". An Iterator may also be joined.
  • ·        split(String, String): It splits a piece of delimited text. It's not quite as powerful as StringTokenizer (it doesn't handle groups of delimiters together), but it is a quick and easy tool to apply on many occasions. It is the reverse of our previous join method: split("A;B;C",";") => {"A", "B", "C"}. The split method may now be found on java.lang.String in JDK 1.4.
  • ·        reverseDelimitedString(String, String): This is an interesting method. It reverses a piece of text based on a delimiter. So: reverseDelimitedString("org.apache.commons") happily becomes "commons.apache.org."
  • ·        replace(String, String, String): This is the often yearned for String replacement method. While JDK1.4 can throw regular expressions at the problem, the StringUtils class provides the much simpler String-based version that is used more frequently.

RandomStringUtils is a class designed to create random bits of text that are quite often seen in generated passwords. It has full range to unicode and has special overloads for English-alphabetic, Ascii, AsciiNumeric, and numeric. In the future, I hope to plug it into a larger randomized framework and make it more locale aware.

Exceptions
Over the last few years, one of the most common reusable classes I've seen is the humble NestedException, or CascadedException. Now in java.lang.Throwable in JDK 1.4, the exception subpackage is a set of classes that attempt to handle most of the implementations out there, which is a change from the beta release of Commons Lang where the subpackage was merely an implementation.

While it might seem that such a thing isn't very useful, people in JDK 1.3 can use an implementation, and people in JDK 1.4 can use the SDK version. It is important to remember that Commons Lang's biggest customer is the Jakarta project itself, which contains many projects that users expect to work in many versions of Java at the same time.

Hidden in ExceptionUtils is the useful method:
  • ·        String getStackTrace(Throwable), which converts a Throwable into a String.

  • This is one of those String utilities that always crops up.

    Enum
    When Java was released, it was seen as being a new version of C. While wiser heads (not me, I owned a dying Amiga at the time) might have pointed to it merely looking like C but having core features from a much different set of languages, there were many who puzzled over what parts of C were missing and wanted to know why. Near, or at the top of that list, was the C enum statement. It was classically used to implement bits of code such as:
    enum day {monday, tuesday, wednesday, thursday, friday, saturday, sunday};

    This allowed monday to be automatically assigned the value 0, tuesday to be 1, etc. Then, the compiler would keep an eye on the day variable type and help protect the programmer from unforeseen errors.

    While there are some issues having such a thing in Java—namely that Java has no true constant that survives over RMI or multiple ClassLoading—the Lang Enum class handles itself admirably over RMI as long as you avoid the use of the equality (==) comparison method. An example of the day enumerator using the Lang Enum is available in Listing A.

    Yes it's a lot longer than the C version, but that's the breaks of not having the syntax in the language.

    Builder
    The last subpackage of the org.apache.commons.lang package I will discuss is the builder package. The classes contained in this package make it easier and safer to create standard methods such as equals, toString, hashCode, and compareTo. The hashCode and equals builders both follow the rules that Joshua Bloch lays out in his classic Effective Java: Programming Language Guide book. For example, to write a good hashCode method, you can either think hard about it, or do this:
    public int hashCode() {
    return new HashCodeBuilder(17, 37).append(width)
    .append(height)
    .append(z)
    .append(name)
    .toHashCode();
    }


    To write an easy equals method, you can do this:
    public boolean equals(Object obj) {
    if(!obj instanceof ThisClass) { return false; }
    ThisClass tc = (ThisClass)obj;
    return new EqualsBuilder().test(name, tc.name)
    .test(age, tc.age)
    .test(postcode, tc.postcode)
    .isEquals();
    }


    While the learning curve for these builder classes is slightly higher than a simple public static function, the payback in development ease is quite obvious.

    The rest of the pack
    There are other classes in org.apache.commons.lang, such as a NumberRange class for signifying a set of numbers and the NumberUtils class, which contains various useful functions such as String to number conversions and maximum/mininum methods. And don't forget ObjectUtils, where you can create the default toString method for an object even if toString is overridden.

    Also, SystemUtils allows you to easily access some standard Java environment variables, and SerializationUtils makes serializing objects much easier.

    Just an overview
    I've taken only a quick look at the Commons Lang API. I skipped many methods but hope that I managed to give a good impression of the API in general. It's designed to make all your Java projects go a little bit easier.

    Working on Jakarta Commons Lang with the half dozen or so other developers has been a great experience. Trying to survive the Commons-Dev mailing list—from whence formed long, heated discussions on such weighty subjects as "Should StringUtils have a public constructor?" and "Just how do we merge in all this code from Jakarta Avalon?"—was yet another experience. It is a perfect example of an open-source project—a place where a small community builds, where you argue with each other, learn from each other, and build up contacts you never would have had a chance to build a decade ago.

    Editor's Picks