Creating data structures with the Java Collections Framework

All but the most trivial computer programs employ some type of data structure. Java provides the most common data structures via the Java Collections Framework. The good news is that it can easily be used and extended.

By Alexandre Calsavara

The Java Platform provides the most commonly used data structures in the form of the Collections Framework, and it provides a rich API to operate on them. In this first article in a two-part series, I'll explain the concepts behind the Collections Framework and show how easy it is to extend or customize the Framework for specialized applications.

The interface hierarchy
The Collections Framework relies on a set of abstract data structures represented by interfaces that describe them based on the operations they support, from the most generic to the most specialized. At the base of the Framework are the interfaces java.util.Collection and java.util.Map. The Collection interface represents just a group of objects, with no particular order or restriction. It also defines a number of useful operations you can perform on any data structure that implements it.

The interfaces java.util.List and java.util.Set extend Collection, representing more specialized data structures and adding more operations and/or constraints. List represents a sequence of elements—that is, a group of elements where you can refer to an element by its relative position within the sequence. List redefines Collection operations according to its constraints and also adds more operations.

The Set interface represents a collection with no duplicate elements. It just redefines the Collection operations to disallow duplicate elements. The java.util.SortedSet is a specialization of the Set interface. It represents a set where the elements are sorted. The Set interface does not impose any particular order on its elements.

The Map interface represents an abstract data structure mapping keys to values. That is, given a key, the structure returns the value that is mapped to (associated with) it. The interface defines several methods that allow you to retrieve the values, keys, and entries (key/value pairs) stored, but the interface does not specify any particular order.

Finally, the interface java.util.SortedMap extends the Map interface, adding the requirement that the keys must be sorted; any method that returns the keys must return them in sorted order. Figure A shows the complete interface hierarchy, as well as the relationships between them.

Figure A
Collections Framework abstract data structures

The Collections Framework defines two other interfaces that are used to sort objects: java.util.Comparable and java.util.Comparator. Any class implementing the Comparable interface defines a method, compareTo, to be used to compare objects of that class to establish a relative order between them. This is called the "natural order" of the class.

Classes that implement the SortedSet interface require that any object stored implement the Comparable interface. The same is true for objects used as keys in a SortedMap. Since all primitive type wrapper classes already implement the Comparable interface, no additional effort is required to use them with the Collections Framework.

However, if a given class does not implement the Comparable interface, or if you want to sort it in some way other than its natural order, you can use a comparator. A comparator is an object of a class that implements the Comparator interface, which defines methods that allow you to compare two objects to sort them. Providing a comparator when the data structure is constructed overrides the objects' natural order.

Creating new data structures
The Collections Framework already provides generic implementations of the most popular data structures, which is enough for most applications. But you'll still find yourself creating a new data structure when you need customized features, high performance, or even operations that are not supported by the standard implementations.

Whatever the reason, it is easy to create a new data structure from scratch that conforms to the Collections Framework conventions. The Collections Framework provides skeletal implementations of each interface except SortedSet and SortedMap (see Table A) to minimize the effort required to implement them.
Table A
Each skeletal implementation is an abstract base class that implements a given interface.
Class Implements
AbstractCollection Collection
AbstractSet Set
AbstractList List
AbstractSequentialList List
AbstractMap Map
Collections Framework skeletal implementations

All you need to do is to extend the class that implements the interface that corresponds to your data structure and then implement a couple of methods (see Table B). All other methods required by the Collections Framework's interfaces are already implemented by the skeletal classes. The documentation for each class describes the implementation of these methods in detail, so you can override them and provide alternate implementations if the default does not suit your needs.
Table B
You just need to implement the abstract methods defined by each skeletal implementation to produce a data structure that's fully compliant with the Collections Framework.
Base class Methods
AbstractCollection iterator()
AbstractSet iterator()
AbstractList get(int)
AbstractSequentialList iterator()
AbstractMap entrySet()
Skeletal implementations of abstract methods

Although constructors can't be abstract and therefore won't be enforced by the base classes, you should also provide a no-args constructor and a constructor that accepts a Collection (AbstractCollection, AbstractSet, AbstractList, and AbstractSequentialList) or a Map (AbstractMap) to comply with the corresponding interface specification.

There are two skeletal implementations for the List interface. Use AbstractList when the implementation is based on a data structure that supports random access (like an array) and use AbstractSequentialList when it is based on a data structure that supports only sequential access (like a linked list).

If you follow these guidelines, you create an unmodifiable data structure—that is, one that may not be changed once created. If you want to create a modifiable data structure, you need to fulfill additional requirements (see Table C). Even with these extra steps, the effort needed to adapt an existing or new data structure to the Collections Framework's conventions is minimal. After all, several of the methods you need to implement would have to be provided anyway, in one fashion or another.
Table C
To create modified data structures, it is necessary to override additional methods and comply with additional requirements.
Base class Additional requirements
AbstractCollection Override the class's add(Object) method and implement the method remove() of the iterator returned by the iterator() method
AbstractSet Same as the AbstractCollection
AbstractList Override the class' set(int,Object) method; if the data structure has variable size, override the methods add(int,Object) and remove(int)
AbstractSequentialList Implement the method set(Object) of the iterator returned by the listIterator() method; if the data structure has variable size, also implement the iterator's methods add(Object) and remove()
Creating modifiable data structures

If you take into account that a compliant data structure offers numerous benefits like implementation independence (your data structure can be referred by the interface, rather than by the class), utility methods defined by the interface itself or by the Framework, and standard algorithms, you can see that the value added largely offsets the effort required for implementation.

Unfortunately, there is no equivalent of a skeletal implementation for the interfaces SortedSet and SortedMap. So implementing them takes much more work, especially implementing the methods that return subsets of the underlying data structure.

Listing A shows a simple example that illustrates some of the concepts we've been discussing. It wraps a standard, fixed-size array as a List. Since the underlying data structure (an array) supports random access, AbstractList was used as the base class.

We've looked at the fundamentals of the Java Collections Framework and seen how to create new data structures based on abstract base classes it provides. In the next article, I'll show you how to create specialized versions of existing implementations using wrappers.

Editor's Picks