The I/O APIs in JDK 1.4 provide improved performance, including buffer management, scalable I/O operations for network and file, encoders and decoders for character-set conversion, and regular expressions for pattern matching. No wonder people consider JDK 1.4 to be a significant advance in Java development. Let’s take a quick spin through the new I/O operation in JDK 1.4 to better appreciate its offerings.
Why is it new?
The I/O in JDK 1.4 is considered new because it offers some additional support. For example, nonblocking I/O has appeared in UNIX and Windows, but Java supported only blocking I/O in previous versions because it was designed for PCs as well as other electronic devices, from smart cards to supercomputers.
Several big Java supporters, such as BEA Systems, IBM, and Sun, defined the specification of the new I/O. These enterprises are giant supporters of high throughput Java applications, like WebLogic and WebSphere.
Let’s look at some of the additional functionality of I/O in JDK 1.4, including buffers, character-set conversion, regular expressions, channels, advanced file handling, and nonblocking I/O.
Grab the code
Download the source code for the article here.
Buffers provide a container to store a set of data. Think of them as a combination of DataInputStream and DataOutputStream with a fixed-size array. You can use them to read or write a specific primitive data element.
The new I/O provides seven types of buffers:
- · ByteBuffer
- · CharBuffer
- · DoubleBuffer
- · FloatBuffer
- · IntBuffer
- · LongBuffer
- · ShortBuffer
Each can contain only a specific primitive data element, except ByteBuffer.
Listing A shows a basic mechanism of a buffer. A CharBuffer object called buff reads a string from args and then writes to the standard output.
Notice that the buffer's member method length() is different from the array’s data member length. The buffer’s member method returns the remaining length of the CharBuffer object. By returning the length of the CharBuffer object, it works like DataInputStream or DataOutputStream but can both read and write.
In addition, a buffer can be categorized as a nondirect buffer or a direct buffer. A nondirect buffer is allocated logically to store data, so you don’t need to be worried about the memory allocation. The example in Listing A employed a nondirect buffer. On the flipside, the direct buffer is allocated to create a contiguous memory block. You will find a direct buffer in the source code for this article. Since the direct buffer requires the system’s native I/O operations, it is important to consider its creation cost and its performance when choosing either of them.
The introduction to buffers raises a problem of byte ordering and character conversion. Although ByteBuffer can handle byte ordering internally by ByteOrder class, it does not have any method to deal with character conversion. Character conversion is a complicated topic because it involves many international standards.
Fortunately, JDK 1.4 provides the Charset class, CharsetEncoder class, and CharsetDecoder class to deal with this issue. The Charset class provides a static factory method Charset.forName() to create a character-set instance. This instance can produce a CharsetEncoder object to convert character sequences into bytes as well as a CharsetDecoder object to convert from bytes back to character sequences. The provided source codes show an example of handling character-set issues in networking.
Regular expression support makes Java more attractive because it strengthens Java to handle strings. Simply speaking, regular expressions offer a pattern of characters to describe a set of strings.
JDK 1.4 keeps most of Perl’s regular expression construct and character class. This allows Perl programmers to catch up on Java quickly and easily. (That may be Sun’s strategy.) Listing B shows how to represent regular expressions.
Unlike Perl, to use strange-looking functions such as s// and tr//, Java employs an object-oriented approach to utilize regular expressions. The Pattern class represents regular expressions in Java, and the Matcher class matches character sequences with the Pattern class. Look at the following snippet:
Pattern p = Pattern.compile("[,\\s]+");
String result = p.split("one,two, three four , five");
These statements split a string of “one,two, three four , five” into an array of String, which contains “one”, “two”, “three”, “four” and “five”, by using a limiter that belongs to [,\\s]+.
Let’s try another example. The statements in Listing C replace all occurrences of “girl” with “boy”. Similarly, "one girl, two girls in the room"p, a Pattern object, targets “girl” and creates a Matcher object m for the sentence "one girl, two girls in the room". Then, m tries to find the pattern p in the sentence. If the object m can find a string that fulfills the pattern p, it will use the method appendReplacement() to replace the targeted word with “boy” and append a string to the StringBuffer sb object. sb becomes “one boy” after calling the first appendReplacement(). Before calling appendTail(), sb has become “one boy, two boy” and it becomes “one boy, two boys in the room” after calling appendTail().
Although Java’s usage is not as convenient as Perl’s, it will be a good start for regular expression support in the land of object-oriented programming.
A channel is an open connection to an entity that can perform one or more distinct I/O operations. The entity can be a hardware device, a file, a socket, or a program component. You can use channel-related classes to enhance the ability and performance of the original I/O operations.
First of all, you employ a channel to read from or write to your ByteBuffer objects through DataInputStream or DataOutputStream. For instance, the SocketChannel object is created to output an encoded CharBuffer provided in the article source code.
Second, the channel introduces two new I/O operations to Java, which have been used for years in high performance I/O management in UNIX and Microsoft Windows NT: Scattering Read and Gathering Write. Scattering Read, which is specified by ScatteringByteChannel class, performs reading a sequence of buffers. Gathering Write, which is specified by GatheringByteChannel, performs writing a sequence of buffers.
Third, a channel is necessary to introduce FileLock, MappedByteBuffer, and Selector classes, which perform advanced file handling and nonblocking I/O operations. A channel acts as a bridge between the new feature and the original I/O entity. The following sections will look at the details.
Advanced file handling
Channels play an important role in handling advanced file I/O operations; we can make a lock on a region of a file and map a buffer of bytes to a file with FileLock and MappedByteBuffer, respectively. Both classes require FileChannel, a channel specifically for handling file I/O, as an interface to the native I/O operation.
Listing D outlines how to map a buffer of bytes to a file. After getting a channel from a FileInputStream object by the method getChannel(), you can create a MappedByteBuffer object by the FileChannel method called map(). We set the buffer as read-only by specifying its map mode to be FileChannel.MapMode.READ_ONLY. The last three lines of statement show how to deal with file I/O operations the same as buffer operations. You can, for instance, set its character set through the mapped MappedByteBuffer object.
Listing E shows an example of locking a file. You make a Channel object for the file dummy.dat and then lock the file by the method lock(). If the file has already been locked, the process will wait at the statement FileLock flock = fc.lock(); until the process that has locked this file releases the lock. After you lock it successfully, you will enter the try blanket and run those three statements. Finally you release the lock by calling release().
Another important role the channel plays is to enable nonblocking I/O sockets. You may be familiar with the nonblocking I/O socket if you have tried socket programming in C or C++. Enabling multiplexing for nonblocking I/O sockets in Java is similar to that in C.
For basic socket programming, we usually follow the steps of opening a socket, listening to a connection, accepting a connection, exchanging data, and finally, closing a connection. The process has to wait or be blocked at the point of listening to a connection until a client requests a connection. This blocking issue does not allow more than one connection at a time. Although multithreading is one way to solve this problem, multiplexing can utilize resources much more efficiently because there's no more thread creation.
Multiplexing is performed when you locate every socket into a list. The list of all the ready sockets is called the Ready List. We can operate I/O of a socket if we retrieve the socket from the Ready List by selecting its key.
In the source code for the article, HelloServerNB.java offers a simple example of the nonblocking I/O feature. The constructor of HelloServerNB just calls the method acceptConnections(). The method first creates a Selector object to perform multiplexing using a factory method called SelectorProvider.provide().openSelector(), which returns a default Selector object. Then, it opens a socket using a ServerSocketChannel object instead of a ServerSocket object. It configures the object as a nonblocking I/O socket by calling ssc.configureBlocking(false). After performing address translation and binding, it will run SelectionKey acceptKey = ssc.register(acceptSelector, SelectionKey.OP_ACCEPT); and register an accept() on the server socket with the Selector object instead of directly using the accept() method. That means that the socket will be on the Ready List of the Selector object. After using acceptSelector.select() method to check whether the registration is successful, it will collect a set of keys on the Ready List by the method acceptSelector.selectedKeys() and convert the set to an iterator for easy manipulation. Finally, it retrieves the ServerSocketChannel from the selected key and performs I/O operations with this Channel object.
JDK 1.4 introduces many features, including the new I/O. We've just touched on the power of the I/O. In future articles, I will explore additional features and functionality.