By Alex Kuznetsov, Alex Plant, and Alexander Tormasov

Last time, we explained how the TCP_CORK option can decrease the number of packets transferred over a network. While critically important, reducing traffic is just one part of the puzzle of high-performance network data transmission. Other TCP options can significantly improve performance and reduce server response latency under certain conditions. Let’s look at some of those options.

The first option we’ll consider is TCP_DEFER_ACCEPT. (This is what it’s called in Linux; other OSs offer the same option but use different names.) To understand the idea of the TCP_DEFER_ACCEPT option, it is necessary to picture a typical process of the HTTP client-server interaction. Consider how the TCP establishes a connection with the goal of transferring data. On a network, information travels in discrete units called IP packets (or IP datagrams). A packet always has a header that carries service information, used for internal protocol handling, and it may also carry payload data. A typical example of service information is a set of so-called flags, which mark the packets as having special meaning to a TCP/IP stack, such as acknowledgement of successful packet receiving. Often, it’s possible to carry payload in the “marked” packet, but sometimes, internal logic forces a TCP/IP stack to send out packets with just a header. These packets often introduce unwanted delays and increased overhead and result in overall performance degradation.

The server has now created a socket and is waiting for a connection. The connection procedure in TCP/IP is a so-called “three-way handshake.” First, a client sends a TCP packet with a SYN flag set and no payload (a SYN packet). The server replies by sending a packet with SYN/ACK flags set (a SYN/ACK packet) to acknowledge receipt of the initial packet. The client then sends an ACK packet to acknowledge receipt of the second packet and to finalize the connection procedure. After receiving the SYN/ACK, the packet server wakes up a receiver process while waiting for data. When the three-way handshake is completed, the client starts to send “useful” data to be transferred to the server. Usually, an HTTP request is quite small and fits into a single packet. But in this case, at least four packets will be sent in both directions, adding considerable delay times. Note also that the receiver has already been waiting for the information—since before the data was ever sent.

To alleviate these problems, Linux (along with some other OSs) includes a TCP_DEFER_ACCEPT option in its TCP implementation. Set on a server-side listening socket, it instructs the kernel not to wait for the final ACK packet and not to initiate the process until the first packet of real data has arrived. After sending the SYN/ACK, the server will then wait for a data packet from a client. Now, only three packets will be sent over the network, and the connection establishment delay will be significantly reduced, which is typical for HTTP.

Equivalents of this option are available on other operation systems, as well. For example, in FreeBSD, the same behavior is achieved with the following code:
/* some code skipped for clarity */
struct accept_filter_arg af = { “dataready”, “” };
setsockopt(s, SOL_SOCKET, SO_ACCEPTFILTER, &af, sizeof(af));

This feature, called an “accept filter” in FreeBSD, is used in different ways, although in all cases, the effect is the same as TCP_DEFER_ACCEPT—the server will not wait for the final ACK packet, waiting only for a packet carrying a payload. More information about this option and its significance for a high-performance Web server is available in the Apache documentation.

With HTTP client-server interaction, it may be necessary to change client behavior. Why would the client send this “useless” ACK packet anyway? A TCP stack has no way of knowing the status of an ACK packet. If FTP were used instead of HTTP, the client would not send any data until it received a packet with the FTP server prompt. In this case, delayed ACK will cause a delay in a client-server interaction. To decide whether this ACK is necessary, a client should know the application protocol and its current state. Thus, it is necessary to modify client behavior.

For Linux-based clients, we can use another option, which is also called TCP_DEFER_ACCEPT. There are two types of sockets, listening and connected sockets, and two corresponding sets of options. Hence, it is possible to use the same name for these two options that are often used together. After setting this option on a connected socket, the client will not send an ACK packet after receiving a SYN/ACK packet and will instead be waiting for a next request from a user program to send data; therefore, the server will be sent one fewer packet.

Another way to prevent delays caused by sending useless packets is to use the TCP_QUICKACK option. This option is different from TCP_DEFER_ACCEPT, as it can be used not only to manage the process of connection establishment, but it can be used also during the normal data transfer process. In addition, it can be set on either side of the client-server connection. Delaying sending of the ACK packet could be useful if it is known that the user data will be sent soon, and it is better to set the ACK flag on that data packet to minimize overhead. When the sender is sure that data will be immediately be sent (multiple packets), the TCP_QUICKACK option can be set to 0. The default value of this option is 1 for sockets in the “connected” state, which will be reset by the kernel to 1 immediately after the first use. (This is a one-time option.)

In another scenario, it could be beneficial to send out the ACK packets. The ACK packet will confirm receipt of a data block, and when the next block is processed, there will be no delay. This mode of data transfer is typical for interactive processes, where the moment of user input is unpredictable. In Linux, this is known as default socket behavior.

In the aforementioned situation, where a client is sending HTTP requests to a server, it is previously known that the request packet is short and should be sent immediately after a connection is established, which is indicative of how HTTP works. There is no need to send a pure ACK packet, so it is possible to set TCP_QUICKACK to 0 to improve performance. On the server side, both options can be set only once on the listening socket. All sockets, created indirectly by an accept call, will inherit all of the options from the original socket.

By combining the TCP_CORK, TCP_DEFER_ACCEPT, and TCP_QUICKACK options, the number of packets participating in each HTTP transaction will be reduced to a minimal acceptable level (as required by TCP protocol requirements and security considerations). The result is not only fast data transfer and request processing but also minimized client-server two-way latency.

Optimization of network program performance is obviously a complex task. Optimization techniques include using the zero-copy approach everywhere possible, proper packet assembling by TCP_CORK and equivalents, minimization of a number of packets transferred, and a latency optimization. For any significant increase in performance and scalability, it is necessary to use all methods available jointly and consistently across the code. Of course, a clear understanding of inner workings of TCP/IP stack and OS as a whole is required, as well.

Subscribe to the Developer Insider Newsletter

From the hottest programming languages to commentary on the Linux OS, get the developer and open source news and tips you need to know. Delivered Tuesdays and Thursdays

Subscribe to the Developer Insider Newsletter

From the hottest programming languages to commentary on the Linux OS, get the developer and open source news and tips you need to know. Delivered Tuesdays and Thursdays