When you’re making decisions about how to meet your organization’s networking needs, it’s essential that you have an accurate measure of throughput. This article will explain how you can estimate the maximum throughput in a TCP/IP network. Throughout the article, we will be using examples based on a fictitious video content company with offices in Seattle, San Francisco, and New York. Our goal will be to evaluate data transfer times and cost options for the organization so that we can select the appropriate telecommunications technology.
In order to estimate throughput, we will need three pieces of data:
- A characterization of the traffic flows that will be moving across the network
- An estimate of network path delay and jitter
- The maximum size of the TCP window
Additional resources on this topic
- G. R. Wright, W. R. Stevens, TCP/IP Illustrated, Volume 1: The Protocols, Addison-Wesley, 1993
- G. R. Wright, W. R. Stevens, TCP/IP Illustrated, Volume 2: The Implementation, Addison-Wesley, 1995
Understanding traffic flows
The key to predicting network performance, and TCP performance, is to have a healthy understanding of the type of traffic flows that occur on your network. A flow commonly consists of a source and destination address, a source and destination port, and the protocol type. However, we’re going to modify that slightly to include:
- Source and destination.
- Application protocol.
- Symmetry of flow.
- Sensitivity to time.
- Either the total amount of data to be moved or a sustained data rate that needs to be achieved.
You can use a variety of methods to obtain this type of information—network management software, network traffic traces, user interviews, other network engineer’s assessments, and so on. In a large network, you will probably need to use all these methods to get an accurate picture of traffic flows.
Let’s assume our video-editing company has a hub and spoke network that connects the three offices through San Francisco (see Figure A). The content producers in Seattle need to move a 10-GB package of video streams via FTP to the editors in New York. They want to do this several times a day. Currently, they can move this package only to San Francisco, so the editing department has been temporarily moved into the corporate offices, and no one is happy about the arrangement. A second group of content producers in San Francisco has the same requirements as the group in Seattle.
We know the source and destination of the flow, the application protocol (FTP), the symmetry (very asymmetrical), the sensitivity to time (non-real time), and the overall amount of data to move between the source and destination (10 GB). Armed with this data, we can now start looking at the network path delay.
Network path delay
Network path delay is the total time it takes a packet to be delivered between two points on a network. It consists of many pieces of data, but for the purposes of estimation, it can be simplified down to these few components:
- Serialization delay of the individual circuits or networks
- Transmission delay of the individual circuits or networks
- Processing delay of the network devices
All circuits have a common characteristic known as serialization delay, which is the time it takes some unit of data to be serialized onto the circuit. It’s directly related to the bandwidth of the circuit and the technology employed. For instance, if I have a DS3, DS1, and a DS0, and I want to send a 1,500-byte packet (the maximum payload for a TCP/IP packet), the approximate serialization delay will be:
DS3: (1500 bytes * 8 bits/byte) / 44040192 bits/sec = .27 ms (approx)
DS1: (1500 bytes * 8 bits/byte) / 1572864 bits/sec = 8 ms (approx)
DS0: (1500 bytes * 8 bits/byte) / 65536 bits/sec = 183 ms (approx)
With some technologies, satellite in particular, the transmission delay of the circuit can be a factor as well. Actually, all circuits have some sort of transmission delay that varies with both distance and technology. A T1 leased-line across town will likely not have any measurable transmission delay, while a T1 frame relay circuit across the continent will have delay on the order of 50 to 80 ms. In those circumstances where you have measurable transmission delay, you will want to add it to the serialization delay to get a more accurate picture.
We also need to factor in the delay for any network devices that our traffic must go through to reach its final destination. This varies depending on the type of device and how busy the device is. There are really no hard-and-fast rules about how much delay a class of device will add. The safe assumption is that anywhere that there is any serious buffering occurring, there will also be noticeable delay. Examples of devices that do a significant amount of buffering are busy routers, traffic-shaping devices, firewalls, and so on.
To help illustrate how network path delay is estimated, we’ll go back to our imaginary video content company network. We will assume that the average packet size on our flow from Seattle to New York is 1,500 bytes and that the average packet size on the return path is 64 bytes (e.g., analogous to a FTP session). We will also add a millisecond of delay for each router, as they are not real busy devices. Our path delay looks like this:
Seattle to New York
Router + SD + Router + SD + TD + Router
1 ms + .27 ms + 1 ms + 8 ms + 60 ms + 1 ms = 71 ms
New York to Seattle
Router + SD + TD + Router + SD + Router
1 ms + 1 ms + 60 ms + 1 ms + 0 ms + 1 ms = 64 ms
Total path delay = 64 ms + 71 ms = 135 ms
If the traffic flow were real-time in nature, we would need to look at jitter as well. Jitter is the variation in network delay and it has a substantial impact on performance of real-time flows. Although it is difficult to estimate, it is relatively easy to predict areas of the network where jitter can occur and attempt to engineer around those problem spots by utilizing technologies that have very stable delay characteristics.
Now that we have a reasonable estimate of path delay, we can look at TCP behavior in more depth.
TCP has a number of features to compensate for network congestion and errors. The feature we are going to concentrate on is called windowing, and TCP uses this to perform flow control. The TCP window is the amount of unacknowledged data the sender can have outstanding in bytes. The size of the window can be as large as 1 GB, but in practice, it is usually not this large and the actual value is machine-specific.
Many operating systems come with a default setting, which is usually a good value for estimating throughput. During the TCP connection request, the smaller of the sender’s offered window size and the receiver’s maximum window size is chosen for the connection. While actual throughput in a TCP connection is affected by a number of factors, the maximum possible throughput is determined by the window size and network path delay.
To calculate the maximum throughput of the TCP connection, we simply use the following equation:
((window size) * 8 bits/byte)/(2 * delay) = maximum throughput
At our example company, we use Windows 2000 servers to transfer the data, and these servers have a default configuration of 64-KB windows. If we have a 64-KB window and a path delay of 135 ms, maximum throughput looks like this:
(64 KB * 8b/B)/(2 * .135s) = 1.90 Mb/s
So we can keep the frame-relay circuit busy, but adding much more bandwidth without either decreasing the path delay or increasing the window size will not help the situation. Changing the maximum window size to 1 Mb results in this throughput value:
(1MB * 8b/B)/(2 * .135s) = 30 Mb/s
Let’s assume that our CIO wants us to look into fractional DS3 service and a particular 10-Mb/s managed VPN service to connect San Francisco and New York. Preliminary testing shows that the managed VPN service increases the transmission delay to 210 ms, and during midday peak periods, the delay climbs to 410 ms. Our path delay has now increased by about 300 ms for off-peak and by 700 ms during peak periods. We recalculate the throughput like this:
(1 MB * 8b/B)/(2 * .435s) = 9.195 Mb/s
(1 MB * 8b/B)/(2 * .835s) = 4.790 Mb/s
We’ve made a substantial increase in performance over the 1.90-Mb/s throughput we had before, but we are still not able to optimally use the VPN service. We could further increase the TCP window size settings on the two machines doing the transfer, but this might adversely affect other applications running on the two machines.
Replacing the T1 frame relay circuit with a fractional DS3 running at 10.5 Mb/s (seven DS1s) would decrease both the serialization and transmission delay. This would provide throughput in excess of 30 Mb/s and would eliminate the peak hour issue, albeit at an increase in cost.
Back to our example. The organization now has an accurate estimation of the transfer times of the 10-GB file between Seattle and San Francisco. Based on our estimates, it is approximately 2.5 hours during off-peak times and five hours during peak times for the VPN solution, and just over two hours for the fractional DS3 solution. Had we not taken a look at delay, it’s very possible the organization would have assumed that the two solutions were equal in terms of performance and made their selection solely on the basis of cost.
If our goal had been to construct a network supporting an intranet consisting of mostly SMTP, HTTP, and occasional database transactions, the delay in our example would have been manageable and most of the end users of the network would not be affected by the increased delay incurred in the VPN-based solution.
Have you wrestled with estimating throughput on your network?
We look forward to getting your input and hearing your experiences regarding this topic. Join the discussion below or send the editor an e-mail.