Networking

Clear up network congestion

Network congestion is pretty much a fact of life—even if you're not experiencing it now, you likely will at some point. How do you fight network congestion? David Davis offers some tips for troubleshooting this problem, and he discusses some possible solutions.

Do you have network congestion? If you don't now, you probably have before, or you likely will in the future. How do you fight network congestion? While there isn't one quick-hit solution, you have several available options. Let's look at how you can begin troubleshooting network congestion and discuss some possible solutions.

Ask these questions

Before we begin troubleshooting, you need to answer some questions about your network. Even if you think you already know the answers, you still need to use tools to validate them.

Start off with these questions:

  1. What does your network look like? Do you have a diagram?
  2. What size are the network links?
  3. What types of applications are running on the network?
  4. What are the characteristics of those applications? Are they latency-sensitive or latency-insensitive? How much traffic do they generate? What are their traffic patterns?
  5. When did the congestion start? Was it all of the sudden, or has it slowly developed over time?
  6. Is the congestion constant, or does it come and go? Does it happen at a certain time of the day, week, or month?
  7. Has anything recently changed that could have caused the congestion (e.g., new applications, hardware changes, applied patches, etc.)?

Validate your answers

Using your answers to these questions, you may think that you know what's causing the congestion. However, you need to use tools to verify these deductions.

So how do you corroborate that the congested link is really the one you think it is? On a Cisco router, this may be as simple as using the show interfacecommand. Here's an example:

Router# show interface s3/0
Serial3/0 is up, line protocol is up 
 Hardware is QUICC with integrated T1 CSU/DSU
 Internet address is 10.0.100.2/30
 MTU 1500 bytes, BW 512 Kbit, DLY 20000 usec, 
   reliability 255/255, txload 36/255, rxload 255/255
 Encapsulation HDLC, loopback not set
 Keepalive set (10 sec)
 Last input 00:00:00, output 00:00:00, output hang never
 Last clearing of "show interface" counters never
 Input queue: 1/75/0/0 (size/max/drops/flushes); Total output drops: 4281
 Queueing strategy: fifo
 Output queue: 0/40 (size/max)
 5 minute input rate 498000 bits/sec, 400 packets/sec
 5 minute output rate 73000 bits/sec, 110 packets/sec
   148239286 packets input, 3250920677 bytes, 0 no buffer
   Received 536509 broadcasts, 0 runts, 5 giants, 0 throttles
   31566 input errors, 2219 CRC, 14502 frame, 0 overrun, 0 ignored, 14840 abort
   148886376 packets output, 1823664299 bytes, 0 underruns
   0 output errors, 0 collisions, 200 interface resets
   0 output buffer failures, 0 output buffers swapped out
   17 carrier transitions
   DCD=up  DSR=up  DTR=up  RTS=up  CTS=up

Router#

As you can see, the receive load on this 512K circuit is high, and so is the 5-minute input rate. These results show that this circuit is indeed congested.

You can also use Paessler's PRTG—an easy, graphical tool for monitoring utilization—to validate your answers. However, while these tools can help you make sure you're on the right track, neither PRTG nor the show interface command can tell you where the traffic is coming from or what traffic it is.

Determine what the traffic is

To get a better idea of the traffic, you'll need to take a packet capture or use a tool such as Packeteer, Network General Sniffer, or Network Instruments Observer. These tools sport remote hardware that can capture those packets and bring them back to a decoding station (such as your desktop). They then decrypt the traffic to be able to explain it. (Packeteer can also block traffic.)

Or, if you're local to the site with the congestion, determining the problematic traffic could be as simple as mirroring the port on the switch going to that router and using a PC with Ethereal to view the traffic. There are a lot of ways to find out what that traffic is, so choose a method you're comfortable and familiar with.

Decide how to deal with the traffic

Once you've determined what the traffic is, you basically have two options. You can stop the traffic, or you can choose to allow the traffic.

If you're lucky, opting to stop the traffic should resolve the congestion. You can stop it with an access control list, or you can terminate it at the source.

On the other hand, if you choose to allow the traffic, you then have a few choices for how to deal with the congestion. Of course, there are pros and cons to each option.

  • Add more bandwidth.
  • Perform quality of service (QoS) on the traffic.
  • Compress the traffic.

Weigh your options

Adding more bandwidth (at least on a WAN link) means you can expect to pay a higher price per month. In some cases, however, this is your only option.

For example, if you have 25 users who are all trying to use Citrix over a 56-K dedicated frame-relay circuit, no amount of QoS or compression will resolve the extreme slowness. You just need more bandwidth.

On the other hand, let's say you already have a reasonable amount of bandwidth for your Citrix and VoIP traffic, but users complain of periodic slowness. This slowness happens when users print 10-MB PDF files over the 256-K WAN link. In this case, you need to perform QoS.

This solution goes back to the question about the requirements of the applications running on your network (in this case, latency-sensitive vs. non-latency-sensitive). The non-latency-sensitive print jobs are slowing down the latency-sensitive traffic, and the latency-sensitive traffic needs higher priority. Most users won't notice if their print job takes a little longer to print out, but they will notice if their phone call sounds bad or their Citrix session is slow.

As for the third option, you can use compression in place of additional bandwidth. However, keep in mind that there are several caveats that go along with compression.

One big stipulation is that this solution doesn't always work. Compression only works for certain types of traffic, and it can cause delay on other types of traffic. In addition, compression can be expensive if you have several locations because you'll need a compression unit at each one.

While Cisco routers can carry out compression, it does cause a bit of delay and a larger increase in CPU utilization. Cisco routers can also perform QoS, but it isn't very friendly to configure—nor is it easy to see what's going on.

Although a dedicated QoS device like Packeteer will cost you, in my opinion, it's far superior to trying to perform QoS inside a Cisco router. As much as I love Cisco routers and try to use them as much as possible, sometimes you need to take the "best-of–breed" approach.

Miss a column?

Check out the Cisco Routers and Switches Archive, and catch up on David Davis' most recent columns.

Want to learn more about router and switch management? Automatically sign up for our free Cisco Routers and Switches newsletter, delivered each Friday!

David Davis has worked in the IT industry for 12 years and holds several certifications, including CCIE, MCSE+I, CISSP, CCNA, CCDA, and CCNP. He currently manages a group of systems/network administrators for a privately owned retail company and performs networking/systems consulting on a part-time basis.

1 comments
Klinkert
Klinkert

Nice article! When a network is so congested that significant changes are needed, descriptive tools may not help, and more advanced prescriptive tools are needed, such as the Network Strategizer (TM). Beyond IT networks, OT networks can really benefit. Thanks for the fine information.

Editor's Picks