Networking

Cisco NetFlow on a low budget

Network administrators need to be able to monitor what is happening on their networks. Many systems that deliver such services are high-end and quite costly. Dave Mays has a solution using the OSU flow tools, and best of all, it's a solution that's free!


When you think of troubleshooting your network, it brings to mind spending a small fortune on high-end packages to meet your needs. In this Daily Feature, I will cover some simple and free tools that can provide usage statistics for your network.

My goal, within this Daily Feature, will be to achieve a network monitoring solution that is nearly as capable as Cisco’s high-end (and costly) NetFlow system. Of course, to see the data that is zooming through your Cisco network, you must first have a NetFlow-capable Cisco router. You can find a list of NetFlow-capable systems on Cisco’s Web site.

What is NetFlow?
NetFlow is a proprietary protocol that Cisco introduced to provide the system administrator with information about what the network was doing. For an in-depth look at NetFlow and how it is designed, check out this Cisco page, which gives you more information than most people will ever want to know about the protocol.

In the past, the network administrator was able to pull Simple Network Management Protocol (SNMP) data from a connection and see the number of packets flying through the network, but he or she was then left guessing as to where the data was coming from (and where is was going). With NetFlow, you get all the nitty-gritty details, as well as log files large enough to choke a horse, in packages called flows. A flow is a shrink-wrapped packet of data about a data transfer that the router has seen. A flow provides the source and destination IPs, as well as the port number and whether the packet is UDP or TCP. From the flows, you can gather in-depth information about your network and find out which segments and connections are more important to your users and which ones don’t mean as much.

You could purchase Network Data Analyzer (formerly NetFlow FlowAnalyzer) to evaluate your network data from Cisco, but after you have purchased all that expensive equipment, who has the cash remaining to buy software? The most popular package on the Internet is cflowd, which allows you to perform statistical queries against the flows and get down and dirty with your traffic.

After taking a brief look, I decided that cflowd was a little much for what I was doing, so I took a shorter path and chose the OSU flow tools, written by Mark Fullmer and Suresh Ramachandran at Ohio State, with Steve Romig giving suggestions and fixes. This little package is still in beta (maybe alpha) stages, but it does the trick for finding out where your data goes. The OSU package has been designed to get you the info that you want and ditch the rest. The databases used with the OSU flow tools are compressed with zlib so the data won’t kill you. You can download the latest OSU flow tools from this Ohio State mirror.

Once you have the package downloaded, compiled, and installed, the usage is pretty simple. Take heed, though: Some tools work, some don’t, and some don’t do what they say they do (but they are getting better everyday). The ones I use on a daily basis are flow-cat and flow-stat, which seem to get the job done.

Setting up flow output
The first step is to set up the flow output from your router. On a system that supports NetFlow, enable the flow data to go to both your system and the port that you have defined. On a Cisco 7500 series, these are the configuration lines:
configure terminal
interface serial 3/0/0
ip route-cache flow
exit
ip flow-export 192.168.1.1 4444 version 5 peer-as
exit
clear ip flow stats


This example shows how to modify the configuration of the serial interface 3/0/0 to activate NetFlow switching and to send the flows to UDP port 4444 on a workstation with the IP address 192.168.1.1. Then existing stats are cleared to ensure accurate readings when the command show ip cache flow is run to view a summary of the NetFlow switching statistics. If you don’t clear them, the stats you get on the flow-collector won’t match.

Once that is complete, login to the 192.168.1.1 host and run this command.

In this example, the command will listen on port 4444 from host 192.168.1.254, compress with the highest compression (9), swap out the file 23 times a day, and not let a single day go over 512 MB. You can tweak this for your own configuration, but it works well for me to see an hour at a time. Another common setting is –n 92, which will swap out the log file every 15 minutes to make the files smaller. I also save the flow files in /usr/local/Netflow/data by creating a temp file called tmp-<date and time>; once the period has passed, the tmp-<date and time> will be moved to a backup file named ft-<date and time>. I also run a cron job that will move a complete day’s files into their own directory (for sanity’s sake).

Multiple router information
If you have multiple routers reporting the information, you will need to run multiple flow-captures and change the –I command to the other routers. Even with this change, the routers will write the data to the same file, so you can still get all the information to run the reports on.

Now that you have your flows, what are you going to do with them? Simple—run the tools to tell you what you have going on in your network. Since the tools are written to take input from the STDIN, all you have to do is pipe the data to the tool.

This gives you:
#
# output from: ../bin/flow-stat -f20 -S2
#
# dst AS flows octets  packets duration
#
701   23580 203392279 365287 113959400
11101  135 117379341 78764  548560
6128  1950 96515212 74791  4348004
11351  1970 96247011 72227  4651180
7015  2366 47204868 46858  6003856
6327  2893 46751525 43615  6074960
4355  15626 43973032 107213 19762064


The flow-stat command will give you a report on the data that is contained in the flow packets. The example above is the one that I use most often to get an idea of which AS number my network traffic is destined for. This report is created by passing an f20 flag to the command. With the f20 flag, you will get a report of Destination AS flows, which AS the packets are going to, octets, how many octets traveled in the time of the report, packets, how many packets encompassed the data, and duration. Since the NetFlow data doesn't count on the same clock as a single unit, you have to know how long the transaction took in order to get an accurate read of what is occurring. You can also use the f8 flag, which will report the reverse (the data coming into your network) in a similar fashion. Another handy flag to use is s2, which will sort (in column 2) the total octets and give you a ranking from largest to smallest. One issue that I have discovered is that the s2 flag will work correctly with the f20 flag but not with the f8 flag.

flow-cat
The next command is flow-cat, which lets you take multiple flow files and cat them together. Normal cat won’t do this because of the headers and the compressed nature of the flow files. For example:
../bin/flow-cat ft-* | ../bin/flow-stat -f20 -S2 | less

will cat together all the ft-* flow files and create the report with flow-stat.

This new information will help you determine who your data is coming from and where your data is going, and best of all, it doesn’t cost that much to implement. The only thing that might cause issues is the amount of data collected. At my site, I do about 700 MB a day and flush the logs from the server once a week.

The man pages for the system are well done and have examples to lead you though the process, plus you can’t beat getting something for free. Happy packet hunting.

Editor's Picks