By Sam Shah, CCNA, CCNP
Early in the morning of Saturday, January 25, I received a phone call at home from upper management requesting that I check a customer’s network for problems. This was a direct phone call and I did not receive a page—a deviation from normal protocol for such communication—which immediately indicated to me that this was urgent.
From home, I made multiple attempts to dial in to the network but they proved to be unsuccessful. Internet response was very slow in general, so I turned to CNN and heard that networks throughout the world were experiencing some problems, possibly due to a new virus or worm.
That was enough for me to decide it was time to go into the office. I told my wife that it could be a while before I returned. I was right.
This customer has a highly sophisticated, robust network consisting of dual high-speed backbones using ATM and Gigabit Ethernet. In addition, redundant routers provide high availability.
Upon arriving at my desk, I tried to Telnet into a core switch but without any success. Ping responses from the switch were intermittent. Before I went into a lengthy troubleshooting process, I wanted to see if there was any more information about a new virus or worm that might help me pinpoint my efforts.
A quick scan of some of the popular Web advisories indicated that a worm called SQL Slammer was spreading fast and that it exploited Microsoft SQL Server 2000 and the MSDE desktop engine (which is based on SQL). A phone call to our Chief Information Security Officer confirmed the information. This gave me a good start. Now our job was to find and block SQL Slammer packets. Thus, a game of hunt and kill began.
Help from Sniffer
A visual inspection of a related Cisco Catalyst switch revealed high backplane utilization, according to the LED bar activity on the Supervisor module. It indicated that a very high volume of data was streaming to and from the network. The next step was to discover the systems that were moving this high volume of data.
Since no one was able to Telnet into any of the core switches (Cisco Catalyst 6500s and 5500s), we consoled directly into one of the server farm switches. We spanned the server farm Virtual LAN (VLAN) to a port and attached Sniffer to this port. (Spanning is the data-mirroring technique that captures that data for diagnostic purposes.) Within just a few seconds, we collected multiple megabytes of data. Viewing the IP host table revealed our “top talkers" on the network. That gave us a list of suspect systems and their IP addresses.
For those without Sniffer…
If you do not own a $30,000 enterprise version of Sniffer, here are three commands you can run on a Cisco Catalyst switch that will provide similar results:
show port mac
show cam dynamic OR show cam [slot #]/[port #]
The first command clears statistics counters. The second command shows the number of transmitted and received unicast and multicast/broadcast packets. Look for ports with extremely high numbers of packets transmitted. The third command (whichever one you choose) displays MAC addresses associated with those ports. Using these commands should help you find chattering ports that need to be shut down.
Help from the server team
By now, most members of both the network and the server teams were on-site. We ran the IP addresses of the top talkers by the server team, and they verified that those indeed belonged to Microsoft SQL Servers, with a few exceptions. (The exceptions were later discovered to be machines with MSDE installed.)
Now that suspects were verified, we had to shut down the ports on the switch where they were attached. To find the port for each suspect device, we pinged its IP address from the switch’s console. After a successful ping response, we ran the Cisco IOS command show arp to display the port where the device attached. The next step was simple: Disable the port to inhibit communication on it. Eventually, all of the Microsoft SQL-related top talkers were silenced.
Stopping the outsider attacks
We effectively shut down SQL Slammer chatter generated from inside the network, but we still had to deal with the traffic coming from the outside. To do this, we used the simple technique of Cisco IP access control lists (ACLs). Within an ACL, you can specify source and destination IP addresses as well as port number to either permit or deny (block) traffic. You can apply the access list on any interface, either inbound or outbound.
So we had to find the TCP port number where this attack was being launched. (This was before major companies such as Symantec had discovered and revealed this information publicly.) Further analysis of Sniffer packet data revealed that all the top talkers were transmitting data with destination port number 1434.
I created an ACL to block any traffic intended to destination port 1434 for UDP and TCP traffic. (TCP was later removed after confirming that the worm used only UDP.) To create this ACL on a Cisco router, I ran these commands:
access-list 100 deny udp any any eq 1434 log
access-list 100 permit ip any any
The first line specifies that traffic from any source to any destination with a destination port UDP 1434 will be blocked. Adding the keyword “log” at the end allowed us to record source and destination addresses whenever any traffic was blocked so we could discover where the attack was coming from and where it was targeted.
All ACLs have an implicit deny statement that essentially blocks all traffic that's not explicitly permitted in the list. To override this and to allow the rest of the IP traffic (not using port 1434), the permit statement was essential to complete the ACL. If I had forgotten to put it in, I would have blocked all IP traffic.
Next, I applied the ACL to all my router interfaces for inbound traffic. An ACL by itself is meaningless until it is applied to an interface. The commands I used were:
ip access-group 100 in
Interface VLAN 11
ip access-group 100 in
It is important to remember to apply the ACL to both the physical and virtual interfaces, as I did in the example above. The security team applied the ACL to our firewalls and Cisco VPN concentrators to block Slammer traffic coming in from the Internet, as well as from business partners. This effectively inhibited all Slammer traffic into the network.
Switch and router backplane and CPU utilization soon dropped to a reasonable level, and we were able to easily Telnet into network devices again.
The server team downloaded a patch from Microsoft to fix the SQL Server vulnerability issue. Then the network team reenabled each switch port one at a time to apply the patch to the SQL servers over the network. The utilization went up momentarily but came back down as soon as the patch was applied.
Reflections on Slammer
This was probably one of the fastest spreading worms ever, as it spanned the globe within about 15 minutes from inception. It was not malicious from a data-alteration standpoint, but it crippled major ATM networks, call centers, and a few 911 emergency response centers because of the huge amounts of traffic that it produced.
It essentially ran from computer memory and shelled out UDP traffic as fast as the system's NIC could handle it. Since UDP is a connectionless protocol and does not need acknowledgement (as opposed to TCP, which is connection oriented), it can just blindly shoot out buckets of data. The Slammer packet itself was only 376 bytes of data plus header, but the sheer repetition caused lots of havoc as it overloaded switches, hubs, routers, and (unpatched) Microsoft SQL servers, rendering these systems too busy to respond.
Our technique to deal with Slammer was reactive. I did come up with an ACL-based solution; however, I will not take all the credit for the fix. Collaboration among various teams helped us conquer the problem within a few hours. We all felt a lot better when we later learned that many large corporations fought the problems for over three days.
Losing a Saturday at work got me thinking about whether there could be a proactive solution to a problem like this. The first idea I came up with was to have the Cisco IOS Firewall feature set on a perimeter router or an IDS external to the network router. If an intrusion signature were provided by a vendor, an action of alarm generation, packet drop, or send TCP reset (in case of TCP intrusion) could be configured on the router upon a signature match. However, chances are low that signatures would exist for a new type of intrusion and probably would not be released soon enough to make a measurable difference.
The second option is to simply stay completely updated on server and application patches, which can be very time and effort intensive. So there you have it—the current security dilemma in a nutshell. There is no perfect solution in the security world. The fast deployment of an intrusion signature to perimeter firewalls and/or IDS systems may hold the best potential for dealing with future episodes like Slammer.