Enterprise Software

Support diary: Heather H. (Friday)

It's a holiday in the U.K., but Heather puts in a long day nonetheless.

As the week winds up, Heather spends part of her holiday hard at work.

Read Monday’s entry.

Read Tuesday’s entry.

Read Wednesday’s entry.

Read Thursday’s entry.

Friday, 10:45 A.M.
I think my husband has just about given me up for dead when I stumble down the stairs. He says something witty like “Look, it moves.” Still, he did let me sleep in, which I definitely needed.

1:00 P.M.
I decided that I should get my health check out of the way, now that I was up and had had some breakfast and coffee. Decaf coffee in my case—I’m resolutely trying to stay off the caffeine. Some days it seems like a better idea than others…

Right, that health check. Health checks, or more accurately, mini health checks, are done by the systems engineers as a sales tool. We go in, take a trace, discuss it on site with the customer, and then dissect it in the privacy of our own offices/homes to produce a report. Customers generally find this very helpful as it documents conditions on their networks. I’m aware that some people have used them as powerful bargaining tools for new equipment and software, thankfully including Sniffer itself.

So, what sort of conditions do we show the customer? Anything the Sniffer picks up, basically. We take care to ensure that everything in the report is what the Sniffer itself produces, not what we can produce. This is simply because people are looking at buying the tool, not our services. Anything else I try to point out when we’re there in person so that they can be aware, but it doesn’t compromise the integrity of the report.

Today’s report is a doozy. The network that I visited was Ethernet and had been cobbled together from two separate networks earlier this year. The gentleman thinks that he might be having some problems, as access to the network is now significantly slowed, and logins take a long time to be processed.

As soon as we got to his server room, I thought I could see some problems. First off, his hubs were going mad. They were absolutely mental, and they were flashing orange—collisions. Immediately I suspected that his network was overutilized and resolved to look at that first thing. From the pattern of the flashing on the hubs (i.e. many of them going orange at the same moment), I suspected he was having broadcast storms. It seemed too far-reaching to be something more minor.

I set up and took a look at his utilization. Considering the figures I’m seeing, I am quite surprised that his users aren’t trying to lynch him in the halls for what they must be going through. So far, utilization has peaked at 76 percent, and I’ve been looking for about 30 seconds. This is Ethernet, mind you. It shouldn’t really hold above 30 percent utilization if you want comfortable performance. Ah, 85 percent utilization now. When I point this out, he says “Is that a bit high?” Oh dear.

Another thing I noticed is that he has a ton of CRC and alignment errors. Not entirely surprising. And many, many broadcast storms, as predicted. Unfortunately, the card used in this case was not one that can run in promiscuous mode. Therefore I was not picking up physical layer errors, which is a definite bummer in this place.

While I had been looking at these initial errors, I also had been asking the customer about his network structure and what he hopes to do with it. I strongly suggested getting some switches in, and he asked if this would extend his cable lengths further. It seemed an odd question, so I asked him to elaborate. This is where he dropped the bomb: he’s overextended his cabling lengths. By how much, I asked. It turns out that when the additional network was added, they ran Cat-5 cabling to match where they had the old 10-base 2 cabling. Meaning he’s got extra cabling he never had before, and he’s laid it on the same path and distances as a previous cable type that could go twice the distance. Oh dear. Breaking simple laws of physics isn’t good for a network. I tried to explain why he shouldn’t do that, and he told me that he understands, but didn’t think it was really that important. Oh yes, it’s important. All those collisions…

After enjoying my initial view of a network in crisis, I decided to finally start my capture. I figured I had better make it a big capture file, because I could fill an 8 MB one in a second on this network. I started with 32 MB as my base capture size, and figured I could take two or three trace files in no time.

As expected, even my 32-MB trace file filled quickly, but there were tons of errors. One that keeps cropping up is a Broadcast/Multicast Storm diagnostic. The information you get from the Sniffer on this is as such: “Global Diagnosis: Broadcast/Multicast Storm. Description: The number of broadcast or multicast frames per second has Broadcast/Multicast Storm entry in the Network Associates Expert. If a broadcast storm diagnosis occurs, then a broadcast storm symptom will also have occurred. Refer to the detailed information for that symptom. Possible cause: This could be a temporary condition resulting from a valid broadcast or multicast operation and/or a workstation that does not have an adequate host table is sending repeated RWHO packets. Change the workstation to assure that it maintains a host table and stops sending RWHO packets. The broadcast storm indicates some other underlying configuration problem. If broadcast storms occur frequently, their causes should be determined. A network can slow down considerably during such storms.”

Well, that pretty much sums that up. This network also had other errors including ACK Too Longs, Retransmissions, and Idle Too Longs. These are all connection issues that could theoretically be blamed on the network’s congestion.

There is another error that is quite interesting and leads me to believe that many of the above errors are due to routing issues. The text of the error is: “Misdirected Frame. Description: This station has sent a large number of frames in which the IP destination address is not on the local subnet and the DLC destination address is not that of a router advertising a route to the IP destination. Possible cause: This station does not know what subnet it is attached to and/or this station is using expired or cancelled routes and/or a protocol other than RIP is being using to circulate routing information.”

Now, what is interesting about this is that the station that is the culprit in this error is the router. Huh? I’m forced to turn to my favorite network resource on this question.

Thankfully, I’ve gotten a possible answer that it might be a misconfiguration of the router, potentially a static route. Thanks, hubby. It’s helpful having my husband, John, in the business. I’d say that he’s taught me everything I know, but as I forget three-fourths of what he teaches me, people might get the wrong impression of what he knows. ;-)

4:00 P.M.
The health check is pretty much done. I’ll put some finishing touches on it when I’m next in the office or some other time, but for now I just want to get my laptop off of my lap. These things do get mighty warm, especially around the heat sink.

I decide instead to grab my CCNA book and take it upstairs to the bedroom. I’m working on my Cisco certifications now, having gotten my MCSE and gone as far as I want with Novell. John (the husband) has to go out for a while, and I can think of nothing nicer than studying for a little while and then taking a nap.

6:00 P.M.
Oooh, I think I read all of about two pages in that book before zonking out entirely. A lovely nap, but now the husband is back, so I think I’d be better off upright and studying rather than sleeping. Whoa, let me read that sentence again—choosing not to sleep? I must be coming down with something. Regardless, I don’t really want John asking me how far I’ve gotten with the book, as I haven’t. He’s way ahead of me in the Cisco stakes, one exam away from CCNP. I’m jealous of how far he’s gotten, but if I applied myself more I could get it done faster. Sometimes it’s just getting the hours in the day, though.

Speaking of hours in the day—I think I’ve put in a fair day’s work for a holiday, so I decide to pack it in and just enjoy having a few daylight hours with my husband. Next week will also be busy, appointments around the country, and then Thursday and Friday in Dublin.

2:16 A.M.
I’ve finally about finished writing this, my diary account for the last day of the week. I’m a bit sorry that today was a holiday because it means I didn’t have as much to report, but I was delighted to have the time at home. And now, seeing as it’s 2:20 in the morning, please allow me to be totally predictable for the last time and say, “It’s time I get some sleep.”
To comment on this diary entry, please post a comment below or follow this link to write to Heather .

Heather Herbert is employed by Network Associates as a systems engineer working with their Sniffer Technologies and Magic Solutions product lines. She is MCSE, MCP+I, and CNA certified and working on her Cisco certifications. Born in New York and transplanted to the UK after meeting her husband online, she lives in a true geek household where Nirvana will be achieved as soon as the coffee pot is fully networked.

Editor's Picks

Free Newsletters, In your Inbox