Questions

One bad switch cause entire network to lose connection to server

+
0 Votes
Locked

One bad switch cause entire network to lose connection to server

hbissell
Has anyone seen a single bad switch cause a entire network to lose connection to a server/internet.

Details of network (Please note this is a network that was setup by a previous IT Company):
Main building - cable modem connected to the firewall, firewall connected to the main switch. The server has 2 NICs both plugged into this switch. There are 3 other switches located through out the building that connect to each other and then into Main Switch.
Building 2 - has 3 switches and a CAT5 connection into the Main Building switch.

Problem: One switch went bad in building 2, which cause not only the PCs plugged into it to lose connection, but the PCs plugged into the other 2 switches as well as the main building switch to lose connection. We know that it was the switch in building 2 causing the issue because as soon as we disconnect building 2's connection to the main building everything started working in the main building. Bad switch was replaced and their connection re-connect to the main switch and everything still continued to work correctly.

Thoughts: I can see the bad switch in building 2 effecting building 2's connection to server and the internet, but I do understand how it relates to the main building. My staff and I are completely baffled by this one.
  • +
    0 Votes
    pixelprt

    Did you find the root cause. I have a similar situation happening, one switch failure causes everything on the network to loose connectivity even if jacked into a completely different switch.

    We added an new 48 port Cisco switch to the end of our daisy chain, but have had no problems over the last few months. It's a mystery to us and would love to hear if you found the error.

    +
    0 Votes
    Mohammad Oweis

    Maybe the SW was keep sending broadcast traffic.
    Having high volume of broadcast traffic can kill the network, specially if you use a very large subnet.

    +
    0 Votes
    oldbaritone

    "Building 2 - has 3 switches and a CAT5 connection into the Main Building switch."

    A "discussion" (translation: argument) that I sometimes have with customers is about the merits of running a fiber-optic connection between buildings, instead of copper. Yes, fiber costs a little more, but there are two HUGE benefits:

    #1 - lightning and surge protection. Anytime copper goes between buildings, there's an increased chance for transients, even if the cable is underground. Fiber-optic is not subject to it. Benefit: increased safety, minimal cost. In fact, the difference between the cost of fiber and the cost of TWO Cat5 surge protectors (one for each end) is often minimal.

    #2 - ground loops. Different buildings have different electrical grounds. Sometimes there is a measurable potential between ground in one building and ground in another building. When you use copper (like cat5) between buildings, you connect the electrical grounds together. Sometimes, there's enough potential to wreak havoc. Try measuring it with a voltmeter - run a regular extension cord from one building to the other, and then measure from "ground" on one (the 3rd prong on the plug) to "ground" in the other building, like a conduit. (BE CAREFUL! I've seen cases where the ground loop was 100 volts or more! DON'T TOUCH! POSSIBLE SHOCK HAZARD!) If it's more than five volts or so, it could be a problem for computers. Again, Fiber-optic is not electrical, so it's not subject to ground loops. Un-problem.

    Ground loops cause a "hum" (longitudinal imbalance) on the line. The receivers in the switch can compensate to some degree, but there's a limit. I'm guessing the bad switch may have shorted the chassis ground to the balanced line in the Cat5, and pushed the signal distortion into the other switch.

    Yes, switches with one or more fiber ports are more expensive than Cat5-only. If the customer is really that cheap, there are "media converters" readily available, that have Cat5 on one side and Fiber on the other. They're less than $100 a pair on ebay. Then just plug the Cat5 side to the switch on either end, just like before. Each building's media converter is plugged into that building's power, so there's no ground loop.

    And BTW, the same thought applies for buildings with more than one electrical service: Ground loops can be a problem, and Fiber-optic is an easy way to avoid it.

    As long as it's just a straight-run point-to-point fiber, the connectors are very easy to install. Then plug in the fiber, plug in the Cat5 and it's ready to go.

    And of course, fiber has the potential for FAR more bandwidth than Cat5 - which may become another important payback as the customer grows. Change out the switches and go Gigabit on the fiber, and Gigabit directly into the server.

    BTW, my curiosity: Why does server have 2 NICs plugged into the switch? Would it make sense to plug one NIC into the Building One switch, and plug the other NIC into the Building Two switch? That would make subnetting easy!

    +
    0 Votes
    Gy-Antony

    Were they smart switches ? - I have had similar problems with connection issues.

    Our network is probabbly about 6 years old approx 25 users connection through 1 netgear smartswitch GS748Tv3. We not long ago has a new phone system installed (VoIp) so when we have a power outage everything go's off including our phones. I installed battery backup's on the server, switch ,IP500... everything has been running smoothley up untill recentley. Everyone on the network keeps loosing connection to the server, when I have tried the network discovery tool for the switch it says its not there ! ... I have been trying to trouble shoot this for days and have come to the conclusion there must be an inherrent problem with the DHCP IP configuration. The DHCP was enabled and the server logs appear as they should. We have experianced these network problems at least twice a day every day for about a week. I have disabled the DHCP on the switch and set it to a static IP. As of yet we have experianced no problems... fingers crossed.

    Regards

    +
    0 Votes
    CorbyBennett

    The answer is a broadcast storm, essentially a feedback loop of ARP packets. Usually caused by a network loop where a switch is plugged into itself, but can also be caused by a bad switch or NIC. Using fiber between buildings is a good idea, but in this case it would not have helped because the problem was at Layer 2.

    +
    0 Votes

    This Q&A is over 2 years old...the participants are gone, the room is
    empty...let the zombies sleep.

  • +
    0 Votes
    pixelprt

    Did you find the root cause. I have a similar situation happening, one switch failure causes everything on the network to loose connectivity even if jacked into a completely different switch.

    We added an new 48 port Cisco switch to the end of our daisy chain, but have had no problems over the last few months. It's a mystery to us and would love to hear if you found the error.

    +
    0 Votes
    Mohammad Oweis

    Maybe the SW was keep sending broadcast traffic.
    Having high volume of broadcast traffic can kill the network, specially if you use a very large subnet.

    +
    0 Votes
    oldbaritone

    "Building 2 - has 3 switches and a CAT5 connection into the Main Building switch."

    A "discussion" (translation: argument) that I sometimes have with customers is about the merits of running a fiber-optic connection between buildings, instead of copper. Yes, fiber costs a little more, but there are two HUGE benefits:

    #1 - lightning and surge protection. Anytime copper goes between buildings, there's an increased chance for transients, even if the cable is underground. Fiber-optic is not subject to it. Benefit: increased safety, minimal cost. In fact, the difference between the cost of fiber and the cost of TWO Cat5 surge protectors (one for each end) is often minimal.

    #2 - ground loops. Different buildings have different electrical grounds. Sometimes there is a measurable potential between ground in one building and ground in another building. When you use copper (like cat5) between buildings, you connect the electrical grounds together. Sometimes, there's enough potential to wreak havoc. Try measuring it with a voltmeter - run a regular extension cord from one building to the other, and then measure from "ground" on one (the 3rd prong on the plug) to "ground" in the other building, like a conduit. (BE CAREFUL! I've seen cases where the ground loop was 100 volts or more! DON'T TOUCH! POSSIBLE SHOCK HAZARD!) If it's more than five volts or so, it could be a problem for computers. Again, Fiber-optic is not electrical, so it's not subject to ground loops. Un-problem.

    Ground loops cause a "hum" (longitudinal imbalance) on the line. The receivers in the switch can compensate to some degree, but there's a limit. I'm guessing the bad switch may have shorted the chassis ground to the balanced line in the Cat5, and pushed the signal distortion into the other switch.

    Yes, switches with one or more fiber ports are more expensive than Cat5-only. If the customer is really that cheap, there are "media converters" readily available, that have Cat5 on one side and Fiber on the other. They're less than $100 a pair on ebay. Then just plug the Cat5 side to the switch on either end, just like before. Each building's media converter is plugged into that building's power, so there's no ground loop.

    And BTW, the same thought applies for buildings with more than one electrical service: Ground loops can be a problem, and Fiber-optic is an easy way to avoid it.

    As long as it's just a straight-run point-to-point fiber, the connectors are very easy to install. Then plug in the fiber, plug in the Cat5 and it's ready to go.

    And of course, fiber has the potential for FAR more bandwidth than Cat5 - which may become another important payback as the customer grows. Change out the switches and go Gigabit on the fiber, and Gigabit directly into the server.

    BTW, my curiosity: Why does server have 2 NICs plugged into the switch? Would it make sense to plug one NIC into the Building One switch, and plug the other NIC into the Building Two switch? That would make subnetting easy!

    +
    0 Votes
    Gy-Antony

    Were they smart switches ? - I have had similar problems with connection issues.

    Our network is probabbly about 6 years old approx 25 users connection through 1 netgear smartswitch GS748Tv3. We not long ago has a new phone system installed (VoIp) so when we have a power outage everything go's off including our phones. I installed battery backup's on the server, switch ,IP500... everything has been running smoothley up untill recentley. Everyone on the network keeps loosing connection to the server, when I have tried the network discovery tool for the switch it says its not there ! ... I have been trying to trouble shoot this for days and have come to the conclusion there must be an inherrent problem with the DHCP IP configuration. The DHCP was enabled and the server logs appear as they should. We have experianced these network problems at least twice a day every day for about a week. I have disabled the DHCP on the switch and set it to a static IP. As of yet we have experianced no problems... fingers crossed.

    Regards

    +
    0 Votes
    CorbyBennett

    The answer is a broadcast storm, essentially a feedback loop of ARP packets. Usually caused by a network loop where a switch is plugged into itself, but can also be caused by a bad switch or NIC. Using fiber between buildings is a good idea, but in this case it would not have helped because the problem was at Layer 2.

    +
    0 Votes

    This Q&A is over 2 years old...the participants are gone, the room is
    empty...let the zombies sleep.