The loss of power and physical infrastructure during a large-scale man-made or natural disaster will, for obvious reasons, hamper recovery communications. Something else that must be taken into consideration is the spike in voice and information traffic prior to the predicted start time of a cataclysmic event.
Hurricane Harvey is a good example. In his LinkedIn post Digital Impact of Hurricane Harvey, David Jones, Sales Engineering Director, APM Evangelist at Dynatrace, tracked internet performance in the Houston area before, during, and after the storm (Figure A).
“We can see a sharp increase in traffic prior to Harvey making landfall (August 25th) followed by a decrease in traffic as the flooding worsened over the weekend,” writes Jones.
Besides traffic bottlenecks occurring beforehand, there are other complications created by competition for limited bandwidth during and immediately after the hurricane, including:
- Rescue teams trying to do their job;
- Citizens in trouble calling for help; and
- People checking on the safety of family members and friends.
SEE: Power checklist: Building your disaster recovery plan (Tech Pro Research)
Time is more than money
Erik Golen, Jennifer Schneider, and Nirmala Shenoy, a team of professors from Rochester Institute of Technology (RIT), are concerned that losing time from degraded communications time amounts to more than money in disaster scenarios–it can mean the difference between life and death. Schneider, the Eugene H. Fram Chair in Applied Critical Thinking at RIT, told Scott Bureau as reported in this RIT press release, that Hurricane Irene and Hurricane Sandy are good examples. “Sharing data on the internet during an emergency is like trying to drive a jet down the street at rush hour,” explains Schneider. “A lot of the critical information is too big and data heavy for the existing internet pipeline.”
As to competition between emergency responders and civilians, the researchers agree with Jones. “Emergency responders may need to share mapping images, 911 requests and deployments, cell-phone location data, video chats, voice recordings, and social media communications,” writes Bureau. “When that information has to compete with civilians tweeting about the disaster and messaging loved ones, the network is taking on more than it can handle.”
Multi Node Label Routing (MNLR)
To combat bandwidth bottlenecks, Shenoy, along with co-principal investigator Erik Golen and five graduate students, created the Multi Node Label Routing protocol (video). “It is designed with an immediate failover mechanism–meaning if a link or node fails, it uses an alternate path right away, as soon as the failure is detected,” explains Bureau. “The new protocol runs below the existing internet protocols, allowing normal internet traffic to run without disruption.”
The researchers mention that MNLR continues to function normally when currently popular routing protocols such as Border Gateway Protocol (BGP) or Open Shortest Path First (OSPF) are overwhelmed. The main difference being MNLR discovers routes based on labels–structural- and relational-connectivity information–assigned to each router.
“The new protocol is actually of very low complexity compared to the current routing protocols, including BGP and OSPF,” explains Shenoy. “This is because the labels and protocols leverage the connectivity relationship that exists among routers, which are already sitting on a nice structure.”
Put simply, if a router–using existing routing protocols–detects a failure in the network, that router then informs every other router of the breakdown. This process, according to the researchers, is like having a lone police officer direct traffic during rush hour at a busy New York City intersection.
“Rather than requiring every router to keep track of the best directions to every other one, we divide possible routes for internet traffic into hierarchies,” write Golen, Schneider, and Shenoy in this The Conversation article. “These mirror existing emergency response plans: An individual responder sends information to a local commander, who combines several responders’ data and passes the data on to regional managers who assemble a wider picture that they then pass on to state or federal response coordinators.”
Shenoy believes the main issue with current routing protocols is their age. They were invented several decades ago, thus not designed for modern high-speed networks currently comprising the internet. “If you receive an email five minutes late, that is still acceptable,” says Shenoy. “But in an emergency, the implicit impact of these serious network problems truly come to light.”
The team has tested the new routing protocol using a 27-node network mimicking an incident control center, a 911 call center, and an office of emergency management. “While BGP took about 150 seconds to recover from a link failure, MNLR recovered in less than 30 seconds,” writes Bureau. He adds that MNLR also transferred information faster and with greater reliability.
“While BGP has a recommended default keep-alive message interval of 60 seconds, MNLR is not so constrained,” writes Bureau quoting Shenoy. “In fact, MNLR can detect failure with one missing keep alive message as the failure or topology change information will be flooded internet wide, which can be expected in certain cases with BGP.”
This is serious
The seriousness of this challenge is not lost during this year’s hurricane season. “The team is continuing to develop and enhance the MNLR protocol,” writes Bureau. “In the future, the team plans to test and implement the protocol in emergency situations.”