Understanding TCP Incast Throughput Collapse in Datacenter Networks

Date Added: Aug 2009
Format: PDF

Most business organizations make use of internet datacenters to support a multitude of application and services. A large number of business organizations make use of the Transmission Control Protocol (TCP) technology for communicating between nodes. TCP is a mature technology that has been successful in communicating the needs of most applications. However, with time it was noticed that the TCP technology, which was once serving all purposes, was now not able to match the unique workloads, scale, and environment of the Internet datacenter. This paper discusses the TCP Throughput Collapse, a communication pattern that brings out a pathological response from popular implementations of TCP. The idea is to ensure that the link capacity is utilized thoroughly.Also known as Incast, the TCP Throughput Collapse was a communication pattern that was used by a receiver to send data requests to multiple senders. Eventually, it was noted that as the number of concurrent senders increases, the perceived application-level throughput at the receiver collapses. This paper focuses on understanding the dynamics of Incast. It makes the use of empirical data to reason about the dynamic system of simultaneously communicating TCP entities. The experiment was conducted on a configurable network testbed that allowed fine-grained control over the end hosts and the network.The paper also proposes a logical model that can be used to keep an account for the observed Incast symptoms, identify contributory factors, and explore the efficacy of solutions proposed across industry standards.