On the Statistical Characterization of Flows in Internet Traffic With Application to Sampling

Executive Summary

A new method of estimating some statistical characteristics of TCP flows in the Internet is developed in this paper. For this purpose, a new set of random variables (referred to as observables) is defined. When dealing with sampled traffic, these observables can easily be computed from sampled data. By adopting a convenient mouse/elephant dichotomy also dependent on traffic, it is shown how these variables give a reliable statistical representation of the number of packets transmitted by large flows during successive time intervals with an appropriate duration. A mathematical framework is developed to estimate the accuracy of the method. As an application, it is shown how one can estimate the number of large TCP flows when only sampled traffic is available.

