A Data Throughput Prediction and Optimization Service for Widely Distributedmany-Task Computing
A data throughput prediction and optimization service for many-task computing in widely distributed environments. This service uses multiple parallel TCP streams to improve the end-to-end throughput of data transfers. A novel mathematical model is developed to determine the number of parallel streams, required to achieve the best network performance. This model can predict the optimal number of parallel streams with as few as three prediction points. The authors implement this new service in the Stork Data Scheduler, where the prediction points can be obtained using Iperf and GridFTP samplings.