Integrating Scheduling and Replication in Data Grids With Performance Guarantee
Data Grid consists of geographically distributed computing and storage resources that are used in large scale scientific applications. Job scheduling and data replication are two well-known techniques to boost the performance of Data Grid. There has been extensive research on integrating both techniques to further improve performance in Data Grid. However, most of the current work are heuristic based without performance guarantees. In this paper, the authors propose to integrate data replication and job scheduling into one framework to minimize the total job execution time in Data Grid. They refer to the problem as Integrated Scheduling and Replication Problem.