A Framework for Reliable and Efficient Data Placement in Distributed Computing Systems
Data placement is an essential part of today's distributed applications since moving the data close to the application has many benefits. The increasing data requirements of both scientific and commercial applications and collaborative access to these data make it even more important. In the current approach, data placement is regarded as a side effect of computation. The goal is to make data placement a first class citizen in distributed computing systems just like the computational jobs. They will be queued, scheduled, monitored, managed, and even checkpointed.