All-Pairs: An Abstraction for Data-Intensive Cloud Computing
Although modern parallel and distributed computing systems provide easy access to large amounts of computing power, it is not always easy for non-expert users to harness these large systems effectively. A large workload composed in what seems to be the obvious way by a naive user may accidentally abuse shared resources and achieve very poor performance. To address this problem, the paper proposes that production systems should provide end users with high-level abstractions that allow for the easy expression and efficient execution of data intensive workloads. The paper presents one example of an abstraction - All-Pairs - that fits the needs of several data-intensive scientific applications.