Operating System Support for Space Allocation in Grid Storage Systems
Shared temporary storage space is often the constraining resource for clusters that serve as execution nodes in wide-area distributed systems. At least one large national-scale computing grid has reported a failure rate of as high as thirty percent of submitted jobs, often due to accidentally filled shared storage spaces. Previous systems have attacked this problem by adding space allocation to the distributed system interface. However, these allocations are not enforced at the filesystem level, and thus unexpected or unaccounted uses of storage may cause the system to fail. This paper describes an abstract model of space allocation in the file system and explores three implementations of the model: a user-level library, a recursive loopback filesystem, and a modified kernel filesystem.