Improving Data Availability for Better Access Performance: A Study on Caching Scientific Data on Distributed Desktop Workstations

Free registration required

Executive Summary

Client-side data caching serves as an excellent mechanism to store and analyze the rapidly growing scientific data, motivating distributed, client-side caches built from unreliable desktop storage contributions to store and access large scientific data. They offer several desirable properties, such as performance impedance matching, improved space utilization, and high parallel I/O bandwidth. In this context, the authors are faced with two key challenges: the finite amount of contributed cache space is stretched by the ever increasing scientific dataset sizes and the transient nature of volunteered storage nodes impacts data availability.

  • Format: PDF
  • Size: 442.9 KB