Date Added: Nov 2012
Archival storage systems for scientific data have been growing in both size and relevance over the past two decades, yet researchers and system designers alike must rely on limited and obsolete knowledge to guide archival management and design. To address this issue, the authors analyzed three years of file-level activities from the NCAR mass storage system, providing valuable insight into a large-scale scientific archive with over 1600 users, tens of millions of files, and petabytes of data. Their examination of system usage showed that, while a subset of users were responsible for most of the activity; this activity was widely distributed at the file-level. They also show that the physical grouping of files and directories on media can improve archival storage system performance.