Uncovering Errors: The Cost of Detecting Silent Data Corruption

Date Added: Nov 2009
Format: PDF

Data integrity is pivotal to the usefulness of any storage system. It ensures that the data stored is free from any modification throughout its existence on the storage medium. Hash functions such as cyclic redundancy checks or checksums are frequently used to detect data corruption during its transmission to permanent storage or its stay there. Without these checks, such data errors usually go undetected and unreported to the system and hence are not communicated to the application. They are referred as "silent data corruption." When an application reads corrupted or malformed data, it leads to incorrect results or a failed system. Storage arrays in leadership computing facilities comprise several thousands of drives, thus increasing the likelihood of such failures.