Helical Scan Reliability: Lessons Learned from the Exabyte 8500
Abstract: The workloads of large database systems such as the EOSDIS global information system require not only high performance but also reliability over repeated reads from tape. To evaluate tape reliability, we have performed extensive measurements of tape error behavior and have discovered several facts that defy conventional wisdom. First, we found that Exatape 8mm helical scan tapes can sustain many more passes than usually thought; most tapes had few errors for up to 10,000 passes, even though the tapes are rated at only 1,500 passes. We also found that hard errors (errors that cannot be corrected by repeated retries at the location) are surprisingly transient; virtually all disappeared on subsequent passes over the tape. We also observed that, while cleaning the tape heads is essential to prevent catastrophic drive failure, cleaning itself may, in some cases, cause hard errors. This paper gives a full description of our experiments and results.