UC BERKELEY
EECS technical reports
TECHNICAL REPORTS


CSD-05-1404.pdf
CSD-05-1404.ps
Conditions of Use

Archive Home Page

Long-Term Data Maintenance in Wide-Area Storage Systems: A Quantitative Approach

Authors:
Weatherspoon, Hakim
Chun, Byung-Gon
So, Chiu Wah
Kubiatowicz, John
Technical Report Identifier: CSD-05-1404
July 2005
CSD-05-1404.pdf
CSD-05-1404.ps

Abstract: Maintaining data replication levels is a fundamental process of wide-area storage systems; replicas must be created as storage nodes permanently fail to avoid data loss. Many failures in the wide-area are transient, however, where the node returns with data intact. Given a goal of minimizing replicas created to maintain a desired replication level, creating replicas in response to transient failures is wasted effort. In this paper, we present a principled way of minimizing costs while maintaining a desired data availability. Design choices include choosing data redundancy type, number of replicas, extra redundancy, and data placement. We demonstrate via trace-driven simulation that significant maintenance efficiency gains can be realized in existing storage systems with the correct choice of strategies and parameters. For example, we show that DHash can reduce its costs by a factor of 31 while maintaining the same desired data availability.