{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2024,9,21]],"date-time":"2024-09-21T11:54:20Z","timestamp":1726919660340},"reference-count":32,"publisher":"Association for Computing Machinery (ACM)","issue":"5","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Proc. VLDB Endow."],"published-print":{"date-parts":[[2013,3]]},"abstract":"Distributed storage systems for large clusters typically use replication to provide reliability. Recently, erasure codes have been used to reduce the large storage overhead of three-replicated systems. Reed-Solomon codes are the standard design choice and their high repair cost is often considered an unavoidable price to pay for high storage efficiency and high reliability.<\/jats:p>\n This paper shows how to overcome this limitation. We present a novel family of erasure codes that are efficiently repairable and offer higher reliability compared to Reed-Solomon codes. We show analytically that our codes are optimal on a recently identified tradeoff between locality and minimum distance.<\/jats:p>\n We implement our new codes in Hadoop HDFS and compare to a currently deployed HDFS module that uses Reed-Solomon codes. Our modified HDFS implementation shows a reduction of approximately 2\u00d7 on the repair disk I\/O and repair network traffic. The disadvantage of the new coding scheme is that it requires 14% more storage compared to Reed-Solomon codes, an overhead shown to be information theoretically optimal to obtain locality. Because the new codes repair failures faster, this provides higher reliability, which is orders of magnitude higher compared to replication.<\/jats:p>","DOI":"10.14778\/2535573.2488339","type":"journal-article","created":{"date-parts":[[2014,6,24]],"date-time":"2014-06-24T12:17:57Z","timestamp":1403612277000},"page":"325-336","source":"Crossref","is-referenced-by-count":514,"title":["XORing elephants"],"prefix":"10.14778","volume":"6","author":[{"given":"Maheswaran","family":"Sathiamoorthy","sequence":"first","affiliation":[{"name":"University of Southern, California"}]},{"given":"Megasthenis","family":"Asteris","sequence":"additional","affiliation":[{"name":"University of Southern, California"}]},{"given":"Dimitris","family":"Papailiopoulos","sequence":"additional","affiliation":[{"name":"University of Texas at Austin"}]},{"given":"Alexandros G.","family":"Dimakis","sequence":"additional","affiliation":[{"name":"University of Texas at Austin"}]},{"given":"Ramkumar","family":"Vadali","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Scott","family":"Chen","sequence":"additional","affiliation":[{"name":"Facebook"}]},{"given":"Dhruba","family":"Borthakur","sequence":"additional","affiliation":[{"name":"Facebook"}]}],"member":"320","published-online":{"date-parts":[[2013,3]]},"reference":[{"key":"e_1_2_1_1_1","unstructured":"Amazon EC2. http:\/\/aws.amazon.com\/ec2\/. Amazon EC2. http:\/\/aws.amazon.com\/ec2\/."},{"key":"e_1_2_1_2_1","unstructured":"HDFS-RAID wiki. http:\/\/wiki.apache.org\/hadoop\/HDFS-RAID. HDFS-RAID wiki. http:\/\/wiki.apache.org\/hadoop\/HDFS-RAID."},{"key":"e_1_2_1_3_1","article-title":"Asymptotic interference alignment for optimal repair of mds codes in distributed storage. Submitted to","author":"Cadambe V.","year":"2011","journal-title":"IEEE Transactions on Information Theory"},{"key":"e_1_2_1_4_1","doi-asserted-by":"crossref","first-page":"143","DOI":"10.1145\/2043556.2043571","volume-title":"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles","author":"Calder B.","year":"2011"},{"key":"e_1_2_1_5_1","first-page":"98","volume-title":"SIGCOMM-Computer Communication Review","author":"Chowdhury M.","year":"2011"},{"key":"e_1_2_1_6_1","doi-asserted-by":"crossref","first-page":"4539","DOI":"10.1109\/TIT.2010.2054295","article-title":"Network coding for distributed storage systems","author":"Dimakis A.","year":"2010","journal-title":"IEEE Transactions on Information Theory, pages"},{"issue":"3","key":"e_1_2_1_7_1","doi-asserted-by":"crossref","first-page":"476","DOI":"10.1109\/JPROC.2010.2096170","article-title":"A survey on network codes for distributed storage","volume":"99","author":"Dimakis A.","year":"2011","journal-title":"Proceedings of the IEEE"},{"key":"e_1_2_1_8_1","doi-asserted-by":"crossref","first-page":"6","DOI":"10.1145\/1713072.1713075","volume-title":"Proceedings of the 4th Annual Workshop on Petascale Data Storage","author":"Fan B.","year":"2009"},{"key":"e_1_2_1_9_1","first-page":"1","volume-title":"Proceedings of the 9th USENIX conference on Operating systems design and implementation","author":"Ford D.","year":"2010"},{"key":"e_1_2_1_10_1","volume-title":"On the locality of codeword symbols. CoRR, abs\/1106.3625","author":"Gopalan P.","year":"2011"},{"key":"e_1_2_1_11_1","volume-title":"University of California","author":"Greenan K.","year":"2009"},{"key":"e_1_2_1_12_1","volume-title":"HotStorage","author":"Greenan K.","year":"2010"},{"key":"e_1_2_1_13_1","first-page":"68","volume-title":"Computer Communications Review (CCR)","author":"Greenberg A.","year":"2009"},{"key":"e_1_2_1_14_1","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1145\/1594977.1592576","article-title":"VL2: A scalable and flexible data center network","volume":"39","author":"Greenberg A.","year":"2009","journal-title":"SIGCOMM Comput. Commun. Rev."},{"key":"e_1_2_1_15_1","doi-asserted-by":"crossref","first-page":"75","DOI":"10.1145\/1402946.1402968","article-title":"DCell: a scalable and fault-tolerant network structure for data centers","volume":"38","author":"Guo C.","year":"2008","journal-title":"SIGCOMM Comput. Commun. Rev."},{"key":"e_1_2_1_16_1","article-title":"A random linear network coding approach to multicast","volume":"4413","author":"Ho T.","year":"2006","journal-title":"IEEE Transactions on Information Theory, pages"},{"key":"e_1_2_1_17_1","volume-title":"NCA","author":"Huang C.","year":"2007"},{"issue":"6","key":"e_1_2_1_18_1","first-page":"1973","article-title":"Polynomial time algorithms for multicast network code construction. Information Theory","volume":"51","author":"Jaggi S.","year":"2005","journal-title":"IEEE Transactions on"},{"key":"e_1_2_1_19_1","volume-title":"FAST","author":"Khan O.","year":"2012"},{"key":"e_1_2_1_20_1","volume-title":"HotStorage'11: 3rd Workshop on Hot Topics in Storage and File Systems","author":"Khan O.","year":"2011"},{"key":"e_1_2_1_21_1","volume-title":"Write off-loading: Practical power management for enterprise storage. ACM Transactions on Storage (TOS), 4(3):10","author":"Narayanan D.","year":"2008"},{"key":"e_1_2_1_22_1","doi-asserted-by":"crossref","first-page":"1215","DOI":"10.1109\/INFCOM.2011.5934901","volume-title":"INFOCOM, 2011 Proceedings IEEE","author":"Oggier F.","year":"2011"},{"key":"e_1_2_1_23_1","volume-title":"ISIT","author":"Papailiopoulos D.","year":"2012"},{"key":"e_1_2_1_24_1","volume-title":"Simple regenerating codes: Network coding for cloud storage. Arxiv preprint arXiv:1109.0264","author":"Papailiopoulos D.","year":"2011"},{"issue":"8","key":"e_1_2_1_25_1","first-page":"5227","article-title":"Optimal exact-regenerating codes for distributed storage at the msr and mbr points via a product-matrix construction. Information Theory","volume":"57","author":"Rashmi K.","year":"2011","journal-title":"IEEE Transactions on"},{"issue":"8","key":"e_1_2_1_26_1","first-page":"5227","article-title":"Optimal exact-regenerating codes for distributed storage at the msr and mbr points via a product-matrix construction. Information Theory","volume":"57","author":"Rashmi K.","year":"2011","journal-title":"IEEE Transactions on"},{"key":"e_1_2_1_27_1","volume-title":"Journal of the SIAM","author":"Reed I.","year":"1960"},{"key":"e_1_2_1_28_1","doi-asserted-by":"crossref","first-page":"226","DOI":"10.1007\/11558989_21","volume-title":"Peer-to-Peer Systems IV","author":"Rodrigues R.","year":"2005"},{"issue":"4","key":"e_1_2_1_30_1","first-page":"2134","article-title":"Interference alignment in regenerating codes for distributed storage: Necessity and code constructions. Information Theory","volume":"58","author":"Shah N.","year":"2012","journal-title":"IEEE Transactions on"},{"key":"e_1_2_1_31_1","volume-title":"MDS array codes with optimal rebuilding. CoRR, abs\/1103.3737","author":"Tamo I.","year":"2011"},{"key":"e_1_2_1_32_1","volume-title":"Reed-solomon codes and their applications","author":"Wicker S. B.","year":"1994"},{"key":"e_1_2_1_33_1","first-page":"146","volume-title":"MSST","author":"Xin Q.","year":"2003"}],"container-title":["Proceedings of the VLDB Endowment"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.14778\/2535573.2488339","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,12,28]],"date-time":"2022-12-28T11:16:30Z","timestamp":1672226190000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.14778\/2535573.2488339"}},"subtitle":["novel erasure codes for big data"],"short-title":[],"issued":{"date-parts":[[2013,3]]},"references-count":32,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2013,3]]}},"alternative-id":["10.14778\/2535573.2488339"],"URL":"https:\/\/doi.org\/10.14778\/2535573.2488339","relation":{},"ISSN":["2150-8097"],"issn-type":[{"value":"2150-8097","type":"print"}],"subject":[],"published":{"date-parts":[[2013,3]]}}}