Cluster computing

Wednesday, March 16, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. We were discussing how to construct the coding equations. The paper says there are four coding equations. The coding coefficients are determined by four different cases - when none of the four parities fails, only one of the px and py fails, and both px and py fail. The coefficients of the first set are chosen among the elements whose lower order 2 bits of coefficients and their sum are zero. The coefficients of the second set are chosen among the elements whose higher order 2 bits of coefficients and their sum are always zero. Hence they will never be equal. The advantage is that this way of constructing coding equations requires a very small finite field and makes implementation practical. This makes LRC decode arbitrary three failures. To check whether a parity is information theoretically decode-able

Cluster computing

Wednesday, March 16, 2016

No comments:

Post a Comment