Tuesday, March 29, 2016

We continue reading the paper "Erasure coding in Windows Azure storage"  by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We were reviewing the performance of LRC in WAS. We now review related work. Erasure coding has been applied in many large scale distributed storage systems. Its better than simple replication because it provides higher reliability and it requires much lower storage for the same reliability. Even when the distributed systems apply erasure coding, they typically apply only Reed-Solomon. LRC has proved better than that both in terms of efficiency and performance tradeoff. Moreover, LRC reduces costs. For example, node failures trigger reconstruction which is a significant cost. No data is lost on such transient errors and these account for nearly 90% of data center failures. During these periods as well as upgrades, even though there is no background data rebuilding, reads accessing these unavailable nodes are served through reconstruction. Some system techniques like load balancing and prioritization help but LRC explores whether the erasure coding scheme itself can be optimized to reduce repair traffic and improve user I/Os. In addition to these benefits, LRC exploits non-uniform parity degrees, where some parities connect to fewer data nodes than others. This facilitates efficient reconstruction and follows the new trend in the storage industry.  Previous work applied enumeration and heuristic methods to search for parity check erasure codes of small length, Due to exponential search space, the exploration was limited to 3,4 and 5 parities. These could not tolerate arbitrary three failures  where as LRC can.
#codingexercise
List<int> GetReedSolomon(Int[,] coefficients, List<String> message, int n)
{
var Cx = new List<int>();
for (int i = 0; i < n; i++)
{
Cx.Add(getPolynomial(coefficients, ref message));
}
return Cx;
}
String getPolynomial( Int[,] coefficients ref List<string> message)
{
int sum = 0;
for (int i = 0; i <message.count; i++)
sum += get_coefficient(coefficients, i-1)*message[i];
return sum;

}

No comments:

Post a Comment