Cluster computing

Friday, March 18, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes. A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities.Any single data fragment can decoded from k/l fragments within its local group. In addition, LRC achieves maximally recoverable property It tolerates upto r+1 arbitrary fragment failures. It also tolerates failures more than r+1 and upto l+r as long as they are information-theoretically decodeable. Finally LRC provides low storage overhead. It requires the minimum number of parities as compared to other such codes.
The choice of the k,l and r parameters for LRC code is dependent on the reliability outcomes. At a minimum, three-replication is required. This is an industry standard. Higher reliability can be achieved by choosing the parameters with a Markov reliability model. This model considers both independent and correlated failures. The model is extended by LRC to capture unique state transitions. Those transitions are introduced because the failure mode depends on not only the size of the failures, but also which subset of nodes fails. Each state in the Markov process represents the number of available fragments (data and parity). Let lambda denote the failure rate of a single fragment. The transition rates from all fragments healthy state N to state N-1 is N times lambda. Let pd denote the percentage of four decodeable failures. Then the transition state to decodeable failures is N lambda pd and that for non-decodeable failures is N lambda (1-pd) The reverse transition going from a lower state to a higher state is equal to the average repair rate of a single fragment. Repair rates beyond a single failure are dominated by the time taken to detect failure and trigger repair.
#codingexercise
Normalize two strings with a common pattern into substrings that don't have any overlap.
List<char> normalize(List<char> a, List<char> b)
{
var ret = new List<char>();
// assume one pattern to match
var match = a.Intersect(b);
var fragmenta = a.Except(match);
ret.add(fragmenta);
ret.add(match);
var fragmentb = b.Except(match);
ret.add(fragmentb);
// repeat the above for more matches based on regex groups or KMP
return ret;
}
In a graph, the branches of a minimum spanning tree are normalized paths

Cluster computing

Friday, March 18, 2016

No comments:

Post a Comment