Cluster computing

Tuesday, March 15, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. We were discussing the design decisions such as why Reed-Solomon code was not sufficient and why the paper introduced a new one. We also saw how the Local Reconstruction code introduces group fragment parity and improves reconstruction cost. This is with some storage overhead but provides more options with a new trade-off point. We discussed the coding equations and LRC's maximally recoverable property which is a way of saying LRC can decode any failure pattern which is possible to reconstruct.
The challenge was to find a set of coding equations that achieves the maximally recoverable property
These coding equations were found this way:
LRC can tolerate arbitrary 3 failures by choosing the following two sets of coding coefficients a and b for groups x and y. For each group, each coefficient is linearly applicable for each of the three failures, or applicable as the squares or none at all. The same goes for the other coefficient for the other group. This gives us six equations.
Then the LRC's coding equation are as follows: the sum of the linear coefficients of each group as p0, the sum of the square coefficients of each group as p1, the no coefficients case for one group as px, and the no coefficients case for the other group as py.
These coefficients can then be found by applying the following scenarios:
None of the four parities fails
Only one of the px and py fails
Both px and py fail
To ensure all the cases are decodable, all the matrices should be non-singular which leads to the following conditions.
None of the coefficients are zero.
None of the coefficients of group 1 are same as that of group 2
the sum of the coefficients of group 1 should not be the same as the sum of the coefficients of group 2.
#codejam problem
Determine sums of list of integers
public static List<int> Sums( this List<List<int>> numbers)
{
var ret = new List<int>();
numbers.ForEach (x => ret.Add(x.Sum()));
return ret;
}
Could we consider implementing extension methods in C++ ?
In C++ we have the ability to wrap classes and structs if we cannot inherit from them. Therefore we can add methods easily as function pointers or methods.

Cluster computing

Tuesday, March 15, 2016

No comments:

Post a Comment