Cluster computing

Tuesday, March 29, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We were reviewing the performance of LRC in WAS. We now review related work. Erasure coding has been applied in many large scale distributed storage systems. Its better than simple replication because it provides higher reliability and it requires much lower storage for the same reliability. Even when the distributed systems apply erasure coding, they typically apply only Reed-Solomon. LRC has proved better than that both in terms of efficiency and performance tradeoff. Moreover, LRC reduces costs. For example, node failures trigger reconstruction which is a significant cost. No data is lost on such transient errors and these account for nearly 90% of data center failures. During these periods as well as upgrades, even though there is no background data rebuilding, reads accessing these unavailable nodes are served through reconstruction. Some system techniques like load balancing and prioritization help but LRC explores whether the erasure coding scheme itself can be optimized to reduce repair traffic and improve user I/Os. In addition to these benefits, LRC exploits non-uniform parity degrees, where some parities connect to fewer data nodes than others. This facilitates efficient reconstruction and follows the new trend in the storage industry. Previous work applied enumeration and heuristic methods to search for parity check erasure codes of small length, Due to exponential search space, the exploration was limited to 3,4 and 5 parities. These could not tolerate arbitrary three failures where as LRC can.
#codingexercise
List<int> GetReedSolomon(Int[,] coefficients, List<String> message, int n)
{
var Cx = new List<int>();
for (int i = 0; i < n; i++)
{
Cx.Add(getPolynomial(coefficients, ref message));
}
return Cx;
}
String getPolynomial( Int[,] coefficients ref List<string> message)
{
int sum = 0;
for (int i = 0; i <message.count; i++)
sum += get_coefficient(coefficients, i-1)*message[i];
return sum;

}

Monday, March 28, 2016

Sunday, March 27, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We were discussing data consistency checks and the use of CRC. We now look at performance considerations in WAS. Storage is in the firm of blobs, tables, queues and drive. Application have different workloadstyles with smaller size I/Os for tables and queues and larger I/Os for blobs and drives, typically greater than 4MB. A production cluster was used with significant load variations over time. The LRC was compared with ReedSolomon codes. The key metric for small I/Os was latency and the number of I/Os made by the requests. The key metric for large I/Os was latency and the bandwidth. In the small I/O case the experiments were run with a mixture of direct read of a single fragment reconstruction reads. When the cluster load is light all the reconstruction cost is very low. The reconstruction reads performed poorly due to effect of the slowest fragments. LRC performed better than Reed Solomon due to the lower number of fragments used.
The results from the large I/O were very different from the small I/O case. Even when the cluster load is light, the reconstruction with Erasure coding us much slower than direct read cost due to the bandwidth consumed. When the cluster load was heavy, the latency came from the network card. Even here the LRC performed better than Reed Solomon due to the reduced number of fragments.
#coding exercise
Implement Galois field polynomial addition or subtraction
Polynomial is represented as coefficient and power pairs in as ending order
They are all grouped and operated.
List <Term> AddPolynomial ( List <Term> first, List <Term> Second)
{
int ret = new List < Term>();
int max = first.last.power;
If ( max < second.last.power )
max = second.last.power
For (int I = 0; I <= max; I++)
{
var firstTerm = first.find( power = i); // returns empty if not found
var secondTerm = second.find(power = i);
Ret.add ( new Term (firstTerm.coefficient + secondTerm.coefficient, i));
}
Return ret;
}

Saturday, March 26, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We saw how to distribute the fragments in LRC for WAS. We were also reviewing the designing for Erasure Coding where we saw how various I/O types required to be scheduled. We also saw the reconstruction read-ahead and caching. Now we read the consistency of coded data. This is largely done to mitigate data corruption which can occur when data is at rest, being read and written or in transit. Therefore it is essential to check the consistency of data in every step of the storage system and to periodically scrub the data at rest. This is done with checksum and parity which are two primary mechanisms to protect against data corruption.In Windows Azure Storage each append block contains a CRC of the data block which can help detect data and metadata corruptions. If the CRC check fails, then the data is reconstructed using other combinations of erasure coded fragments on the next available EN.
To ensure that the erasure coding operation does not itself introduce data inconsistency, each operation is validated by a few decoding operations. For LRC, these validations include randomly choosing one data fragment in each local group and reconstruct it using its local group, doing the same with one global parity and the remaining data fragments, doing it with the other global parity and the remaining data fragments, choosing two data fragments and reconstructing them, choosing three data fragments and reconstructing them and randomly choosing four data fragments. Finally the coordinator EN performs a CRC of all of the final data fragments. and checks this CRC against that of the original extent Together all these checks ensure data consistency.
Erasure Coding involves Galois Field arithmetic but using it directly is quite expensive. Instead addition and multiplication tables are used. Reed Solomon codes are better implemented with the use of XOR operations and more so by ordering the XOR operations based on the patterns in the coding and decoding matrices. This way some redundancies and repetitions are avoided.
#codingexercise
Implement a 3 bit CRC on a 14 bit message
List<Bit> GetCRC(List<bit> message, List<bit> polynomial, int n)
{
for (int i = 0; i < n; i ++) message.add(0); // pad right to n
int last = message.Length - polynomial.Length;
for (int start = 0; start < last; start++)
if (message.Value() != 0)
for (int i = start; < polynomial.Length; i++) message[i] = message[i] ^ polynomial[i];
return message.GetRange(last. n);
}

Friday, March 25, 2016

We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We saw how to distribute the fragments in LRC for WAS. We now review the design aspects such as scheduling of various I/O types. There are a variety of I/O types from clients including open/close, read/append, create or delete, replicate, scrub and move operations. If all these I/Os were to happen at their own pace, the system will tank. To make the system fair and responsive, operations are subject to throttling and scheduling at all levels of the storage system. I/O requests can be accepted, rejected or delayed on every extent node and stream manager. Similarly the SM keeps track of data replication load on individual ENs and the system as a whole to make decisions on when to initiate replication, erasure coding, deletion and various other system maintenance. Coding and decoding both involve multiple ENs, scheduling and throttling of these information become important. It is also important to make sure erasure coding is keeping up with the incoming data rate from the customers as well as the internal system functions such as garbage collection. If we look at the scale of the overhead in terms of the data size which can easily be petabytes in every few days, the importance of scheduling such operations becomes all the more important.
Reconstruction read ahead and caching is another design point for LRC in WAS. Reconstruction of unavailable fragments is done in unit sizes. These have to be greater than the individual append blocks to reduce the number of disk and network I/Os. Append blocks are 5MB in sizes. Read ahead data of upto 250 MB is cached in memory (upto 256 MB) of the Extent Node that has done the reconstruction. When the reads are sequential, there is now no need to goto the disk because the cache works well in this regard.
#codingexercise
print all permutations of a string in ascending order
Void Permute (String a, StringBuilder b, bool[] used, ref List<string> permutations)
{
If ( b.Length == a.Length) { permutations.Add(b.toString()); return;}
For (int I = 0; I < a.Length ; i++)
{
If (used[i]) continue;
used[i] = true;
B += A[i];
Permute(a, b, used, ref permutations);
B [B.Length – 1] = ‘/0’;
used[i] = false;
}
}
permutations.Sort();
Another way could be to sort the input string beforehand so as to eliminate the last step sort above.

Thursday, March 24, 2016

Void Combine (List<List<char>> overlaps, List<List<char>> b, List<char> target, int start, int level)

{

For (int I = start ; I < overlaps.Length; i++)

{

b[level] = overlap[i];
if (b.toString() == target.toString()) Console.WriteLine('True');

If (I < overlaps.Length)

Combine(overlaps,b, target, start+1,level+1)

B[level] = NULL;

}