Saturday, March 26, 2016

We continue reading the paper "Erasure coding in Windows Azure storage"  by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. We saw how to distribute the fragments in LRC for WAS. We were also reviewing the designing for Erasure Coding where we saw how various I/O types required to be scheduled. We also saw the reconstruction read-ahead and caching. Now we read the consistency of coded data. This is largely done to mitigate data corruption which can occur when data is at rest, being read and written or in transit. Therefore it is essential to check the consistency of data in every step of the storage system and to periodically scrub the data at rest. This is done with checksum and parity which are two primary mechanisms to protect against data corruption.In Windows Azure Storage each append block contains a CRC of the data block which can help detect data and metadata corruptions. If the CRC check fails, then the data is reconstructed using other combinations of erasure coded fragments on the next available EN.
To ensure that the erasure coding operation does not itself introduce data inconsistency, each operation is validated by a few decoding operations. For LRC, these validations include randomly choosing one data fragment in each local group and reconstruct it using its local group, doing the same with one global parity and the remaining data fragments, doing it with the other global parity and the remaining data fragments, choosing two data fragments and reconstructing them, choosing three data fragments and reconstructing them and randomly choosing four data fragments. Finally the coordinator EN performs a CRC of all of the final data fragments. and checks this CRC against that of the original extent Together all these checks ensure data consistency.
Erasure  Coding involves Galois Field arithmetic but using it directly is quite expensive. Instead addition and multiplication tables are used. Reed Solomon codes are better implemented with  the use of XOR operations and more so by ordering the XOR operations based on the patterns in the coding and decoding matrices. This way some redundancies and repetitions are avoided.
#codingexercise
Implement a 3 bit CRC on a 14 bit message
List<Bit> GetCRC(List<bit> message, List<bit> polynomial, int n)
{
for (int i = 0; i < n; i ++) message.add(0); // pad right to n
int last = message.Length - polynomial.Length;
for (int start = 0; start < last; start++)
    if (message.Value() != 0)
        for (int i = start;  < polynomial.Length; i++) message[i] = message[i] ^ polynomial[i];
return message.GetRange(last. n);
}

No comments:

Post a Comment