We continue reading the paper "Erasure coding in Windows Azure storage" by Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin.
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. There are three components to a WAS architecture - the front end, the object layer and the stream replication layer. All of these three components are present within any single storage cluster. The Erasure coding is implemented within stream replication layer which is responsible for the robustness of data. The object partition is in a separate layer so it can scale out. The main components of a stream layer are the stream managers and extent nodes. The nodes consist of blocks which is the unit for Crc which implies data is read or appended using blocks. When the extent reaches maximum size, a new extent is created. When the extent is sealed it becomes immutable. During Erasure coding, the SM creates fragments on a set of ENs which are responsible for . And then one of the EN is chosen as the coordinator which has the responsibility of making sure the Erasure coding completes.the coordinator EN decides which fragment boundaries for all of the fragments will be in the extent. Then this is communicated to the target ENs. The Erasure coding can be paused and resumed anytime because it is offline.
#coding exercise
Find duplicate substrings in linear time between two strings. Matching substrings do not overlap on the same string. Their sequences also match.
List <List <char>> getoverlaps ( list<char> a, list <char> b )
{
var ret = new List <List<char>>();
for (int I = 0; I < a.count; i++)
for ( int j = 0; j < b.count; j++)
{
If (a [i] == b [j])
{
Int m =i;
Int n = j;
int k = 0;
var word = new List <char>();
while (a [m] == b [n] && m < a.count && N < b.count)
{
Word.add (a [m]);
M++;
N++;
K++;
}
Ret.add (word);
I = I + k;
J = j + k;
}
}
Return ret;
}
Erasure coding is a data protection method where data is broken into fragments, expanded and encoded with redundant data pieces and stored across a different set of locations or media. This paper talks about Erasure coding in windows Azure storage WAS. It introduces new codes called the Local Reconstruction Codes.A (k,l,r) LRC divides k data fragments into l local groups. It encodes l local parities, one for each local group and r global parities. There are three components to a WAS architecture - the front end, the object layer and the stream replication layer. All of these three components are present within any single storage cluster. The Erasure coding is implemented within stream replication layer which is responsible for the robustness of data. The object partition is in a separate layer so it can scale out. The main components of a stream layer are the stream managers and extent nodes. The nodes consist of blocks which is the unit for Crc which implies data is read or appended using blocks. When the extent reaches maximum size, a new extent is created. When the extent is sealed it becomes immutable. During Erasure coding, the SM creates fragments on a set of ENs which are responsible for . And then one of the EN is chosen as the coordinator which has the responsibility of making sure the Erasure coding completes.the coordinator EN decides which fragment boundaries for all of the fragments will be in the extent. Then this is communicated to the target ENs. The Erasure coding can be paused and resumed anytime because it is offline.
#coding exercise
Find duplicate substrings in linear time between two strings. Matching substrings do not overlap on the same string. Their sequences also match.
List <List <char>> getoverlaps ( list<char> a, list <char> b )
{
var ret = new List <List<char>>();
for (int I = 0; I < a.count; i++)
for ( int j = 0; j < b.count; j++)
{
If (a [i] == b [j])
{
Int m =i;
Int n = j;
int k = 0;
var word = new List <char>();
while (a [m] == b [n] && m < a.count && N < b.count)
{
Word.add (a [m]);
M++;
N++;
K++;
}
Ret.add (word);
I = I + k;
J = j + k;
}
}
Return ret;
}
No comments:
Post a Comment