Saturday, March 4, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We reviewed the partition layer which translates objects such as blob, table or queue to storage. We saw how the partition manager maintains a massive Object Table that is sharded on different partition servers which maintain their consistency across concurrent transactions using a lock service. We read the partition layer data model and the supported data types and operations, the partition layer architecture and the range partition data structures.  We saw that the RangePartitions are made up of  the metadata stream, commit log stream, row data stream and blob data stream. The in-memory data structures use Memory Table, Index Cache, Row Data Cache and Bloom Filters.  When the PS receives  a write request to the Range Partition, it appends the operations into the commit log, and then puts the newly changed row into the memory table.  The Partition Server is responsible for assigning the broken massive Object Table to partition servers and their associated maintenance operations such as load balance, split and merge.  The load for each range operation as well as the overall load for each Partition server is tracked on the basis of transactions/second, average pending transaction count, throttling rate, the CPU usage, network usage, request latency and the data size for RangePartition. This information is sent by the PS in response to the hearbeat request from the PM. The PM decides if a PS has too much load and the PS then performs the actual split in the PartitionRanges. The PM could also take away RangePartitions and assign to another PS. For each stamp, there are usually 75 splits and merges  and 200 RangePartition load balances per day. In order to keep the loading time of a freshly arrived set of RangePartitions, the commit log is kept small with a checkpoint prior to the handoff. 
The Split and Merge operations are also done by the Partition Servers on request from the PM. The splits are made at the location specified by the key comprising of AccountName, PartitionName.  The RangePartitions maintain the sizes of the objects and the key at which the size can be halved.  The PS chooses a key based on Adaptive Range Profiling which is a technique to find the key ranges that have the most load and to use these to determine the split. Prior to each split the RangePartition is checkpointed. During the split, the PS uses a special operation called MultiModify that takes the source stream and creates a new set of streams for the splits with the same extents in the same order as in the source. This is pretty fast because its merely a pointer manipulation.
The Merge operation requires that the source RangePartitions all come to the same Partition Server. The PM moves these for the PS. The PS checkpoints both and runs the MultiModify operation which in this case is a concatenation of all of the extents from their respective streams. This concatenation takes all of the extents of one and then follows them with all of the extents of the other. The PS then makes the metadata stream with the names of the new commit log and data stream as well as the root of the data index in the merge data stream which can now be correctly loaded.
#codingexercise
Print top view of a binary tree
void PrintTopView(Node root)
{
 // similar to level wise traversal
if (root == null) return;

var d =  new SortedDictionary<int, int>();
int left = 0;
int right = 0;
var q = new Queue<Node>();
q.enqueue(root);
root.dist = 0; 
q.enqueue(null);
while (q.empty() == false)
{
var item = q.dequeue();
if (item == null){
q.enqueue(null);
}else
{
if (root.dist == 0) { if (root.contains(root.dist) == false) d[root.dist] = root.data;}
if (root.dist < left ) { if (root.contains(root.dist) == false) {left = root.dist; d[root.dist] = root.data;}}
if (root.dist > right) { if (root.contains(root.dist) == false){right = root.dist;. d[root.dist] = root.data;}}
if (root.left){
root.left.dist = root.dist-1;
q.enqueue(root.left);
}
if (root.right){
root.right.dist = root.dist+1;
q.enqueue(root.right);
}
}
}
foreach (var kvp in d)
  Console.Write("{0} ", kvp.value);
}


No comments:

Post a Comment