Monday, February 27, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We were discussing the replication flow in Windows Azure Storage service. We now look at sealing operation among extent nodes.
 Sealing the extent means that the commit length does not change again. To do this, all the three ENs are asked for their current commit length. If one or more of the replicas have a different length, the minimum commit length is chosen. This does not cause data loss because the primary EN does not return success unless all the replicas have been written to disk. This means the minimum will have the complete data. Additionally, the minimum may even have more blocks that were not acknowledged to the client. This is fine because the client deals with timeouts and duplicate records. All of the commit records have a sequence number and the duplicate records have a sequence number.
If the SM could not talk to an EN but the client is able to, then this may cause a discrepancy to the client.  The partition layer has two different read patterns to handle this.  If the read records are at known locations, the partition layer reads at a specific offset for a specific length.  This is true for data streams which are only of two types row and blob. The location information is available from a previous successful append at the stream layer.  The append was successfully committed only if it made it to all the three replicas.
The other type of read pattern is when the stream is read sequentially from the start to the finish.  This operation occurs for metadata and commit logs. These are the only streams that the partition layer will read sequentially from a starting point to the very last record of a stream. During the partition load time, the partition layer prevents any appends to these two streams  At the start of the load, a check for commit length is requested from the primary EN of the last extent of these two streams. This checks whether all the replicas are available and that they all have the same length. If not, the extent is sealed and the reads are only performed during partition load against a replica sealed by the SM. This works for repeated loads.
It may be interesting to note that the two types of access are determined by the nature of the stream being read and not merely by their frequencies. The access pattern for metadata and logs differ from data. the latter is more frequent and arbitrary. 


#codingexercise
Print all nodes in a binary tree with k leaves.

int GetNodeWithKLeaves(Node root, int k, ref List<Node> result)
{
if (root == null) return 0;
if (root.left == null && root.right == null) return 1;
int left = GetNodeWithKLeaves(root.left, k, ref result);
int right = GetNodeWithKLeaves(root.right, k, ref result);
if (left + right == k)
{
 result.Add(root);
}
return left + right;
}

No comments:

Post a Comment