Thursday, March 2, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We were discussing the replication flow in Windows Azure Storage service and we looked at sealing operation among extent nodes. We also saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We now look at the partition layer which translates objects such as blob, table or queue to storage.  It maintains a massively scalable namespace for the objects and performs load balancing across the available partition servers. It enforces transaction ordering and strong consistency for access to objects. The key data structure maintained by this layer is the Object Table (OT) which can grow to several petabytes.  This table is dynamically and horizontally sharded to different RangePartitions. There are several types of Object Tables. For example, the Account table stores metadata and configuration for each storage account assigned to the stamp, the Blob tables stores all blob objects for all accounts, the entity table stores all entity rows for all accounts in the stamp, the message table stores all messages for all accounts' queues in the stamp. Each of these OTs have a fixed schema stored in the schema table. The primary key for the blob, entities and message tables consists of AccountName, PartitionName, and ObjectName which provide the indexing and sort order for those Object Tables.
Among the supported data type and operations, the BlobType is a special property introduced to store large amounts of data backed by blob data stream with a list of extent+offset, length pointers.
Because the Partition layer deals with objects and namespace, it uses a Partition Manager, partition servers  and a lock service.
The partition manager splits the object table into partitions and maintains a map of each RangePartition to its designated server. The PartitionManager has many instances. the instances take a lock from the lock service to be the leader. This design helps with the operations. Each partition server maintains its set of RangePartitions persisting their state in streams and keeping track of partitions.  Although a set of RangePartition is assigned to any one active partition server, that server uses the lock service to provide strong consistency and ordering of concurrent transactions to objects for a range partition it is serving.  The PartitionManager also uses the lock service to ensure that there is only one active partition server for a set of RangePartitions in its map table. The same PartitionServer could however handle more than one set of RangePartitions and usually from different Object Table.

#codingexercise
Given a BinaryTree and a value, check if the path sum from root to leaf equals the given value.
void getPathSum(Node root, int sum, ref List<Node> path, ref List<Node> paths)
{
if (root == null) return;
path.Add(root);
getPathSum(root.left, sum, ref path, ref paths);
getPathSum(root.right, sum, ref path, ref paths);
if (root.left == null && root.right == null && path.Sum() == sum)
    paths.add(path);
path.RemoveLast();
}
We can compare the above to 
bool hasPathSum(Node root, int sum)
{
if (root ==null) return sum == 0;
int newsum = sum-root.data;
if (newsum == 0 && root.left == null && root.right == null) return true;
return hasPathSum(root.left, newsum) || hasPathSum(root.right, newsum);
}
Two nodes of a BST are swapped, find them
void InOrderTraverse(Node root, ref prev, ref List<Node> pairs)
{
if (root == null) return;
InOrderTraverse(root.left, ref prev, ref pairs);
If (prev && root.data < prev.data) pairs.add(root);
InOrderTraverse(root.right, ref root, ref pairs);
}
The same above can also be found by inorder serializing the tree first.

No comments:

Post a Comment