Thursday, February 23, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage.
The IaaS offerings from Azure storage services include disks and files where as the PaaS offerings from Azure storage services include objects, tables and queues. The storage offerings are built on a unified distributed storage system with guarantees for durability, encryption at Rest, strongly consistent replication, fault tolerance and auto load balancing.
 The redundancy features are handled by storage stamps and a location service. A storage stamp is a cluster of N racks of storage nodes, where each rack is built out as a separate fault domain with redundant networking and power.  Each storage stamp is located by a VIP  and served by layers of Front-Ends, partition layer and stream layer that provides intra stamp replication.The intra stamp provides synchronous replication in the stream layer and ensures that the data is made durable within the storage stamp. The interstamp replication provides asynchronous replication and is focused on replicating data across stamps. We were looking at the differences between the interstamp and intrastamp replication.
The Stream layer operates in terms of Block, Extent and Streams.  These are units of organization for reading, replication and naming respectively.   Stream Manager and Extent Nodes play the dominant roles in the stream layer.  The Stream manager keeps track of the stream namespace, what extents are in each stream, and the extent across the Extent Nodes.  The Stream Manager is responsible for maintaining the streaming namespace and the state of all active streams, monitoring the health of Extent Nodes, creating and assigning extent to Extent Nodes, performing the lazy replication of lost extent replicas, garbage collecting extents that are no longer pointed to any stream and scheduling the erasure coding of extent data according to stream policy. In other words, the stream manager does all those activities that do not affect the critical path of the client requests. The SM queries the extent nodes to see what extents they store. Any extent that does not have a quorum is lazily restarted. ENs are randomly selected. The SM is agnostic about blocks. This helps it to keep and fit all the state about the extents in its memory.  The Stream layer is used only by the partition layer and the partition layer and stream layer  are co-designed so that they will not use more than fifty million extents and no more than hundred thousand streams for a single storage stamp given our current stamp size.
The Extent Nodes maintain the storage for a set of extent replicas assigned to it by the SM. It does not look at the layer above and knows nothing about streams. Every extent on the disk is a file, which holds data blocks and their checksums and an index that maps extent offsets to blocks. The EN maintains a view of its own extents and those of the peer replicas  in its cache Its the SM that garbage collects the extent and notifies the ENs to reclaim the space.
#codingexercise
Get count of nodes not sub-trees in a BST that lie within a given range with optimizations.
int getCount(Node root, int min, int max)
{
if (root == null) return 0;
if (root.data < min && root.right == null) return 0;
if (root.data > max && root.left == null) return 0;
int left = getCount(root.left, min, max);
int right = getCount(root.right, min, max);
if (root.data >= min && root.data <= max)
   return left + right +1;
return left + right;
}
an alternative would be to list the nodes of the BST in inorder and select only those that fall within the range.
ToInorder(root).Select(x => min <= x && x >= max).ToList();

No comments:

Post a Comment