Wednesday, February 22, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage.
The IaaS offerings from Azure storage services include disks and files where as the PaaS offerings from Azure storage services include objects, tables and queues. The storage offerings are built on a unified distributed storage system with guarantees for durability, encryption at Rest, strongly consistent replication, fault tolerance and auto load balancing.
 The redundancy features are handled by storage stamps and a location service. A storage stamp is a cluster of N racks of storage nodes, where each rack is built out as a separate fault domain with redundant networking and power.  Each storage stamp is located by a VIP  and served by layers of Front-Ends, partition layer and stream layer that provides intra stamp replication.The intra stamp provides synchronous replication in the stream layer and ensures that the data is made durable within the storage stamp. The interstamp replication provides asynchronous replication and is focused on replicating data across stamps. The Inter stamp replication is done in the background so that it does not conflict with the critical path of the customer's request.The intra stamp replication provides durability against hardware failure which is more frequent in large scale systems. The inter-stamp replication provides geo-redundancy against geo-disasters which are relatively rare. The intra stamp replication must occur with low latency because it is on the critical path. The inter-stamp replication on the other hand is concerned with the optimal use of network bandwidth between stamps within an acceptable level of replication delay.   The Stream layer stores the bits on disk and handles distribution and replication of the data across many servers to keep the data durable within a storage stamp.  It is called a stream layer because it handles ordered lists of large storage chunks called extents and an extent is a sequence of append blocks. Only the last extent in a stream can be appended to.  A block is the minimum unit of data for reading and writing. A block can be upto N bytes say 4MB. Data is written to as one or more concatenated blocks to an extent where blocks do not have to be the same size. A client read gets an offset to a stream or extent and the stream layer reads as many blocks as needed at the offset to fulfill the length of the read. The checksums are stored at the block level, one checksum per block. and all the checksums are validated once every few days. Extents are the units of replication Each extent is stored in an NTFS file.  The Stream Manager keeps track of streams and extents not blocks or block appends and therefore stays away from critical path. Since stream and extents are tracked within a single stamp, the stream manager can keep the state in the memory. The stream manager may not use more than fifty million extents and no more than hundred thousand streams for a single storage stamp and their state can fit into 32GB memory for the storage stamp. 
#codingexercise
Get count of nodes not sub-trees in a BST that lie within a given range
int getCount(Node root, int min, int max)
{
if (root == null) return 0;
int left = getCount(root.left, min, max);
int right = getCount(root.right, min, max);
if (root.data >= min && root.data <= max)
   return left + right +1;
return left + right;
}
we could further optimize this by ignoring subtrees that are exclusively smaller or exclusively larger such as when one of the siblings is null and the other sibling continues with the invalidity of the root. The same applies to node with sentinel values of range

No comments:

Post a Comment