Cluster computing

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure networking and started discussing Azure storage.The IaaS offerings from Azure storage services include disks and files where as the PaaS offerings from Azure storage services include objects, tables and queues. The storage offerings are built on a unified distributed storage system with guarantees for durability, encryption at Rest, strongly consistent replication, fault tolerance and auto load balancing. Disks are of two types in Azure - Premium disks (SSD) and Standard Disks (HDD) and are backed by page blobs in Azure storage Disks are offered out of about 26 Azure regions with server side encryption at rest and Azure disk encryption with BitLocker/DMCrypt. In addition, disks come with blob cache technology, enterprise grade durability with three replicas, snapshot for backup, ability to expand disks, and REST interface for developers.
Azure files is a fully managed cloud file storage for use with IaaS and on-premise instances. The scenarios cover include lift and shift, host high availability workload data and backup and disaster recovery.
Azure blobs are of three types - Block blobs, Append blobs, Page blobs. Block blobs are used for document, images, video etc. Append blobs are used for multi-writer append only scenarios such as logging and big data analytics output. Page blobs are used for page aligned random reads and writes IaaS Disks, Event Hub, Block level backup. The blob storage is tiered. There are two tiers - hot tier and cold tier. Hot is used for commonly used data and cold for rarely used data. The former is roughly 2.4cents per GB per month and the cold is one cent per GB per month. There is no charge for hot to cool switch. Generally the rule of thumb is to pick hot tier when we have frequent use and to pick the cold tier when we have high volume. Both tiers are manageable with API that is identical and offers similar throughput and latency. The redundancy options include Locally Redundant Storage aka LRS, Geo Redundant Storage aka GRS and Read Access - Geo Redundant Storage aka RA-GRS.
In LRS, all data in the storage account is made durable by replicating transactions synchronously to three different storage nodes within the same region.
GRS is the default option for redundancy when a storage account is created. In addition to what LRS does, GRS also queues up asynchronous replication to another secondary region where another three more storage nodes are made available.
The RA-GRS gives read only access to a storage account's data in the secondary region from the GRS redundancy. Since the secondary region is used asynchronously, it will eventually have a consistent version of the data. LRS costs less than GRS and has higher throughput than GRS and is especially suited for applications that have their own geo replication strategy.
The paper on Windows Azure Storage says that the system consists of storage stamps and location service.
A storage stamp is a cluster of N racks of storage nodes, where each rack is built out as a separate fault domain with redundant networking and power. Clusters typically range from 10 to 20 racks with 18 disk heavy storage nodes per rack. Each storage stamp is located by a VIP and served by layers of Front-Ends, partition layer and stream layer that provides intra stamp replication. Generally a storage stamps is kept 70% utilized in terms of capacity, transaction and bandwidth. When a storage stamp reaches 70% utilization, the location service migrates accounts to different stamps using inter-stamp replication.

#codingexercise

Convert a BST into a Min Heap

Node ToMinHeap(Node root)

{

if (root == null) return null;

var sorted = new List<Node>();

ToInOrderList(root, ref sorted);

var heap = ToMinHeap(sorted);

return heap;

}

void ToInOrderList(Node root, ref List<node> all)

{

if (root == null) return;

ToInOrderList(root.left, ref all);

all.Add(root);

ToInOrderList(root.right, ref all);

}

Void ToMinHeap(Node root, ref List<Node> sorted)

{

Sorted.ForEach(x = > {x.left = null; x.right = null;});

For (int I = 1; I < sorted.Count()/2; I++)

MinHeapify(sorted, i)

}

Void MinHeapify(List<Node> sorted, int i)

{

Sorted[i-1].left = ( 2 x i <= sorted.count ) ? sorted[2xi-1] : null;

Sorted[i-1].right = (2 x I + 1 <= sorted.count) ? sorted[2xi+1-1] : null;

}
eg: 1 2 3 4 7 8 9 10 14 16
To heapify an unsorted list:
void Min_Heapify(List<Node> A, int i) // 1-based
{
int l = 2 * i;
int r = 2 * i + 1;
int smallest = i;
if (l <= A.Count and A[l] < A[i])
smallest = l;
else
smallest = i;
if (r <= A.Count and A[r] < A[smallest])
smallest = r;
if (smallest != i)
Swap(A[i], A[smallest]);
Min_Heapify(A, smallest);

}

Cluster computing

Saturday, February 18, 2017

No comments:

Post a Comment