Wednesday, March 8, 2017


We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We reviewed the partition layer which translates objects such as blob, table or queue to storage. We saw how the partition manager maintains a massive Object Table that is sharded on different partition servers which maintain their consistency across concurrent transactions using a lock service. We read the partition layer data model and the supported data types and operations, the partition layer architecture and the range partition data structures. we reviewed the load balancing, split and merge operations. Then we reviewed the intrastamp and interstamp replication. We also took stock of TCO. We also reviewed the performance of WAS and a comparision study of the usages of Blobs, Tables and Queues by different cloud applications and services.
The applications and services were Facebook and Twitter search for Bing, XBox Gamesaves, XBox Telemetry and Microsoft Zune Backend. The percentage  of requests for all services showed that 17.9%of the requests used Blobs, 46.88% used Tables and 35.22% used queues. But in terms of capacity 70.31% of capacity was in Blobs, 29.68% of capacity was in Tables and 0.01% used by Queues. The tiny fraction for queues was expected because Queues use very little space as compared to Blobs and Tables. Applications profiles also seemed to have different weights of their usages of Blobs, Tables and Queues.  Zune and XBox Gamesaves had mostly unstructured data and put them into Blobs whereas customers like Bing and XBox Telemetry have to index a lot of data
Earlier I brought up that one of their design decisions was to separate VM based computations from storage  for Windows Azure so that both could scale and load balance independently due to their nature of access patterns. Therefore nodes running a customer's service code are separate from nodes providing their storage and WAS focused on high bandwidth without the data being on the same node or on the same rack. To improve this further, the datacenter networking topology will be flattened out to provide full bisectional bandwidth between compute and storage. 
#codingexercise
Friends Pairing problem:
Given n friends, each one can remain single or can be paired up with some other friend. Each friend can be paired only once so ordering is irrelevant
The total number of ways in which the friends can be paired is given by;
Int GetPairs(int n)
{
If (n <=2) return n;
Return GetPairs(n-1) + GetPairs(n-2)*(n-1);
}

Shortest superstring problem:
Given a set of distinct strings, find the shortest superstring  that contains each string in a given set as substring. The distinct strings cannot be substrings of each other:
The method to solve this is as follows:
1. Make a copy of the array to begin with
2. Find a pair that has maximum overlap as a and b
3. Replace a and b with the concatenation ab
4. The size of the array reduces by one and we repeat 1 to 4 until we are left with only one super string.
5. Since we find the maximum overlap at each point, we make a greedy choice for the solution. This is an approximate solution and not necessarily the most optimal.
String FindShortestSuperString(List<string> input)
{
if (input == null || input.count == 0) return string.empty;
Var items = new string[input.Length];
Input.CopyTo(items);
Var tmp = items.ToList();
While (tmp.Count > 1)
{
Int max = 0;
Int l = 0;
Int r = 0;
For (int I = 0; I < tmp.Count; I++)
   For (int j = 0; j < tmp.Count; j++)
     {
         If ( I != j ){
            var cur = GetOverlapLength(tmp[I], tmp[j]);
            If (cur > max) { max = cur; l = I; r = j; }
         }
     }
If (l == 0 && r==0) r = l + 1;
Var new = tmp[I] + tmp[r];
Tmp.RemoveAt(l);
Tmp.RemoveAt(r);
Tmp.InsertAt(l, new);
}
Return tmp.first();
}


No comments:

Post a Comment