We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We reviewed the partition layer which translates objects such as blob, table or queue to storage. We saw how the partition manager maintains a massive Object Table that is sharded on different partition servers which maintain their consistency across concurrent transactions using a lock service. We read the partition layer data model and the supported data types and operations, the partition layer architecture and the range partition data structures. we reviewed the load balancing, split and merge operations. Then we reviewed the intrastamp and interstamp replication. We also took stock of TCO. We also reviewed the performance of WAS and a comparision study of the usages of Blobs, Tables and Queues by different cloud applications and services.
The applications and services were Facebook and Twitter
search for Bing, XBox Gamesaves, XBox Telemetry and Microsoft Zune Backend. The
percentage of requests for all services showed that 17.9%of the
requests used Blobs, 46.88% used Tables and 35.22% used queues. But in terms of
capacity 70.31% of capacity was in Blobs, 29.68% of capacity was in Tables and
0.01% used by Queues. The tiny fraction for queues was expected because Queues
use very little space as compared to Blobs and Tables. Applications profiles
also seemed to have different weights of their usages of Blobs, Tables and
Queues. Zune and XBox Gamesaves had mostly unstructured data and put them
into Blobs whereas customers like Bing and XBox Telemetry have to index a lot
of data
Earlier I brought up that one of their design decisions was
to separate VM based computations from storage for Windows Azure so that
both could scale and load balance independently due to their nature of access
patterns. Therefore nodes running a customer's service code are separate from
nodes providing their storage and WAS focused on high bandwidth without the
data being on the same node or on the same rack. To improve this further, the
datacenter networking topology will be flattened out to provide full
bisectional bandwidth between compute and storage.
#codingexercise
#codingexercise
Friends
Pairing problem:
Given
n friends, each one can remain single or can be paired up with some other
friend. Each friend can be paired only once so ordering is irrelevant
The
total number of ways in which the friends can be paired is given by;
Int
GetPairs(int n)
{
If
(n <=2) return n;
Return
GetPairs(n-1) + GetPairs(n-2)*(n-1);
}
Shortest
superstring problem:
Given
a set of distinct strings, find the shortest superstring that contains
each string in a given set as substring. The distinct strings cannot be
substrings of each other:
The
method to solve this is as follows:
1.
Make a copy of the array to begin with
2.
Find a pair that has maximum overlap as a and b
3.
Replace a and b with the concatenation ab
4.
The size of the array reduces by one and we repeat 1 to 4 until we are left
with only one super string.
5.
Since we find the maximum overlap at each point, we make a greedy choice for
the solution. This is an approximate solution and not necessarily the most
optimal.
String
FindShortestSuperString(List<string> input)
{
if
(input == null || input.count == 0) return string.empty;
Var
items = new string[input.Length];
Input.CopyTo(items);
Var
tmp = items.ToList();
While
(tmp.Count > 1)
{
Int
max = 0;
Int
l = 0;
Int
r = 0;
For
(int I = 0; I < tmp.Count; I++)
For (int j = 0; j < tmp.Count; j++)
{
If ( I != j ){
var cur = GetOverlapLength(tmp[I], tmp[j]);
If (cur > max) { max = cur; l = I; r = j;
}
}
}
If
(l == 0 && r==0) r = l + 1;
Var
new = tmp[I] + tmp[r];
Tmp.RemoveAt(l);
Tmp.RemoveAt(r);
Tmp.InsertAt(l,
new);
}
Return
tmp.first();
}
No comments:
Post a Comment