Cluster computing

Tuesday, March 14, 2017

#codingexercise
Find the shortest path with exactly k edges in a directed and weighted graph
int GetShortestPath(int[,] graph, int u, int v, int k)
{
if (k == 0 && u == v) return 0;
if (k == 1 && graph[u,v] != int_max) return graph[u,v];
if (k <= 0) return int_max;
int res = int_max;
for (int i = 0; i < graph.Vertices; i++)
{
if (graph[u,i] != int_max && u != i && v != i)
{
int rec = GetShortestPath(graph, i, v, k-1);
if (rec != int_max)
res = min(res, graph[u,i] + rec);
}
}
return res;

}
Given a positive integer N, count the number of binary strings without consecutive 1s
int GetCount(int n)
{
var a = int[n]; // number of required strings ending in 0
var b = int[n]; // number of required strings ending in 1
for (int i = 1; i < n; i++)
{
a[i] = a [i-1] + b[i-1];
b[i] = a[i-1];
}
return a[n-1]+b[n-1];
}

Monday, March 13, 2017

Today we surveyed literature on TCO models and draw the following examples:

Strabel and Stage developed an economic decision model that compares costs for the internal IT infrastructure (server and storage expenses) and the external provisioning by means of Cloud Computing Services (fees for CPU hour, time contingent, storage, internet service provider costs and inbound and outbound data transfer costs). Theirs is a regression model focused on the hybrid usage of internal and external infrastructure sources.

Kondo et al applied a cost benefit analysis that focused on Infrastructure as a Service aka IaaS. They compared cloud computing services to volunteer computing services and found the latter to be more economical in the long run but preferred the former for performance.

Li et al were probably ground breaking in terms of applying the company perspective of Cloud Computing on TCO because they focus on the provider. They discuss setting up and maintenance costs for a cloud such as costs of hardware, software, power, cooling, staff and real-estate. Instead of focusing on physical hardware, they try to maximize the virtual machines that can be deployed within a datacenter so that the organization is flexible to compute demands.

Martens et al detailed costing the cloud computing services and elaborated on the listing and calculation of cost factors.

This is summarized as follows:

The TCO of a cloud computing service equals the sum total of all cost types

The total amount of a cost type equals the sum total of all involved cost factors

The involved cost for a factor is determined on usage period accumulation over a set of timer periods

The cost within a unit period is the product of the required quantity and the unit cost or price.

Without going into their comprehensive elaboration of cost types and cost factors, we just take away their notion of aggregating costs and observe that it seems to work well for Infrastructure as a Service, Platform as a service and Software as a service.

#codingexercise
Greedy algorithm for Egyptian Fraction:
Egyptian Fraction is a representation of fraction as a sum of unit fractions. For example: to represent 6/14, we find the ceiling of 14/6 as 3 and write down one of the components of the egyptian fraction as 1/3. Then we recur for (6/14-1/3)
Void PrintEgyptianFraction(int nr, int dr)
{
If (dr == 0 || nr == 0) return;
If (dr%nr == 0) return;
If (nr %dr == 0) {
Console.WriteLine("1/{0}", nr/dr);
Return;
}
If (nr % dr == 0)
{
Console.WriteLine("{0}", nr/dr);
}
If (nr > dr)
{
Console.WriteLine("{0}", nr/dr);
PrintEgyptian(nr%dr, dr);
return;
}
Int n = Math.Ceil(dr/nr);
Console.WriteLine("1/{0}", n);
PrintEgyptian(nr * n - dr, dr * n);

}

Count binary strings with k times appearing adjacent set bits:
long GetCountPairBits(string bits, int n, int k)
{
assert(bits.length == n);
if (n==1) return 0;
if (k == 0 || k >= n) return 0;

long count = 0;

if (bits.endswith(“0”)){
count += GetCountPairBits(bits.substring(n-1), n-1, k);
}else{
if (bits.substring(n-1).endswith(“0”))
count += GetCountPairBits(bits.substring(n-1), n-1,k);
if (bits.substring(n-1).endswith(“1”))
count += GetCountPairBits(bits.substring(n-1), n-1, k-1) + 1;

}
return count;

}

Sunday, March 12, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We reviewed the partition layer which translates objects such as blob, table or queue to storage. We saw how the partition manager maintains a massive Object Table that is sharded on different partition servers which maintain their consistency across concurrent transactions using a lock service. We read the partition layer data model and the supported data types and operations, the partition layer architecture and the range partition data structures. we reviewed the load balancing, split and merge operations. Then we reviewed the intrastamp and interstamp replication. We also took stock of TCO, performance, application profiles and their comparision study
Today we conclude reviewing the list of design designs taken for the Windows Azure Service. This is primarily an append-only system which simplifies snapshot and replication because snapshots of previous states can now be kept at little or no cost. To keep the space overhead low, a Garbage Collection System is crucial.
To maintain integrity of user data, WAS needed to use end to end checksums. Checksum is padded onto the data from the moment it is received in the system.
Another design decision was to spread servers evenly across fault and upgrade domains. This way if the fault domain goes down, only a fraction of the servers is lost. Upgrade domains may be a different fraction.
Yet another decision was to support multiple data abstractions from a single stack. Blobs, Tables and Queues are all supported on the same stack.
We will recall that blobs, tables and queues were all tracked with the same master Object Table. This system defined abstraction reduces management by the system.
Storage is handled in buckets of 100 TB. WAS limis the amount of storage for an account to be no more than 100TB which allows all storage data pertaining to an account to fit within a given storage stamp. To obtain more storage capacity within a single data center, customers use more than one account within that location. This was considered alright for large customers because they already had multiple accounts.
WAS mentions that it seems to defy CAP theorem WAS provides high availability with strong consistency guarantees while the theorem states that high availability, consistency and partition tolerance may not be achieved at the same time.
Lastly, WAS uses extensive debug logging infrastructure and the logs are maintained on local disks.
The WAS System has also been stress tested using pressure points.
#codingexercise
To count total number of N digit numbers whose sum of even and odd digits are both even, we modify it as follows:
long GetCountEqual(int digits, int even, int odd, int n, bool isOdd)
{
if (digits == n) return (even % 2 == 0 && odd % 2 == 0);
long count = 0;
if (isOdd){
for (int i = 0; i <= 9; i++)
count += GetCountEqual(digits+1, even, odd+i, n, false);
}else{
for (int i = 0; i <= 9; i++)
count+= GetCountEqual(digits+1, even+i, odd, n, true);
}
return count;

}

Saturday, March 11, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We reviewed the partition layer which translates objects such as blob, table or queue to storage. We saw how the partition manager maintains a massive Object Table that is sharded on different partition servers which maintain their consistency across concurrent transactions using a lock service. We read the partition layer data model and the supported data types and operations, the partition layer architecture and the range partition data structures. we reviewed the load balancing, split and merge operations. Then we reviewed the intrastamp and interstamp replication. We also took stock of TCO, performance, application profiles and their comparision study
We reviewed some of their design choices. They separated nodes dedicated to computation from nodes dedicated to storage. They decided to use range based partitioning /indexing instead of hash based indexing for the partition layer's object table. They used throttling or isolation which proved very useful when accounts are not well behaved. They built in automatic load balancing on range based partitioning approach and the account based throttling. They used separate log streams for each RangePartition to isolate the load time of each RangePartition. They used journaling to overcome the read/writes contention on the same drive. This also helps optimize small writes which is the norm when the log files are separated per RangePartition.
Arguably the single most beneficial design choice was the appendoly system. The data is never overwritten once committed to the replica. If there are failures the extent is immediately sealed. Since the data doesn't change, its commit length can be used to enforce consistency across all the replicas. Another application of the append-only system is in its ability to provide snapshot or versioning features at little or no cost. With versions, erasure coding can also be done. Moreover, append only system has been a tremendous benefit for diagnosing issues as well as repairing/recovering the system on failures. With versioning, snapshots and history of the changes, we can now have tools to repair and recover corrupted state to a prior consistent state. The way this works is that the original is untouched and a copy is made for every write and appended after the previous such record. Since all changes are consistent, we can have enough performance for scaling while providing diagnosis and troubleshooting.
An append only system comes with certain costs. Since each copy takes more space, an efficient garbage collection is required to keep the space overhead low. Similarly end to end checksum is a hallmark for this kind of service so that w keep the critical information in a safe place.
#codingexercise
To count total number of N digit numbers whose difference between sum of even and odd digits is equal, we modify it as follows:
long GetCountEqualf(int digits, int even, int odd, int n, bool isOdd)
{
if (digits == n) return even == odd;
long count = 0;
if (isOdd){
for (int i = 0; i <= 9; i++)
count += GetCountEqual(digits+1, even, odd+i, n, false);
}else{
for (int i = 0; i <= 9; i++)
count+= GetCountEqual(digits+1, even+i, odd, n, true);
}
return count;

}

Friday, March 10, 2017

To skip leading zeros, we use a wrapper that counts from 1 to 9

long GetCountWrapper(int N)

{

long count = 0;

int digits = 0;

int even = 0;

int odd = 0;

for (int i = 1; i<=9; i++)

count += GetCountDiff(digits+1, even+i, odd, N, true);

return count;

}
notice that the recursion chains for the number of digits N specified.

Thursday, March 9, 2017

We continue with a detailed study of Microsoft Azure stack as inferred from an introduction of Azure by Microsoft. We reviewed some more features of Azure storage. We saw how erasure coding works in Windows Azure Storage aka WAS and how its different from Pelican. We saw what steps WAS takes to improve durability. We also saw the benefits of journaling. We reviewed the partition layer which translates objects such as blob, table or queue to storage. We saw how the partition manager maintains a massive Object Table that is sharded on different partition servers which maintain their consistency across concurrent transactions using a lock service. We read the partition layer data model and the supported data types and operations, the partition layer architecture and the range partition data structures. we reviewed the load balancing, split and merge operations. Then we reviewed the intrastamp and interstamp replication. We also took stock of TCO, performance, application profiles and their comparision study.
We now review some of their design choices. First, they separated nodes dedicated to computation from nodes dedicated to storage. This lets them both scale and load balance independently. Second, they decided to use range based partitioning /indexing instead of hash based indexing for the partition layer's object table. An account's objects are stored within a set of RangePartitions. This makes object enumeration easier because of the locality of the objects. Additionally, performance isolation is easier because we can go object by object. The hash based schemes have the simplicity of distributing the load across server but do not have the objects local. Objects grouped together by account also enables account throttling. Third, throttling or isolation is very useful when accounts are not well behaved. The statistics is collected by the partition server which keeps track of request rates for AccountNames and PartitionNames. The same determination is used for load balancing. Fourth, automatic load balancing can now be built on range based partitioning approach and the account based throttling. This improves multi-tenancy in the environment as well as handling of peaks in traffic patterns. Moreover, the algorithm can be adaptive even choosing appropriate metrics to determine traffic patterns that are better known and have been handled before. Their approach first started with a single number to quantify load on each RangePartition and each server and then used the product of request latency and request rate to represent loads.Still it did not capture high CPU utilization. CPU and network load were then also used to quantify the load. This strategy is different from the one used for splitting partitions. That decision is based on whether partitions are becoming full. Both splitting and load balancing allow hot partitions to be reallocated to different servers.
It's interesting to observe that the adaptive algorithm and the movement of partitions is equally applicable to compute resources from an IaaS. Today many private compute resource providers are asking customers to choose regions. This can be automated for load balancing and handling traffic patterns when there is little or no difference to the customers which location their resources are tenants on. The strategy applies equally to storage nodes just as much as compute nodes. Furthermore, resources can be grouped and tiered to meet different service level classifications.
#codingexercise
Count of N digit numbers whose sum of digits equals to a given sum
long GetCount(int N, int sum)
{
if (N == 0) return sum == 0;
long count = 0;
for (int i = 0; i <= 9; i++)
if (sum-i >= 0)
count += GetCount(N-1, sum-i);
return count;
}
To skip leading zeros, we use a wrapper that counts from 1 to 9
long GetCountWrapper(int N, int sum)
{
long count = 0;
for (int i = 1; i<=9; i++)
if (sum-i >= 0)
count += GetCount(N-1, sum-i);
return count;
}

To count total number of N digit numbers whose difference between sum of even and odd digits is 1, we modify the above to keep track of both even and odd sum. The check inside thr for loop is then to see if the current sum is odd or even and toggle accordingly.

Wednesday, March 8, 2017

The applications and services were Facebook and Twitter search for Bing, XBox Gamesaves, XBox Telemetry and Microsoft Zune Backend. The percentage of requests for all services showed that 17.9%of the requests used Blobs, 46.88% used Tables and 35.22% used queues. But in terms of capacity 70.31% of capacity was in Blobs, 29.68% of capacity was in Tables and 0.01% used by Queues. The tiny fraction for queues was expected because Queues use very little space as compared to Blobs and Tables. Applications profiles also seemed to have different weights of their usages of Blobs, Tables and Queues. Zune and XBox Gamesaves had mostly unstructured data and put them into Blobs whereas customers like Bing and XBox Telemetry have to index a lot of data

Earlier I brought up that one of their design decisions was to separate VM based computations from storage for Windows Azure so that both could scale and load balance independently due to their nature of access patterns. Therefore nodes running a customer's service code are separate from nodes providing their storage and WAS focused on high bandwidth without the data being on the same node or on the same rack. To improve this further, the datacenter networking topology will be flattened out to provide full bisectional bandwidth between compute and storage.
#codingexercise

Friends Pairing problem:

Given n friends, each one can remain single or can be paired up with some other friend. Each friend can be paired only once so ordering is irrelevant

The total number of ways in which the friends can be paired is given by;

Int GetPairs(int n)

{

If (n <=2) return n;

Return GetPairs(n-1) + GetPairs(n-2)*(n-1);

}

Shortest superstring problem:

Given a set of distinct strings, find the shortest superstring that contains each string in a given set as substring. The distinct strings cannot be substrings of each other:

The method to solve this is as follows:

1. Make a copy of the array to begin with

2. Find a pair that has maximum overlap as a and b

3. Replace a and b with the concatenation ab

4. The size of the array reduces by one and we repeat 1 to 4 until we are left with only one super string.

5. Since we find the maximum overlap at each point, we make a greedy choice for the solution. This is an approximate solution and not necessarily the most optimal.

String FindShortestSuperString(List<string> input)

{

if (input == null || input.count == 0) return string.empty;

Var items = new string[input.Length];

Input.CopyTo(items);

Var tmp = items.ToList();

While (tmp.Count > 1)

{

Int max = 0;

Int l = 0;

Int r = 0;

For (int I = 0; I < tmp.Count; I++)

For (int j = 0; j < tmp.Count; j++)

{

If ( I != j ){

var cur = GetOverlapLength(tmp[I], tmp[j]);

If (cur > max) { max = cur; l = I; r = j; }

}

If (l == 0 && r==0) r = l + 1;

Var new = tmp[I] + tmp[r];

Tmp.RemoveAt(l);

Tmp.RemoveAt(r);

Tmp.InsertAt(l, new);

}

Return tmp.first();

}