Cluster computing

Wednesday, July 27, 2016

Today we continue our discussion of the paper titled "Pelican: a building block for exascale cold data storage". Pelican treats a group of disks as a single schedulable unit. Resource restrictions such as power consumption, vibrations and failure domains are expressed as constraints over these units.

With the help of resource constraints and scheduling units, Pelican aims to be better than its over provisioned counterpart racks using computations in software stacks

Pelican uses resources from a set of resource domains which is a subset of disks. Pelican proposes a data layout and IO scheduling algorithms by expressing these resource domains as constraints over the disks.

We were reading about the IO scheduler. It works on groups and classes rather than the hardware constraints. It reorders requests between the classes to reduce spin latency. The IO scheduler manages costs from the requests as well as costs from the rebuild traffic. It does this using two mechanisms - request ordering and rate limiting.

The scheduler does reordering to batch sets of operation for the same group to amortize the group spin up latency over the set of operations. The larger the batch size, the smaller the cost from the request but more the queuing delay for some operations. Let us see how they quantify this delay.

Queued operations have a timestamp. An operation r has a timestamp tr and is assigned a reordering counter or. The difference between or and tr is the absolute change in ordering compared to first come first serve. There is an upper bound u on the tolerated re-ordering and each difference for every request must be within this limit. The scheduler first examines the queue to find l, the last operation in the same group as r. If there are no such operation, r is appended to the tail of the queue and the process completes. On the other hand, if there was an l then the impact of inserting r after l is determined. The scheduler performs a check to quantify the impact if r were inserted after l by considering all operations i following l. if for any of these operations i, the absolute change does not respect the upper bound, then r is appended to the tail. Otherwise all counters are incremented by 1 and r is inserted after l. Otherwise all oi counters are incremented by one, and r is inserted after l with its reordering counter reduced by the number of request i that it has jumped.

This remains true for all uniform blob sizes so the reordering is expressed in terms of the number of operations. If it is non-uniform, it is expressed in terms of wall clock time.

#codingexercise

Given a binary array and an integer m, find the position of zeroes flipping which creates maximum number of consecutive 1s in array.

The solution involves the following steps

while the number of zeros is no more than m, expand the window to the right and update the count of zeros

while the count of zeros exceeds m, shrink the window from the left and update the count of zeros

update the widest window along the way.

The positions of output zeros are inside the window.

void findZeroes(List<int> arr, int n, int m)
{
int left = 0, right = 0;
int start = 0, window = 0;
int count = 0;
while (right < n)
{
if (count <= m)
{
if (arr[right] == 0)
count++;
right++;
}
if (count > m)
{
if (arr[left] == 0)
count--;
left++;
}
if (right-left > window)
{
window = right-left;
start = left;
}
}
for (int i=0; i<window; i++)
{
if (arr[start+i] == 0)
Console.rightite("{0}",start+i);
}

}

Cluster computing

Wednesday, July 27, 2016

No comments:

Post a Comment