Thursday, July 28, 2016

Today we continue our discussion of the  paper titled "Pelican: a  building block for exascale cold data storage". Pelican treats a group of disks as a single schedulable unit. Resource restrictions such as power consumption, vibrations and failure domains are expressed as constraints over these units.
With the help of resource constraints and scheduling units, Pelican aims to be better than its over provisioned counterpart racks using computations in software stacks
Pelican uses resources from a set of resource domains which is a subset of disks. Pelican proposes a data layout and IO scheduling algorithms by expressing these resource domains as constraints over the disks. 
We were reading about the IO scheduler. It works on groups and classes rather than the hardware constraints. It reorders requests between the classes to reduce spin latency. The IO scheduler manages costs from the requests as well as costs from the rebuild traffic. It does this using two mechanisms - request ordering and rate limiting.
If the blob sizes are uniform, the reordering is expressed in terms of the number of operations. If it is non-uniform, it is expressed in terms of wall clock time.



We now look at rate limiting methods. The rebuild operation interferes with the regular operations that are queued, hence the rate at which each queue is serviced is then controlled. The goal here is to make the rebuild operations complete within an upper time bound. This helps ensure data durability when the annual failure rates of the hardware are known.
If x is the amount of data on the failed disk and t is the average throughput of a single disk, then x / t is the time taken to transfer and if w is the fraction of the resources the scheduler allocates, then the total time to repair after a single disk failure is  (x / t times 1/w) .
This is used to compute the weights for two queues - one for rebuild and another for regular operations and the scheduler merely does weighted fair queuing.


Building bridges problem


Consider a 2D map with a river flowing from left to right through the center. There are n cities to the south of the river and same number on the northern bank. We have to bridge city i on the northern bank to city i on the southern bank however the order of cities can be random on both the northern and southern bank. We have to draw as many bridges without having them cross each other.

int GetBridges(List<int> cities)
{
int n = cities.Length;
int best = new List<int>(n);
for (int i =1; i < n; i++)
for (int j = 0; j < i j++)
{
    if (cities[i] < cities[j] && best[j] + 1 > best[i])
          best[i] = best[j] + 1;
}
return best.max();
}

return min (GetBridges(NorthernCities), GetBridges(SouthernCities));

Find the largest subarray with equal number of zeros and ones
Tuple<int, int> GetLargestSubArray(List<int> nums)
{
var ret = new Tuple<int, int>();
int sum = 0;
int size = INT_MIN;
for (int i = 0; i < nums.Count; i++)
{
   sum = (nums[i] == 0) ? -1: 1;
    int maxsize = -1, startindex;
   for (int j = i+1; j < nums.Count; j++)
   {
       if (nums[j] == 0)
            sum += -1;
       else
            sum += 1;
       if (sum == 0 && size < j-i+1)
      {
             size = j - i + 1;
             ret.first = i;
             ret.second = j;
      }
   }
}
return ret;  
}

No comments:

Post a Comment