Cluster computing

Wednesday, July 20, 2016

Today we continue our discussion of the paper titled "Pelican: a building block for exascale cold data storage". Pelican treats a group of disks as a single schedulable unit. Resource restrictions such as power consumption, vibrations and failure domains are expressed as constraints over these units.

With the help of resource constraints and scheduling units, Pelican aims to be better than its overprovisioned counterpart racks using computations in software stacks

Pelican is not merely a power consumption optimization system. Previous work included Massive array of idle disks that achieve power proportionality assuming there is sufficient power and cooling to have all disks spinning and active when required. Hence they provide power saving. Pelican's peak power is 3.7kW and average of 2.6kW. Pelican provides both a lower capital cost per disk as well as a lower running cost while previous work improved the latter only.

Pelican stores unstructured, immutable chunks of data called blobs and has a key value store interface with write, read and delete operations. Blobs sizes supported are in the range of 200MB to 1 TB. Each blob is identified by a 20 byte key. Since Pelican serves to archive data, the initial time is spent mostly writing blobs and then all subsequent time is spent doing infrequent reads. Reading and repairing constitute much of the normal mode of operation.

Pelican serves as the lower tier in a cloud based storage system. Data is already staged in the higher layers. Pelican controls the transfer time to its system. This means that writes are already scheduled only for low load periods. Pelican therefore focuses on read-dominated workloads.

The software storage stack plays the innovative role of reducing the impact on performance from the restrictions on hardware resources. Pelican uses resources from a set of resource domains. Resource domain supplies its resource to a subset of disks simultaneously Disks in the resource domain are said to be domain conflicting. Two disks that share no common resource domains are domain disjoint.Pelican proposes a data layout and IO scheduling algorithms on these resource domains.

Given a tree, implement a function which replaces a node’s value with the sum of all its childrens’ value, considering only those children whose value is morethan than the main node’s value.

void ToChildrenSumTree(ref node root)

{

int left = 0;

int right = 0;

if (root==null || (root.left == null && root.right == null)) return;

ToChildrenSumTree(ref root.left);

ToChildrenSumTree(ref root.right);

If (root.left != null) left = root.left.data;

If (root.right != null) right = root.right.data;

int sum = 0;

if (left > root.data) sum += left;

if (right > root.data) sum += right;

int delta = sum - root.data;

if (delta >= 0)

root.data = root.data + delta;

else

increment(ref root, 0-delta);

}

Given a sorted matrix (row-wise and column wise) , find kth smallest element.

we walk the board and count until k

int GetKth(int[,] m, int row, int col, int k)

{

int r = 0;

int c = 0;

while ( r* col + c < k)

{

// compare right and down and increment r or c

}

return m[r,c];

}

Cluster computing

Wednesday, July 20, 2016

No comments:

Post a Comment