Cluster computing

Saturday, July 30, 2016

Debunking data is sticky in cloud computing

Traditional enterprise software has focused on accumulating incredible amounts of data for their systems. As the data grows, it becomes more and more sticky requiring applications and services to change accordingly. Entire ecosystems then evolve around the data bringing many routines and much maintenance onus.

Compute and Storage have often gone hand in hand with requirements shaping each other mutually. Yet they have fundamentally different needs and require different services to form their own layers. While the separation between the layers may not matter for desktop applications, they do for cloud computing. On premise data translated to a connection string or a file system or a message bus to the application. This was replaced by blobs, key-values, queue messages, file shares and object storage. A database or a file no longer needs to be static and can be moved around, backed up and replicated, compressed, encrypted, aged and archived.

Object storage gave some of this benefits in exposing a raw storage platform that distributed the failure domains for the data. By adding metadata and a globally unique identifier, the data could now live on multiple virtual machines or storage increasing availability, scalability and self-healing abilities. Applications took advantage of this by storing photos and songs.

However object storage is not a managed platform. It is an alternative to file and block storage. If it were managed platform, it would provide services that storage appliances are used for such as deduplication, backup, aging and archival, and self healing enhancements. One such platform came close to providing these manged services. Isilon OneFS provided distributed file system with a storage cluster. It combines three layers of traditional storage architecture namely, a File System, Volume Manager, and data protection and it scales out because it relies on intelligent software, commodity hardware, and distributed architecture.
However OneFS used cluster storage from on-premise hardware. If Isilon could use cluster storage out of cloud resources such as Mesos clusters, then it would harness more cloud computing power with very little physical on-permise footprint. By using distributed systems kernel that runs on every machine, Isilon could use API for resource management and scheduling across entire data center and cloud environments. This provides several benefits:

It provides linear scalability in that the clusters can grow to ten thousands of nodes where as OneFS max volume is around 143 nodes with hundred terabytes each.
It provides high availability in that the master and agents are replicated in a fault tolerant manner and even the upgrades are non-disruptive.
It provides same cluster to support both cloud native and legacy applications with pluggable scheduling policies so that OneFS services can be ported or rewritten.
It provides APIs for operating the cluster and for monitoring leving much of the onus to mesos while enabling customizations. These APIs are already exercised in the web user interface made available from this layer.
It provides pluggable isolation in terms of CPU, memory, disks, ports, GPU and modules for custom resource isolation. This can help OneFS distribute and differentiate the workloads.
It provides a cross platform to run on Linux, OSX and Windows while being cloud provider agnostic. This makes it universal in the deployments and thus provides significant benefits for adoption and onboarding.
With an existing customer base to validate the viability of OneFS and mesos individually, it should be safe to assume the viability and purpose of the one stacked over the other. Lastly, this provides a managed platform for storage rather than the raw services which lets the applications be more compute oriented and less storage conscious.

#codingexercise

Given a number n and k(no of swaps allowed), make the biggest number from n by making at most k swaps. If its already the biggest return the number itself

To solve the problem, we pick the elements that would appear in the sorted descending manner.

Int getnum(int num, int k)
{

var digits = num.toDigits();

int swaps;

for (int i = 0; i < digits.Count; i++)

{

int pos = getmax(digits, i+1);

if (digits[pos] > digits[i]){

Swap(digits, i, pos);

swaps++;

if (swaps > k)

return digits.toNumber();

}

return digits.toNumber();

}

Cluster computing

Saturday, July 30, 2016

No comments:

Post a Comment