Cluster computing

Tuesday, July 10, 2018

We return to discussing storage as a network. In particular, I want to bring up a level of separation between storage and networking and show that by moving this separation further into one domain we get the possibilities of technologies that are vastly different than if it were pushed in the other domain. For example, Peer-to-Peer (P2P) networking provides a good base for large scale data sharing and application level multicasting. Some of the desirable features of P2P networks include selection of peers, redundant storage, efficient location, hierarchical namespaces, authentication as well as anonymity of users. In terms of performance, the P2P has desirable properties such as efficient routing, self-organizing, massively scalable and robust in deployments, fault tolerance, load balancing and explicit notions of locality. Perhaps the biggest takeaway is that the P2P is an overlay network with no restriction on size and there are two classes structured and unstructured. Structured P2P means that the network topology is tightly controlled and the content is placed on random peers and at specified location which will make subsequent queries more efficient. DHTs fall in this category where the location of the data objects is deterministic and the keys are unique. Napster was probably the first example to realize the distributed file sharing benefit with the assertion that requests for popular content does not need to be sent to a central server. P2P file sharing systems are self-scaling.
We have referred to the P2P network as a top heavy architecture as opposed to the storage first architectures. Let us elaborate this a bit more. Top-heavy means we have an inverted pyramid of layers where the bottom layer is the network layer. This is the substrate that connects different peers. The overlay nodes management layer handles the management of these peers in terms of routing, location lookup and resource discovery. The layer on top of this is the features management layer which involves security management, resource management, reliability and fault resiliency. Over this we have the services specific layer which includes services management, metadata, services messaging and services scheduling. Finally, we have the application layer on top which involves applications, tools and services. Fundamentally P2P networks do not rise from established and connected groups of systems. They don't have a reliable set of resources. Yet they have fault-tolerance, self-organization and massive scalability properties.
Courtesy:IEEE publications on its comparisions.
#codingexercise
Find the maximum number of unbalanced brackets after all the swappings needed to balance brackets. For example:
[]][][ => 1
[[][]] => 0
int GetCountSwaps(String p)
{
validate(p);
Stack<char> open = new Stack<char>();
Stack<char> close = new Stack<char>();
for (int i = 0; i < p.Length; i++)
{
if (p[i] == '['){
open.push('[');
}
if (p[i] == ']') {
if (open.isEmpty() == false) {
open.pop(); / found a match;
}else{
close.push(']');
}//if
}//if
}//for
int swaps = Math.Min(close.Count, open.Count);
return Math.Max(Math.Abs(close.Count-swaps), Math.Abs(open.Count-swaps));
}
The GeekToGeek solution for this appears incorrect.

There is a way to make a solution using linear scan however only the stack keeps track of the sequence of unbalanced.

One optimization in the proceeding is that we treat the open close and close open pairs as requiring 0 or 1 swaps

Cluster computing

Tuesday, July 10, 2018

No comments:

Post a Comment