Cluster computing

Friday, June 29, 2018

We resume our discussion on Storj network and it's messaging
The messaging system implemented by Storj network is Quasar. The publisher-subscriber model is topic based and implemented utilizing Bloom Filters. The topics are predetermined and made available at the protocol level. This helps with implementing Storj over Kademlia. Three new message types are added that correspond to subscribe, update and publish methods. The topics can therefore be extended and include such things as contract parameters and bandwidth reservation. The Kademlia messages facilitate the creation and propagation of filters. Each node maintains information about topics to which it subscribes as well as those to which its neighbors subscribe. Three neighbors are chosen to form the initial subscribe list. The response to the subscribe includes this current filter list. The requesting nodes then merges these lists. This is a pull operation. The update message does the reverse. It pushes the filter changes to the three nearest neighbors. It usually follows the subscribe message so that the loop helps all the nodes learn what is relevant to their neighbors and to those reachable from them. Publish messages broadcast the message to the network. These messages are sent to a node with three nearest neighbors and includes a topic parameter. If the topic is found, the node processes the message. If it is found in the other filters in the filter list, it is forwarded to the three neighbors. If nothing matches, it is forwarded to a randomly selected peer. To prevent the message from coming back, the nodes add their id to the message. Publish messages also include time to live to prevent spam attacks.
We were comparing Kademlia with other P2P networks. In general they provide a good base for large scale data sharing and application level multicasting. Some of the desirable features of P2P networks include selection of peers, redundant storage, efficient location, hierarchical namespaces, authentication as well as anonymity of users. In terms of performance, the P2P has desirable properties such as efficient routing, self-organizing, massively scalable and robust in deployments, fault tolerance, load balancing and explicit notions of locality. Perhaps the biggest takeaway is that ht P2P is an overlay network with no restriction on size and there are two classes structured and unstructured. Structured P2P means that the network topology is tightly controlled and the content are placed at random peers and at specified location which will make subsequent queries more efficient. DHTs fall in this category where the location of the data objects is deterministic and the keys are unique. In distributed computing we saw the benefit of arranging the nodes in a ring with hashes at different nodes. Napster was probably the first example to realized the distributed file sharing benefit with the assertion that requests for popular content does not need to be sent to a central server. P2P file sharing systems are self-scaling.
#codingexercise
Given a paper size of A x B, cut the papers into squares of any size and determine the minimum number of squares.
uint GetMinCount(uint a, uint b)
{
uint count= 0;
uint remainder = 0;
uint min = GetMin(a,b);
uint max= GetMax(a,b);
while (min > 0)
{
count += max/min;
remainder = max%min;
max = min;
min = remainder;
}
return count;
}

Cluster computing

Friday, June 29, 2018

No comments:

Post a Comment