Friday, May 12, 2023

 

An earlier article introduced Epidemic Algorithms. This continues the discussion with some examples of those algorithms. Cyclon is used for membership management while T-man is used for Topology management. Both demonstrate peer sampling which we might recall from the earlier article as one of the criteria for the design space for such algorithms. Every node maintains a relatively small local membership table that provides a partial view of the complete set of nodes and it periodically refreshes the table using a gossiping procedure. 

The generic peer sampling will execute a handler for a timed event that runs every T time units. It will involve selecting peers into a view, permuting the view, moving the oldest pre-defined set of items to the end of the view, adding the others to a buffer initialized with the current node address, sending the buffer to the selected peers, receiving the set of buffers from those selected peers, refreshing the view with these set of buffers and increasing the age of the view.  This generic framework also includes a receiver handler that executes what the timed event handler does but without selecting peers since the senders are known from what is received. The refreshing of the view in these handlers will start by appending the set of buffers to the view, removing duplicates, removing old items, removing the initial few as necessary, and removing some items at random. 

Cyclon as a peer sampling example will involve a highest age Tail strategy for peer selection, a push-pull mode for view propagation, and a Swapper mode for view selection. A Cyclon peer will pick the oldest peer from its view and remove it from the view. It will exchange some of the peers in neighbors via (swap policy) and the active peer sends its fresh address.  If there are no network disruptions, no peer becomes disconnected in the undirected graph. Pointers are updated so peers change from being neighbor of one peer to being the neighbor of another peer.  The algorithm starts from a state where peers are connected in a chain and converges to a graph that has the same average path length as a random graph. 

T-man is a protocol that can construct and maintain any topology with the help of a ranking function. The ranking function orders any set of nodes according to the neighbors of a given node.  As with generic peer sampling algorithms, T-man also has a timed event that runs every T time units. It will involve selecting peers into a view,  permuting the view, merging a descriptor formed from the current node address and its profile into a buffer, merging the random order of the selected peers into a buffer, sending the buffer to the selected peers, receiving the set of buffers from those selected peers, refreshing the view with these set of buffers by the same merge operation and selecting the view as a subset of the buffer.  It also includes a receiver handler that executes what the timed event handler does but without selecting peers since the senders are known from what is received. The selecting of the view from the buffer will sort all nodes in the buffer and pick out the highest ranked nodes. The ranking function could be based on linear interpolation or ring formation. 

When a private node behind a Network Address Traversal aka NAT, requests the information from a public node it will specify a shuffle request. In response the public node will send a shuffled response. Both the private node and the public node will update their states. If another node on the internet tries to issue a shuffle request to the private node behind the NAT, it will not pass through the NAT. This is the expected behavior for the Epidemic algorithms when they leverage a NAT.

The solutions for communicating with the private nodes in this manner can be achieved in one of two ways. The first technique involves a relay where the communications are relayed to the private node using a public relay node. The second method involves a NAT hole-punching algorithm to establish a direct connection to the private node using a public rendezvous node. The choice between the two is determined by latency and load.  Relaying has lower latency message exchange. It enables lower gossip cycle periods. It is also necessary in dynamic networks. The other method decreases load on the public nodes but this is not always the case. The load does not decrease if the shuffle messages are small.

Gozar is NAT aware peer sampling algorithm. Each private nodes connects to one or more public nodes called partners that act as a relay or rendezvous server on behalf of the private node. A node’s descriptor consists of both its own address, it’s NAT type, and its partners’ addresses at the time of the descriptor creation. When a node wants to gossip with a private node, it uses the partner addresses in its descriptor to communicate with the private node.

Epidemic algorithms are important techniques to solve problems in a dynamic large scale systems. They are scalable, simple and robust to node failures, message loss and transient network disruptions. The applications are aggregation, membership management, and topology management.


#codingexercise 

https://1drv.ms/w/s!Ashlm-Nw-wnWhMpiXxGeVryG7eiFXA

 

 

 

 

 

No comments:

Post a Comment