Cluster computing

Tuesday, October 8, 2013

In distributed computing, there is a very interesting technique called gossip. We will describe it in this post. By the way my next several posts are going to be in distributed computing just like the one this Sunday. This is because I'm reading that book I cited earlier. Gossip refers to the mechanism of diffusing computations from one process to all the other processes. The computations are forwarded as messages and although the mechanism may not be obvious, the goal is that all the processes will have received the computations.
First let us consider the topology of processes as an undirected graph such that processes are able to communicate to their neighbor. By undirected I mean neighbors can send messages to each other in any order. We will shortly see why the ordering is not important but if we try to focus on the exact order in which the messages will be sent, there are so many that we can easily lose ourselves. Instead let us focus on the tenets of this algorithm.
The safety of the algorithm is that there is a final invariant. Each process will have computations.
The progress of the algorithm is satisfied by some forwarding of messages from processes to their neighbors such that the computations spread to everyone. When one process forwards a message, it informs the others and it wants to be informed that the others received theirs. So each process forwards the messages onto its immediate neighbor but sends an acknowledgement only to its parent. So there seems to be two different messages, the computations and the acknowledgements. For the sake of efficiency, we can eliminate the acknowledgements and say that a process received the computations the first time and everything else is an acknowledgement now that it has received it's own. When the sender gets back the computation from the receiver, the same message acts as an acknowledgement. However there is a possibility to confuse the computations and acknowledgements due to race conditions. Assume that X and Y send computations to each other. X may not know if Y was sending the computation to it before X sent or whether Y was sending an acknowledgement after it received.
Therefore convincing ourselves that the algorithm works requires some rigor. And there are great applications to this technique such as barrier synchronization.
And this is how we present the proof of correctness. A process can only be in one of three different states - idle, active and, complete. A process starts out in the idle state. Once it hears the gossip and passes it to its children, it becomes active. Once it receives the acknowledgements and sends the acknowledgement to its parent, it becomes complete.
We divide the graph into two directed graphs- the first consists of the active or completed nodes and the second consists of the nodes that are active. Both include the initiator and the edges that connect. The second starts out as a proper subset of first. We notice that the directed graphs are trees where the first tree grows and the second tree grows and shrinks. We determine that the second tree shrinks only because a node moves into the first tree as completed.

Cluster computing

Tuesday, October 8, 2013

No comments:

Post a Comment