Cluster computing

Service Fabric (continued)

Part 2 compared Paxos and Raft. This article continues the discussion with a comparison to Service Fabric.

Kubernetes requires distributed consensus, but Service Fabric relies on its ring. Distributed consensus algorithm like Paxos and Raft must perform leader election, Service Fabric doesn’t. Service Fabric avoids this with a decentralized ring and failure detection. It is motivated by the adage that distributed consensus is at the heart of numerous co-ordination problems, but it has a trivially simple solution if there is a reliable failure detection service. Service Fabric does not use a distributed consensus protocol like Raft or Paxos or a centralized store for cluster state. Instead, it proposes a Federation subsystem which answers the most important question on membership which is whether a specific node is part of the system. The nodes are organized in rings, and the heartbeats are only sent to a small subset of nodes called the neighborhood. The arbitration procedure involves more nodes besides the monitor but it is only executed on missed heartbeats. A quorum of privileged nodes in an arbitration group helps resolve the possible failure detection conflicts, isolate appropriate nodes and maintain a consistent view of the ring.

Nodes in a Service Fabric are organized in a virtual ring with 2^m points where m = 128 bits. Nodes and keys are mapped onto a point in the ring. A key is owned by the node closest to it, with ties won by the predecessor. Each node keeps track of multiple (a given number of) its immediate successor nodes and predecessor nodes in the ring which is called the neighborhood set. Neighbors are used to run SF’s membership and failure detection protocol.

Membership and failure detection in Service Fabric relies on two key design principles: 1. A notion of strongly consistent membership and 2. Decoupling failure detection from failure decision. All nodes responsible for monitoring a node X must agree on whether X is up or down. When used in the SF-ring, all predecessors and successors in the neighborhood of a node X agree on its X’s status. This forms a consistent neighborhood. Failure detection protocols can lead to conflicting decisions. For this reason, the decision of which nodes are failed is decoupled from the failure detection itself.

Monitoring and leasing make this periodic. Heartbeating is fully decentralized where each node is monitored by a subset of other nodes which are called its monitors. Node X periodically sends a lease renewal request which is a heartbeat message with a unique sequence number, to each of its monitors. When a monitor acknowledges, node X is said to carry a timer value called T0 so that X can wait for T0 time and then take actions with respect to Y. If a node X suspects a neighbor Y, it sends a fail(Y) message to the arbitrator but waits for T0 time after receiving the accept(.) message before reclaiming the portion of the Y’s ring. Any routing requests received for Y will be queued but processed only after the range has been inherited by Y’s neighbors.

The SF-Ring is a distributed hash table. It provides a seamless way of scaling from small groups to large groups. SF-Ring was developed at around the same time as P2P DHTs like Pastry, Chord, and others. SF-Ring is unique because 1) Routing table entries are bidirectional and symmetrical, 2) the routing is bidirectional, 3) Routing tables are eventually convergent, 4) there is decoupled mapping of Nodes and keys and 5) There are consistent routing tokens.

1. SF-Ring maintains routing partners in routing tables at exponentially increasing distances in the ring. Routing partners are maintained both clockwise and anticlockwise. Most routing partners are symmetric due to bidirectionality

2. A bidirectional routing table enables a node looking to forward a message for a key to find another node whose ID is closest to the key so that it may forward the message. It is a distributed form of binary search and is greedy in nature.

3. SF nodes use a chatter protocol to continuously exchange routing table information. The symmetric nature of the routing enables failure information to propagate quickly and this leads to eventual converging of routing table entries that are affected.

4. Nodes and services are mapped onto the ring in a way that is decoupled from the ring. Nearby nodes on the rings are selected from different fault domains and services are mapped in a near-optimal way and load balanced way.

5. Each SF node owns a portion of the ring as encoded in its token. There is no overlap among tokens owned by nodes and every token range is eventually owned by at least one node. When a node leaves, its successor and predecessor split the range between them halfway. With these criteria, SF routing will eventually succeed.

Cluster computing

Tuesday, March 22, 2022

No comments:

Post a Comment