Service Fabric (continued)
Part 2 compared Paxos and
Raft. This article continues the discussion with a comparison to Service
Fabric.
Kubernetes requires distributed consensus, but
Service Fabric relies on its ring. Distributed consensus algorithm like Paxos
and Raft must perform leader election, Service Fabric doesn’t. Service Fabric
avoids this with a decentralized ring and failure detection. It is motivated by
the adage that distributed consensus is at the heart of numerous co-ordination
problems, but it has a trivially simple solution if there is a reliable failure
detection service. Service Fabric does not use a distributed consensus protocol
like Raft or Paxos or a centralized
store for cluster state. Instead, it proposes a Federation subsystem which
answers the most important question on membership which is whether a specific
node is part of the system. The nodes are organized in rings, and the
heartbeats are only sent to a small subset of nodes called the neighborhood. The
arbitration procedure involves more nodes besides the monitor but it is only
executed on missed heartbeats. A quorum of privileged nodes in an arbitration
group helps resolve the possible failure detection conflicts, isolate
appropriate nodes and maintain a consistent view of the ring.
Nodes in a Service Fabric are organized in a
virtual ring with 2^m points where m = 128 bits. Nodes and keys are mapped onto
a point in the ring. A key is owned by the node closest to it, with ties won by
the predecessor. Each node keeps track of multiple (a given number of) its
immediate successor nodes and predecessor nodes in the ring which is called the
neighborhood set. Neighbors are used to run SF’s membership and failure
detection protocol.
Membership and failure detection in Service
Fabric relies on two key design principles: 1. A notion of strongly consistent
membership and 2. Decoupling failure detection from failure decision. All nodes
responsible for monitoring a node X must agree on whether X is up or down. When
used in the SF-ring, all predecessors and successors in the neighborhood of a
node X agree on its X’s status. This forms a consistent neighborhood. Failure
detection protocols can lead to conflicting decisions. For this reason, the
decision of which nodes are failed is decoupled from the failure detection
itself.
Monitoring and leasing make this periodic.
Heartbeating is fully decentralized where each node is monitored by a subset of
other nodes which are called its monitors. Node X periodically sends a lease
renewal request which is a heartbeat message with a unique sequence number, to
each of its monitors. When a monitor acknowledges, node X is said to carry a
timer value called T0 so that X can wait for T0 time and then take actions with
respect to Y. If a node X suspects a neighbor Y, it sends a fail(Y) message to
the arbitrator but waits for T0 time after receiving the accept(.) message
before reclaiming the portion of the Y’s ring. Any routing requests received
for Y will be queued but processed only after the range has been inherited by
Y’s neighbors.
The SF-Ring is a distributed hash table. It
provides a seamless way of scaling from small groups to large groups. SF-Ring
was developed at around the same time as P2P DHTs like Pastry, Chord, and
others. SF-Ring is unique because 1) Routing table entries are bidirectional
and symmetrical, 2) the routing is bidirectional, 3) Routing tables are
eventually convergent, 4) there is decoupled mapping of Nodes and keys and 5)
There are consistent routing tokens.
1.
SF-Ring maintains routing partners in routing tables at exponentially
increasing distances in the ring. Routing partners are maintained both
clockwise and anticlockwise. Most routing partners are symmetric due to
bidirectionality
2.
A bidirectional routing table enables a node looking to forward a
message for a key to find another node whose ID is closest to the key so that
it may forward the message. It is a distributed form of binary search and is
greedy in nature.
3.
SF nodes use a chatter protocol to continuously exchange routing table
information. The symmetric nature of the routing enables failure information to
propagate quickly and this leads to eventual converging of routing table
entries that are affected.
4.
Nodes and services are mapped onto the ring in a way that is decoupled
from the ring. Nearby nodes on the rings are selected from different fault
domains and services are mapped in a near-optimal way and load balanced way.
5.
Each SF node owns a portion of the ring as encoded in its token. There
is no overlap among tokens owned by nodes and every token range is eventually
owned by at least one node. When a node leaves, its successor and predecessor
split the range between them halfway. With these criteria, SF routing will
eventually succeed.
No comments:
Post a Comment