Cluster computing

Sunday, June 6, 2021

A cluster cache implementation:

A cluster mode implementation comes with the primary benefit of horizontally scaling up or down the Redis cluster with little or no impact to performance. A single Redis server is usually constrained by the size of the memory on the host. Since it is an in-memory key-value store, it requires memory to determine the provisioning of its store. If the memory is not adequate, the server can be over-provisioned and if it too large it can be under-provisioned. Writing to more than one server is always possible but a cluster mode allows us to write once and scale horizontally without requiring the application to iterate through distinct servers each time.

This cluster mode involves 0-5 replicas per primary server for each shard. It requires both data partitioning and replication as opposed to a configuration with a single shard, one primary server, and its replicas. Adding and removing shards and node rebalance occur during horizontal scaling. There can be up to 90 shards in the cluster. One of the advantages of the cluster is that it can deploy over multiple availability zones. A Cluster-Mode can scale to enormous amounts of storage (potentially 100s of terabytes) across up to 90 shards.

Cluster Mode also allows for more flexibility when designing new workloads with unknown storage requirements or heavy write activity. In a read-heavy workload, we can scale a single shard by adding read replicas, up to five, but a write-heavy workload can benefit from additional write endpoints when cluster mode is enabled.

Redis leverages a form of sharding in which every cache key is mapped to a “hash slot.” Within the cluster, there are 16,384 hash slots available. Those slots are divided amongst the total number of shards in the cluster. By default, all the shards are equally distributed, although this could also be customized with a distribution scheme if required.

When writing or reading data to the cluster, the client will calculate which hash slot to use via a simple algorithm: CRC16(key) mod 16384. In the case of clustering over Redis servers, the client could itself determine which shard to use based on the keyspace. The cluster avoids becoming a single point of failure and the client is allowed to reach any shard in the cluster. When an initial connection is made to the Redis cluster, the client could resolve and manage a keyspace mapping that can be used to identify on which node a particular hash key can be found.

The clusters can be spread across multiple Availability-Zones (multi-AZs) which improves their handling of fault and update zones. When the cluster is set up, it could be defined to have auto-failover with Multiple Availability zone redundancy. The allocation of nodes by the cluster spans availability zones when the cluster is aware of the zones to deploy. This could be round-robin or user-specified.

Cluster computing

Sunday, June 6, 2021

No comments:

Post a Comment