Tuesday, February 5, 2019

Today we continue discussing the best practice from storage engineering:

416) Key Value pairs are organized according to the key. Keys in turn are assigned to a partition. Once a key is assigned to a partition, it cannot be moved to a different partition. Since it is not configurable, the number of partitions in a store is decided upfront.

417) The number of storage nodes in use by the store can however be changed. When this happens, the store undergoes reconfiguration, the partitions are balanced between new and old shards, redistribution of partition between one shard and another takes place. 

418) The more the number of partitions the more the granularity for the reconfiguration. It is typical to have ten to twenty partitions per shard. Since the number of partitions cannot be changed afterwards, it is decided at design time.

419) The number of nodes belonging to a shard called its replication factor improves the throughput. The higher the replication factor, the faster the read because of availability. The same is not true for writes since there is more copying involved. Once the replication factor is set, the storage product takes care of creating the appropriate number of replication nodes for each shard.

420) A topology is a collection of storage nodes, replication nodes, and the associated services. At any point of time, a deployed store has one topology. The initial topology is such that it minimizes the possibility of a single point of failure for any given shard. If the storage node hosts more than one replication node, those replication nodes will not be from the same shard. If the host machine goes down, the shard can continue for reads and writes.

No comments:

Post a Comment