Sunday, October 18, 2020

Network engineering continued ...

 This is a continuation of the earlier posts starting with this one: http://ravinote.blogspot.com/2020/09/best-practice-from-networking.html  

      1. Clusters deal with nodes and disks as a commodity making no differentiation in terms of capacity improved or nodes added. They are tolerant of nodes going down and view the disk array as Network Access Storage. If they could improve resource management with storage classes where groups of disks are treated differently based on power management and I/O scheduling, it will provide tremendous quality of service levels to workloads. 

      1. While there can be coordination between the controller nodes and data nodes in a cluster, an individual disk or a group of disks in a node does not have a dedicated disk worker to schedule I/O to the disks since storage has always been progressing towards higher and higher disk capacity. When the disks become cheap in their expansion by way of numerous additions and earmarking, then the dispatcher and execution worker model can even be re-evaluated. 

      1. The process per disk worker model is still in use today. It was used by early DBMS implementations.  The I/O scheduling manages the time-sharing of the disk workers and the operating system offers protection. This model has been helpful to debuggers and memory checkers.  

      1. The process pool per disk worker model has alleviated the need to fork processes and tear down and every process in the pool is capable of executing any of the read-writes from any of the clients. The process pool size is generally finite if not fixed. This has all of the advantages from the process per disk worker model above and with the possibility of differentiated processes in the pool and their quota. 

      1. When compute and storage are consolidated, they have to be treated as commodities and the scalability is achieved only with the help of scale-out. On the other hand, they are inherently different. Therefore, nodes dedicated to computation may be separated from nodes dedicated to storage. This lets them both scale and load balance independently. 


      1. #codingexercise: https://ideone.com/DBjnkH 

No comments:

Post a Comment