Saturday, November 17, 2018

Today we continue discussing the best practice from storage engineering: 

61) Diagnostic queries: As each layer and component of the storage server create and maintain their own data structures during their execution, it helps to query these data structures at runtime to diagnose and troubleshoot erroneous behavior. While some of the queries may be straightforward if the data structures already support some form of aggregation, others may be quite involved and include a number of steps. In all these cases, the queries will be against a running system in as much permitted with read-only operations.

62) Performance counter: Frequently subsystems and components take a long time. It is not possible to exhaust diagnostic queries to discover the scope that takes the most time to execute. On the other hand, the code is perfectly clear about call sequences, so such code blocks are easy to identify in the source. Performance counters help measure the elapsed time for the execution of these code blocks.

63) Statistics counter: In addition to the above-mentioned diagnostic tools, we need to perform aggregation over execution of certain code blocks. While performance counters measure elapsed time, these counters help with aggregation such as count, max, sum, and so on.

64) Locks: In order to perform thread synchronization, these primitives are often used. If their use cannot be avoided, they are best taken as few as possible universally. Partitioning and coordination solve this in many cases. Storage server relies on the latter approach and versioning.

65) Parallelization: Generally there is no limit enforced to the number of parallel workers in the storage server or the number of partitions that each worker operates on. However, the scheduler that interleaves workers works best when there is one active task to perform in any timeslice.  Therefore, the number of tasks is ideal when it is one more than the number of processor. A queue helps hold the tasks until their execution. This judicious use of task distribution improves performance in every layer.

No comments:

Post a Comment