Tuesday, December 25, 2018

Today we continue discussing the best practice from storage engineering:


225) A shared nothing system must mitigate partial failures. This is a term used to describe the condition when one or more of the participating nodes goes down. In such cases the mitigation may be one of the following: 1) bring down all of the nodes when any one fails which is equivalent to a shared –memory system, 2) use “data skipping” where queries are allowed to be executed on any node that is up and the data on the failed node is skipped and 3) use as much redundancy as necessary to allow queries access to all the data regardless of any unavailability.

226) Search algorithm over data in storage tend to be top-down. Top-down search implies lower costs because it can prune the query plan to what is just relevant. However, top-down search can exhaust memory. It is often helpful, if there could be additional hints taken from the user and the storage system be capable of using the hints.

227) A single query may be run synchronously and serially usually. However, if the user does not see it and there are ways to gain improvement by parallelizing the workers, then it is always better to use that. The caveat here is there forms of two stages: first the work estimation and then the workload distribution

228) Any storage system can be made to perform better with the help of “auto-tuning” In this method, the same workload is studied with different “what-if” plans so that the outcome is chosen as the one most beneficial.

229) Storage queries that are repeated often are useful to cache because chances are the data has not changed significantly to alter the plan that best suits the execution of the query. While the technique has been very popular with relational databases, it actually holds true for many forms of queries and storage products.

230) The results of query execution and materialized views are equally helpful to be cached and persisted separately from the actual data. This reduces the load on the product as well as makes the results available sooner to the queries.


No comments:

Post a Comment