Cluster computing

Tuesday, November 20, 2018

Today we continue discussing the best practice from storage engineering:

75) Cachepoints – Cachepoints are used with consistent hashing. Cachepoints are arranged along the circle depicting the key range and cache objects corresponding to the range. Virtual nodes can join and leave the network without impacting the operation of the ring.

76) Stream/Batch/Sequential - processing: Storage products often distinguish themselves as serving stream processing, batch processing or sequential processing. Yet, the factors that determine the choice are also equally applicable to the components within the product when they are not necessarily restricted by the overall design. There are ways to convert one form of processing into another which drives down the cost. For example, event processing has largely been stream- based.

77) Joins – Relational data has made remarkable use of joins over tuples of data involving storage and query improvements to handle these cases. Components within products that are used for unstructured data often have to encounter some form of matching between collections. The straightforward way to implement these have been iterators over one or more collections that are filtered based on conditions that evaluate those collections. However, it helps to lookup associations whenever possible by ways and means that can improve performance. Judicious choice of such techniques is always welcome wherever possible.

78) Strategies – Implementation of a certain data processing logic within a storage product may often have a customized implementation and maintained with the component as it improves from version to version. Very little effort is usually spent on externalizing the strategy across components to see what may belong to the shared category and potentially benefit the components. Even if there is only one strategy every used with that component, this technique allows other techniques to be tried out independent of the product usage.

79) Plug and Play architecture – the notion of plugins that work irrespective of the components and layers in a storage stack is well-understood and part of the software design. Yet the standardization of the interface such that it is applicable across implementations is often left pending for later. Instead, the up-front standardization of interfaces promotes eco-system and adds convenience to the user.

80) Interoperability – Most storage products work well when the clients are running on a supported flavor of an operating system. However, this consideration allows the product to expand its usage. Interoperability is not just a convenience for the end-user, it is a reduction in management cost as well.

Cluster computing

Tuesday, November 20, 2018

No comments:

Post a Comment