Monday, September 24, 2018

We were discussing the design of the Object cache layer versus the design of the Object Storage. The design from the cache may involve a very different deployment as compared to the object storage. There is no necessity for the cache layer to be handled by one server. There can be multiple servers within the object cache each with its own implementation of the server to determine the schedule of persistence of the objects to the store. This enables scale out of the cache layer across different nodes so that they handle only a small subset of objects. With a cluster-based deployment or the choice of a set of proxy servers, the cache may have a wide variety of choices for its implementation. The object storage is entirely cluster based and virtualizes the data center with its implementation of storage pool.  Its design is somewhat more restricted than the cache because it is storage oriented. The cache on the other hand is focused on the workloads and may choose to partition based on the object distribution to different cache servers. It can simply involve a distributed hash table and has similar distribution strategy as the gateway service in terms of the distribution of the objects to the designated caches. A Peer to peer network may be overlaid over the object storage to determine this object-cache distribution. There is no necessity to have a cluster only approach to the cache layer.  There are several benefits to this approach. First each cache is concerned with a small subset of the objects. Hence it is able to serve this workload better than if everything was part of the same cache. Second the implementation of the cache is now independent of the compute at that peer which allows far more commodity servers than if they were part of a cluster. Third each and every cache may store objects to the same object storage on the backend but provide the option to be bound to a specific workload allowing scale up where necessary.  Finally, the ability of such a cache to perform flush to disk is entirely encapsulated within the peer in a deeply isolated stack which enables far superior performance than a distributed file system. All these considerations make the cache layer stand out from the storage layer.  

No comments:

Post a Comment