Sunday, October 28, 2018

Object Storage is perceived as backup and tertiary storage. However, in the next few posts we argue that object storage is a persistent thread local storage for all workers in any system so that there is never any loss of state when a worker disappears. While files have traditionally been the storage of choice for workers, we argue that this is just a matter of access and that file protocols or http access serve just the same to file-system enabled object storage. Moreover, not all data needs to be written deep into the object storage at once. With the help of a cache layer discussed earlier, we can even allow workers higher performance than working with remote storage. The requirements for object storage need not even change while the reads and writes from the workers can be handled 
Many applications maintain concurrent activities. Large scale query processing on big data requires intermediate storage for their computations. Similarly cluster computing also requires storage for the processing of their nodes. Traditionally, these nodes have been sharing volumes or maintaining local file storage. However, most disk operations are already written off as expensive as computing moves more into memory. A large number of workers are generally commodity workers. They don’t need to maintain high performance that object storage cannot step in and fill. Moreover each worker or node gets it own set of objects in the object storage which can be considered as shared-nothing as its own disks.  All they need is their storage to be in the cloud and managed so that they never have to be limited by their disks. 
That is how this Ticket Layer positions itself. It offers a tracking and leasing of universal storage that do away with disks for nodes in a cluster and workers in an application. Several forms of data access can be supported in addition to file-system protocols and S3 access in an object storage. The ticketing system is a planning and tracking software management system that generates tickets and leases for remote storage on a workload by workload basis so that the clouds’ elasticity can be brought to individual workers even within the same node or application. 

No comments:

Post a Comment