Cluster computing

Monday, March 18, 2019

The Simple Storage Service (S3) API
Virtually all storage providers in cloud and on-premise storage solutions support the S3 Application Programming Interface. With the help of this APIs, applications can send and receive data without having to worry about the storage best practice. They are also able to switch from one storage appliance to another, one on-premise cluster to another, one cloud provider to another and so on. The API was introduced by Amazon but has become the industry standard and accepted by many storage providers. Even competitors like Azure provide an S3 Proxy to allow applications to access their storage with this API.
This API has wide spread popularity for storing binary large objects also called blobs for short but that has not stopped providers from making an S3 façade over other forms of storage. They get the audience with S3 since the application changes are minimal and retrofit the storage solutions to support the most used aspect of S3.
S3 however is neither a restriction on the storage type nor true to its roots in object storage. It provides the ability to navigate the buckets and objects and adds improvements such as multi-part upload to the way data is sent into the object storage. It supports various headers to make the most of the object storage which includes sending options along with the payload as well as retrieving information on the stored products.
S3 like any other cloud-based service has been developed with the Representation State Transfer (REST) best practice. However, it does not involve all the provisions of HTTP 2.0 (released) or 3.0 (in-progress). Neither does it provide a protocol like abstraction where layers can be added above or below in a storage stack. A networking stack on the other hand has dedicated protocols for each of its layers. A storage stack may comprise of say, at the very least, an active workload versus a backup workload layer where the active remains as the higher layer and can support say HTTP and the backup remains as the lower layer and supports S3. Perhaps this distinction has been somewhat obfuscated where object storage can expand its capabilities to both layer.
Object storage can also support the notions of storage classes, versioning, retention, data aging, deduplication and compression and none of these are really feature very well by S3 without a header here or a prefix there or a tag or a path and this leads to incredible inconsistency not just between the features but also in their usages by the customers.

Cluster computing

Monday, March 18, 2019

No comments:

Post a Comment