Cluster computing

Thursday, August 16, 2018

We were discussing the suitability of Object Storage to various workloads. Specifically, we discussed its role in Artifactory which is used to store binary objects from CI/CD. A large number of binary objects or files gets generated with each run of the build. These files are mere build artifacts and the most common usage of these files is download. Since the hosted solution is cloud based, Artifactory users demands elasticity, durability and http access. Object Storage is best suited to meet these demands. The emphasis here is the distinction over a file-system exposed over the web for their primary use case scenario. In fact, the entire DevOps process with its CI/CD servers can use a common Object Storage instance so that there is little if any copying of files from one staging directory to another. The Object Storage not only becomes the final single instance destination but also avoids significant inefficiencies in the DevOps processes. Moreover, builds are repeated through development, testing and production so the same solution works very well for repetitions. This is not just a single use case but an indication that there are many cycles within the DevOps process that could benefit from a storage tier as Object Storage. Static content like binary images of executable are generally copy over write. Versioning for same named files is a feature of Object Storage. Object Storage can not only use file exports but also provide automatic versioning of content. It becomes a content library for binary artifacts of build in all the features demanded over a file system such as versioning. Previous versions may be retained for as long as the life-cycle rules allow. These rules can be specified for the objects. It can also provide time limited access to content. The URI exposed for the object can be shared with anyone. The object may be downloaded on any device anywhere. It enables multi-part upload (mpu) of large objects. This is considered a significant improvement for large binary objects since it enables binary transfer in parts. There are three steps - an mpu upload is requested, different parts are uploaded, and finally an mpu complete is requested. The object storage constructs the object from the parts and then it can be accessed just the same as any other object. Each part is identified and they can number in hundreds. The part upload request includes a part number. The object storage returns a tag header for each part. The header and part number must be included in subsequent requests. The parts can be sent to object storage in any order. A complete request or an abort request must be sent to finalize the parts and permit Object storage to start reconstruction of the object and removing the parts. Parts uploaded so far can be listed. If the listed parts are more than 1000, a series of such list requests need to be sent. Multi-part uploads can be concurrent.If a part is sent again, it will update the already uploaded part. All the parts are used for reconstruction of the original object only after the complete request is received.

Cluster computing

Thursday, August 16, 2018

No comments:

Post a Comment