Tuesday, September 11, 2018

Object versioning versus naming conventions
Introduction: When an object is created, it has a lifetime and purpose. If it undergoes modifications, the object is no longer the same. Even though the modified object may serve the same purpose, it will be a different entity.  These objects can co-exist either as separate versions or with separate names.
In object storage we have similar concept. When we enable versioning, we have the option to track changes to an object. Any modification of the object via upload increments its version. This follows the “copy-on-write" principle.  Generally, in place editing of an object is not recommended. There are several reasons for this. First, an object may be considered a sequence of bytes. If we overwrite a byte range that is yet to be read, the reader may not know what the original object was. Second the size of the byte range may shrink or expand on editing. The caller may never be able to use the size as an attribute of the object if it keeps changing. Third, just like we have caveats with memcpy operations on byte ranges, we exercise similar caution on the changes for readers and writers. If we had made a copy, the readers could continue to read the old copy without any concern about writers and vice versa.  The changes to the object could also leave the object in an inconsistent or unusable state. Therefore, editing an object is not preferred. Unless it is done for debugging or other forms of forensics or reverse engineering, an object is best served to have a different version.  Versioning is automatic and it is possible to go forward or back ward between version. The versioning may even come with descriptions that talk about each version. All versions of the objects have the same name.  When an object is referred by its name, the latest version is retrieved.
When objects have different names, they may choose to have patterns for their organization. When there are a large number of objects, having a prefix and a naming convention is a standard practice in many Information Technology departments. These names with wild card patterns can then be used for search commands that can span some or all of these objects. This may not be easy to do with past versions unless the search command iterates over all versions of an objects. Moreover, copies of the same object with different filename can each undergo different modifications and maintain their own version history. The names may not just involve tags that lets objects be grouped, ranked and sorted.
There are other forms of modifications that are not covered by the techniques above. For example, there is a edit-original and copy-later technique where two copies are simultaneously maintained and the edit of one copy is allowed with restore from the other copy. An undo like technique is also possible where the incremental changes are captured and undone with the copies of the originals used to replace. In fact all in-place editing can be done automatically with the help of some form of rollback behaviour involving either discarding the writes or overwriting with the original.  Locking and logging are two popular techniques to help with their atomicity, consistency, isolation and durability.

No comments:

Post a Comment