Tuesday, November 13, 2018

Today we continue discussing the best practice from storage engineering:

41) Allocations: Although a storage organization unit such as file, blob or table seems like a single indivisible logical unit to the user, it translates to multiple physical layer allocations. Files have hierarchical organization and low-level drivers translate them to file location and byte offset on disk. This has been a traditional architecture and primarily driven by hierarchy and naming. Storage units have more than names. They have tags and metadata and designing file system that utilizes alternate forms of organization that leverages tags helps simultaneous use of different nomenclature. This is an example where master data management can bring significant advantages such as the use of attributes to lookup files.

42) Catalogs:  Physical organization does not always have to directly co-relate with the way users save them. A catalog is a great example of utilizing the existing organization to serve various ways in which the content is looked up or correlated. Moreover, custom tags can help increase the ways in which the files can be managed and maintained. While lookups have translated to queries, content indexers have provided an alternate way to look up data. Here we refer to organization of metadata so that the storage architecture can be separated from the logical organization and lookups.

43) System metadata – Metadata is not specific only to the storage artifacts from the user. Every layer maintains entities and bookkeeping in the immediately lower layer and these are often just 6 useful to query as some of the queries of the overall system. This metadata is internal and for system purposes only. Consequently, they are the source of truth for the artifacts in the system.

44) User metadata – We referred to metadata for user objects. However, such metadata is usually in the form of predetermined fields that the system exposes. In some cases, users can add more labels and tags and this customization is referred to as user metadata. User metadata helps in cases outside the system where users want to group their content that can then be used in classification and data mining.

45) User defined functions, callbacks and webhooks – Labels and tags are only as much useful to the user as they can be used with their queries. If the system does not support intensive or involved logic, the user is left to implement their own. Such expressions may involve custom user defined operators, and callbacks.  These can be executed on a subset of the user-data or all of the data including those of the user. They can also be executed where the results can be streamed.


No comments:

Post a Comment