Wednesday, October 24, 2018

We continue discussing data virtualization over object storage. Data Virtualization does not need to be a thin layer as a mashup of different object storage. Even if it was just a gateway, it could have allowed customizations from users or administrators for workloads. However, it doesn’t stop there as described in this write-up: https://1drv.ms/w/s!Ashlm-Nw-wnWt0XqQrQQTQBtuKYh. This layer can be intelligent to interpret data types to storage. Typically queries specify the data location either as part of the query with fully resolved data types or as part of the connection in which the queries are made. In a stateless request-response mode, this can be part of the request. The layer resolving all of the request can determine the storage location. Therefore, the data virtualization can be an intelligent router.
Notice that both user as well as system can determine the policies for the data virtualization. If the data type is only one then the location is known If the same data type is available in different locations, then the determination can be made by user with the help of rules and configurations. If the user has not specified, then the system can choose to determine the location for the data type. The rules are not necessarily maintained by the system in the latter case. It can simply choose the first available store that has the data type.
Many systems use the technique of reflection to find out whether a module has a particular data type. An object storage does not have any such mechanism. However, there is nothing preventing the data virtualization layer to maintain a registry in the object storage or the data type defining modules themselves so that those modules can be loaded and introspected to determine if there is such a data type.  This technique is common in many run-time. Therefore, the intelligence to add to the data virtualization layer is pretty much known.
Most of the implementations for a layer is usually in the form of a micro-service. This helps modular design as well as testability. Moreover, the microservice provides http access for other services. This has been proven to be helpful from the point of view of production support. The services that depend on the data virtualization to be running can also directly go to the object storage via http. Therefore, the data virtualization layer merely acts like a proxy. The proxy can be transparent, thin or fat. It can even play the man in the middle and therefore it requires very little change from the caller.

No comments:

Post a Comment