Sunday, May 17, 2020

Data import and export tool continued

Earlier, we described the data transfer to the stream store with the help of syntax similar to that of connectors for object store. Since the stream store is layered over files and blobs as tier2  storage which bring storage engineering best practice for durability, replication and geographical and web accessible distribution, it is not often recognized as a veritable store requring in participation in workflows outside analytics

The data that flows out of stream store is often pulled from various receivers. That usage continues to be mainstream for the stream store. However, we add additional usages where the appenders push the data to different data sinks. Previously there were in-house scripts for such data transfers. Instead we suggest making it part of the standard storage and provide users just the ability to configure it to their purpose. 
The ability to take the onerous routines of using stream store as a storage layer from the layer above across different data sinks enables a thinner upper layer and more convenience to the end user. The customizations in the upper layer are reduced and the value additions bubble up the stack.  
On-premise stream store is no more a standalone. The public cloud has moved towards embracing on-premise compute resources and their management via System Center integration. The same applies to on-premise stream store as well.  Consequently, the technologies behind the appenders are not only transparent but they are also setup for being monitored and reported via dashboards and charts This improves visibility into the data transfers while enabling cost accounting. 
The importer and exporter have the ability to append or read sections of the stream improving performance and speed.
The importer and exporter also make push and pull model easy by acting as a relay in the middle. The role of the importer and exporter then becomes an adapter between heterogeneous systems. The API for example, is a pull model. But most metrics and time series database are a push model relying on the agents like telegraf to transfer data. 
The importer and the exporter enable a stream store to participate in a data pipeline. This is a critical business value for the stream store because it adds value as co-inhabitant of a data ecosystem as opposed to competing with time-series database. 
The importer and exporter can also be customized for business needs. For example, the size of artifact and the number of artifacts to export are variables. Similarly, the type of event to transform the remote ByteBuffer can be determined before the importer runs. 

No comments:

Post a Comment