Thursday, February 25, 2021

Feedback Mechanisms continued...

 Telemetry and its storage: 

Problem statement: We discussed a mechanism to collect feedback from the users of a software application. This article follows up with the storage for the feedback. 

Description: When the feedback collection is consistent across all components, it tends to have a schema. This data can be collected in a table and the size of the table can grow arbitrarily. Creating instances of the table scoped to the component, feature or product for which the feedback is provided allows the number of such feedback tables to be finite and each allowed to grow as much as their usages.  

Some feedback could be quantitative, others could be qualitative or both. They can even become full-fledged surveys that can be customized. Others can be as short as a specially crafted Uniform Resource Locator that provides all the relevant information via previously crafted query parameters. A binary value might be sufficient in such a feedback. If only metrics were collected, it could be saved in Telegraf, InfluxDB and Grafana stack.

If the content varies a lot, it can be saved in key-value stores or non-relational data with as much convenience as that of a database. When this is collected from a multi-tenant's software, even the databases are separate for each customer. A collection service along with a load-balancer or a high-availability cluster mode storage nicely enables the scale out for an increase in traffic. The use of a service is favored even by platforms like Kubernetes that bring the best practice for hosting infrastructure. 

The data from the transactional database can be rolled from the database to a data warehouse for accumulation. In the absence of a warehouse, the services would be rolling up the feedback and inserting it as a new record while deleting the archived ones. The historical information on the feedback is also useful for analytics which can independently be implemented with read-only reporting stacks.  This calls for a warehouse that can support of a variety of reporting stacks. Using a warehouse also supports replacing the database with a data pipeline involving event processors. The use of event processors and the standard stream querying libraries is well-known for pipeline operations. These libraries support both querying and extract-transform-load operations because they deal with one event at a time.  

A data warehouse can be supported in the cloud with the help of virtual data centers. One such popular warehouse supports data ingestion in the form of JSON from data pipelines. The ability to perform queries over this warehouse follows the conventional Online Analytical Processing model and serves the feedback mechanism very well. Data virtualization libraries that make querying simpler, like Presto, is not needed because the data warehouse supports both querying and storage with a traditional query language. The data is also expected to be consistent between the tenants so there is no need for alternative solutions for any custom processing.    

Conclusion: The information from feedback can seamlessly make its way through collection services, storage tiers and reporting stacks for an end-to-end completion of a feedback cycle to improve the software offering. 

 

No comments:

Post a Comment