Friday, April 9, 2021

  Applications of Data Mining to Reward points collection service

Continuation of use cases:   

Features and labels are helpful for the evaluation of the model as well. When the data comes from IoT sensors, it is typically streaming in nature. This makes it suitable for streaming stacks such as Apaches Kafka and Flink.  The production models are loaded in a scoring pipeline to get predicted product quality.  

The drift is monitored by joining product quality labels and predicted quality labels and summarized over a time window to trend model quality. Multiple such KPIs can be used to cover the scoring criteria. For example, a lagging indicator could determine if the actual labels are lagging behind arrive delayed compared to predicted labels. Thresholds can be set for the delay to specify the acceptable business requirements and to trigger a notification or alert.

Ranking and scoring are central and critically important to this pipeline and the choice of their algorithms can make a huge difference in terms of how the model makes predictions for new data. If the scoring were to be sensitive to drift in concept, data, and upstream systems, it would be more accurate, consistent, and avoid deterioration. 

The real-world data arrives continuously hence the performance and the quality of the model need to be evaluated continuously. The performance and quality provide independent considerations to tune the model so they are both needed. The former can be indicated by the platform on which the model runs but the latter is more domain and model specific, so it must be decided before the model is deployed and as part of the scoring pipeline. 

The pipeline itself becomes a re-usable asset that can be used in automation and remote invocations. This makes it easy to compare and deploy model variations so that when we are not just tuning the same model, we can also evaluate them side by side. All models deteriorate over time but the refreshed model tends to pull up the quality over a longer duration.

The following chart makes a comparison of all the data mining algorithms including the neural networks: https://1drv.ms/w/s!Ashlm-Nw-wnWxBFlhCtfFkoVDRDa?e=aVT37e


Thank you.

No comments:

Post a Comment