Saturday, April 10, 2021

 

Applications of Data Mining to Reward points collection service 

Continuation of use cases:    

The drift is monitored by joining product quality labels and predicted quality labels and summarized over a time window to trend model quality. Multiple such KPIs can be used to cover the scoring criteria. For example, a lagging indicator could determine if the actual labels are lagging behind arrive delayed compared to predicted labels. Thresholds can be set for the delay to specify the acceptable business requirements and to trigger a notification or alert. 

  

Ranking and scoring are central and critically important to this pipeline and the choice of their algorithms can make a huge difference in terms of how the model makes predictions for new data. If the scoring were to be sensitive to drift in concept, data, and upstream systems, it would be more accurate, consistent, and avoid deterioration.  

  

The real-world data arrives continuously hence the performance and the quality of the model need to be evaluated continuously. The performance and quality provide independent considerations to tune the model so they are both needed. The former can be indicated by the platform on which the model runs but the latter is more domain and model specific, so it must be decided before the model is deployed and as part of the scoring pipeline.  

  

The pipeline itself becomes a reusable asset that can be used in automation and remote invocations. This makes it easy to compare and deploy model variations so that when we are not just tuning the same model, we can also evaluate them side by side. All models deteriorate over time but the refreshed model tends to pull up the quality over a longer duration. 

 

When the machine learning pipeline is executed well, the following goals are achieved. First, the data remains immutable. Second, the business logic is written once. Finallythe data is available in real-time.  

The model deterioration and drift are avoided without requiring updates to the data and logic.  


The following chart makes a comparison of all the data mining algorithms including the neural networks: https://1drv.ms/w/s!Ashlm-Nw-wnWxBFlhCtfFkoVDRDa?e=aVT37e 

Thank you. 


A coding exercise: https://tinyurl.com/nnj5vd8v 

 

No comments:

Post a Comment