Sunday, April 11, 2021

 

Applications of Data Mining to Reward points collection service

Continuation of discussion in terms of Machine Learning deployments

Machine learning algorithms are a tiny fraction of the overall code that is used to realize prediction systems in production. As noted in the paper on “Hidden Technical Debt in Machine Learning systems” by Sculley, Holt and others, the machine learning code comprises mainly of the model but all the other components such as configuration, data collection, features extraction, data verification, process management tools, machine resource management, serving infrastructure, and monitoring comprise the rest of the stack. All these components are usually hybrid stacks in nature especially when the model is hosted on-premises. Public clouds do have a pipeline and relevant automation with better management and monitoring programmability than on-premises but it is usually easier for startups to embrace public clouds than established large companies who have significant investments in their inventory, devOps and datacenters.

Hybrid stacks is not the only concern. There are a few other concerns as well. Architectural patterns are harder to enforce with Machine Learning deployments. Traditional web application deployments have significant and growing eco-system of infrastructure, tools and processes to benefit from. But machine learning systems are not always equivalent to a predictive web service. Many models are trained and tested with little or no requirements for outside world connectivity or programmability. Again, the public clouds lead the way in standardizing deployment, monitoring and operations for machine learning deployments.

Lastly, the machine learning field is emerging, and development teams continuously try and experiment with algorithms, data and technology stacks before establishing a process that lets them switch between use cases and production deployments.  A Continuous Integration / Continuous Deployment (CI/CD) pipeline, ML tests and model tuning become a responsibility for the development team even though they are folded into the business service team for faster turn-around time to deploy artificial intelligence models in production. Public clouds make it easy to monitor, troubleshoot and update models in production system deployments but the development team continues to be responsible for the number and scale of such deployments.

 

 

The following chart makes a comparison of all the data mining algorithms including the neural networks: https://1drv.ms/w/s!Ashlm-Nw-wnWxBFlhCtfFkoVDRDa?e=aVT37e

Thank you.

 

 

No comments:

Post a Comment