Tuesday, April 13, 2021

 Applications of Data Mining to Reward points collection service 

Continuation of discussion in terms of Machine Learning deployments 

Machine learning algorithms are a tiny fraction of the overall code that is used to realize prediction systems in production. As noted in the paper on “Hidden Technical Debt in Machine Learning systems” by Sculley, Holt and others, the machine learning code comprises mainly of the model but all the other components such as configuration, data collection, features extraction, data verification, process management tools, machine resource management, serving infrastructure, and monitoring comprise the rest of the stack. All these components are usually hybrid stacks in nature especially when the model is hosted on-premises. Public clouds do have a pipeline and relevant automation with better management and monitoring programmability than on-premises but it is usually easier for startups to embrace public clouds than established large companies who have significant investments in their inventory, DevOps and datacenters. 

Some of the other advantages in deploying machine learning models to the public cloud include the following: 

1) Readymade automation for machine learning pipelines that can be monitored 24x7. 

2) Ability to span on-premises and public cloud with virtual hybrid cloud 

3) Elasticity of computing resources for machine learning workload including support for GPU 

4) building consistency into machine learning deployments 

5) Machine Learning deployments can have variable workloads during the lifetime of the model. The cloud resources are better able to scale up and down as needed. 

6) ML solutions can take advantage of all the data at once in the cloud without waiting for Extract-Transform-Load that had become a necessity with warehouses. Even virtual data warehouses are available in the cloud if they must be used. 

7) Cloud security is robust and this secures the data at rest as well as transit reducing the onus around the maintenance of data in the cloud.  

8) Cost is transparent in the pay-as-you-go mode of billing and various tools are available to monitor usage and costs 

9) Rate limiting technologies are numerous in addition to native techniques in the cloud and these can prevent the overrun of costs during experimentation 

10) Free tier is available for quick and dirty prototyping in the public cloud that would help to find hidden costs for production systems. 

The following chart makes a comparison of all the data mining algorithms including the neural networks: https://1drv.ms/w/s!Ashlm-Nw-wnWxBFlhCtfFkoVDRDa?e=aVT37e 

Thank you. 

 

 

 

No comments:

Post a Comment