Cluster computing

Tuesday, May 28, 2024

This is a summary of the book titled “The AI playbook: mastering the art of machine learning deployment” written by Eric Siegel and published by MIT press in 2024. Prof. Siegel urges business and tech leaders to come out of their silos and collaborate to harness the full potential of machine learning models that will transform their organization and optimize their operations. He provides a step-by-step framework to do that which includes establishing value-driven deployment goal by leveraging “backward planning”, collaborating for a specific prediction goal, finding the right evaluation metrics, preparing the data to achieve desired outcomes, training the model to detect patterns, deploying the models such that there is a full-stack buy-in from stakeholder departments in the organization and committing to a strong ethical compass for maintaining the models.

Machine Learning (ML) opportunities require collaboration between business and data professionals. Business professionals need a holistic understanding of the ML process, including models, metrics, and data collection. Data professionals must broaden their perspective on ML to understand its potential to transform the entire business. BizML, a six-step business approach, bridges gaps between the business and data ends of an organization. It focuses on organizational execution and complements the Cross Industry Standard Process for Data Mining (CRISP-DM). Successful ML and AI projects require "backward planning" to establish a value-driven deployment goal. ML's applications extend beyond predicting business outcomes, addressing social issues like abuse or neglect. After choosing how to apply ML, stakeholders with decision-making power should approve it, focusing on the gains ML can make rather than fixating on the technology.

Business and tech leaders should collaborate to specify a prediction goal for machine learning (ML) projects. This involves defining the goal in detail, identifying viable prediction goals, and adhering to the "Law of ML Planning." Ensure that deployment and the predictions will shape business operations are at the forefront of the project. Consider potential ethical issues, such as the potential for predictive policing models to inflate the likelihood of Black parolees being rearrested.

For new ML projects, consider creating a binary model or binary classifier that makes predictions by answering yes/no questions. Other predictive models, such as numerical or continuous models, can also be used.

Evaluating the model's performance is crucial to determine its success. Accuracy is not the best way to measure the model's success. High accuracy models only perform better than random guessing, and metrics such as "lift" and "cost" should be used to evaluate the model's performance.

To train a machine learning (ML) model, ensure that the data is long, wide, and labeled. This will help the model accurately predict outcomes and identify patterns. Ensure that the data is structured and unstructured and be wary of "noise" or "corrupt data" that may be causing issues.

Teach the ML model to detect patterns in a sensible way, as ML algorithms learn from your data and use patterns to make predictions. Understanding your model is not always straightforward, but if the patterns your model detects and uses to make predictions are reliable, you don't necessarily need to establish causation.

Familiarize yourself with different modeling methods, such as decision trees, linear regression, and logistic regression. Investigate your models to ensure they don't contain bugs, as some models may combine input variables in problematic ways. For example, a model designed to distinguish huskies from wolves using images may label all images with snow as "wolves" when it might be discovered that the model was labeling all images without snow as "huskies."

To deploy an AI model, it's crucial to gain full-stack cooperation and buy-in from all team members within your organization. Building trust in the model is essential, as it can automate decision-making processes. Humans still play a role in some processes, and deploying a "human-in-the-loop" approach allows them to make operational decisions after integrating data from the model. Deployment risk can be mitigated by using a control group or incremental deployment. Maintaining the model is essential to prevent model drift, which can occur when the data used degrades. To avoid discrimination, ensure the model doesn't operate in a discriminatory way, aiming to equally represent different groups and avoid inferring sensitive attributes. Aspire to use data ethically and responsibly, based on empathy.

Cluster computing

Tuesday, May 28, 2024

No comments:

Post a Comment