When describing the Azure Machine Learning Workspace deployments via IaC and its shortcomings and corresponding resolutions, it was hinted that the workspace and all its infrastructure concerns can be resolved at deployment time so that the data scientists are free to focus on business use cases. Part of this setup involves kernel creation that can be done via scripts during the creation and assignment of compute to the data scientists. There are two scripts required one at the creation time and other at the start of the compute. Some commads require the terminal to be restarted, so the split in the scripts helps with the stages to specify them. For example, to provision a python 3.11 and spark 3.5 based custom kernel, the following scripts come useful:
#!/bin/bash
set -e
curl https://repo.anaconda.com/archive/Anaconda3-2024.02-1-Linux-x86_64.sh --output Anaconda3-2024.02-1-Linux-x86_64.sh
chmod 755 Anaconda3-2024.02-1-Linux-x86_64.sh
./Anaconda3-2024.02-1-Linux-x86_64.sh -b
# This script creates a custom conda environment and kernel based on a sample yml file.
echo "installation complete"
cat <<EOF > env.yaml
name: python3.11_spark3.5
channels:
- conda-forge
- defaults
dependencies:
- python=3.11
- numpy
- pyspark
- pip
- pip:
- azureml-core
- ipython
- ipykernel
- pyspark==3.5
EOF
echo "env.yaml written"
/anaconda/condabin/conda env create -f env.yaml
echo "Initializing new conda environment"
/anaconda/condabin/conda init bash
#!/bin/bash
set -e
python3 -m pip install ipykernel==v6.29.5
python3 -m ipykernel install --user --name python3.11_spark3.5 --display-name "Python 3.11 - Spark 3.5 (DSS)"
echo "Activating new conda environment"
/anaconda/envs/azureml_py38/bin/conda init bash
/anaconda/envs/azureml_py38/bin/conda activate python3.11_spark3.5
/anaconda/envs/azureml_py38/bin/conda install -y ipykernel anaconda::pyspark
echo "Installing kernel"
sudo -u azureuser -i <<'EOF'
python3 -m pip install pip --upgrade
pip3 install pyopenssl --upgrade
pip3 install pyspark==3.5
pip3 install snowflake-snowpark-python==1.20.0
pip3 install snowflake-connector-python==3.11.0
pip3 install azure-keyvault
pip3 install azure-identity
python3 -m pip install ipykernel==v6.29.5
echo "Conda environment setup successfully."
EOF
Previous articles: https://1drv.ms/w/s!Ashlm-Nw-wnWhPIt_-X-iYdnygX-fA?e=ZCKWsR
No comments:
Post a Comment