AKS - Airflow setup and use with SSO
Here’s a step-by-step guide to deploying and using Apache Airflow on Azure Kubernetes Service (AKS) [1]:
Step 1: Set Up Your AKS Cluster
If you don’t already have an AKS cluster, create one using IaC. This should already be done for you. Log into the Aks cluster with
az aks get-credentials --resource-group <resourcegroupname> --name <clustername>
kubelogin convert-kubeconfig -l azurecli
Step 2: Install Helm
Ensure Helm is installed on your local machine:
curl https://raw.githubusercontent.com/helm/helm/master/scripts/get-helm-3 | bash
Step 3: Create a Namespace for Airflow
kubectl create namespace airflow
Step 4: Configure Workload Identity (Optional but Recommended)
This step allows Airflow to securely access Azure resources like Key Vault:
1. Create a service account:
kubectl apply -f - <<EOF
apiVersion: v1
kind: ServiceAccount
metadata:
name: airflow
namespace: airflow
EOF
1. Annotate the service account with your Azure identity:
kubectl annotate serviceaccount airflow \
azure.workload.identity/client-id=<CLIENT_ID> \
azure.workload.identity/tenant-id=<TENANT_ID> \
-n airflow
Step 5: Install External Secrets Operator (Optional for Key Vault Integration)
helm repo add external-secrets https://charts.external-secrets.io
helm repo update
helm install external-secrets external-secrets/external-secrets \
--namespace airflow \
--create-namespace \
--set installCRDs=true \
--wait
Step 6: Add the Apache Airflow Helm Chart
helm repo add apache-airflow https://airflow.apache.org
helm repo update
Step 7: Install Airflow
a kustomization is preferred:
or using helm
helm install airflow apache-airflow/airflow \
--namespace airflow \
--set executor=CeleryExecutor \
--set airflow.image.tag=2.8.1 \
--set createUser=true \
--set webserver.defaultUser.username=admin \
--set webserver.defaultUser.password=admin
You can get Fernet key with
python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"
and optionally save it as a secret
kubectl get secret --namespace airflow airflow-fernet-key -o jsonpath="{.data.fernet-key}" | base64 --decode
I prefer creating a HelmRelease with the official airflow chart with the release file in the references and creating configMaps for values file also in the references and the webserver_config.py discussed a few steps below if you want to specify SSO during setup.
Step 8: Access the Airflow Web UI
Port-forward the web server service:
kubectl port-forward svc/airflow-webserver 8080:8080 -n airflow
Then open your browser and go to: http://localhost:8080
For a complete setup, it could look like this: PIBI NonProd Airflow
Step 9: Setup the app registration for SSO
Add the following:
1. redirect URI as https://<your-airflow-domain>/oauth-authorized/azure or https://<your-airflow-domain>/oauth2/callback
2. Assign API permissions (openid, email, profile) for authentication.
3. Navigate to the token configuration page of the Azure AD application. For ID and access token, add an optional claim on the
1. email
2. preferred_username
3. given_name
4. family_name
5. UPN
4. Edit the groups claim to include sAMAccountName for both ID and Access tokens but leave out SAML.
5. Specify federated identity entries for use with your GitHub repository.
6. Optionally, create App roles on your Azure AD application such as airflow_nonprod_admin, airflow_nonprod_dev and airflow_nonprod_viewer
7. Make sure you have the right client id, client secret and tenant id for the next steps.
Step 10: Create a secret from the app registration in the previous step
kubectl create secret generic airflow-ad-secret \
--from-literal=client-id=<your-azure-client-id> \
--from-literal=client-secret=<your-azure-client-secret> \
--from-literal=tenant-id=<your-azure-tenant-id>
Step 11: Configure the airflow web server for SSO
Create the values file attached in the references or with the following modifications to bring your own values file:
SSO configuration
webserver:
defaultUser:
enabled: false
authBackend: "airflow.providers.microsoft.azure.auth.backend.azure_auth"
extraEnv:
- name: AIRFLOW__WEBSERVER__RBAC
value: "True"
- name: AIRFLOW__API__AUTH_BACKENDS
value: "airflow.api.auth.backend.deny_all"
- name: AIRFLOW__WEBSERVER__AUTH_BACKEND
value: "airflow.providers.microsoft.azure.auth.backend.azure_auth"
- name: AIRFLOW__MICROSOFT__CLIENT_ID
value: "<your-client-id>"
- name: AIRFLOW__MICROSOFT__CLIENT_SECRET
value: "<your-client-secret>"
- name: AIRFLOW__MICROSOFT__TENANT_ID
value: "<your-tenant-id>"
- name: AIRFLOW__MICROSOFT__REDIRECT_URI
value: "https://<your-airflow-domain>/oauth2/callback"
and create a ConfigMap for the values file named airflow-values.
Step 12: Upgrade the airflow deployment
Run the following command:
apply the yaml above
helm upgrade airflow apache-airflow/airflow \
--namespace <your-namespace> \
-f values.yaml
Step 13: Using webserver_config.py in Airflow to enable OAuth authentication
Just apply the updated /opt/airflow/webserver_config.py4 as shown below to the airflow container.
webserver_config.py
from airflow.www.fab_security.manager import AUTH_OAUTH
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [{
'name': 'Microsoft Azure AD',
'token_key': 'access_token',
'remote_app': {
'api_base_url': "https://login.microsoftonline.com/{TENANT_ID}",
'access_token_url': "https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token",
'authorize_url': "https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/authorize",
'client_id': "{CLIENT_ID}",
'client_secret': "{CLIENT_SECRET}",
'jwks_uri': "https://login.microsoftonline.com/common/discovery/v2.0/keys"
}
}]
or create.a configMap named airflow-webserver-config with the webserver_config.py file attached in the references and pass it to your instance.
Restart the airflow webserver to apply changes.
Step 14: Configure the ingress for https and redirect URI
create the following YAML:
Specify callback
rules:
- host: <your-airflow-domain>
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: airflow-web
port:
number: 8080
Step 15: Test the SSO
• Navigate to https://<your-airflow-domain>.
• You should be redirected to Azure AD login.
• Upon successful login, you’ll be redirected back to Airflow.
webserver_config.py:
from __future__ import annotations
import os
from airflow.www.fab_security.manager import AUTH_OAUTH
from airflow.www.security import AirflowSecurityManager
from airflow.utils.log.logging_mixin import LoggingMixin
basedir = os.path.abspath(os.path.dirname(__file__))
# Flask-WTF flag for CSRF
WTF_CSRF_ENABLED = True
WTF_CSRF_TIME_LIMIT = None
AUTH_TYPE = AUTH_OAUTH
OAUTH_PROVIDERS = [{
‘name’:’Microsoft Azure AD’,
‘token_key’:’access_token’,
‘icon’:’fa-windows’,
‘remote_app’: {
‘api_base_url’: https://login.microsoftonline.com/{}.format(os.getenv(“AAD_TENANT_ID”)),
‘request_token_url’: None,
‘request_token_params’: {
‘scope’: ‘openid email profile’
},
‘access_token_url’: https://login.microsoftonline.com/{}/oauth2/v2.0/token.format(os.getenv(“AAD_TENANT_ID”)),
“access_token_params”: {
‘scope’: ‘openid email profile’
},
‘authorize_url’: https://login.microsoftonline.com/{}/oauth2/v2.0/authorize.format(os.getenv(“AAD_TENANT_ID”)),
“authorize_params”: {
‘scope’: ‘openid email profile’
},
‘client_id’: os.getenv(“AAD_CLIENT_ID”),
‘client_secret’: os.getenv(“AAD_CLIENT_SECRET”),
‘jwks_uri’: ‘https://login.microsoftonline.com/common/discovery/v2.0/keys’
}
}]
AUTH_USER_REGISTRATION_ROLE = “Public”
AUTH_USER_REGISTRATION = True
AUTH_ROLES_SYNC_AT_LOGIN = True
AUTH_ROLES_MAPPING = {
“airflow_prod_admin”: [“Admin”],
“airflow_prod_user”: [“Op”],
“airflow_prod_viewer”: [“Viewer”]
}
Class AzureCustomSecurity(AirflowSecurityManager, LoggingMixin):
Def get_oauth_user_info(self, provider, response=None):
Me = self._azure_jwt_token_parse(response[“id_token”])
Return {
“name”: me[“name”],
“email”: me[“email”],
“first_name”: me[“given_name”],
“last_name”: me[“family_name”],
“id”: me[“oid”],
“username”: me[“preferred_username”],
“role_keys”: me[“roles”]
}
# the first of these two appears to work with older Airflow versions, the latter newer.
FAB_SECURITY_MANAGER_CLASS = ‘webserver_config.AzureCustomSecurity’
SECURITY_MANAGER_CLASS = AzureCustomSecurity
Airflow-Repo:
Airflow-release:
Airflow-values.yaml:
No comments:
Post a Comment