Sunday, May 18, 2025

 Steps to setup airflow with SSO on Azure Kubernetes Service with Active Directory identity and Kubernetes RBAC.

1. Login to the AKS:

az login

az account set --subscription <subscription-id>

az aks get-credentials --resource-group <resource-group> --name <cluster-name>

kubelogin convert-kubeconfig -l azurecli

2. Install nginx-ingress-controller

nginx-ingress-controller:

helm repo add bitnami https://charts.bitnami.com/bitnami

helm repo update

helm install my-ingress oci://registry-1.docker.io/bitnamicharts/nginx-ingress-controller --namespace ingress-nginx --create-namespace

3. Verify installation: check that public ip is provided to the type: LoadBalancer as highlighted below:

kubectl get services --all-namespaces

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

default kubernetes ClusterIP 10.0.0.1 <none> 443/TCP 3d19h

ingress-nginx my-ingress-nginx-ingress-controller LoadBalancer 10.0.45.167 52.230.235.159 80:32303/TCP,443:30457/TCP 2d19h

ingress-nginx my-ingress-nginx-ingress-controller-default-backend ClusterIP 10.0.17.7 <none> 80/TCP 2d19h

kube-system ama-metrics-ksm ClusterIP 10.0.212.159 <none> 8080/TCP 3d19h

kube-system ama-metrics-operator-targets ClusterIP 10.0.216.158 <none> 80/TCP 3d19h

kube-system azure-wi-webhook-webhook-service ClusterIP 10.0.23.89 <none> 443/TCP 3d19h

kube-system kube-dns ClusterIP 10.0.0.10 <none> 53/UDP,53/TCP 3d19h

kube-system metrics-server ClusterIP 10.0.167.74 <none> 443/TCP 3d19h

kube-system network-observability ClusterIP 10.0.63.23 <none> 10093/TCP 3d19h

4. Install airflow:

helm repo add apache-airflow https://airflow.apache.org

helm upgrade --install airflow apache-airflow/airflow --namespace airflow --create-namespace

5. Verify airflow: check airflow-webserver service has an ip address

kubectl get pods -n airflow

kubectl get services --all-namespaces

NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE

airflow airflow-postgresql ClusterIP 10.0.190.39 <none> 5432/TCP 2d19h

airflow airflow-postgresql-hl ClusterIP None <none> 5432/TCP 2d19h

airflow airflow-redis ClusterIP 10.0.198.77 <none> 6379/TCP 2d19h

airflow airflow-statsd ClusterIP 10.0.235.127 <none> 9125/UDP,9102/TCP 2d19h

airflow airflow-triggerer ClusterIP None <none> 8794/TCP 2d19h

airflow airflow-webserver ClusterIP 10.0.174.37 <none> 8080/TCP 2d19h

airflow airflow-worker ClusterIP None <none> 8793/TCP 2d19h

6. Install nginx:

kubectl create –f ingress.yaml

Ingress.yaml:

7. Verify nginx:

8. Setup SSO:

9. Create a Service account:

kubectl annotate serviceaccount airflow

  azure.workload.identity/client-id=<CLIENT_ID>

  azure.workload.identity/tenant-id=<TENANT_ID>

  -n airflow

10. Annotate it:

kubectl annotate serviceaccount airflow \

  azure.workload.identity/client-id=<client-id> \

  azure.workload.identity/tenant-id=<tenant-id> \

  -n airflow

11. Modify the enterprise application behind the client_id with the following:

a. Set up redirect_uri with the following:

i. https://<public-ip>/oauth_provider/azure

ii. http://localhost:8080/oauth_provider/azure

iii. https://52.230.235.159/auth-response

iv. https://localhost:8080/auth-response

v. https://<public-ip>/oauth-authorized/azure

vi. http://localhost:8080/oauth-authorized/azure

vii. https://<public-ip>/login/callback

viii. http://localhost:8080/login/callback

ix. And auth for api with https://<public-ip>/api/auth/callback

x. https://<public-ip>/api/auth/callback

b. Setup token configuration for

i. id to include email,family_name, given_name, preferred_username, upn

ii. Group claim to set SAMAccountName for id and access tokens

c. Api permissions for Microsoft.Graph email and user.read

d. Setup app roles for airflow_nonprod_admin, airflow_nonprod_dev/op, airflow_nonprod_viewer

12. Apply the environment variables to the deployment with

kubectl set env deployment/airflow-webserver AAD_TENANT_ID=<your-tenant-id> -n airflow

kubectl set env deployment/airflow-webserver AAD_CLIENT_ID=<your-client-id> -n airflow

kubectl set env deployment/airflow-webserver AAD_CLIENT_SECRET=<your-client-secret> -n airflow

kubectl set env deployment/airflow-webserver OAUTH_PROVIDERS="[{\n'name':'azure',\n'token_key':'access_token',\n'icon':'fa-windows',\n'remote_app': {\n'api_base_url': 'https://login.microsoftonline.com/{}'.format(os.getenv('AAD_TENANT_ID')),\n'request_token_url': None,\n'request_token_params': {\n'scope': 'openid email profile'\n},\n'access_token_url': 'https://login.microsoftonline.com/{}/oauth2/v2.0/token'.format(os.getenv('AAD_TENANT_ID')),\n'access_token_params': {\n'scope': 'openid email profile'\n},\n'authorize_url': 'https://login.microsoftonline.com/{}/oauth2/v2.0/authorize'.format(os.getenv('AAD_TENANT_ID')),\n'authorize_params': {\n'scope': 'openid email profile'\n},\n'client_id': os.getenv('AAD_CLIENT_ID'),\n'client_secret': os.getenv('AAD_CLIENT_SECRET'),\n'jwks_uri': 'https://login.microsoftonline.com/common/discovery/v2.0/keys',\n'redirect_uri': 'https://52.230.235.159/oauth-authorized/azure'\n" -n airflow

13. Modify the webserver_config.py by uncommenting the sections for AUTH_TYPE and OAUTH_PROVIDER or uploading the attached webserver_config.py

14. Restart and test the airflow user interface

kubectl rollout restart deployment airflow-webserver -n airflow


No comments:

Post a Comment