Wednesday, May 14, 2025

  

The following is a list of errors and resolutions frequently encountered during k8s and airflow setup with Active Directory integration and Single Sign-On on an Azure Kubernetes Service instance. This is hard to find online. 

  1. Unable to get aks-credentials with the error message of python import error for azure.graph module even though the cluster and the resource group are correct: 

pip3 install azure-graphrbac 
pip3 install msgraph-core 

  1. Az cli command to specify a kubectl command on the cluster fails: 

az extension add --name aks-preview 
az extension add --name azure-cli-legacy-auth 
az extension add --name resource-graph 
az extension add --name k8s-extension 

  1. The extensions are there for the az cli command but still they fail: 

az extension update --name aks-preview && az extension update --name k8s-extension 

  1. Unable to get namespaces on the cluster even after successful login and extensions install: 

Run both: 

az aks get-credentials --resource-group <resource-group> --name <aks-cluster> 
kubelogin convert-kubeconfig -l azurecli 

  1. Installation of airflow fails: 

Get helm, this is probably going to be the fastest way to do the install and add the url to download the helm chart from airflow or create a HelmRelease 

Create a namespace: 

kubectl create namespace airflow 

  1. Repo exists and chart found but airflow install times out: 

Increase timeout. 

helm install dev-release apache-airflow/airflow --namespace airflow --timeout 60m0s --wait 
 

  1. Diagnose failures: 

Use the following to the deployment logs or HelmRelease failures: 
kubectl describe  helmrelease.helm.toolkit.fluxcd.io/airflow -n airflow 
 
 
For failed instances, uninstall and install again: 
      helm list --all-namespaces --failed 
      helm uninstall apache-airflow/airflow --namespace airflow 
 

  1. Webserver is inaccessible: 

kubectl port-forward svc/dev-release-webserver 8080:8080 —namespace airflow 
#command to reset metadata in airflow after ad integration 
airflow db reset 
 

  1. Integration with Active Directory or LDAP does not work: 

Modify webserver_config.py with the following: 

 
Sample webserver_config.py for ldap: 
import os 
from flask_appbuilder.security.manager import AUTH_LDAP 
 
basedir = os.path.abspath(os.path.dirname(__file__)) 
WTF_CSRF_ENABLED = True 
AUTH_TYPE = AUTH_LDAP 
AUTH_LDAP_SERVER = 'ldap://your-ldap-server:389' 
AUTH_LDAP_BIND_USER = 'cn=svc_airflow,cn=Managed Service Accounts,dc=testdomain,dc=local' 
AUTH_LDAP_BIND_PASSWORD = 'supersecretpw!' 
AUTH_LDAP_UID_FIELD = 'sAMAccountName' 
AUTH_LDAP_SEARCH = 'ou=TestUsers,dc=testdomain,dc=local' 
AUTH_ROLES_MAPPING = { 
         'cn=Access_Airflow,ou=Groups,dc=testdomain,dc=local':["Admin"], 
         'ou=TestUsers,dc=test,dc=local':["User"] 
} 
AUTH_ROLE_ADMIN = 'Admin' 
AUTH_USER_REGISTRATION = True 
AUTH_USER_REGISTRATION_ROLE = 'Admin' 
AUTH_ROLES_SYNC_AT_LOGIN = True 
AUTH_LDAP_GROUP_FIELD = "memberOf" 
 

  1. Webserver is accessible but api auth fails: 
     
    Modify airflow ConfigMap to allow auth api with AD integration: 
     
    apiVersion: v1 
    kind: ConfigMap 
    metadata: 
    name: airflow-config 
    data: 
    airflow.cfg: | 
       [api] 
       auth_backends = airflow.api.auth.backend.basic_auths 

No comments:

Post a Comment