Wednesday, May 14, 2025

  

The following is a list of errors and resolutions frequently encountered during k8s and airflow setup with Active Directory integration and Single Sign-On on an Azure Kubernetes Service instance. This is hard to find online. 

  1. Unable to get aks-credentials with the error message of python import error for azure.graph module even though the cluster and the resource group are correct: 

pip3 install azure-graphrbac 
pip3 install msgraph-core 

  1. Az cli command to specify a kubectl command on the cluster fails: 

az extension add --name aks-preview 
az extension add --name azure-cli-legacy-auth 
az extension add --name resource-graph 
az extension add --name k8s-extension 

  1. The extensions are there for the az cli command but still they fail: 

az extension update --name aks-preview && az extension update --name k8s-extension 

  1. Unable to get namespaces on the cluster even after successful login and extensions install: 

Run both: 

az aks get-credentials --resource-group <resource-group> --name <aks-cluster> 
kubelogin convert-kubeconfig -l azurecli 

  1. Installation of airflow fails: 

Get helm, this is probably going to be the fastest way to do the install and add the url to download the helm chart from airflow or create a HelmRelease 

Create a namespace: 

kubectl create namespace airflow 

  1. Repo exists and chart found but airflow install times out: 

Increase timeout. 

helm install dev-release apache-airflow/airflow --namespace airflow --timeout 60m0s --wait 
 

  1. Diagnose failures: 

Use the following to the deployment logs or HelmRelease failures: 
kubectl describe  helmrelease.helm.toolkit.fluxcd.io/airflow -n airflow 
 
 
For failed instances, uninstall and install again: 
      helm list --all-namespaces --failed 
      helm uninstall apache-airflow/airflow --namespace airflow 
 

  1. Webserver is inaccessible: 

kubectl port-forward svc/dev-release-webserver 8080:8080 —namespace airflow 
#command to reset metadata in airflow after ad integration 
airflow db reset 
 

  1. Integration with Active Directory or LDAP does not work: 

Modify webserver_config.py with the following: 

 
Sample webserver_config.py for ldap: 
import os 
from flask_appbuilder.security.manager import AUTH_LDAP 
 
basedir = os.path.abspath(os.path.dirname(__file__)) 
WTF_CSRF_ENABLED = True 
AUTH_TYPE = AUTH_LDAP 
AUTH_LDAP_SERVER = 'ldap://your-ldap-server:389' 
AUTH_LDAP_BIND_USER = 'cn=svc_airflow,cn=Managed Service Accounts,dc=testdomain,dc=local' 
AUTH_LDAP_BIND_PASSWORD = 'supersecretpw!' 
AUTH_LDAP_UID_FIELD = 'sAMAccountName' 
AUTH_LDAP_SEARCH = 'ou=TestUsers,dc=testdomain,dc=local' 
AUTH_ROLES_MAPPING = { 
         'cn=Access_Airflow,ou=Groups,dc=testdomain,dc=local':["Admin"], 
         'ou=TestUsers,dc=test,dc=local':["User"] 
} 
AUTH_ROLE_ADMIN = 'Admin' 
AUTH_USER_REGISTRATION = True 
AUTH_USER_REGISTRATION_ROLE = 'Admin' 
AUTH_ROLES_SYNC_AT_LOGIN = True 
AUTH_LDAP_GROUP_FIELD = "memberOf" 
 

  1. Webserver is accessible but api auth fails: 
     
    Modify airflow ConfigMap to allow auth api with AD integration: 
     
    apiVersion: v1 
    kind: ConfigMap 
    metadata: 
    name: airflow-config 
    data: 
    airflow.cfg: | 
       [api] 
       auth_backends = airflow.api.auth.backend.basic_auths 

Tuesday, May 13, 2025

 These are helpful utilities for image processing using Azure resources:

1. Vectorize images: Sample code and output follow:

Output: Vector embedding: [-1.0224609, -1.3076172,...

2. Analyze images Sample code and output follow:

Output:

Image analysis results:

Caption: 'a building with a road and trees', Confidence: 0.5844

{

  "modelVersion": "2023-10-01",

  "captionResult": {

    "text": "a building with a road and trees",

    "confidence": 0.5844066143035889

  },

  "denseCaptionsResult": {

    "values": [

      {

        "text": "a building with a road and trees",

        "confidence": 0.5844066143035889,

        "boundingBox": {

          "x": 0,

          "y": 0,

          "w": 1920,

          "h": 1080

        }

      },

      {

        "text": "a building with a roof and trees",

        "confidence": 0.5829769968986511,

        "boundingBox": {

          "x": 929,

          "y": 171,

          "w": 938,

          "h": 884

        }

      },

      {

        "text": "a tree shadow on the road",

        "confidence": 0.6864767074584961,

        "boundingBox": {

          "x": 332,

          "y": 0,

          "w": 255,

          "h": 1062

        }

      },

      {

        "text": "a top view of a building",

        "confidence": 0.7406209707260132,

        "boundingBox": {

          "x": 962,

          "y": 189,

          "w": 887,

          "h": 332

        }

      },

      {

        "text": "a blurry image of a person's arm",

        "confidence": 0.7104462385177612,

        "boundingBox": {

          "x": 1634,

          "y": 328,

          "w": 54,

          "h": 63

        }

      },

      {

        "text": "a building with a roof and a road and trees",

        "confidence": 0.5697128176689148,

        "boundingBox": {

          "x": 0,

          "y": 0,

          "w": 1890,

          "h": 1056

        }

      },

      {

        "text": "a tree in a park",

        "confidence": 0.6157793402671814,

        "boundingBox": {

          "x": 848,

          "y": 444,

          "w": 503,

          "h": 619

        }

      },

      {

        "text": "a close up of a plant",

        "confidence": 0.6476104855537415,

        "boundingBox": {

          "x": 943,

          "y": 930,

          "w": 206,

          "h": 146

        }

      },

      {

        "text": "a tree and grass field",

        "confidence": 0.5954487919807434,

        "boundingBox": {

          "x": 4,

          "y": 0,

          "w": 319,

          "h": 1070

        }

      },

      {

        "text": "a close up of a window",

        "confidence": 0.7861047387123108,

        "boundingBox": {

          "x": 1633,

          "y": 419,

          "w": 83,

          "h": 76

        }

      }

    ]

  },

  "metadata": {

    "width": 1920,

    "height": 1080

  },

  "tagsResult": {

    "values": [

      {

        "name": "outdoor",

        "confidence": 0.9880061149597168

      },

      {

        "name": "building",

        "confidence": 0.93121337890625

      },

      {

        "name": "urban design",

        "confidence": 0.9306544065475464

      },

      {

        "name": "map",

        "confidence": 0.9177150726318359

      },

      {

        "name": "aerial photography",

        "confidence": 0.8905916213989258

      },

      {

        "name": "intersection",

        "confidence": 0.8808201551437378

      },

      {

        "name": "junction",

        "confidence": 0.8713006973266602

      },

      {

        "name": "aerial",

        "confidence": 0.8662087917327881

      },

      {

        "name": "tree",

        "confidence": 0.8520137667655945

      },

      {

        "name": "infrastructure",

        "confidence": 0.8460453748703003

      },

      {

        "name": "house",

        "confidence": 0.8455849885940552

      },

      {

        "name": "suburb",

        "confidence": 0.8436774015426636

      },

      {

        "name": "transport corridor",

        "confidence": 0.841437578201294

      },

      {

        "name": "street",

        "confidence": 0.7220888137817383

      }

    ]

  },

  "objectsResult": {

    "values": [

      {

        "boundingBox": {

          "x": 961,

          "y": 18,

          "w": 941,

          "h": 1055

        },

        "tags": [

          {

            "name": "building",

            "confidence": 0.551

          }

        ]

      }

    ]

  },

  "readResult": {

    "blocks": []

  },

  "smartCropsResult": {

    "values": [

      {

        "aspectRatio": 1.96,

        "boundingBox": {

          "x": 80,

          "y": 135,

          "w": 1760,

          "h": 900

        }

      }

    ]

  },

  "peopleResult": {

    "values": [

      {

        "boundingBox": {

          "x": 1033,

          "y": 0,

          "w": 54,

          "h": 78

        },

        "confidence": 0.11555740237236023

      },

      {

        "boundingBox": {

          "x": 1706,

          "y": 0,

          "w": 38,

          "h": 28

        },

        "confidence": 0.044786710292100906

      },

      {

        "boundingBox": {

          "x": 1764,

          "y": 702,

          "w": 72,

          "h": 107

        },

        "confidence": 0.018947092816233635

      },

      {

        "boundingBox": {

          "x": 1617,

          "y": 4,

          "w": 26,

          "h": 32

        },

        "confidence": 0.01635269820690155

      },

      {

        "boundingBox": {

          "x": 1897,

          "y": 997,

          "w": 20,

          "h": 80

        },

        "confidence": 0.014565806835889816

      },

      {

        "boundingBox": {

          "x": 1174,

          "y": 264,

          "w": 65,

          "h": 138

        },

        "confidence": 0.009904739446938038

      },

      {

        "boundingBox": {

          "x": 1570,

          "y": 0,

          "w": 19,

          "h": 26

        },

        "confidence": 0.00963284820318222

      },

      {

        "boundingBox": {

          "x": 975,

          "y": 812,

          "w": 23,

          "h": 56

        },

        "confidence": 0.007403235416859388

      },

      {

        "boundingBox": {

          "x": 1892,

          "y": 256,

          "w": 25,

          "h": 89

        },

        "confidence": 0.0058165849186480045

      },

      {

        "boundingBox": {

          "x": 1730,

          "y": 1006,

          "w": 92,

          "h": 71

        },

        "confidence": 0.005636707879602909

      },

      {

        "boundingBox": {

          "x": 1003,

          "y": 0,

          "w": 49,

          "h": 28

        },

        "confidence": 0.005567244254052639

      },

      {

        "boundingBox": {

          "x": 1006,

          "y": 0,

          "w": 64,

          "h": 60

        },

        "confidence": 0.00508015975356102

      },

      {

        "boundingBox": {

          "x": 1788,

          "y": 672,

          "w": 72,

          "h": 102

        },

        "confidence": 0.004823194816708565

      },

      {

        "boundingBox": {

          "x": 1878,

          "y": 943,

          "w": 39,

          "h": 134

        },

        "confidence": 0.00384620507247746

      },

      {

        "boundingBox": {

          "x": 1063,

          "y": 249,

          "w": 49,

          "h": 126

        },

        "confidence": 0.003768299473449588

      },

      {

        "boundingBox": {

          "x": 1791,

          "y": 991,

          "w": 115,

          "h": 86

        },

        "confidence": 0.003688311204314232

      },

      {

        "boundingBox": {

          "x": 1743,

          "y": 438,

          "w": 45,

          "h": 77

        },

        "confidence": 0.0035305204801261425

      },

      {

        "boundingBox": {

          "x": 1702,

          "y": 0,

          "w": 42,

          "h": 69

        },

        "confidence": 0.0028348765335977077

      },

      {

        "boundingBox": {

          "x": 902,

          "y": 805,

          "w": 31,

          "h": 63

        },

        "confidence": 0.0027336280327290297

      },

      {

        "boundingBox": {

          "x": 1135,

          "y": 223,

          "w": 36,

          "h": 65

        },

        "confidence": 0.002365714870393276

      },

      {

        "boundingBox": {

          "x": 1068,

          "y": 203,

          "w": 76,

          "h": 164

        },

        "confidence": 0.00231865793466568

      },

      {

        "boundingBox": {

          "x": 1721,

          "y": 316,

          "w": 33,

          "h": 72

        },

        "confidence": 0.001977135194465518

      },

      {

        "boundingBox": {

          "x": 1430,

          "y": 274,

          "w": 34,

          "h": 63

        },

        "confidence": 0.0019341635052114725

      },

      {

        "boundingBox": {

          "x": 917,

          "y": 799,

          "w": 21,

          "h": 32

        },

        "confidence": 0.0017207009950652719

      },

      {

        "boundingBox": {

          "x": 1722,

          "y": 976,

          "w": 58,

          "h": 101

        },

        "confidence": 0.0017095959046855569

      },

      {

        "boundingBox": {

          "x": 1824,

          "y": 989,

          "w": 50,

          "h": 76

        },

        "confidence": 0.0014758453471586108

      },

      {

        "boundingBox": {

          "x": 1745,

          "y": 130,

          "w": 87,

          "h": 202

        },

        "confidence": 0.001272828783839941

      },

      {

        "boundingBox": {

          "x": 1559,

          "y": 635,

          "w": 115,

          "h": 232

        },

        "confidence": 0.001130886492319405

      },

      {

        "boundingBox": {

          "x": 1220,

          "y": 255,

          "w": 21,

          "h": 55

        },

        "confidence": 0.0010053350124508142

      }

    ]

  }

}


Monday, May 12, 2025

 This is a sample to illustrate geolocation verification in aerial images:

import cv2

import numpy as np

import requests

# Function to detect and extract features from the aerial image

def extract_features(image_path):

    image = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)

    orb = cv2.ORB_create()

    keypoints, descriptors = orb.detectAndCompute(image, None)

    return keypoints, descriptors, image

# Function to match features between images

def match_features(descriptors1, descriptors2):

    matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

    matches = matcher.match(descriptors1, descriptors2)

    matches = sorted(matches, key=lambda x: x.distance) # Sort by match quality

    return matches

# Function to get GPS coordinates using Google Maps API

def get_geolocation(image_name, api_key):

    url = f"https://maps.googleapis.com/maps/api/geocode/json?address={image_name}&key={api_key}"

    response = requests.get(url)

    data = response.json()

    if data["status"] == "OK":

        location = data["results"][0]["geometry"]["location"]

        return location["lat"], location["lng"]

    return None

# Paths to images

aerial_image_path = "aerial_landmark.jpg"

reference_image_path = "reference_satellite.jpg"

# Extract features from both images

keypoints1, descriptors1, image1 = extract_features(aerial_image_path)

keypoints2, descriptors2, image2 = extract_features(reference_image_path)

# Match features

matches = match_features(descriptors1, descriptors2)

# Draw matches

output_image = cv2.drawMatches(image1, keypoints1, image2, keypoints2, matches[:50], None, flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS)

# Display results

cv2.imshow("Feature Matching", output_image)

cv2.waitKey(0)

cv2.destroyAllWindows()

# Perform geolocation verification

api_key = "YOUR_GOOGLE_MAPS_API_KEY" # Replace with your API key

location = get_geolocation("Hoover Tower, Stanford University", api_key)

if location:

    print(f"Verified Landmark Coordinates: Latitude {location[0]}, Longitude {location[1]}")

else:

    print("Geolocation verification failed!")


Sunday, May 11, 2025

 The following is a sample of how to index images in Azure AI Search for lexical and vector search.

#! /usr/bin/python


#from azure.ai.vision import VisionClient

from azure.core.credentials import AzureKeyCredential

from azure.core.rest import HttpRequest, HttpResponse

from azure.core.exceptions import HttpResponseError

from azure.identity import DefaultAzureCredential

from azure.search.documents import SearchClient

from azure.ai.vision.imageanalysis import ImageAnalysisClient

from azure.ai.vision.imageanalysis.models import VisualFeatures

from tenacity import retry, stop_after_attempt, wait_fixed

from dotenv import load_dotenv  

import json  

import requests

import http.client, urllib.parse

import os


load_dotenv()  

search_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")  

index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")

search_api_version = os.getenv("AZURE_SEARCH_API_VERSION")

search_api_key = os.getenv("AZURE_SEARCH_ADMIN_KEY")  

vision_api_key = os.getenv("AZURE_AI_VISION_API_KEY")

vision_api_version = os.getenv("AZURE_AI_VISION_API_VERSION")

vision_region = os.getenv("AZURE_AI_VISION_REGION")

vision_endpoint =  os.getenv("AZURE_AI_VISION_ENDPOINT")

credential = DefaultAzureCredential()

#search_credential = AzureKeyCredential(search_api_key)

vision_credential = AzureKeyCredential(vision_api_key)


# Initialize Azure clients

#vision_client = VisionClient(endpoint=vision_endpoint, credential=AzureKeyCredential(vision_api_key))

search_client = SearchClient(endpoint=search_endpoint, index_name=index_name, credential=credential)

analysis_client = ImageAnalysisClient(vision_endpoint, vision_credential)


# Define SAS URL template

sas_template = "https://saravinoteblogs.blob.core.windows.net/playground/vision/main/main/{file}.jpg?sp=rle&st=2025-05-11T00:36:41Z&se=2025-05-11T08:36:41Z&spr=https&sv=2024-11-04&sr=d&sig=vjCrqWLo3LbmkXwCyIKWtAtFnYO2uBSxEWNgGKbeS00%3D&sdd=3"


# Process images in batches of 100

batch_size = 100

total_images = 2 # 17853  # Adjust this as needed


def get_description(id, image_url):

    result = analyze_image_from_sdk(client, image_url)

    description = {}

    description["id"] = id

    # Access the results (e.g., image categories)

    if result.caption:

        print(f"Caption: {result.caption.text}")

        print(f"Caption Confidence: {result.caption.confidence}")

        description["caption"] = f"{result.caption.text}"

        description["caption_confidence"] = result.caption.confidence

    if result.tags:

        print("Tags:")

        tags = []

        for tag in result.tags:

            tag = {}

            print(f"  {tag.name}: {tag.confidence}")

            tag["name"] = f"{tag.name}"

            tag["confidence"] = f"{tag.confidence}"

            tags += [tag]

        description["tags"] = tags

    if result.objects:

        print("Objects:")

        objectItems = []

        for obj in result.objects:

            objectItem = {}

            print(f"  {obj.name}: {obj.confidence}")

            objectItem["name"] = f"{obj.name}"

            objectItem["confidence"] = obj.confidence

            if obj.bounding_box:

                print(f"    Bounding Box: {obj.bounding_box}")

                objectItem["bounding_box"] = f"{obj.bounding_box}"

            objectItems += [objectItem]

        description["objects"] = objectItems

    return description


#@retry(stop=stop_after_attempt(5), wait=wait_fixed(1))

def vectorize_image(client, blob_url):

    headers = {

        'Ocp-Apim-Subscription-Key': vision_api_key,

    }


    params = {

        'model-version': '2023-04-15',

        'language': 'en'

    }

    headers['Content-Type'] = 'application/json'


    request = HttpRequest(

        method="POST",

        url=f"/retrieval:vectorizeImage?api-version={vision_api_version}",

        json={"url": blob_url},

        params=params,

        headers=headers

    )

    response = client.send_request(request)    

    try:

        print(repr(response))

        response.raise_for_status()

        print(f"vectorize returned {response.json()}")

        return response.json()

    except HttpResponseError:

        print(str(e))

        return None


#@retry(stop=stop_after_attempt(5), wait=wait_fixed(1))

def get_image_vector(image_path, key, region):

    headers = {

        'Ocp-Apim-Subscription-Key': key,

    }


    params = urllib.parse.urlencode({

        'model-version': 'latest',

    })


    try:

        if image_path.startswith(('http://', 'https://')):

            headers['Content-Type'] = 'application/json'              

            body = json.dumps({"url": image_path})

        else:

            headers['Content-Type'] = 'application/octet-stream'

            with open(image_path, "rb") as filehandler:

                image_data = filehandler.read()

                body = image_data


        conn = http.client.HTTPSConnection("img01.cognitiveservices.azure.com", timeout=3)

        conn.request("POST", "/retrieval:vectorizeImage?api-version=2023-04-01-preview&%s" % params, body, headers)

        response = conn.getresponse()

        print(repr(response))

        data = json.load(response)

        print(repr(data))

        conn.close()


        if response.status != 200:

            raise Exception(f"Error processing image {image_path}: {data.get('message', '')}")


        return data.get("vector")


    except (requests.exceptions.Timeout, http.client.HTTPException) as e:

        print(f"Timeout/Error for {image_path}. Retrying...")

        raise


#@retry(stop=stop_after_attempt(5), wait=wait_fixed(1))

def analyze_image(client, blob_url):

    headers = {

        'Ocp-Apim-Subscription-Key': search_api_key,

    }


    params = {

        'model-version': '2023-04-15',

        'language': 'en'

    }

    headers['Content-Type'] = 'application/json'


    request = HttpRequest(

        method="POST",

        url=f"/computervision/imageanalysis:analyze?api-version={vision_api_version}",

        json={"url": blob_url},

        params=params,

        headers=headers

    )


    response = client.send_request(request)    

    try:

        response.raise_for_status()

        print(f"analyze returned {response.json()}")

        return response.json()

    except HttpResponseError:

        print(str(e))

        return None


def analyze_image_from_sdk(client, blob_url):

    result = client.analyze(

        image_url=blob_url,

        visual_features=[

            VisualFeatures.TAGS,

            VisualFeatures.OBJECTS,

            VisualFeatures.CAPTION,

            VisualFeatures.DENSE_CAPTIONS,

            VisualFeatures.READ,

            VisualFeatures.SMART_CROPS,

            VisualFeatures.PEOPLE,

        ],  # Mandatory. Select one or more visual features to analyze.

        smart_crops_aspect_ratios=[0.9, 1.33],  # Optional. Relevant only if SMART_CROPS was specified above.

        gender_neutral_caption=True,  # Optional. Relevant only if CAPTION or DENSE_CAPTIONS were specified above.

        language="en",  # Optional. Relevant only if TAGS is specified above. See https://aka.ms/cv-languages for supported languages.

        model_version="latest",  # Optional. Analysis model version to use. Defaults to "latest".

    )

    return result


def vectorize_image_from_sdk(client, blob_url):

    result = client.vectorize(

        image_url=blob_url,

        language="en",  # Optional. Relevant only if TAGS is specified above. See https://aka.ms/cv-languages for supported languages.

        model_version="latest",  # Optional. Analysis model version to use. Defaults to "latest".

    )

    return result


for batch_start in range(1, total_images + 1, batch_size):

    vectorized_images = {}

    documents = []


    # Vectorize 100 images at a time

    batch_end = min(batch_start + batch_size, total_images + 1)

    for i in range(batch_start, batch_end):

        file_name = f"{i:06}"

        blob_url = sas_template.format(file=file_name)


        try:

            #response = get_image_vector(blob_url, vision_api_key, "eastus")

            response = vectorize_image(analysis_client, blob_url)

            print(repr(response))

            if response:

               vectorized_images[file_name] = response

               documents += [

                  {"id": file_name, "description": repr(get_description(file_name, sas_template.format(file=file_name))), "vector": response}

               ]

        except Exception as e:

            print(f"Error processing {file_name}.jpg: {e}")


    print(f"Vectorization complete for images {batch_start} to {min(batch_start + batch_size - 1, total_images)}")

    # Upload batch to Azure AI Search

    if len(documents) > 0:

        # search_client.upload_documents(documents)

        print(f"Uploaded {len(documents)} images {batch_start} to {batch_end} to {index_name}.")


print(f"Vectorized images successfully added to {index_name}!")