Cluster computing

Monday, July 21, 2025

Micro-batching versus streaming for aerial drone video analytics

Azure AI Vision enables near real-time video analysis by extracting meaningful insights from live video streams. The process involves capturing frames from a video source, selecting which frames to analyze, submitting them to the API, and consuming the returned results. There are three progressively sophisticated approaches to implementing this workflow.

The simplest method uses an infinite loop that grabs a frame, analyzes it, and processes the result. While straightforward, this approach is limited by the latency of cloud-based API calls, which can slow down frame acquisition. To improve performance, the second method introduces parallelism by launching each analysis task asynchronously. This allows frame grabbing to continue uninterrupted but introduces challenges like out-of-order results and potential thread safety issues.

The most robust solution is a producer-consumer system. Here, a producer thread captures frames and queues analysis tasks, while a consumer thread process results sequentially. This design ensures orderly result handling and maximizes frame throughput without blocking operations. This draws inspiration from TCP protocol ring buffer processing and socket programming with overlapped I/O.

To help developers get started, Microsoft provides a sample library called FrameGrabber, which simplifies integration with Azure AI services. It supports event-driven programming, allowing developers to respond to new frames and analysis results efficiently. A sample python application demonstrates how to use the pygrabber for vehicle tracking in aerial drone feeds.

This hybrid approach—combining client-side preprocessing with cloud-based analysis—offers flexibility and scalability for building intelligent video applications by still retaining the bulk of the work on the analytics side instead of the video preprocessing side or with model training, testing and revision cycles. Developers can build interactive experiences on querying aerial drone images using agentic retrieval.

While many purpose-specific aerial drone video sensing applications do require custom models for various purposes, we believe a video sensing analytical platform1 removes much of the overhead and repetitive tasks in home-grown and DIY solutions while moving the complexity from vectorizing to analytics. It also provides an opportunity to stay nimble on alternative or augmentation techniques to image frame grabbing and processing such as with video indexing, thereby avoiding the high cost of repetitive tasks on the video preprocessing side.

Extending the concept of picking what to process from frame selection to video indexing, it becomes clearer that specific analysis can be done with a high degree of accuracy and high performance when videos are micro-batched versus when split into live feed frames by reducing the working set. Only in cases where continuous tracking is required, the latter may be paid for. In most cases, the drone world catalog suffices to be build incrementally from the iterative micro-batches for answering general questions including those relevant to vehicle tracking.

Sample application demonstrating this:

# Step 1: grab frames from live or indexed video

from pygrabber.dshow_graph import FilterGraph

import cv2

# Initialize the frame grabber device (typically, 0 means the first available camera)

graph = FilterGraph()

camera_index = 0

graph.add_video_input_device(camera_index)

# Define a function to process and return frames

def get_camera_frame(image_buffer):

# Convert the bgr image buffer to an OpenCV image

frame = image_buffer

return frame

graph.add_sample_grabber(get_camera_frame)

graph.add_null_render()

graph.prepare_preview_graph()

graph.run()

import torch

# Step 2: detect object

model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)

def detect_vehicles(frame):

results = model(frame)

# Keep only 'car', 'truck', 'bus', 'motorcycle' detections

vehicle_labels = ['car', 'truck', 'bus', 'motorcycle']

detections = results.pandas().xyxy[0]

vehicles = detections[detections['name'].isin(vehicle_labels)]

return vehicles

# Step 3: implement trackers

trackers = cv2.MultiTracker_create()

def update_trackers(frame):

ret, boxes = trackers.update(frame)

return boxes

def initialize_trackers(frame, vehicles):

for _, det in vehicles.iterrows():

bbox = tuple(det[['xmin', 'ymin', 'xmax', 'ymax']])

# OpenCV boxes: (x, y, w, h)

x1, y1, x2, y2 = map(int, bbox)

w = x2 - x1

h = y2 - y1

tracker = cv2.TrackerCSRT_create()

trackers.add(tracker, frame, (x1, y1, w, h))

# Step 4: visualize tracking results

while True:

frame = ... # Get latest frame from frame grabber

vehicles = detect_vehicles(frame)

if trackers.getObjects().empty():

initialize_trackers(frame, vehicles)

else:

boxes = update_trackers(frame)

for i, box in enumerate(boxes):

x, y, w, h = [int(v) for v in box]

cv2.rectangle(frame, (x, y), (x + w, y + h), (255,0,0), 2)

cv2.putText(frame, f'Vehicle {i+1}', (x, y-10), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0,255,0), 2)

cv2.imshow('Vehicle Tracking', frame)

if cv2.waitKey(1) & 0xFF == ord('q'):

break

graph.stop()

cv2.destroyAllWindows()

Sunday, July 20, 2025

This is a summary of the book titled “Rise of the Rest” written by Steve Case and published by Avid Reader Press in 2022. Steve dedicates this book to the Rise-of-the-Rest initiative which is a nationwide undertaking designed to shine spotlight on innovation growth outside of the traditional coastal tech hubs through city visits, hubs and events and capital investments. Communities across the United States and not just Silicon Valley are nurturing new businesses. He describes innovation competitions and showcases in US cities that don’t usually make the list of innovation hubs. His initiative forms a support network for innovation start-ups to succeed while encouraging communities to leverage their local strengths. In return, the startups can revitalize their stagnant communities. Adversity can also help and startups may benefit from diversity and sustainability.

Rise of the Rest is an initiative that encourages cities across the US to invest in innovation. It brings together innovators, government representatives, potential funders, business leaders, and policymakers to support innovative entrepreneurs in America's heartland. Each Rise of the Rest Road Trip visits five cities in five days, investing at least $500,000 in local start-ups through pitch competitions and innovation showcases. Success by a start-up firm is rare and requires essential resources such as connections, money, attention, and support from credible partners. Rise of the Rest can spark change in local communities it visits, such as those in Dallas, Birmingham, Louisville, and Chattanooga and Memphis. Business conditions increasingly favor startup growth in cities beyond Silicon Valley, with half of venture capital investment in the US going to California companies. Supportive networks help start-ups succeed, and communities benefit when they create innovation hubs. Examples of successful ecosystems include Louisville's Muhammad Ali Center and Madison's Warehouse District.

Rise of the Rest encourages communities to leverage their local strengths to support entrepreneurs and innovation. In the United States, cities should capitalize on their local assets, such as internet access, regional freight industry, and retail expertise. Atlanta's Opportunity Hub (OHUB) supports underrepresented innovators, while Denver's supportive government, academic, and business communities, sense of local cohesion, and business leaders' commitment to entrepreneurship make it fertile ground. Pennsylvania towns, like Pittsburgh, are applying similar tactics to create a technology hub, with robotics boosting Pittsburgh. The University of Nebraska-Lincoln Innovation Campus is turning investors and mentors into start-ups, like LifeLoop, a successful start-up that coordinates among senior living community residents, staff, and family members. Salt Lake City, Utah, has shifted its business culture from working and commuting to a consolidated live, work, play approach, with a focus on entrepreneurship skills and a focus on poverty alleviation. Nashville, Tennessee, demonstrates the value of tapping into local resources and fostering a supportive environment for start-ups.

#codingexercise: https://1drv.ms/w/c/d609fb70e39b65c8/ES9sIfBashZAtTf1RNl4_08BVrvP2mlGaE8YpS9rbMXE7g?e=Guah74

#Codingexercise: https://1drv.ms/w/c/d609fb70e39b65c8/ES9sIfBashZAtTf1RNl4_08BVrvP2mlGaE8YpS9rbMXE7g?e=psuqdn

Saturday, July 19, 2025

This is a summary of the book titled “AI In business and economics” written by Michael Vogelsang and Isabel Lausberg and published by De Gruyter in 2024. The authors present a collection of papers from the 2023 Economic perspective of Artificial Intelligence conference. The studies examine both the present and the future of AI adoption and impact on society and industry. Early adopters especially the large companies have the chance to establish monopolies while medium-sized companies need more gradual and measurable progress on the roadmap to integrate AI. Technology teams are not strictly required for AI adoption and AI can’t replace every job. Media plays a crucial role in its perception while AI can step up to provide guidance in ambiguous forecasts.

Richard von Maydell of ETH Zürich and Christoph Menzel of the Federal Ministry for Economic Affairs and Climate Action, Berlin, discuss the rise of tech giants like Apple, Amazon, and Microsoft, noting that their reliance on information and communication technology (ICT) has led to increased market concentration. The growing use of AI in these sectors exacerbates the problem, as AI helps companies reduce costs and become more efficient, making it harder for new businesses to compete. This concentration of power can harm competition and push up prices, limiting consumer choices.

The book also explores the increasing role of intangible assets like software, data, and AI itself, which further contribute to market concentration. Dominant companies thrive while smaller competitors struggle, leading to an imbalance that hurts consumers by limiting choices and contributing to high prices. Governments are urged to update laws and policies to regulate competition in digital markets. The European Union's Digital Markets Act is one such example, though its effectiveness in handling AI's growing influence remains uncertain. Financial support for smaller businesses and encouraging data sharing could help create a more level playing field.

Medium-sized companies need a gradual, specific, and goal-oriented process for integrating AI. The KI-AGIL research project, led by Markus Feld, Wolfgang Arens-Fischer, and Marcel Schumacher of the University of Applied Sciences Osnabrück, aimed to help SMEs integrate AI into their operations. The project guided six SMEs through manageable phases called “sprints,” allowing them to build their AI systems gradually. Agile development techniques were used, focusing on flexibility and continuous improvement. This incremental approach helped businesses manage the complexity and reduce the risks of adopting AI.

Isabel Lausberg, Arne Eimuth, and Anne Stockem Novo of the Ruhr West University of Applied Sciences discuss the slow adoption of AI in management reporting. Despite AI's potential to enhance these processes, companies face hurdles like poor data integration and the continued use of traditional tools like Excel. Improving data management, building a strong AI infrastructure, and fostering a culture of AI acceptance within organizations are crucial steps. Top management support is essential to drive the necessary changes and allocate resources.

The WIRKsam project, a joint initiative by researchers from various universities, emphasizes the use of “plug and play” AI tools that require little to no coding experience. No-code and low-code approaches allow users to build applications without extensive programming knowledge. These tools make AI accessible to a wide range of users, enabling businesses to benefit from AI without relying solely on specialized tech teams.

Ed Dandalt of Wilfrid Laurier University argues that AI is unlikely to replace physicians' roles in health care due to the unique combination of clinical and management skills they bring. While AI can support doctors by automating administrative tasks, it cannot replace the essential components of empathy, active listening, and decision-making critical to patient care. Patients' preferences for human physicians and the high demand for medical professionals further support this view.

Simone Roth and Medina Klicic of Ruhr West University of Applied Sciences highlight the media's crucial role in shaping public perceptions of AI. Positive articles often emphasize AI's benefits, while negative coverage gains more attention, contributing to public concerns and skepticism. Addressing consumer-related issues like fairness and protection in the media could help mitigate fears and foster greater acceptance of AI technologies.

Finally, Katharina I. Köstner, Bàrbara Llacay, and David Alaminos of the Universitat de Barcelona discuss the promise of algorithms like Deep Autoregressive Recurrent Networks (DeepAR) in forecasting market volatility. AI models using DeepAR outperform traditional methods in terms of accuracy and error reduction, particularly during periods of high volatility. Incorporating more control variables could enhance the model's accuracy and versatility in real-world applications.

Overall, the book provides a comprehensive overview of AI's impact on business and economics, highlighting both opportunities and challenges. It emphasizes the need for gradual integration, regulatory updates, and the importance of human elements in AI adoption. The insights from various experts offer valuable guidance for businesses and policymakers navigating the complex landscape of AI.

#codingexercise: https://1drv.ms/w/c/d609fb70e39b65c8/EfiFph_kTWlBpxlOX1Rn03oBz2_9ZFR9rpW8BEU7DZxPmA?e=4YuaIq

#video sensing platform: https://1drv.ms/w/c/d609fb70e39b65c8/EVT5MyCpeitHq00VIVoCNuABWxp3HSXCeY5yAm1paF1-nw?e=h0XeRx

Friday, July 18, 2025

In Strategic: The Skill to Set Direction, Create Advantage, and Achieve Executive Excellence, Rich Horwath delivers a compelling guide for leaders aiming to forge clear, differentiated strategies and drive lasting success. At its core, the book emphasizes that strategy is not just a plan—it’s a deliberate set of choices about where to play and how to win, rooted in clarity and trade-offs. Horwath argues that imitation, particularly competing on price or replicating a rival’s approach, is a weak substitute for true strategic thinking. Instead, he champions distinctiveness—identifying and nurturing what sets your organization apart.

He Introduces the GOST Framework—goals, objectives, strategy, and tactics—clarifying how high-level aspirations translate into specific, actionable plans. A goal is broad, an objective is precise, strategy is conceptual, and tactics are tangible. His Rule of Touch offers a practical litmus test: if you can physically touch it, it’s a tactic, not strategy.

Horwath critiques the widespread reliance on annual resource reallocation, highlighting that continuous and agile reassessment is key to performance. Through research with major organizations, he shows that timely resource shifts fuel growth and innovation. Concentration, not diversification, is where greatness often lies. Drawing on examples like Apple, Google, and Amazon, he reveals how focusing on core strengths rather than spreading thin leads to market leadership.

The book also explores strategic pitfalls—indecisiveness, risk aversion, and failure to innovate—arguing that the worst mistake isn’t making the wrong choice, but making none at all. Emotional intelligence emerges as another pillar of leadership: self-awareness and relationship management boost communication, resilience, and overall effectiveness.

Horwath pays close attention to organizational culture, suggesting that purpose-driven declarations (mission, vision, values) should steer behaviors and decisions. Toxic influences—such as persistent negativity or blame—must be rooted out to preserve strategic integrity. Effective culture, much like strategy, doesn’t happen by accident; it must be designed, reinforced, and protected.

On planning, he urges leaders to balance short-term tactics with long-term vision. Despite widespread reliance on one-year plans, imagining your company years down the road enhances adaptability and foresight. Talent management plays a key role here: hiring based on past behavior, not just experience, ensures stronger team performance.

Finally, Horwath encourages an open-minded, question-driven approach to innovation. Strategic options shouldn’t be treated as either/or propositions; instead, elements from different paths can be fused. He champions collaborative environments while warning against unproductive meetings and the myth of multitasking, advocating for structured, focused sessions and monotasking for clarity and impact.

Through vivid examples, thoughtful frameworks, and sharp insights, Horwath guides readers to build businesses that are strategically sound, culturally vibrant, and constantly evolving. His message is clear: strategic excellence isn’t reserved for the few—it’s a skill that can be cultivated, executed, and mastered.

#codingexercise: https://1drv.ms/w/c/d609fb70e39b65c8/EffYN3c80HpIg4GMHe5H--EBQLfEIZId0sDkkyfLjD8bwA?e=wrlieq

Thursday, July 17, 2025

The following summarizes the workflow of a drone video sensing application:

import requests

import time

import os

from django.conf import settings

from dotenv import load_dotenv

load_dotenv(override=True)

import urllib.parse

from azure.core.credentials import AzureKeyCredential

from azure.core.exceptions import HttpResponseError

from azure.identity import DefaultAzureCredential

from azure.ai.vision.imageanalysis import ImageAnalysisClient

from azure.ai.vision.imageanalysis.models import VisualFeatures, ImageAnalysisResult

from tenacity import retry, stop_after_attempt, wait_fixed

from pprint import pprint, pformat

from dotenv import load_dotenv

import json

import http.client

vision_api_key = settings.vision_api_key

vision_api_version = settings.vision_api_version

vision_region = settings.vision_region

vision_endpoint = settings.vision_endpoint

api_version = settings.api_version

model_version = settings.model_version

# Step 1: Get an access token

def get_access_token():

url = f"{settings.video_indexer_endpoint}/auth/{settings.video_indexer_region}/Accounts/{settings.video_indexer_account_id}/AccessToken"

headers = {

"Ocp-Apim-Subscription-Key": settings.video_indexer_api_key

}

response = requests.get(url, headers=headers)

return response.text.strip('"')

def trim_filename(filename: str, max_length: int = 255) -> str:

# Separate base name and extension

import os

base, ext = os.path.splitext(filename)

# Truncate base if total exceeds max_length

allowed_base_length = max_length - len(ext)

trimmed_base = base[:allowed_base_length]

return trimmed_base + ext

# Step 2: Upload video and start indexing

def upload_and_index_video(access_token, accountId, video_file_path, video_url = None):

video_name = None

if video_uri:

parsed_url = urllib.parse.urlparse(video_url)

video_path = parsed_url.path

video_name = accountId + "-" + video_path.split('/', 2)[-1]

if video_file_path:

video_name = accountId + "-" + trim_filename(os.path.basename(video_file_path))

# https://api-portal.videoindexer.ai/api-details#api=Operations&operation=Upload-Video

url = f"{settings.video_indexer_endpoint}/{settings.video_indexer_region}/Accounts/{settings.video_indexer_account_id}/Videos?name={video_name}&accessToken={access_token}&privacy=Private"

if video_url:

encoded_url = urllib.parse.quote(video_url, safe='')

url += f"videoUrl={encoded_url}"

response = requests.post(url)

return response.json()

else:

with open(video_file_path, 'rb') as video_file:

files = {'file': video_file}

response = requests.post(url, files=files)

return response.json()

# Step 3: Wait for indexing to complete and get insights

def get_video_insights(access_token, video_id):

url = f"{settings.video_indexer_endpoint}/{settings.video_indexer_region}/Accounts/{settings.video_indexer_account_id}/Videos/{video_id}/Index?accessToken={access_token}"

while True:

response = requests.get(url)

data = response.json()

if data['state'] == 'Processed':

return data

time.sleep(10) # Wait 10 seconds before checking again

# Step 4: Main workflow

def get_uploaded_video_id(access_token, accountId, video_file_path, video_url = None):

video_data = upload_and_index_video(access_token, accountId, video_file_path, video_url)

print(video_data)

if 'id' in video_data:

video_id = video_data['id']

return video_id

return None

def get_insights_formatted(access_token, video_id):

insights = get_video_insights(access_token, video_id)

value = "Video highlights and key insights:\n"

value += ("=" * 50) + "\n"

# Extract highlights: keyframes, topics, and summarization

if 'summarizedInsights' in insights:

for theme in insights['summarizedInsights']['themes']:

value += f"Theme: {theme['name']}"

for highlight in theme['keyframes']:

value += f" Keyframe at {highlight['adjustedStart']} to {highlight['adjustedEnd']}\n"

value += f" Thumbnail: {highlight['thumbnailId']}\n"

value += f" Description: {highlight.get('description', 'No description')}\n"

else:

value += f"No summarization available. See full insights: {insights}"

return value

"""

{'accountId': '26ff36de-cac7-4bea-ad7a-abdf0d63c19c', 'id': 'lwxjba8wy3', 'partition': None, 'externalId': None, 'metadata': None, 'name': 'mainindexedvideo.mp4', 'description': None, 'created': '2025-06-25T03:54:44.3133333+00:00', 'lastModified': '2025-06-25T03:54:44.3133333+00:00', 'lastIndexed': '2025-06-25T03:54:44.3133333+00:00', 'privacyMode': 'Private', 'userName': 'Ravi Rajamani', 'isOwned': True, 'isBase': True, 'hasSourceVideoFile': True, 'state': 'Uploaded', 'moderationState': 'OK', 'reviewState': 'None', 'isSearchable': True, 'processingProgress': '1%', 'durationInSeconds': 0, 'thumbnailVideoId': 'lwxjba8wy3', 'thumbnailId': '00000000-0000-0000-0000-000000000000', 'searchMatches': [], 'indexingPreset': 'Default', 'streamingPreset': 'Default', 'sourceLanguage': 'en-US', 'sourceLanguages': ['en-US'], 'personModelId': '00000000-0000-0000-0000-000000000000'}

"""

def repeat_video_index(access_token, video_id):

"""Retrieve the index/insights for a video by its ID."""

url = f"{video_indexer_endpoint}/{video_indexer_region}/Accounts/{video_indexer_account_id}/Videos/{video_id}/ReIndex?accessToken={access_token}"

response = requests.put(url)

if response.status_code == 200:

return response

return get_video_insights(access_token, video_id)

def get_video_insights(access_token, video_id):

url = f"{video_indexer_endpoint}/{video_indexer_region}/Accounts/{video_indexer_account_id}/Videos/{video_id}/Index?accessToken={access_token}"

count = 0

while True:

response = requests.get(url)

data = response.json()

if "state" in data and data['state'] == 'Processed':

return data

count+=1

if count%10 == 0:

print(data)

print("Sleeping for ten seconds...")

time.sleep(10) # Wait 10 seconds before checking again

def get_selected_segments(insights, threshold):

indexed_duration = insights["summarizedInsights"]["duration"]["seconds"]

reduced_duration = (threshold * indexed_duration) / 100

selected_segments = []

# total_duration = 0

for video in insights["videos"]:

for shot in video["insights"]["shots"]:

shot_id = shot["id"]

for key_frame in shot["keyFrames"]:

key_frame_id = key_frame["id"]

start = key_frame["instances"][0]["start"]

end = key_frame["instances"][0]["end"]

# total_duration += float(end) - float(start)

print(f"Clipping shot: {shot_id}, key_frame: {key_frame_id}, start: {start}, end: {end}")

selected_segments +=[(start,end)]

# print(f"Total duration: {total_duration}")

return selected_segments

def create_project(access_token, video_id, selected_segments):

import random

import string

video_ranges = []

for start,end in selected_segments:

intervals = {}

intervals["videoId"] = video_id

intervalRange = {}

intervalRange["start"] = start

intervalRange["end"] = end

intervals["range"] = intervalRange

video_ranges += [intervals]

project_name = ''.join(random.choices(string.hexdigits, k=8))

data = {

"name": project_name,

"videosRanges": video_ranges,

"isSearchable": "false"

}

headers = {

"Content-Type": "application/json"

}

url = f"{video_indexer_endpoint}/{video_indexer_region}/Accounts/{video_indexer_account_id}/Projects?accessToken={access_token}"

response = requests.post(url, json=data, headers=headers)

print(response.content)

if response.status_code == 200:

data = response.json()

project_id = data["id"]

return project_id

else:

return None

def render_video(access_token, project_id):

url = f"{video_indexer_endpoint}/{video_indexer_region}/Accounts/{video_indexer_account_id}/Projects/{project_id}/render?sendCompletionEmail=false&accessToken={access_token}"

headers = {

"Content-Type": "application/json"

}

response = requests.post(url, headers=headers)

print(response.content)

if response.status_code == 202:

return response

else:

return None

def get_render_operation(access_token, project_id):

url = f"{video_indexer_endpoint}/{video_indexer_region}/Accounts/{video_indexer_account_id}/Projects/{project_id}/renderoperation?accessToken={access_token}"

while True:

response = requests.get(url)

data = response.json()

if "state" in data and data['state'] == 'Succeeded':

return data

print("Sleeping for ten seconds before checking on rendering...")

time.sleep(10) # Wait 10 seconds before checking again

def download_rendered_file(access_token, project_id):

url = f"{video_indexer_endpoint}/{video_indexer_region}/Accounts/{video_indexer_account_id}/Projects/{project_id}/renderedfile/downloadurl?accessToken={access_token}"

response = requests.get(url)

if response.status_code == 200:

print(response.content)

data = response.json()

if "downloadUrl" in data:

return data["downloadUrl"]

return None

def index_and_download_video(account_id = None, project_id = None, video_id = None, video_file_path = None, video_url = None):

if not account_id:

account_id = settings.video_indexer_default_account_id

# Main workflow

access_token = settings.video_indexer_access_token

if not access_token:

access_token = get_access_token()

if not access_token:

access_token = get_access_token()

if not uploaded_video_id and not video_file_path and not video_url:

return None

if not video_id:

if video_file_path:

video_id = get_uploaded_video_id(access_token, accountId, video_file_path)

if video_url:

video_id = get_uploaded_video_id(access_token, accountId, video_file_path, video_url=video_url)

insights = get_video_insights(access_token, video_id)

selected_segments = get_selected_segments(insights, 10)

if not project_id:

project_id = create_project(access_token, video_id, selected_segments)

print(project_id)

render_response = render_video(access_token, project_id)

print(render_response)

if render_response:

status = get_render_operation(access_token, project_id)

print(status)

download_url = download_rendered_file(access_token, project_id)

print(download_url)

return download_url

return None

local_only = False

def get_image_blob_url(video_url, frame_number):

# Parse the original video URL to get account, container, and path

parsed = urlparse(video_url)

path_parts = parsed.path.split('/')

container = path_parts[1]

blob_path = '/'.join(path_parts[2:])

# Remove the file name from the blob path

blob_dir = '/'.join(blob_path.split('/')[:-1])

if blob_dir == "" or blob_dir == None:

blob_dir = "output"

# Create image path

image_path = f"{blob_dir}/images/frame{frame_number}.jpg"

# Rebuild the base URL (without SAS token)

base_url = f"{parsed.scheme}://{parsed.netloc}/{container}/{image_path}"

# Add the SAS token if present

sas_token = parsed.query

if sas_token:

image_url = f"{base_url}?{sas_token}"

else:

image_url = base_url

return image_url

def download_blob_to_stream(blob_client):

download_stream = blob_client.download_blob()

return io.BytesIO(download_stream.readall())

def extract_and_upload_frames(video_sas_url):

# Set up blob client for video

video_blob_client = BlobClient.from_blob_url(video_sas_url)

# Download video to memory stream

video_stream = download_blob_to_stream(video_blob_client)

# Use OpenCV to read from memory

video_bytes = video_stream.getvalue()

# Use cv2 to read from bytes

video_stream.seek(0)

video_temp = os.path.join(os.getcwd(), f"temp_{uuid.uuid4()}.mp4")

print(video_temp)

with open(video_temp, 'wb') as f:

f.write(video_bytes)

vidcap = cv2.VideoCapture(video_temp)

# Extract frames

frame_number = 0

while True:

success, frame = vidcap.read()

if not success:

break

# Convert frame to bytes

_, buffer = cv2.imencode('.jpg', frame)

image_bytes = buffer.tobytes()

if local_only:

image_path = f"frame{frame_number}.jpg"

with open(image_path, 'wb') as f:

f.write(image_bytes)

else:

# Generate image blob URL

image_url = get_image_blob_url(video_sas_url, frame_number)

image_blob_client = BlobClient.from_blob_url(image_url)

# Upload frame as image

image_blob_client.upload_blob(image_bytes, overwrite=True)

print(f"Uploaded frame {frame_number} to {image_url}")

frame_number += 1

# Clean up temp file

vidcap.release()

os.remove(video_temp)

def vectorize_extracted_frames(video_sas_url):

extract_and_upload_frames(video_sas_url)

vision_credential = AzureKeyCredential(vision_api_key)

analysis_client = ImageAnalysisClient(vision_endpoint, vision_credential)

# Set up blob client for video

video_blob_client = BlobClient.from_blob_url(video_sas_url)

# Extract frames

frame_number = 0

tuples = []

while True:

try:

# Generate image blob URL

image_url = get_image_blob_url(video_sas_url, frame_number)

image_blob_client = BlobClient.from_blob_url(image_url)

image_stream = download_blob_to_stream(image_blob_client)

vector = vectorize_image(image_url, vision_api_key, vision_region)

if vector:

vector = np.pad(vector, (0, 1536 - len(vector)), mode='constant')

print(f"Vectorized frame: {frame_number}")

description = analyze_image(analysis_client, image_url)

if description:

print(f"Analyzed frame: {frame_number}")

tuples += [(vector, description)]

except Exception as e:

print(f"No such image: {image_url[74:80]}. Giving up...")

break

raise

frame_number += 1

return tuples

# access_token = os.getenv("AZURE_VIDEO_INDEXER_ACCESS_TOKEN", get_access_token())

# video_sas_url=video_sas_url.strip('"')

# print(video_sas_url)

# extract_and_upload_frames(video_sas_url)

# vision_credential = AzureKeyCredential(vision_api_key)

# analysis_client = ImageAnalysisClient(vision_endpoint, vision_credential)

@retry(stop=stop_after_attempt(5), wait=wait_fixed(60))

def vectorize_image(image_path, key, region):

try:

# API version and model version

api_version = "2024-02-01"

model_version = "2023-04-15"

# Construct the request URL

url = f"{vision_endpoint}/computervision/retrieval:vectorizeImage?api-version={api_version}&model-version={model_version}"

# Set headers

headers = {

"Content-Type": "application/json",

"Ocp-Apim-Subscription-Key": key

}

# Set the payload with the SAS URL

payload = {

"url": image_path

}

# Make the POST request

response = requests.post(url, headers=headers, json=payload)

# Check the response

if response.status_code == 200:

result = response.json()

# The vector is in the 'vector' field of the response

vector = result.get("vector")

# print("Vector embedding:", vector)

return vector

else:

print("Error:", response.status_code, response.text)

vector = [0.0] * 1024

raise Exception(f"Error vectorizing image {image_path[74:80]}")

except (requests.exceptions.Timeout, http.client.HTTPException) as e:

print(f"Timeout/Error for {image_path[74:80]}. Retrying...")

raise

@retry(stop=stop_after_attempt(5), wait=wait_fixed(60))

def analyze_image(client, image_url):

try:

# Define all available visual features for analysis

features = [

VisualFeatures.CAPTION,

VisualFeatures.TAGS,

VisualFeatures.OBJECTS,

VisualFeatures.READ,

VisualFeatures.SMART_CROPS,

VisualFeatures.DENSE_CAPTIONS,

VisualFeatures.PEOPLE

]

# Analyze the image from the SAS URL

result = client.analyze_from_url(

image_url=image_url,

visual_features=features,

gender_neutral_caption=True )

# Explicitly cast to ImageAnalysisResult (for clarity)

result: ImageAnalysisResult = result

if result is not None:

captions = []

captions += [ f"{result.caption.text}" if result.caption is not None else "No Caption"]

captions += [ f"{caption.text}" for caption in result.dense_captions.list if result.dense_captions is not None]

# Enhance result

result.description = ",".join(captions)

description = pformat(result.__dict__, depth=4, compact=False)

return description

except HttpResponseError as e:

print(str(e))

raise

return "No description"

#codingexercise: https://1drv.ms/w/c/d609fb70e39b65c8/Echlm-Nw-wkggNbvIgEAAAABm0YMghEHgDsUJWecmQ7nbQ?e=9d5h8u