Cluster computing

A previous document discussed ways to improve the performance of detection and cataloging of objects to drone world database from drone images. A drone video clip consists of several overlapping frames and repeating the vectorization and analysis of images and objects within images may not only be time consuming and expensive but unnecessary given that duplicates or previously visited objects can be known. This still maintains a frame-by-frame advancement but skips the processing wherever possible.

Another approach is to statistically sample temporally distributed images from drone world depending on the speed of the drone, pattern of flying, GPS and timestamps, but this additional information may need to be sourced externally. While some drones provide correlation keys and additional information can be found online, the premise for our approach did not require this optional information.

A different approach to improve performance would be to run AI models to determine highlights from drone videos and thus reduce the video size to split into frames and analyze while still having high precision and recall of drone world objects. Some examples of generating highlights from videos are available commercially. For example, VEED.IO online tool uses AI to extract the best moments from your videos and turn them into highlights. It allows you to trim clips, rearrange footage, and add text or music. OpusClip is an AI-driven highlight video maker that automatically selects the most engaging moments from your footage. It’s designed to save time and enhance video quality. Pictory specializes in creating highlight reels from long videos. It also offers automatic captioning and background music integration. Kapwing is a versatile online editor that integrates AI tools to streamline video editing. It’s useful for creating short, attention-grabbing clips. Powder.AI is a real-time gameplay automontage development tool that can run locally without cloud services on the windows PC.

While custom models, fine tuning, reasoning and agentic frameworks can help with the selection of frames for a new condensed video clip, significant performance gains can come from more context aware or thresholding parameters that can be used to work with the algorithms. While they remain optional, their inclusion wherever possible to reduce the work or do deeper analysis can make significant improvements to the overall processing speed and accuracy.

Reference: previous article: https://1drv.ms/w/c/d609fb70e39b65c8/EXBBHwTJngVMiCkRUA7rv0MBXhgxzpE4PyWz_8pbHH04cA?e=16wWYh

Sample:

import requests

import time

import os

# Replace these with your actual values

AZURE_VIDEO_INDEXER_API_URL = "https://api.videoindexer.ai"

AZURE_LOCATION = "westus2" # e.g., "westus2"

AZURE_ACCOUNT_ID = "your-account-id"

AZURE_API_KEY = "your-api-key"

VIDEO_FILE_PATH = "path/to/your/video.mp4"

# Step 1: Get an access token

def get_access_token():

url = f"{AZURE_VIDEO_INDEXER_API_URL}/auth/{AZURE_LOCATION}/Accounts/{AZURE_ACCOUNT_ID}/AccessToken"

headers = {

"Ocp-Apim-Subscription-Key": AZURE_API_KEY

}

response = requests.get(url, headers=headers)

return response.text.strip('"')

# Step 2: Upload video and start indexing

def upload_and_index_video(video_file_path, access_token):

video_name = os.path.basename(video_file_path)

url = f"{AZURE_VIDEO_INDEXER_API_URL}/{AZURE_LOCATION}/Accounts/{AZURE_ACCOUNT_ID}/Videos?name={video_name}&accessToken={access_token}&privacy=Private"

with open(video_file_path, 'rb') as video_file:

files = {'file': video_file}

response = requests.post(url, files=files)

return response.json()

# Step 3: Wait for indexing to complete and get insights

def get_video_insights(access_token, video_id):

url = f"{AZURE_VIDEO_INDEXER_API_URL}/{AZURE_LOCATION}/Accounts/{AZURE_ACCOUNT_ID}/Videos/{video_id}/Index?accessToken={access_token}"

while True:

response = requests.get(url)

data = response.json()

if data['state'] == 'Processed':

return data

time.sleep(10) # Wait 10 seconds before checking again

# Step 4: Main workflow

access_token = get_access_token()

video_data = upload_and_index_video(VIDEO_FILE_PATH, access_token)

video_id = video_data['id']

insights = get_video_insights(access_token, video_id)

print("Video highlights and key insights:")

print("=" * 50)

# Extract highlights: keyframes, topics, and summarization

if 'summarizedInsights' in insights:

for theme in insights['summarizedInsights']['themes']:

print(f"Theme: {theme['name']}")

for highlight in theme['keyframes']:

print(f" Keyframe at {highlight['adjustedStart']} to {highlight['adjustedEnd']}")

print(f" Thumbnail: {highlight['thumbnailId']}")

print(f" Description: {highlight.get('description', 'No description')}")

else:

print("No summarization available. See full insights:", insights)

Using Azure AI video indexer interface:

https://videoindexer.ai/media/library

Upload and index

100% 1 file uploaded

File:

main3-trim-fast-local

Video source language:

English

Indexing preset:

Standard video + audio

Included models: Audio effects, Closed captions, Keyframes, Audio transcription, Object detection, Text-based emotions, Named entities, Keywords, Visual labels, Character recognition (OCR), Rolling credits, Speakers, Topics

Excluded models: Face detection, Celebrities, Custom faces, Editorial shot type

Privacy:

Private

Streaming quality:

Single bitrate

Output:

Azure AI Video Indexer

Create unlimited account

ravibeta-80d6fe

Trial

Render

Save project

Add videos

View insights

Filter options

main3-trim-fast-local

Duration: 00:09:10

97 segments selected

00:07:31 - 00:07:33

building

00:07:35 - 00:07:36

building

00:07:38 - 00:07:38

building

00:07:44 - 00:07:44

outdoor

00:07:48 - 00:07:54

building

00:07:49 - 00:08:04

outdoor

00:07:50 - 00:07:50

car

00:07:57 - 00:08:01

building

00:08:02 - 00:08:02

car

vehicle

00:08:04 - 00:08:05

building

00:08:11 - 00:08:11

building

00:08:12 - 00:08:12

aerial photography

00:08:13 - 00:08:13

text

00:08:16 - 00:08:16

text

00:08:19 - 00:08:19

building

00:08:21 - 00:08:24

building

00:08:22 - 00:08:22

text

00:08:24 - 00:08:25

outdoor

00:08:25 - 00:08:25

text

00:08:26 - 00:08:29

building

00:08:27 - 00:08:31

outdoor

text

00:08:28 - 00:08:28

car

00:08:31 - 00:08:38

building

00:08:34 - 00:08:36

outdoor

00:08:38 - 00:08:42

outdoor

00:08:39 - 00:08:39

window

00:08:40 - 00:08:45

building

text

00:08:46 - 00:08:46

text

00:08:51 - 00:08:54

building

00:08:52 - 00:09:09

outdoor

00:08:54 - 00:08:55

car

00:08:55 - 00:08:56

vehicle

00:08:57 - 00:09:09

building

Cluster computing

Friday, June 13, 2025

No comments:

Post a Comment