A previous document discussed ways to improve the performance of detection and cataloging of objects to drone world database from drone images. A drone video clip consists of several overlapping frames and repeating the vectorization and analysis of images and objects within images may not only be time consuming and expensive but unnecessary given that duplicates or previously visited objects can be known. This still maintains a frame-by-frame advancement but skips the processing wherever possible.
Another approach is to statistically sample temporally distributed images from drone world depending on the speed of the drone, pattern of flying, GPS and timestamps, but this additional information may need to be sourced externally. While some drones provide correlation keys and additional information can be found online, the premise for our approach did not require this optional information.
A different approach to improve performance would be to run AI models to determine highlights from drone videos and thus reduce the video size to split into frames and analyze while still having high precision and recall of drone world objects. Some examples of generating highlights from videos are available commercially. For example, VEED.IO online tool uses AI to extract the best moments from your videos and turn them into highlights. It allows you to trim clips, rearrange footage, and add text or music. OpusClip is an AI-driven highlight video maker that automatically selects the most engaging moments from your footage. It’s designed to save time and enhance video quality. Pictory specializes in creating highlight reels from long videos. It also offers automatic captioning and background music integration. Kapwing is a versatile online editor that integrates AI tools to streamline video editing. It’s useful for creating short, attention-grabbing clips. Powder.AI is a real-time gameplay automontage development tool that can run locally without cloud services on the windows PC.
While custom models, fine tuning, reasoning and agentic frameworks can help with the selection of frames for a new condensed video clip, significant performance gains can come from more context aware or thresholding parameters that can be used to work with the algorithms. While they remain optional, their inclusion wherever possible to reduce the work or do deeper analysis can make significant improvements to the overall processing speed and accuracy.
Reference: previous article: https://1drv.ms/w/c/d609fb70e39b65c8/EXBBHwTJngVMiCkRUA7rv0MBXhgxzpE4PyWz_8pbHH04cA?e=16wWYh
Sample:
import requests
import time
import os
# Replace these with your actual values
AZURE_VIDEO_INDEXER_API_URL = "https://api.videoindexer.ai"
AZURE_LOCATION = "westus2" # e.g., "westus2"
AZURE_ACCOUNT_ID = "your-account-id"
AZURE_API_KEY = "your-api-key"
VIDEO_FILE_PATH = "path/to/your/video.mp4"
# Step 1: Get an access token
def get_access_token():
url = f"{AZURE_VIDEO_INDEXER_API_URL}/auth/{AZURE_LOCATION}/Accounts/{AZURE_ACCOUNT_ID}/AccessToken"
headers = {
"Ocp-Apim-Subscription-Key": AZURE_API_KEY
}
response = requests.get(url, headers=headers)
return response.text.strip('"')
# Step 2: Upload video and start indexing
def upload_and_index_video(video_file_path, access_token):
video_name = os.path.basename(video_file_path)
url = f"{AZURE_VIDEO_INDEXER_API_URL}/{AZURE_LOCATION}/Accounts/{AZURE_ACCOUNT_ID}/Videos?name={video_name}&accessToken={access_token}&privacy=Private"
with open(video_file_path, 'rb') as video_file:
files = {'file': video_file}
response = requests.post(url, files=files)
return response.json()
# Step 3: Wait for indexing to complete and get insights
def get_video_insights(access_token, video_id):
url = f"{AZURE_VIDEO_INDEXER_API_URL}/{AZURE_LOCATION}/Accounts/{AZURE_ACCOUNT_ID}/Videos/{video_id}/Index?accessToken={access_token}"
while True:
response = requests.get(url)
data = response.json()
if data['state'] == 'Processed':
return data
time.sleep(10) # Wait 10 seconds before checking again
# Step 4: Main workflow
access_token = get_access_token()
video_data = upload_and_index_video(VIDEO_FILE_PATH, access_token)
video_id = video_data['id']
insights = get_video_insights(access_token, video_id)
print("Video highlights and key insights:")
print("=" * 50)
# Extract highlights: keyframes, topics, and summarization
if 'summarizedInsights' in insights:
for theme in insights['summarizedInsights']['themes']:
print(f"Theme: {theme['name']}")
for highlight in theme['keyframes']:
print(f" Keyframe at {highlight['adjustedStart']} to {highlight['adjustedEnd']}")
print(f" Thumbnail: {highlight['thumbnailId']}")
print(f" Description: {highlight.get('description', 'No description')}")
else:
print("No summarization available. See full insights:", insights)
Using Azure AI video indexer interface:
https://videoindexer.ai/media/library
Upload and index
100% 1 file uploaded
File:
main3-trim-fast-local
Video source language:
English
Indexing preset:
Standard video + audio
Included models: Audio effects, Closed captions, Keyframes, Audio transcription, Object detection, Text-based emotions, Named entities, Keywords, Visual labels, Character recognition (OCR), Rolling credits, Speakers, Topics
Excluded models: Face detection, Celebrities, Custom faces, Editorial shot type
Privacy:
Private
Streaming quality:
Single bitrate
Output:
Azure AI Video Indexer
Create unlimited account
ravibeta-80d6fe
Trial
1
Render
Save project
Add videos
View insights
Filter options
main3-trim-fast-local
Duration: 00:09:10
97 segments selected
00:07:31 - 00:07:33
building
00:07:35 - 00:07:36
building
00:07:38 - 00:07:38
building
00:07:44 - 00:07:44
outdoor
00:07:48 - 00:07:54
building
00:07:49 - 00:08:04
outdoor
00:07:50 - 00:07:50
car
00:07:57 - 00:08:01
building
00:08:02 - 00:08:02
car
vehicle
00:08:04 - 00:08:05
building
00:08:11 - 00:08:11
building
00:08:12 - 00:08:12
aerial photography
00:08:13 - 00:08:13
text
00:08:16 - 00:08:16
text
00:08:19 - 00:08:19
building
00:08:21 - 00:08:24
building
00:08:22 - 00:08:22
text
00:08:24 - 00:08:25
outdoor
00:08:25 - 00:08:25
text
00:08:26 - 00:08:29
building
00:08:27 - 00:08:31
outdoor
text
00:08:28 - 00:08:28
car
00:08:31 - 00:08:38
building
00:08:34 - 00:08:36
outdoor
00:08:38 - 00:08:42
outdoor
00:08:39 - 00:08:39
window
00:08:40 - 00:08:45
building
text
00:08:46 - 00:08:46
text
00:08:51 - 00:08:54
building
00:08:52 - 00:09:09
outdoor
00:08:54 - 00:08:55
car
00:08:55 - 00:08:56
vehicle
00:08:57 - 00:09:09
building
No comments:
Post a Comment