Cost calculation for drone video analysis:
The previous few articles have described agentic retrieval for insights from aerial drone images. Each agent uses an LLM and incurs tokens for query response. The KnowledgeAgent associated with the AI Search index with vector fields uses “gpt-4o-mini" model that involves automatic query decomposition and planning which yields higher number of subqueries than regular ‘gpt-4o’ model. Response for each subquery or agent execution incurs tokens. At this point, it would be helpful to calculate the cost complexity of an end-to-end drone video analysis.
The following table shows a breakdown of the typical cost involved in the end-to-end workflow from the user input of the drone video to the response on the chat interface associated with the video. It is assumed that the video, frames and associated artifacts are cleaned up at the end of the user session and that storage does not represent a significant factor in the cost calculations. The rest of the break-up pertains to the processing->analytics->feedback control loop and they are already optimized to handle the minimum workload needed to move on to the next stage.
Activity | Cost projection (USD) |
Video Indexing ( first pass ) | $0.09 per minute for typically 8 minute duration = $0.72 audio excluded at $0.024 per minute Up to 40 hours of free indexing for trial accounts |
Video Indexing (second pass) reduces to about a minute duration | Twice the cost as above |
Extracting frames from the indexed video for a minimal set of images to populate Drone World | Base Cost for Azure FunctionApp is typically $135.27 per month on a P1V3 tier but Elastic tier. Even assuming $0.40 per million executions and free grant of 250,000 executions, the cost per end-to-end run is ~$0.92 |
Vectorizing and analyzing each image | Assuming dense captions, 3 transactions per image for analysis, a base rate of $1.50 per 1000 transactions and at least 30 images per video to generate embeddings for, this comes to about $0.10 |
Uploading and saving vectors in azure ai search index | A 1536-dimension vector is ~6KB per image resulting in 6MB vector data for 1000 images and an additional 2 MB for json insights. A single resource can host upto 200 indexes and assuming one index per user, the cost is about $75 per month. The cost of running semantic ranker is about $1 per 1000 queries. So the net cost for say 30 images without any vectorization of individual objects within the image is about $0.37 |
Agentic retrieval with knowledgeagent and connected agents for azure ai search and function calling for a search spanning 30 image vectors and associated analysis | With the use of azure ai search and function apps already incurred as above, the cost here is entirely from models and deployments alone. Both the gpt-4o-mini and the text-embedding-ada-002 deployments and their respective usage by each agent to say about 6 runs per query correspond to 128K tokens with a rate of $0.10 per 1M token coming to about a net of $0.07 per user query. |
Preserving chat history and its usage in a tracking agent for the user based on the user session | This is not optional as tracking conversations is considered helpful to any interactive analysis and can be considered to be at most double the cost above. |
In short, the end-to-end consumption causes a usage-based cost of about $ 0.98 per video.
#codingexercise: CodingExercise-07-15-2025.docx
No comments:
Post a Comment