Cluster computing

Tuesday, December 9, 2025

TPC-H for aerial drone image analytics

This is a proposal for domain-adapted benchmark by taking the TPC-H v3 decision support queries which stress-test OLAP systems with business oriented data warehouse workloads) and reframes them for aerial drone image analytics. This would create a standardized way to evaluate drone video/image pipelines with SQL-like queries, but grounded in geospatial and vision tasks.

Step 1. Schema adaptation:

TPC-H schema has tables like CUSTOMER, ORDERS, LINEITEM. For drone imagery, we’d define analogous tables:

IMAGE: metadata for aerial images (id, timestamp, location, altitude, sensor type).

OBJECT_DETECTION: detected objects (image_id, object_type, bounding_box, orientation, confidence).

TRACKING: temporal sequences (track_id, object_id, trajectory, speed, direction).

EVENTS: higher-level events (traffic jam, unauthorized entry, wildfire hotspot).

REGIONS: geospatial polygons (urban, rural, restricted zones).

Step 2. Query adaptation:

The following table lists the adaptations:

TPC-H Query Original Purpose Drone Analytics Adaptation

Q1: Pricing Summary Report Aggregate line items by date Detection Summary Report: Count objects per type per region per day (e.g., vehicles, aircraft).

Q3: Shipping Priority Orders with high priority Event Priority: Identify urgent drone-detected events (e.g., accidents, intrusions) sorted by severity.

Q5: Local Supplier Volume Join across regions Regional Object Volume: Join detections with regions to compute density of vehicles/people per zone.

Q7: Volume Shipping Compare nations Cross-Region Traffic Flow: Compare object counts across multiple geospatial regions over time.

Q8: Market Share Share of supplier Model Share: Compare detection accuracy share between different drone models or sensors.

Q9: Product Profit Profit by supplier Event Cost Impact: Estimate resource usage (battery, bandwidth) per event type.

Q10: Top Customers Identify top customers Top Hotspots: Identify top regions with highest frequency of detected anomalies.

Q12: Shipping Modes Distribution by mode Flight Modes: Distribution of detections by drone altitude or flight mode.

Q13: Customer Distribution Count customers by orders Object Distribution: Count detections by object type (cars, pedestrians, aircraft).

Q15: Top Supplier Best supplier Top Detector: Identify best-performing detection algorithm (highest precision/recall).

Q18: Large Volume Customer Customers with large orders Large Volume Region: Regions with unusually high detection counts (e.g., traffic congestion).

Step 3. Metrics and Evaluations:

Just like TPC-H measures query response time, throughput, and power, the drone benchmark would measure:

Query Latency: Time to answer detection/tracking queries.

Throughput: Number of queries processed per minute across drone streams.

Accuracy Metrics: Precision, recall, mAP for detection queries.

Spatial-Temporal Efficiency: Ability to handle joins across time and geospatial regions.

Resource Utilization: CPU/GPU load, bandwidth usage, battery impact.

Step 4. Sample query:

This query evaluates object detection density per region per week, analogous to TPC-H’s line item aggregation:

SELECT

region_id,

object_type,

COUNT(*) AS object_count,

AVG(confidence) AS avg_confidence

FROM OBJECT_DETECTION od

JOIN REGIONS r ON od.location WITHIN r.polygon

WHERE od.timestamp BETWEEN '2025-12-01' AND '2025-12-07'

GROUP BY region_id, object_type

ORDER BY object_count DESC;

Future:

This benchmark is reproducible for drone analytics pipelines and provides standardization. Vendors can compare drone video systems and pipelines. It performs stress-testing using geo-spatial joins, temporal queries, and detection accuracy at scale. We could call it the Drone-Analytics Benchmark proposal.

References:

• Full Specification: https://1drv.ms/w/c/d609fb70e39b65c8/EXuckQNUpo9MowxSWSkeaA8Bm1f-ADuTaPf_GrOPLKBMPg?e=uoA10o

Cluster computing

Tuesday, December 9, 2025

No comments:

Post a Comment