Tuesday, September 23, 2025

 

  1. Dataset augmentation: While the drone world dataset comprises of images and objects detected by the drones and extracted from its video, we could supplement the dataset with thousands of geo-tagged images primarily from 1. Crowd sourcing via user-contributed photographs of popular sites and 2. Overhead imagery from online mapping services. Each of these images can also be vectorized and tagged to help with similarity search. Manual and automated download of the latter overhead images from online mapping services is not part of this study but it could be leveraged to 1. not rely on the GPS Json download from the drones and 2. Enable geo-unique cataloging of objects in the drone world database thus eliminating duplicates and providing a mapping to real-world objects via location co-ordinates. The benefit of this augmented dataset is that it avoids the classification scoring and confidence scoring from alternative geolocation heuristics using online datasets and services or correlation of GPS Json index with image offsets or time offsets from the start of the whole tour by the drone. Even unambiguous mapping of two unique objects in a scene image with their real-world subjects is sufficient to populate the location information of all other objects in the scene using scaling.  

  1. World Drone Images dataset: While most image analysis models are based on CNN based detection, the process of cataloging and classifying visible objects in public spaces such as street signs, building facades, fire hydrants, solar panels, mail boxes, crosswalks, parking spaces, vehicles, etc. require custom models or offline analytics which could be added later on. The ability to do so is neither limited by the platform nor restricted by policy. In fact, such postprocessing could, given sufficient time and resources, create a geo-tagged dataset that may become as comprehensive as text corpus on which LLMs are trained. 

  1. Custom models: Together 1 and 2 can help build reliable drone models of populated cites that can serve as the standard for the recognition of most public objects and custom labels independent of climate, country, or class. Dataset and model fuel each other and in the world of drones, they can only improve analytics especially when there is a need to determine all variations of a given subject or to determine its age. Additionally, since our platform fosters model training and deployment in the cloud, it can leverage GPU/TPU that is otherwise not available on the edge computing. For example, the cloud services may deploy generative adversarial networks (GANs) or semantic segmentation algorithms. Root mean square error thresholds and F-scores can standardize benchmarks for these images and objects. 

  1. Architecture: The cloud computing infrastructure must meet the non-functional service level agreements for: 

  1. Real-time feedback loop for the delay between the taking of an image and the arrival of the feedback to not exceed 500ms and inter cloud-component calls to complete within 10ms latency. 

  1. Scalability where dozens of objects detected from each image in tens of thousands of images from each drone in thousands of drones does not degrade the system performance 

  1. Energy-efficiency where the flight time of the drone is not reduced more than ten percent from the energy spent in its communications with the cloud 

  1. Security where UAVs, the cloud and the end-users will be able to detect and prevent malicious attacks on the system 

  1. Safety where the operation of the drones is unhindered and implemented with strategies to limit damage to objects and scenes from failing communications and controls. 

  1. Reliability where failsafe strategies ensure continuity when disaster strikes. 

Additionally, the cloud must offer all drone and user management capabilities and ensure connectivity, communication, authentication, and the availability of services. 


#codingexercise: CodingExercise-09-23-2025.docx

No comments:

Post a Comment