One of the recent developments in object tracking involves tracking other drones across a ‘scene’ that are usually difficult to detect and track owing to their small size.
In the paper “Real-Time and Accurate Drone Detection in a Video with a Static Camera”2 (2020), researchers tackled the challenge of identifying drones that are visually hard to distinguish from the background—an "invisible" object problem under aerial imaging conditions. The method divides the drone detection into two tasks: foreground segmentation and object recognition.
Foreground Segmentation: First, temporal median background subtraction highlights moving objects from static backgrounds in the video stream. This separates sources of movement—crucial when objects (like drones) are extremely small or camouflaged.
Fourier Descriptor Extraction: For each moving region (candidate object), the shape is analyzed using global Fourier descriptors (FD). The Fourier Transform is applied to the shape boundary, transforming spatial features into frequency space. This helps represent the contour and capture subtle variations, even if the drone is nearly indistinguishable to the naked eye or standard image processing.
Combined Features for Recognition: The Fourier descriptor (FD) is fused with local Histogram of Oriented Gradients (HOG) features to encode both shape and texture. This combination enables the algorithm to spot unique signatures of drones, distinct from birds, leaves, or noise.
Classification with SVM: These feature vectors are then classified by a trained Support Vector Machine (SVM) to reliably distinguish drones—even when they’re visually “invisible” due to size or background similarity.
Experimental Results: Testing across challenging scenes (see the Dajiang Phantom 4 dataset), the Fourier+HOG hybrid delivered 98% recognition accuracy, outpacing traditional shape-based and pixel-level classifiers.
Fourier-based tracking excels over spatial methods for low-visibility targets because it transforms object features from the spatial domain (raw pixels or shapes) into the frequency domain, making subtle structural patterns and periodicities more detectable—even when the object blends in visually with its background.
Fourier-Based Tracking is superior for following reasons:
Extracts Global Shape Information:
The Fast Fourier Transform (FFT) analyzes the object's boundary or shape as a periodic signal, encoding the whole contour—even when parts of it are missing or indistinct. This enables detection of irregular or blurred target outlines that spatial methods, which rely on clear edges or textures, often miss.
Robustness to Noise and Occlusion:
Frequency-domain representations smooth over pixel-level noise and occlusions. Even if the target is partly hidden, its overall frequency "signature" may remain intact and distinguishable. Spatial algorithms (e.g., template matching, pixel edge detection) are much more sensitive to partial occlusion and background clutter.
Highlights Subtle Repetitive Features:
Many objects have underlying structural regularities—such as periodic contours, propeller movements, or patterned features—that are hard to spot in spatial scans but are amplified and isolated in the Fourier spectrum. This proves critical for drones or objects that are visually "invisible" or too small to distinguish by conventional means.
Invariant to Scale and Rotation:
Fourier descriptors capture the essence of the shape regardless of its position or orientation in the frame, making them more stable than pixel-wise methods that may break down if the target moves, rotates, or changes scale.
Complementary to Local Features:
While spatial descriptors focus on local texture or edge, Fourier-based tracking provides global structural validation. This combination delivers robust recognition, as seen in hybrid systems that yield higher accuracy rates in low-visibility cases.
FFT-based frequency analysis of object shapes improves the detection and tracking of drones—even when the drone blends into the background or is hard to distinguish in standard image space.
For example, the following code correctly tracks a red car across three consecutive images:
import cv2
import numpy as np
def extract_red_car_contour(image):
# Convert to HSV and threshold to isolate red color
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower_red1 = np.array([0, 120, 70])
upper_red1 = np.array([10, 255, 255])
lower_red2 = np.array([160, 120, 70])
upper_red2 = np.array([180, 255, 255])
mask1 = cv2.inRange(hsv, lower_red1, upper_red1)
mask2 = cv2.inRange(hsv, lower_red2, upper_red2)
mask = cv2.bitwise_or(mask1, mask2)
kernel = np.ones((5, 5), np.uint8)
mask = cv2.morphologyEx(mask, cv2.MORPH_OPEN, kernel)
mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)
contours, _ = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
if contours:
contour = max(contours, key=cv2.contourArea)
return contour, mask
return None, mask
def compute_fourier_descriptors(contour, num_descriptors=16):
contour = contour.squeeze()
z = contour[:, 0] + 1j * contour[:, 1]
fd = np.fft.fft(z)
fd = fd[:num_descriptors] / np.abs(fd[0])
return fd
def match_descriptors(fd1, fd2):
return np.linalg.norm(fd1 - fd2)
image_paths = ['frame9.jpg', 'frame10.jpg', 'frame11.jpg'] # Replace with actual filepaths
red_car_descriptors = []
for image_path in image_paths:
image = cv2.imread(image_path)
contour, mask = extract_red_car_contour(image)
if contour is not None and len(contour) > 10:
fd = compute_fourier_descriptors(contour)
red_car_descriptors.append(fd)
# Compute bounding box
x, y, w, h = cv2.boundingRect(contour)
cv2.rectangle(image, (x, y), (x + w, y + h), (0, 255, 0), 2)
cv2.imwrite(image_path+"-tracked", image)
cv2.imshow('Red Car Tracking', image)
cv2.waitKey(3000)
cv2.waitKey(10000)
cv2.destroyAllWindows()
for i in range(1, len(red_car_descriptors)):
dist = match_descriptors(red_car_descriptors[i-1], red_car_descriptors[i])
print(f'Frame {i-1} to Frame {i} descriptor distance: {dist:.3f}')
No comments:
Post a Comment