Sunday, May 17, 2026

 AI safety and security are primary concerns for the emerging GenAI applications. Organizations treat the defense-in-depth approach as the preferred path to stronger security for AI solutions. They also engage in feedback from security researchers via programs like AI Red Teaming and Bug Bounty program to make a positive impact to their customers. The following section outlines some of the other best practices that are merely advisory and not a mandate in any way.

As these GenAI applications become popular as productivity tools, the speed of AI releases and adoption acceleration must be matched with improvements to existing SecOps techniques. The security-first processes to detect and respond to AI risks and threats effectively include visibility, zero critical risks, democratization, and prevention techniques. Out of these the risks refer to data poisoning that alters training data to make predictions erroneous, model theft where proprietary AI models suffer from copyright infringement, adversarial attacks by crafting inputs that make model hallucinate, model inversion attacks by sending queries that cause data exfiltration and supply chain vulnerabilities for exploiting weaknesses in the supply chain.

The best practices leverage the new SecOps techniques and mitigate the risks with:

Achieving full visibility by removing shadow AI which refers to both unauthorized and unaccounted for AI. AI bill-of-materials will help here as much as setting up relevant networking to ensure access for only allow-listed GenAI providers and software. Employees must also be trained with a security-first mindset.

Protecting both the training and inference data by discovering and classifying the data according to its security criticality, encrypting data at rest and in transit, performing sanitizations or masking sensitive information, configuring data loss prevention policies, and generating a full purview of the data including origin and lineage.

Securing access to GenAI models by setting up authentication and rate limiting for API usage, restricting access to model weights, and allowing only required users to kickstart model training and deployment pipelines.

Using LLM-built-in guardrails such as content filtering to automatically removing or flagging inappropriate or harmful content, abuse detection mechanisms to uncover and mitigate general model misuse, and temperature settings to change AI output randomness to the desired predictability.

Detecting and removing AI risks and attack paths by continuously scanning for and identifying vulnerabilities in AI models, verifying all systems and components that have the most recent patches to close known vulnerabilities, scanning for malicious models, assessing for AI misconfigurations, effective permissions, network resources, exposed secrets, and sensitive data to detect attack paths, regularly auditing access controls to guarantee authorizations and least-privilege principles, and providing context around AI risks so that we can proactively remove attack paths to models via remediation guidance.

Monitoring against anomalies by using detection and analytics at both input and output, detecting suspicious behavior in pipelines, keeping track of unexpected spikes in latency and other system metrics, and supporting regular security audits and assessments.

Setting up incident response by including processes for isolation, backup, traffic control, and rollback, integrating with SecOps tools, and availability of an AI focused incident response plan.

In this way, existing SecOps practices that leverage well-known STRIDE threat modeling and Assets, Activity Matrix and Actions chart with enhancements and techniques specific to GenAI.


Saturday, May 16, 2026

 The current phase of the AI agent economy is defined by a tension between undeniable productivity gains and uneven monetization, a pattern made clear in recent industry reviews. Across tens of thousands of surveyed users, the strongest signal is that AI is already expanding the amount and type of work individuals can complete. Users report “substantially more productive” outcomes, with 48 percent citing expanded scope of work and 40 percent citing faster execution . These gains are real, measurable, and broadly distributed, yet they do not automatically translate into durable revenue for the companies building these systems. The market is now shifting from hype-driven visibility to a more sober evaluation of where AI actually changes operating leverage.

Commercial traction is emerging most clearly in enterprise environments where workflows are frequent, outcomes are quantifiable, and cost structures are well understood. Customer support illustrates this dynamic: organizations with high ticket volumes and predictable service metrics can immediately measure the impact of automation on cost per interaction. Even modest deflection rates of 20 to 50 percent materially improve margins at scale, making support automation one of the earliest reliable revenue categories. Similar logic applies to sales and revenue operations, where AI agents that automate CRM updates, summarize calls, or draft follow‑ups increase productive selling hours without increasing headcount. In engineering and internal operations, the value proposition is even more direct because skilled labor is expensive and capacity constrained. Tools that reduce debugging time or accelerate documentation by even 20 to 40 percent can outperform many back‑office use cases despite smaller user counts.

The reviews emphasize that Southeast Asia’s SME landscape may represent an underappreciated opportunity. Small and medium enterprises in the region often operate with lean teams and fragmented systems, making AI agents for invoicing, scheduling, multilingual messaging, and collections immediately valuable. These are environments where owner‑level productivity gains translate directly into willingness to pay. The broader pattern is consistent: enterprises pay for AI when it improves labor efficiency, shortens cycles, or generates measurable operating returns.

At the same time, the labor implications are complex. Productivity gains do not necessarily reduce anxiety about job security. The survey shows that roughly one‑fifth of respondents fear displacement, with early‑career workers expressing the highest concern. One article cites that “users who reported the largest speed gains… were also among the most concerned about job loss” . This creates a two‑speed labor market in which junior and repetitive tasks are automated first, potentially compressing the traditional pipeline through which future managers and specialists develop. The next phase of value creation may therefore come not from replacing workers but from enabling one skilled employee to manage the output of multiple AI systems.

Where hype outpaces revenue, the pattern is equally clear. Consumer‑facing general agents attract attention and experimentation, but retention is inconsistent and pricing power is weak. As foundation models improve, standalone wrappers with limited differentiation face increasing pressure. Products with high inference costs but low willingness to pay may show strong usage while generating weak margins. The market increasingly rewards repeat usage, clear ROI, and defensible workflow integration rather than viral adoption.

From an investor perspective, the next winners may appear less glamorous but more economically durable. Metrics such as fast payback periods, high usage frequency, low churn, expansion revenue, proprietary data loops, and strong margins are the most reliable signals of long‑term value. Products embedded deeply into CRM, ERP, ticketing, finance, or operational systems create switching costs that general assistants cannot match. Vertical AI in healthcare administration, legal review, finance operations, logistics, and industrial workflows may therefore outperform broader consumer‑oriented tools.

This reinforces that the majority of AI’s current surplus accrues to individuals rather than institutions. Around 70 percent of respondents say the primary beneficiary of AI productivity is “me,” while only about 10 percent point to employers or clients . This suggests that adoption is still user‑led rather than enterprise‑captured. Historically, technologies such as search, social platforms, and cloud software followed similar trajectories: utility emerged first, monetization matured later. The next stage of the AI agent economy will depend on converting personal productivity gains into enterprise budgets through workflow integration, measurable outcomes, and recurring value.


Friday, May 15, 2026

 Drone Survey Area reconstitution:

Problem statement:

Aerial drone images extracted from a drone video are sufficient to reconstitute the survey area with image selection to create a mosaic that fully covers the survey area. This method does away with the knowledge of flight path of the drone. Write a python implementation that places selections from the input on the tiles in a grid to increase the likelihood of match with the overall survey area.

Solution:

The following is a visual survey approximation, not a georeferenced orthomosaic. Without GPS/EXIF or camera poses from the previous example, the script cannot know the true ground positions, so the grid is an informed montage rather than a mathematically correct map.

Usage:

pip install pyodm

docker run -p 3000:3000 opendronemap/nodeodm --test

Code:

#! /usr/bin/python

import cv2

import numpy as np

from pathlib import Path

import math

def detect_road_like_mask(img):

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    gray = cv2.GaussianBlur(gray, (5, 5), 0)

    edges = cv2.Canny(gray, 40, 120)

    kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7, 7))

    closed = cv2.morphologyEx(edges, cv2.MORPH_CLOSE, kernel, iterations=2)

    dilated = cv2.dilate(closed, kernel, iterations=1)

    return (dilated > 0).astype(np.uint8) * 255

def skeletonize(mask):

    mask = (mask > 0).astype(np.uint8)

    skel = np.zeros_like(mask)

    element = cv2.getStructuringElement(cv2.MORPH_CROSS, (3, 3))

    temp = mask.copy()

    while True:

        eroded = cv2.erode(temp, element)

        opened = cv2.dilate(eroded, element)

        temp2 = cv2.subtract(temp, opened)

        skel = cv2.bitwise_or(skel, temp2)

        temp = eroded.copy()

        if cv2.countNonZero(temp) == 0:

            break

    return skel

def border_signature(skel):

    h, w = skel.shape

    return (

        skel[0, :], # top

        skel[-1, :], # bottom

        skel[:, 0], # left

        skel[:, -1], # right

    )

def border_similarity(a, b):

    if a.shape != b.shape:

        return 0

    return np.sum((a > 0) & (b > 0))

def compute_pairwise_border_scores(skeletons):

    N = len(skeletons)

    borders = [border_signature(s) for s in skeletons]

    scores = {}

    for i in range(N):

        for j in range(N):

            if i == j:

                continue

            scores[(i, j)] = {

                "up": border_similarity(borders[i][0], borders[j][1]),

                "down": border_similarity(borders[i][1], borders[j][0]),

                "left": border_similarity(borders[i][2], borders[j][3]),

                "right": border_similarity(borders[i][3], borders[j][2]),

            }

    return scores

def filter_redundant_frames(skeletons, overlap_threshold=0.75):

    N = len(skeletons)

    keep = [True] * N

    for i in range(N):

        if not keep[i]:

            continue

        si = skeletons[i]

        if si is None or si.size == 0:

            keep[i] = False

            continue

        si = si > 0

        for j in range(i + 1, N):

            if not keep[j]:

                continue

            sj = skeletons[j]

            if sj is None or sj.size == 0:

                keep[j] = False

                continue

            sj = sj > 0

            inter = np.sum(si & sj)

            union = np.sum(si | sj)

            if union == 0:

                continue

            iou = inter / union

            if iou > overlap_threshold:

                keep[j] = False

    return keep

def solve_directional_grid(N, scores, min_adj_score=20, direction_bias=1.5):

    G = int(math.ceil(math.sqrt(N)))

    grid = [[None for _ in range(G)] for _ in range(G)]

    used = set()

    grid[0][0] = 0

    used.add(0)

    for r in range(G):

        for c in range(G):

            if r == 0 and c == 0:

                continue

            best_tile = None

            best_score = -1

            for t in range(N):

                if t in used:

                    continue

                score = 0

                if r > 0 and grid[r - 1][c] is not None:

                    above = grid[r - 1][c]

                    vertical_score = scores.get((above, t), {}).get("down", 0)

                    score += vertical_score * direction_bias

                if c > 0 and grid[r][c - 1] is not None:

                    left = grid[r][c - 1]

                    horizontal_score = scores.get((left, t), {}).get("right", 0)

                    score += horizontal_score * direction_bias

                if score > best_score:

                    best_score = score

                    best_tile = t

            if best_score < min_adj_score:

                grid[r][c] = None

            else:

                grid[r][c] = best_tile

                used.add(best_tile)

            if len(used) == N:

                return grid

    return grid

def build_grid_mosaic(images, grid):

    H, W = images[0][1].shape[:2]

    G = len(grid)

    canvas = np.zeros((G * H, G * W, 3), dtype=np.uint8)

    for r in range(G):

        for c in range(G):

            idx = grid[r][c]

            if idx is None:

                continue

            name, img = images[idx]

            y0, y1 = r * H, (r + 1) * H

            x0, x1 = c * W, (c + 1) * W

            canvas[y0:y1, x0:x1] = img

    return canvas

def mosaic_street_grid(folder, out_path="grid_mosaic.jpg"):

    folder = Path(folder)

    images = []

    for p in sorted(folder.iterdir()):

        if p.suffix.lower() in [".jpg", ".jpeg", ".png"]:

            img = cv2.imread(str(p))

            images.append((p.name, img))

    if not images:

        raise RuntimeError("No images found")

    # normalize all images to the size of the first one

    base_h, base_w = images[0][1].shape[:2]

    norm_images = []

    for name, img in images:

        h, w = img.shape[:2]

        if (h, w) != (base_h, base_w):

            img = cv2.resize(img, (base_w, base_h), interpolation=cv2.INTER_AREA)

        norm_images.append((name, img))

    images = norm_images

    skeletons = []

    for name, img in images:

        road_mask = detect_road_like_mask(img)

        skel = skeletonize(road_mask)

        skeletons.append(skel)

        cv2.imwrite(str(folder / f"temp-road-{name}"), road_mask)

        cv2.imwrite(str(folder / f"temp-skel-{name}"), skel)

    valid_images = []

    valid_skeletons = []

    for (name, img), skel in zip(images, skeletons):

        if skel is None:

            print(f"[WARN] Skeleton for {name} is None — skipping")

            continue

        if skel.size == 0:

            print(f"[WARN] Skeleton for {name} is empty — skipping")

            continue

        if len(skel.shape) != 2:

            print(f"[WARN] Skeleton for {name} has invalid shape {skel.shape} — skipping")

            continue

        valid_images.append((name, img))

        valid_skeletons.append(skel)

    images = valid_images

    skeletons = valid_skeletons

    if len(skeletons) == 0:

        raise RuntimeError("All skeletons were invalid — nothing to process.")

    keep_mask = filter_redundant_frames(skeletons)

    images = [img for img, k in zip(images, keep_mask) if k]

    skeletons = [sk for sk, k in zip(skeletons, keep_mask) if k]

    scores = compute_pairwise_border_scores(skeletons)

    grid = solve_directional_grid(len(images), scores)

    mosaic = build_grid_mosaic(images, grid)

    cv2.imwrite(out_path, mosaic)

    return mosaic

if __name__ == "__main__":

    mosaic_street_grid(".", "street_grid_mosaic.jpg")


Thursday, May 14, 2026

 Drone Survey Area reconstitution:

Problem statement:

Aerial drone images extracted from a drone video are sufficient to reconstitute the survey area with image selection to create a mosaic that fully covers the survey area. This method does away with the knowledge of flight path of the drone. Write a python implementation that places selections from the input on the tiles in a grid to increase the likelihood of match with the overall survey area.

Solution:

The following is a visual survey approximation, not a georeferenced orthomosaic. Without GPS/EXIF or camera poses from the previous example, the script cannot know the true ground positions, so the grid is an informed montage rather than a mathematically correct map.

Usage:

pip install pyodm

docker run -p 3000:3000 opendronemap/nodeodm --test

Code:

#! /usr/bin/python

from pathlib import Path

import cv2

import numpy as np

import math

import shutil

import sys

def list_images(folder):

    exts = {".jpg", ".jpeg", ".JPG", ".JPEG"}

    files = [p for p in Path(folder).iterdir() if p.suffix in exts]

    return sorted(files, key=lambda p: p.name)

def make_detector():

    try:

        return cv2.SIFT_create()

    except Exception:

        return cv2.ORB_create(4000)

def detect(detector, img):

    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    return detector.detectAndCompute(gray, None)

def match_score(des1, des2, use_sift=True):

    if des1 is None or des2 is None:

        return 0

    if use_sift:

        matcher = cv2.FlannBasedMatcher(dict(algorithm=1, trees=5), dict(checks=40))

        matches = matcher.knnMatch(des1, des2, k=2)

    else:

        matcher = cv2.BFMatcher(cv2.NORM_HAMMING)

        matches = matcher.knnMatch(des1, des2, k=2)

    good = 0

    for pair in matches:

        if len(pair) < 2:

            continue

        m, n = pair

        if m.distance < 0.75 * n.distance:

            good += 1

    return good

def overlap_score(img1, img2, detector):

    kp1, des1 = detect(detector, img1)

    kp2, des2 = detect(detector, img2)

    use_sift = hasattr(cv2, "SIFT_create") and detector.__class__.__name__.lower().find("sift") >= 0

    return match_score(des1, des2, use_sift=use_sift)

def choose_grid(n, aspect=1.0):

    best = None

    for rows in range(1, n + 1):

        cols = math.ceil(n / rows)

        score = abs((cols / rows) - aspect)

        waste = rows * cols - n

        cand = (score, waste, abs(rows - cols), rows, cols)

        if best is None or cand < best:

            best = cand

    return best[3], best[4]

def fit_tile(img, tile_w, tile_h, pad=8, bg=(255, 255, 255)):

    h, w = img.shape[:2]

    scale = min((tile_w - 2 * pad) / w, (tile_h - 2 * pad) / h)

    nw, nh = max(1, int(round(w * scale))), max(1, int(round(h * scale)))

    resized = cv2.resize(img, (nw, nh), interpolation=cv2.INTER_AREA)

    canvas = np.full((tile_h, tile_w, 3), bg, dtype=np.uint8)

    x = (tile_w - nw) // 2

    y = (tile_h - nh) // 2

    canvas[y:y+nh, x:x+nw] = resized

    return canvas

def build_montage(folder, max_tiles=30, tile_w=360, tile_h=240, pad=8):

    folder = Path(folder).resolve()

    files = list_images(folder)

    if not files:

        raise ValueError("No JPG images found.")

    imgs = []

    for p in files:

        im = cv2.imread(str(p))

        if im is not None:

            imgs.append((p, im))

    if not imgs:

        raise ValueError("Could not read any images.")

    detector = make_detector()

    n = min(len(imgs), max_tiles)

    used = imgs[:n]

    scores = np.zeros((n, n), dtype=int)

    for i in range(n):

        for j in range(i + 1, n):

            s = overlap_score(used[i][1], used[j][1], detector)

            scores[i, j] = scores[j, i] = s

    remaining = set(range(1, n))

    order = [0]

    while remaining:

        last = order[-1]

        nxt = max(remaining, key=lambda j: (scores[last, j], -j))

        order.append(nxt)

        remaining.remove(nxt)

    rows, cols = choose_grid(n, aspect=1.0)

    while len(order) < rows * cols:

        order.append(None)

    montage = np.full((rows * tile_h, cols * tile_w, 3), 255, dtype=np.uint8)

    for idx in range(rows * cols):

        r = idx // cols

        c = idx % cols

        x0, y0 = c * tile_w, r * tile_h

        cv2.rectangle(montage, (x0, y0), (x0 + tile_w - 1, y0 + tile_h - 1), (230, 230, 230), 1)

        item_idx = order[idx]

        if item_idx is None:

            continue

        p, img = used[item_idx]

        tile = fit_tile(img, tile_w, tile_h, pad=pad)

        montage[y0:y0 + tile_h, x0:x0 + tile_w] = tile

        label = p.stem[:34]

        cv2.putText(

            montage,

            label,

            (x0 + 10, y0 + tile_h - 12),

            cv2.FONT_HERSHEY_SIMPLEX,

            0.5,

            (20, 20, 20),

            1,

            cv2.LINE_AA,

        )

    out_dir = folder / "montage_output"

    out_dir.mkdir(exist_ok=True)

    out_path = out_dir / f"{folder.name}_grid_montage.png"

    cv2.imwrite(str(out_path), montage)

    same_folder_copy = folder / out_path.name

    shutil.copy2(out_path, same_folder_copy)

    return str(same_folder_copy)

if __name__ == "__main__":

    if len(sys.argv) < 2:

        print("Usage: python grid_montage.py /path/to/folder")

        sys.exit(1)

    print(build_montage(sys.argv[1]))


Wednesday, May 13, 2026

 Drone Survey Area reconstitution:

Problem statement:

Aerial drone images extracted from a drone video are sufficient to reconstitute the survey area with image selection to create a mosaic that fully covers the survey area. This method does away with the knowledge of flight path of the drone. Write a python implementation that places selections from the input on the tiles in a grid to increase the likelihood of match with the overall survey area.

Solution:

The following implementation assumes that the images have GPS/EXIF metadata and leverages OpenDroneMap to create a mosaic.

Usage:

pip install pyodm

docker run -p 3000:3000 opendronemap/nodeodm --test

Code:

#! /usr/bin/python

from pathlib import Path

import shutil

import sys

from pyodm import Node, exceptions

def find_images(input_folder: Path):

    exts = {".jpg", ".jpeg", ".JPG", ".JPEG"}

    images = sorted([str(p) for p in input_folder.iterdir() if p.suffix in exts])

    return images

def pick_orthomosaic_file(results_dir: Path):

    candidates = []

    for ext in ("*.tif", "*.tiff", "*.png", "*.jpg", "*.jpeg"):

        candidates.extend(results_dir.rglob(ext))

    preferred = []

    for p in candidates:

        s = str(p).lower()

        if "orthophoto" in s or "orthomosaic" in s or "odm_orthophoto" in s:

            preferred.append(p)

    if preferred:

        preferred.sort(key=lambda p: (0 if p.suffix.lower() in [".tif", ".tiff"] else 1, len(str(p))))

        return preferred[0]

    if candidates:

        candidates.sort(key=lambda p: (0 if p.suffix.lower() in [".tif", ".tiff"] else 1, len(str(p))))

        return candidates[0]

    return None

def reconstruct_mosaic(input_folder: str, node_url="localhost", node_port=3000):

    input_path = Path(input_folder).resolve()

    if not input_path.exists() or not input_path.is_dir():

        raise FileNotFoundError(f"Folder not found: {input_path}")

    images = find_images(input_path)

    if len(images) < 3:

        raise ValueError("Need at least 3 overlapping drone images for a meaningful mosaic.")

    output_dir = input_path / "odm_results"

    output_dir.mkdir(parents=True, exist_ok=True)

    node = Node(node_url, port=node_port)

    print(node.info())

    options = {

        "auto-boundary": True,

        "crop": 0,

        "fast-orthophoto": True,

        "skip-post-processing": False,

        "orthophoto-resolution": 5,

        "use-exif": True,

        "optimize-disk-space": True,

    }

    try:

        task = node.create_task(images, options)

        print("Task created:", task.info().task_id)

        task.wait_for_completion()

        task.download_assets(str(output_dir))

        orthomosaic = pick_orthomosaic_file(output_dir)

        if orthomosaic is None:

            raise FileNotFoundError("No orthomosaic file was produced by ODM.")

        final_name = input_path / f"{input_path.name}_orthomosaic{orthomosaic.suffix.lower()}"

        shutil.copy2(orthomosaic, final_name)

        print(f"Orthomosaic saved to: {final_name}")

        return str(final_name)

    except exceptions.NodeConnectionError as e:

        raise RuntimeError(f"Cannot connect to NodeODM at {node_url}:{node_port}. Error: {e}")

    except exceptions.TaskFailedError as e:

        raise RuntimeError(f"ODM task failed: {e}")

if __name__ == "__main__":

    if len(sys.argv) < 2:

        print("Usage: python odm_mosaic.py /path/to/drone_images")

        sys.exit(1)

    reconstruct_mosaic(sys.argv[1])

References: compare to previous article: 

Tuesday, May 12, 2026

 Drone Survey Area reconstitution:

Problem statement:

Aerial drone images extracted from a drone video are sufficient to reconstitute the survey area with image selection to create a mosaic that fully covers the survey area. This method does away with the knowledge of flight path of the drone. Write a python implementation that places selections from the input on the tiles in a grid to increase the likelihood of match with the overall survey area.

Solution:

The following implementation uses overlap between consecutive frames to estimate a 2D motion vector (how the drone moved between frame i and i+1), integrates those motions along the timeline to get approximate 2D positions for each frame, rotates and normalizes those positions so the path becomes a clean rectangle-ish footprint, snaps those positions to a 2D grid (with possible collisions—some frames can land in the same cell), builds a mosaic image where the layout reflects the actual flight path much more than just “visual similarity clustering”.

Code:

#! /usr/bin/python

import os

import math

import cv2

import numpy as np

from typing import List, Tuple

# ---------------------------------------------------------

# 1. Load and preprocess images (sorted by filename)

# ---------------------------------------------------------

def load_images_sorted(folder: str,

                       max_images: int = None,

                       target_size: Tuple[int, int] = (512, 512)) -> List[np.ndarray]:

    files = sorted(os.listdir(folder))

    imgs = []

    for fname in files:

        path = os.path.join(folder, fname)

        if not os.path.isfile(path):

            continue

        img = cv2.imread(path, cv2.IMREAD_COLOR)

        if img is None:

            continue

        img = cv2.resize(img, target_size, interpolation=cv2.INTER_AREA)

        imgs.append(img)

        if max_images is not None and len(imgs) >= max_images:

            break

    if not imgs:

        raise ValueError("No valid images found in folder")

    return imgs

# ---------------------------------------------------------

# 2. Estimate translation between consecutive frames

# using phase correlation (overlap-based)

# ---------------------------------------------------------

def estimate_translation(img1: np.ndarray, img2: np.ndarray) -> np.ndarray:

    """

    Estimate 2D translation from img1 to img2 using phase correlation.

    Returns a 2D vector (dx, dy) in pixels.

    """

    # Convert to grayscale float32

    g1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY).astype(np.float32)

    g2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY).astype(np.float32)

    # Optional: apply Hanning window to reduce edge effects

    h, w = g1.shape

    win = cv2.createHanningWindow((w, h), cv2.CV_32F)

    g1w = g1 * win

    g2w = g2 * win

    shift, response = cv2.phaseCorrelate(g1w, g2w)

    dx, dy = shift # note: phaseCorrelate returns (dx, dy)

    return np.array([dx, dy], dtype=np.float32)

def accumulate_positions(images: List[np.ndarray]) -> np.ndarray:

    """

    For a sequence of images, estimate relative translations and

    integrate them to get approximate 2D positions.

    """

    N = len(images)

    positions = np.zeros((N, 2), dtype=np.float32)

    for i in range(N - 1):

        delta = estimate_translation(images[i], images[i + 1])

        # We accumulate the *negative* of the shift because phaseCorrelate

        # tells us how to move img2 to align with img1.

        positions[i + 1] = positions[i] - delta

    return positions # shape (N, 2)

# ---------------------------------------------------------

# 3. Normalize and straighten the path (PCA)

# ---------------------------------------------------------

def normalize_positions(positions: np.ndarray) -> np.ndarray:

    """

    Center, rotate (PCA), and scale positions into [0,1]x[0,1].

    """

    # Center

    mean = positions.mean(axis=0)

    X = positions - mean

    # PCA for rotation

    cov = np.cov(X.T)

    eigvals, eigvecs = np.linalg.eigh(cov)

    # Sort eigenvectors by descending eigenvalue

    order = np.argsort(eigvals)[::-1]

    R = eigvecs[:, order]

    X_rot = X @ R # rotate

    # Normalize to [0,1]

    min_xy = X_rot.min(axis=0)

    max_xy = X_rot.max(axis=0)

    span = np.maximum(max_xy - min_xy, 1e-6)

    X_norm = (X_rot - min_xy) / span

    return X_norm # shape (N, 2), in [0,1]

# ---------------------------------------------------------

# 4. Snap positions to a grid

# ---------------------------------------------------------

def choose_grid_shape(N: int) -> Tuple[int, int]:

    """

    Choose a roughly rectangular grid for N images.

    """

    rows = int(math.floor(math.sqrt(N)))

    cols = int(math.ceil(N / rows))

    if rows * cols < N:

        cols += 1

    return rows, cols

def snap_to_grid(pos_norm: np.ndarray,

                 grid_rows: int,

                 grid_cols: int) -> List[Tuple[int, int]]:

    """

    Map normalized positions in [0,1]^2 to integer grid cells.

    Multiple images can land in the same cell; that's allowed.

    """

    N = pos_norm.shape[0]

    assignments = []

    for i in range(N):

        x, y = pos_norm[i]

        # x -> col, y -> row

        c = int(np.clip(x * grid_cols, 0, grid_cols - 1))

        r = int(np.clip(y * grid_rows, 0, grid_rows - 1))

        assignments.append((r, c))

    return assignments

# ---------------------------------------------------------

# 5. Build a mosaic for visualization

# ---------------------------------------------------------

def build_mosaic(images: List[np.ndarray],

                 assignments: List[Tuple[int, int]],

                 grid_rows: int,

                 grid_cols: int,

                 tile_size: Tuple[int, int] = (256, 256)) -> np.ndarray:

    """

    Visual mosaic: each grid cell shows the *last* image assigned to it.

    (You can change this to average or small multiples if you want.)

    """

    tile_w, tile_h = tile_size

    mosaic_h = grid_rows * tile_h

    mosaic_w = grid_cols * tile_w

    mosaic = np.zeros((mosaic_h, mosaic_w, 3), dtype=np.uint8)

    for img, (r, c) in zip(images, assignments):

        tile = cv2.resize(img, (tile_w, tile_h), interpolation=cv2.INTER_AREA)

        y0 = r * tile_h

        x0 = c * tile_w

        mosaic[y0:y0+tile_h, x0:x0+tile_w, :] = tile

    return mosaic

# ---------------------------------------------------------

# 6. High-level function

# ---------------------------------------------------------

def layout_drone_tour_by_overlap(folder: str,

                                 max_images: int = None,

                                 base_size: Tuple[int, int] = (512, 512)) -> np.ndarray:

    """

    1) Load sequential frames from folder.

    2) Estimate frame-to-frame translations via phase correlation.

    3) Integrate to get 2D positions along the flight path.

    4) Straighten and normalize the path with PCA.

    5) Snap to a grid and build a mosaic.

    """

    images = load_images_sorted(folder, max_images=max_images, target_size=base_size)

    positions = accumulate_positions(images)

    pos_norm = normalize_positions(positions)

    grid_rows, grid_cols = choose_grid_shape(len(images))

    print(f"Grid shape: {grid_rows} x {grid_cols}")

    assignments = snap_to_grid(pos_norm, grid_rows, grid_cols)

    mosaic = build_mosaic(images, assignments, grid_rows, grid_cols,

                          tile_size=(256, 256))

    return mosaic

if __name__ == "__main__":

    # Requirements:

    # pip install opencv-python numpy

    folder = "."

    mosaic = layout_drone_tour_by_overlap(folder, max_images=None)

    cv2.imwrite("drone_path_layout.png", mosaic)

    print("Saved drone_path_layout.png")


Monday, May 11, 2026

Continued from previous post...

 Note to software engineers:

An AI system’s lifetime begins long before the first line of code is written, and Article 50’s transparency obligations shape that lifetime from the earliest prototype to the final shutdown. Engineers must think of transparency not as a late‑stage compliance patch but as a design constraint that grows in importance as the system matures. The guidelines make this clear when they say that providers must “develop and design the AI system in such a way that the natural persons concerned are informed they are interacting with an AI system,” a line that signals that transparency is a design‑time responsibility, not a deployment‑time afterthought.

In the prototype phase, engineers are still exploring feasibility, but this is the moment when the system’s eventual interaction patterns, content‑generation capabilities, and biometric or emotional inference pathways are first conceived. Even though research‑only prototypes are exempt, the guidelines warn that the exemption disappears the moment the system or its outputs leave the research context. Engineers must therefore architect prototypes with the assumption that transparency features will eventually be required. This means choosing model architectures that can support watermarking or provenance metadata, designing interaction flows that can accommodate disclosure messages, and avoiding early design choices that make later transparency impossible or brittle. For agentic systems, the guidelines explicitly note that if the provider cannot reliably determine when the agent will interact with natural persons, the agent must disclose itself in all likely interactions. Engineers must therefore design agent frameworks with built‑in disclosure hooks from day one.

The guidelines require that AI‑generated or manipulated content be “marked in a machine‑readable format and detectable as artificially generated or manipulated,” and that the technical solutions be “effective, interoperable, robust and reliable.” Those phrases are deceptively simple; in practice they mean that every layer of your system — storage, services, and user interfaces — must participate in preserving, propagating, and exposing these signals. If any layer drops the signal, the entire chain fails.

In the backend storage layer, marking begins as metadata, provenance, or embedded signatures. Engineers must treat marking as a first‑class property of the content object, not an afterthought. If the system stores images, videos, audio, or text, the marking must be embedded in a way that survives format conversions, compression, and distribution. For images and video, this may mean cryptographic watermarks, metadata fields, or fingerprint hashes stored alongside the asset. For text, it may mean structured provenance metadata or embedded markers that do not alter meaning. The storage system must support immutable provenance fields, versioning, and auditability, because the guidelines expect markings to be robust against tampering. A backend that strips metadata, rewrites files, or normalizes formats without preserving markings becomes a compliance liability. Engineers must therefore design storage schemas that treat marking as part of the content’s identity, ensuring that every read, write, transform, or replication operation preserves it. This includes object stores, relational databases, distributed file systems, and content delivery caches. Even internal transformations — transcoding, resizing, chunking — must be marking‑aware.

In the middle‑tier business services, marking becomes a routing and policy problem. These services orchestrate content flows, apply business logic, and integrate with external systems, and they must propagate marking metadata faithfully. A service that generates content must attach markings at creation time; a service that manipulates content must determine whether the manipulation is substantial enough to require marking, because the guidelines distinguish between minor edits and semantic changes. A service that aggregates or composes content must merge markings without losing fidelity. Business logic must enforce that any content leaving the system — through APIs, feeds, notifications, or exports — carries its marking intact. Detection services must be exposed as callable APIs so that downstream systems, partners, or users can verify authenticity. Middle‑tier engineers must also design for adversarial conditions: markings may be intentionally removed, corrupted, or spoofed, so services must validate markings, detect inconsistencies, and log anomalies. Because the guidelines require interoperability, services must support open standards for provenance and watermarking rather than proprietary formats that cannot be consumed by others. Middle‑tier systems must also enforce policy boundaries: if content is destined for a context where disclosure is required at first exposure, the service must ensure that the frontend receives the necessary metadata to surface that disclosure.

On the frontend, marking becomes human‑visible disclosure. The guidelines require that natural persons be informed “at the latest at the time of the first interaction or exposure,” which means the frontend must surface clear, perceivable, accessible signals that the content is AI‑generated or manipulated. This is where metadata becomes UI. Engineers must design controls that display labels, badges, overlays, or contextual notices without degrading usability. For interactive systems, the frontend must announce that the user is interacting with an AI system, whether through text, voice, or visual cues. For deep fakes, the frontend must display a disclosure that is visible at the moment the content appears, not buried in menus or footnotes. For AI‑generated text informing the public, the frontend must show a disclosure unless the content has undergone human editorial review. Accessibility requirements apply, so disclosures must work for screen readers, high‑contrast modes, and users with cognitive or perceptual differences. Frontend engineers must also ensure that disclosures persist across navigation, embedding, sharing, and re‑rendering, because the guidelines expect disclosures to survive distribution. If the frontend allows users to download or share content, the marking must travel with it.

The entire stack must work together to ensure that marking and detection survive the full lifecycle of content. Backend systems must store markings immutably; middle‑tier services must propagate and validate them; frontends must expose them to users. If any layer fails, the system becomes non‑compliant. The guidelines’ insistence on robustness and interoperability means that engineers must design for hostile environments, cross‑platform distribution, and long‑term persistence. Markings must survive not just your own system’s transformations but also the unpredictable behavior of downstream systems, social platforms, and user devices. Detection must remain possible even when content is recompressed, clipped, or partially transformed.

In practice, this means that marking and detection are not features of a single component but properties of the entire architecture. They must be designed into storage schemas, service contracts, API payloads, UI components, and operational workflows. They must be tested end‑to‑end, monitored in production, and preserved during migrations and refactors. They must be resilient to adversarial attempts to remove them and flexible enough to evolve as standards mature. And because the guidelines apply to both providers and deployers, engineers must ensure that transparency signals remain intact even when content leaves their control.

As the system moves into the initial version or MVP stage, the engineering focus shifts from exploration to implementation. This is where the transparency obligations begin to crystallize into concrete engineering tasks. Interactive systems must be instrumented so that every user‑facing entry point can surface an AI disclosure at first interaction. Generative systems must begin to embed machine‑readable markings into outputs, and detection APIs must be designed so that downstream actors can verify authenticity. Engineers working on data pipelines must ensure that the system can distinguish between minor edits and semantic manipulations, because the guidelines draw a sharp line between the two. A grammar‑corrected sentence is exempt; a sentence whose meaning has been altered is not. This distinction must be encoded into the system’s logic, not left to human judgment at deployment time.

During the growth phase, the system expands in scale, features, and user base. This is the phase where transparency obligations become operational rather than theoretical. Engineers must ensure that disclosure mechanisms scale across modalities — text, audio, video, avatars, VR environments — because the guidelines treat all of these as potential interaction channels. As the system integrates with other services, engineers must ensure that transparency metadata survives transformations, API hops, and distribution through third‑party platforms. The guidelines emphasize that marking must be “effective, interoperable, robust and reliable,” which means engineers must design for adversarial environments where markings may be stripped, corrupted, or intentionally removed. This requires redundancy, cryptographic signatures, and provenance chains that can survive format conversions.

For deployers, the growth phase is where operational workflows must incorporate transparency. Engineers responsible for integration must ensure that emotion‑recognition or biometric‑categorisation systems surface disclosures at the moment of exposure, whether in a mobile app, a kiosk, a classroom, or a workplace tool. Engineers working on content‑publishing pipelines must ensure that deep fakes or AI‑generated text published on matters of public interest are labelled clearly unless they undergo genuine editorial review. The guidelines quote that text is exempt only if it has undergone “human review or editorial control and is subject to editorial responsibility,” which means engineers must build audit trails that prove such review occurred.

As the system reaches maturity, the engineering challenge shifts to maintaining transparency across evolving features, new markets, and new regulatory expectations. Mature systems often accumulate technical debt, and transparency features must be refactored to remain reliable. Engineers must ensure that marking and detection systems remain state‑of‑the‑art, because the guidelines require providers to implement technically feasible solutions, not outdated ones. As models are retrained or replaced, engineers must ensure that transparency features are preserved across versions. When new interaction modes are added — such as voice, AR, or agent‑to‑human messaging — engineers must extend disclosure mechanisms accordingly. Mature systems also face increased scrutiny from regulators, meaning engineers must maintain logs, provenance records, and compliance evidence that can withstand audits.

In the maintenance phase, transparency becomes a matter of operational discipline. Engineers must monitor whether disclosures are being surfaced correctly, whether markings remain intact across distribution channels, and whether detection tools continue to function as intended. When content is syndicated, embedded, or transformed by downstream systems, engineers must ensure that transparency metadata is not lost. The guidelines emphasize that transparency must be provided “at the latest at the time of the first interaction or exposure,” which means engineers must design monitoring systems that detect when disclosures fail to appear. Maintenance also includes updating transparency mechanisms as adversarial techniques evolve, because robustness is an ongoing requirement, not a one‑time achievement.

Finally, in the decommissioning phase, engineers must ensure that transparency obligations are respected even as the system winds down. If the system continues to generate or serve content during a sunset period, markings and disclosures must remain active. If the system is replaced, engineers must ensure that legacy content remains labelled, especially deep fakes and AI‑generated text that continue to circulate. If detection tools are retired, engineers must provide alternative means for users to verify authenticity. Decommissioning also requires preserving audit logs and provenance data for regulatory review, because the guidelines make clear that compliance is evaluated across the system’s operational lifetime, not just at a single point in time.

Across all phases, engineers at every level have distinct responsibilities. Model engineers must ensure that model architectures support watermarking, provenance, and disclosure triggers. Backend engineers must propagate transparency metadata through APIs and services. Frontend engineers must surface disclosures in ways that are perceivable, accessible, and adapted to vulnerable users. DevOps and SRE teams must ensure that transparency features remain reliable under load and across deployments. Security engineers must defend markings and detection systems against tampering. QA engineers must test transparency features as rigorously as functional features, because a missing disclosure is a compliance failure. Product engineers must ensure that editorial workflows, content pipelines, and agent behaviors align with the guidelines’ expectations. And engineering managers must ensure that transparency is treated as a first‑class requirement throughout the system’s lifetime.

The guidelines’ structure may appear legalistic, but their message to engineers is simple: transparency is not a feature; it is a lifecycle obligation. It must be designed early, implemented consistently, maintained continuously, and preserved even as the system is retired. Every phase of the AI system’s life introduces new transparency risks, and engineers must anticipate and mitigate those risks long before regulators come asking.