Monday, May 5, 2025

 Database calls are fast and the curation of objects lends itself to query operators but does not take advantage of the progressive and rolling sequence of objects detected frame-by-frame or its bookkeeping along with multiple UAV sequence tracking which is something messaging paradigm and event processors have solved successfully in several data and telemetry pipelines and from a traditional grounding that databases are queues. With the shift in paradigm from rows to events and similar SQL operators across both such as with Flink, the DFCS drone video sensing platform does not demand adherence to a database or messaging paradigm if the interface supports the following requirements: 1. standard query operator on objects detected with world co-ordinates attributes. 2. participation in retrieval augmentation along with a vector store and search and 3. support analytics stacks with programmability that can support custom drone sensing applications built independent of the UAV swarm sensing-analysis-routing architecture dedicated to the swarms’ flights. In addition, some criteria are suggested for messaging pipelines which include:

1. Noise filtering: This involves sifting through data to spotlight the essentials.

2. Long-Term data retention: this involves safeguarding valuable data for future use

3. Event-trimming: This customizes data for optimal analytics so that the raw data is not dictating eccentricities in the charts and graphs.

4. Data condensation: this translates voluminous MELT data into focused metrics and prevents archiving or cleaning up as messages are removed from the queue.

5. Operational Efficiency Boosting: This amplifies operating speed and reliability.

This article implements a sample for this alternative with emphasis on multiple stream processors


Sunday, May 4, 2025

 Agentic object detection 

import requests

def get_bounding_boxes(api_key, image_path, search_text):

    headers = {"Authorization": f"Bearer {api_key}"}

    with open(image_path, "rb") as image_file:

        files = {"file": image_file}

        data = {"prompt": search_text}

        response = requests.post(

            "https://api.landing.ai/v1/agentic-object-detection",

            headers=headers,

            files=files,

            data=data

        )

    if response.status_code == 200:

        detections = response.json().get("detections", [])

        return [detection["box"] for detection in detections] # Returns [x_min, y_min, x_max, y_max] boxes

    else:

        raise Exception(f"API Error: {response.text}")

# Usage example

api_key = "your_api_key_here"

boxes = get_bounding_boxes(api_key, "sample.jpg", "red shoes")

print(f"Found {len(boxes)} matching objects:")

for box in boxes:

    print(f"- Bounding box: {box}")


Saturday, May 3, 2025

 This is a summary of the book titled “Robot Ethics” written by Mark Coeckelbergh and published by MIT Press in 2022. The author is an academic and philosopher who surveys the robots and discusses the moral challenges to answer questions such as how much privacy to surrender, the ratio of human to robots, should robots be allowed to perform surgery, fight wars on our behalf and such others. All these questions are driven towards what kind of future we want for our children. Changes brought about by robots are desirable and they are expected to enter all walks of life. As home companions, they pose a dilemma to how much personal information to share. In medical practice, they pose a question about quality. As self-driving vehicles, they pose a challenge to ethical decision making. As they get closer to human capabilities, how should they be treated when they appear like humans. When military robots reduce cost and risks of warfare, are wars acceptable? Are their ethics human ethics?

Robots are changing the world in mundane ways, altering work, travel, and interaction. The ethical implications of their use are significant, as they can deepen economic disparities, harm vulnerable groups, and lead to the loss of human life and dignity. The ethical dilemmas surrounding robotics include who holds responsibility for the problematic effects, such as the user, manufacturer, programmer, marketer, or regulatory agency. As robots become more like humans, society will face challenges in understanding what makes us human. The modern industrial site involves more human-robot interaction, bringing new challenges and concerns about worker welfare. Safety, security, and privacy are important concerns, as robots can carry heavy payloads and move in unpredictable ways. The new industrial revolution involves automation of repetitive mechanical tasks and complex mental work, with jobs at risk of automation in customer service and administrative assistance.

Robots may replace jobs in certain professions, but they may also lead to high-pressure and low-meaning occupations. Humans can mitigate these effects through planning and forward-looking policies. Some jobs, such as care work, teaching, and artistic endeavors, should remain in human hands, even if automation technology exists. Education is essential to prepare the workforce of the future, and it may be time to consider restructuring the socioeconomic framework through measures like universal basic income.

Robotic home companions and personal assistants present new issues regarding privacy and deception. Without legal protections, a surveillance state is likely. Robots designed to resemble people or speak with human voice patterns may perpetuate problematic stereotypes. Using robots for companionship and care raises concerns about deception and dignity, as the person being cared for may not understand the companion's non-human capabilities.

Sex robots illustrate the flip side of deception, as their imitation of human actions can lead to harm, such as rendering people incapable of handling romantic relationships or increasing comfort with sexual partners.

Robots in healthcare are revolutionizing the industry, enabling telehealth, medication delivery, and complex surgeries. However, ethical concerns arise regarding privacy, surveillance, data collection, human worker displacement, and the impact on care providers. A coherent ethic for using robots in medicine should be based on quality in human life, considering patients' physical, emotional, and relational needs, as well as providers' engagement and loved ones' involvement.

As autonomous robots become more autonomous, the responsibility for failures becomes more complex. Factors such as driver reaction speed, safety features, and city officials' permission should be considered. To build greater safety, efforts should involve input from all affected parties, including taxi drivers, pedestrians, and cyclists. Regulations can help maintain transparency, and people must be prepared to incorporate robots into community planning and policymaking. In conclusion, a coherent ethic for using robots in healthcare should prioritize human dignity and consider the potential risks and benefits.

As robots become more lifelike, there are ethical considerations regarding their treatment. Empathy with robots can lead to mistreatment, as seen in a 2015 video of employees kicking a robotic dog. Some argue that creating robots that can masquerade as humans is unethical, as it could lead to human degradation or a rise in similar behavior towards fellow humans. One possible answer is to consider robots as entities with which people have established a relationship, similar to how we feel a different duty to an animal we keep as a pet than to one we raise for meat. The rise of fully- or partially automated weapons systems raises new ethical dilemmas, as opponents argue that reducing war's human costs could make it easier for politicians to justify military action. The ultimate ethical question raised by killer robots is whether the use of fully automated weapons is justified under any circumstances.

#codingexercise: CodingExercise-05-03-2025.docx

Friday, May 2, 2025

 These are the steps in a typical cnn based vision processor for drone images. Let’s enumerate them:

1. Initialization: Drone Images are 512x512 resolution images. They are not labeled in pascal voc format. Before each image in drone video is processed, the model is initialized as a 7-layer CNN with activation and sigmoid. Activation functions introduce non-linearity to neural networks allowing them to learn complex patterns such as edges, textures and shapes by adjusting neuron outputs before passing them to the next layer. Sigmoid is a mathematical function that squashes the input values between 0 and 1 that makes it useful for probability-based tasks including drawing heat-maps discussed earlier. The specific one used with this model is one that combines sigmoid and binary cross-entropy loss into a single operation for numerical stability for binary classification tasks. Hyperparameters for the model such as learning rate, targets and masks are set to default values. Optimizers are essential to neural network for updating its weights during the training process and help in finding the optimal set of weights that minimize the loss functions. A loss function measures the difference between the predicted and actual values of the target variable. The optimizer used with this model is one that implements the Adam algorithm.

2. Each convolutional layer transforms using input and output channels. It involves an activations scheme of Rectified Linear Unit aka ReLU which takes a value only if its positive and 0 otherwise. During training, each layer has a default value for dropout as none, padding as same and batchnorm and transpose as turned off. Dropout prevents overfitting by randomly setting a fraction of neurons to zero. Padding are extra pixels around the borders of an image before a convolutional operation. Batch normalizations normalize activation around a mini batch of data. Transpose or Transposed convolution often called deconvolution or upsampling is used to increase spatial dimensions reversing the standard convolutional process.

Kernel and biases are also set for each layer. Kernel used is a 3x3 with an initializer that generates a truncated normal distribution on the input channels for transformation to output channels. Biases only affect the output channel with a constant initializer.

3. location: Pixel co-ordinates are transformed to world co-ordinates. The alignment data is stored in the bounds which helps to transform the data in the raw frame to the detections in the world coordinates. This involves perspective transformation using OpenCV’s method to find the homography matrix which describes the transformation between two sets of corresponding points in two different images.


Thursday, May 1, 2025

 This is a summary of the book titled “Give to Grow: Invest in relationships to build your business and grow your career” written by Mo Bunnell and published by Bard Press in 2024. The author is a performance and growth consultant, who offers a framework to develop relationships that boost productivity and growth. His book draws on his decades in business development and consulting. This is an easy-to-read and high impact book. Investing in client relationships to unlock growth and drive long-term success, increasing your value, rooting out false beliefs that were self-limiting, showing clients that you have a genuine desire to understand the problems and help them, demonstrating your expertise, ensuring your success by always taking action and “thinking in bets” and thus growing your clients, growing your team and growing your scale are some of the tenets of his framework.

To achieve full performance and growth potential, prioritize relationships as the foundation for long-term success. The Give to Grow framework guides individuals in two components of high performance: "Doing the Work" by delivering outcomes to clients and "Winning the Work" by developing relationship skills. Top performers distinguish themselves by their focus on long-term relationships, which generate growth and provide opportunities for growth. Top performers in complex roles deliver between eight and thirty times the value of average employees. The key difference between top performers and others is their focus on growth strategies and actions. Top performers prioritize client conversations, engage in extensive research, and embrace an ethos of continuous improvement. They translate annual goals into weekly priorities and take time to reflect on what worked and what didn't after each client meeting.

Adam Grant's book Give and Take identifies three types of people: "Takers" who seek the best outcome, "Matchers" who negotiate fair deals, and "Givers" who are perpetually generous. Successful people are Givers, who focus on their most important relationships and give without demanding anything in return. They maintain healthy boundaries to prevent burnout. To become a Strategic Giver, reach out to clients frequently, helping them even when they aren't in a position to buy from you, and consistently become the client's first call when a need manifests. Expand your idea of growth by enlarging your network and investing in relationships. To reach your highest growth potential, identify false beliefs about yourself that can limit your growth. Replace them with a growth mindset, "I can't" and "I don't know how," "I might do it wrong," "I'm too busy," and "I might look bad." Overcoming these fears helps you grow and become a more effective professional.

To effectively engage clients, it is essential to show genuine interest and genuine engagement. This can be achieved by setting a two-sided, enjoyable, and energizing conversation, keeping meetings productive, and offering different forms of support. Connect with clients by finding commonalities and reducing stress through humor and celebrating incremental progress. Focus on their engagement and aim to "fall in love" with their problem, ensuring they feel seen and heard. Before each meeting, reflect on questions to better understand the client's situation and listen attentively. Demonstrate your expertise by giving potential clients a taste of what working with you would be like, such as providing a technical analysis free of charge. This groundwork will position you as the best candidate for the job. When meeting with clients, always give them a recommendation regarding their next steps, allowing them to make better decisions while placing you as a guide and expert. It is best to appear "passionately agnostic" and give them space to choose their next steps.

As you grow in your career, remember that you can always improve your situation, even in difficult circumstances. Strengthen relationships and respond to setbacks with compassion and generosity. Identify three high-impact tasks every week and schedule time for them, aligning with your vision of long-term growth. "Think in bets" - investing time and energy in opportunities with the biggest payoff.

High performers experience three levels of business growth: growth in client list, growth in team, and growth at scale. In the first stage, make yourself indispensable by bringing in more business than it costs your organization to employ you. As you build success, build a team to support you, delegate more to free up time, and scale your success throughout your organization. As your business scales, view your impact holistically, focusing on helping others succeed.


Wednesday, April 30, 2025

An image processing pipeline can have any number of extensions or operators. It is not limited to the proprietary models or techniques. In fact, if there are locations that you already have captured images and have labeled the objects of interest, you can plug-n-play your model for processing the next round of images say from the UAV swarm flight which will prioritize your predictions in the test flight and route autonomously. This widens the strategy and purpose of developing applications that can leverage this pipeline for their specific use cases. Objects detected using the Bring-Your-Own-Device processor can still be registered to a world catalog.


As an example, some preprocessing of the drone images with a dataset is based on 512x512 resolution images of highways and annotated in the Pascal VoC format, could leverage the following transform


1. Filters using kernels. A kernel is any matrix A, that when multiplied by another matrix B, transforms B in a way that highlights a certain feature. Finding features in images can be helpful to classificatio


2. CNN: A Convolutional Neural Network that takes an image and produces a vector based on embeddings that it derived from its training. Most Landing.AI experiments with images leverage this technique. It applies different kernels across the image and constantly improves these kernels using gradient descent. MobileNet is an example model suitable for drone imageries. Another example is YOLOv3 and we sourced most of the runti


3. LSTM: also called Long Short-Term Memory Neural network uses previous predictions and occurrences as a basis for predicting current input. This helps with temporal information such as movemen


4. Augmentation: Certain shifts, jubilations and rotations to images as part of preprocessing before CNN would be covered in this operator and this can be a great way to normalize all the input images to a common standar


5. Gaussian Blurring: is a kernel that can be applied across the image to balance the pixel around its neighbors and thereby make transition smoother. A 5x5 pixel with a standard deviation of 2 could be an example blurring kerne


6. Edge detection: come very helpful to detecting road boundaries which in turn can help analyze a variety of drone imageries and yield useful information. Canny is one such edge detection algorithm, but you can bring your ow


7. Heat-map: a variety of probability functions can be used to create a probability map of the image in color coding or gray scale so that lighter are areas of importance and darker regions are less importan



 t.n.l.d.t.men.s:


Tuesday, April 29, 2025

 Multimodal image search

 The following code snippet describes how multimodal search can come useful to search images. The images are indexed and searched based on vector embeddings but the query is text based.

from dotenv import load_dotenv,dotenv_values

import json

import os

import requests

from tenacity import retry, stop_after_attempt, wait_fixed

from dotenv import load_dotenv

from azure.core.credentials import AzureKeyCredential

from azure.identity import DefaultAzureCredential

from azure.search.documents import SearchClient

from azure.search.documents.indexes import SearchIndexClient

from azure.search.documents.models import (

    RawVectorQuery,

)

from azure.search.documents.indexes.models import (

    ExhaustiveKnnParameters,

    ExhaustiveKnnVectorSearchAlgorithmConfiguration,

    HnswParameters,

    HnswVectorSearchAlgorithmConfiguration,

    SimpleField,

    SearchField,

    SearchFieldDataType,

    SearchIndex,

    VectorSearch,

    VectorSearchAlgorithmKind,

    VectorSearchProfile,

)

from IPython.display import Image, display

 load_dotenv()

service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")

index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")

api_version = os.getenv("AZURE_SEARCH_API_VERSION")

key = os.getenv("AZURE_SEARCH_ADMIN_KEY")

aiVisionApiKey = os.getenv("AZURE_AI_VISION_API_KEY")

aiVisionRegion = os.getenv("AZURE_AI_VISION_REGION")

aiVisionEndpoint = os.getenv("AZURE_AI_VISION_ENDPOINT")

credential = AzureKeyCredential(key)

search_client = SearchClient(endpoint=service_endpoint, index_name=index_name, credential=credential)

query_image_path = "images/PIC01.jpeg"

@retry(stop=stop_after_attempt(5), wait=wait_fixed(1))

def get_image_vector(image_path, key, region):

    headers = {

        'Ocp-Apim-Subscription-Key': key,

    }

    params = urllib.parse.urlencode({

        'model-version': '2023-04-15',

    })

    try:

        if image_path.startswith(('http://', 'https://')):

            headers['Content-Type'] = 'application/json'

            body = json.dumps({"url": image_path})

        else:

            headers['Content-Type'] = 'application/octet-stream'

            with open(image_path, "rb") as filehandler:

                image_data = filehandler.read()

                body = image_data

        conn = http.client.HTTPSConnection(f'{region}.api.cognitive.microsoft.com', timeout=3)

        conn.request("POST", "/computervision/retrieval:vectorizeImage?api-version=2023-04-01-preview&%s" % params, body, headers)

        response = conn.getresponse()

        data = json.load(response)

        conn.close()

        if response.status != 200:

            raise Exception(f"Error processing image {image_path}: {data.get('message', '')}")

        return data.get("vector")

    except (requests.exceptions.Timeout, http.client.HTTPException) as e:

        print(f"Timeout/Error for {image_path}. Retrying...")

        raise

vector_query = RawVectorQuery(vector=get_image_vector(query_image_path,

                                                      aiVisionApiKey,

                                                      aiVisionRegion),

                              k=3,

                              fields="image_vector")

def generate_embeddings(text, aiVisionEndpoint, aiVisionApiKey):

    url = f"{aiVisionEndpoint}/computervision/retrieval:vectorizeText"

    params = {

        "api-version": "2023-02-01-preview"

    }

    headers = {

        "Content-Type": "application/json",

        "Ocp-Apim-Subscription-Key": aiVisionApiKey

    }

    data = {

        "text": text

    }

    response = requests.post(url, params=params, headers=headers, json=data)

    if response.status_code == 200:

        embeddings = response.json()["vector"]

        return embeddings

    else:

        print(f"Error: {response.status_code} - {response.text}")

        return None

query = "farm"

vector_text = generate_embeddings(query, aiVisionEndpoint, aiVisionApiKey)

vector_query = RawVectorQuery(vector=vector_text,

                              k=3,

                              fields="image_vector")

# Perform vector search

results = search_client.search(

    search_text=query,

    vector_queries= [vector_query],

    select=["description"]

)

for result in results:

    print(f"{result['description']}")

    display(Image(DIR_PATH + "/images/" + result["description"]))

    print("\n")