Thursday, May 1, 2025

 This is a summary of the book titled “Give to Grow: Invest in relationships to build your business and grow your career” written by Mo Bunnell and published by Bard Press in 2024. The author is a performance and growth consultant, who offers a framework to develop relationships that boost productivity and growth. His book draws on his decades in business development and consulting. This is an easy-to-read and high impact book. Investing in client relationships to unlock growth and drive long-term success, increasing your value, rooting out false beliefs that were self-limiting, showing clients that you have a genuine desire to understand the problems and help them, demonstrating your expertise, ensuring your success by always taking action and “thinking in bets” and thus growing your clients, growing your team and growing your scale are some of the tenets of his framework.

To achieve full performance and growth potential, prioritize relationships as the foundation for long-term success. The Give to Grow framework guides individuals in two components of high performance: "Doing the Work" by delivering outcomes to clients and "Winning the Work" by developing relationship skills. Top performers distinguish themselves by their focus on long-term relationships, which generate growth and provide opportunities for growth. Top performers in complex roles deliver between eight and thirty times the value of average employees. The key difference between top performers and others is their focus on growth strategies and actions. Top performers prioritize client conversations, engage in extensive research, and embrace an ethos of continuous improvement. They translate annual goals into weekly priorities and take time to reflect on what worked and what didn't after each client meeting.

Adam Grant's book Give and Take identifies three types of people: "Takers" who seek the best outcome, "Matchers" who negotiate fair deals, and "Givers" who are perpetually generous. Successful people are Givers, who focus on their most important relationships and give without demanding anything in return. They maintain healthy boundaries to prevent burnout. To become a Strategic Giver, reach out to clients frequently, helping them even when they aren't in a position to buy from you, and consistently become the client's first call when a need manifests. Expand your idea of growth by enlarging your network and investing in relationships. To reach your highest growth potential, identify false beliefs about yourself that can limit your growth. Replace them with a growth mindset, "I can't" and "I don't know how," "I might do it wrong," "I'm too busy," and "I might look bad." Overcoming these fears helps you grow and become a more effective professional.

To effectively engage clients, it is essential to show genuine interest and genuine engagement. This can be achieved by setting a two-sided, enjoyable, and energizing conversation, keeping meetings productive, and offering different forms of support. Connect with clients by finding commonalities and reducing stress through humor and celebrating incremental progress. Focus on their engagement and aim to "fall in love" with their problem, ensuring they feel seen and heard. Before each meeting, reflect on questions to better understand the client's situation and listen attentively. Demonstrate your expertise by giving potential clients a taste of what working with you would be like, such as providing a technical analysis free of charge. This groundwork will position you as the best candidate for the job. When meeting with clients, always give them a recommendation regarding their next steps, allowing them to make better decisions while placing you as a guide and expert. It is best to appear "passionately agnostic" and give them space to choose their next steps.

As you grow in your career, remember that you can always improve your situation, even in difficult circumstances. Strengthen relationships and respond to setbacks with compassion and generosity. Identify three high-impact tasks every week and schedule time for them, aligning with your vision of long-term growth. "Think in bets" - investing time and energy in opportunities with the biggest payoff.

High performers experience three levels of business growth: growth in client list, growth in team, and growth at scale. In the first stage, make yourself indispensable by bringing in more business than it costs your organization to employ you. As you build success, build a team to support you, delegate more to free up time, and scale your success throughout your organization. As your business scales, view your impact holistically, focusing on helping others succeed.


Wednesday, April 30, 2025

An image processing pipeline can have any number of extensions or operators. It is not limited to the proprietary models or techniques. In fact, if there are locations that you already have captured images and have labeled the objects of interest, you can plug-n-play your model for processing the next round of images say from the UAV swarm flight which will prioritize your predictions in the test flight and route autonomously. This widens the strategy and purpose of developing applications that can leverage this pipeline for their specific use cases. Objects detected using the Bring-Your-Own-Device processor can still be registered to a world catalog.


As an example, some preprocessing of the drone images with a dataset is based on 512x512 resolution images of highways and annotated in the Pascal VoC format, could leverage the following transform


1. Filters using kernels. A kernel is any matrix A, that when multiplied by another matrix B, transforms B in a way that highlights a certain feature. Finding features in images can be helpful to classificatio


2. CNN: A Convolutional Neural Network that takes an image and produces a vector based on embeddings that it derived from its training. Most Landing.AI experiments with images leverage this technique. It applies different kernels across the image and constantly improves these kernels using gradient descent. MobileNet is an example model suitable for drone imageries. Another example is YOLOv3 and we sourced most of the runti


3. LSTM: also called Long Short-Term Memory Neural network uses previous predictions and occurrences as a basis for predicting current input. This helps with temporal information such as movemen


4. Augmentation: Certain shifts, jubilations and rotations to images as part of preprocessing before CNN would be covered in this operator and this can be a great way to normalize all the input images to a common standar


5. Gaussian Blurring: is a kernel that can be applied across the image to balance the pixel around its neighbors and thereby make transition smoother. A 5x5 pixel with a standard deviation of 2 could be an example blurring kerne


6. Edge detection: come very helpful to detecting road boundaries which in turn can help analyze a variety of drone imageries and yield useful information. Canny is one such edge detection algorithm, but you can bring your ow


7. Heat-map: a variety of probability functions can be used to create a probability map of the image in color coding or gray scale so that lighter are areas of importance and darker regions are less importan



 t.n.l.d.t.men.s:


Tuesday, April 29, 2025

 Multimodal image search

 The following code snippet describes how multimodal search can come useful to search images. The images are indexed and searched based on vector embeddings but the query is text based.

from dotenv import load_dotenv,dotenv_values

import json

import os

import requests

from tenacity import retry, stop_after_attempt, wait_fixed

from dotenv import load_dotenv

from azure.core.credentials import AzureKeyCredential

from azure.identity import DefaultAzureCredential

from azure.search.documents import SearchClient

from azure.search.documents.indexes import SearchIndexClient

from azure.search.documents.models import (

    RawVectorQuery,

)

from azure.search.documents.indexes.models import (

    ExhaustiveKnnParameters,

    ExhaustiveKnnVectorSearchAlgorithmConfiguration,

    HnswParameters,

    HnswVectorSearchAlgorithmConfiguration,

    SimpleField,

    SearchField,

    SearchFieldDataType,

    SearchIndex,

    VectorSearch,

    VectorSearchAlgorithmKind,

    VectorSearchProfile,

)

from IPython.display import Image, display

 load_dotenv()

service_endpoint = os.getenv("AZURE_SEARCH_SERVICE_ENDPOINT")

index_name = os.getenv("AZURE_SEARCH_INDEX_NAME")

api_version = os.getenv("AZURE_SEARCH_API_VERSION")

key = os.getenv("AZURE_SEARCH_ADMIN_KEY")

aiVisionApiKey = os.getenv("AZURE_AI_VISION_API_KEY")

aiVisionRegion = os.getenv("AZURE_AI_VISION_REGION")

aiVisionEndpoint = os.getenv("AZURE_AI_VISION_ENDPOINT")

credential = AzureKeyCredential(key)

search_client = SearchClient(endpoint=service_endpoint, index_name=index_name, credential=credential)

query_image_path = "images/PIC01.jpeg"

@retry(stop=stop_after_attempt(5), wait=wait_fixed(1))

def get_image_vector(image_path, key, region):

    headers = {

        'Ocp-Apim-Subscription-Key': key,

    }

    params = urllib.parse.urlencode({

        'model-version': '2023-04-15',

    })

    try:

        if image_path.startswith(('http://', 'https://')):

            headers['Content-Type'] = 'application/json'

            body = json.dumps({"url": image_path})

        else:

            headers['Content-Type'] = 'application/octet-stream'

            with open(image_path, "rb") as filehandler:

                image_data = filehandler.read()

                body = image_data

        conn = http.client.HTTPSConnection(f'{region}.api.cognitive.microsoft.com', timeout=3)

        conn.request("POST", "/computervision/retrieval:vectorizeImage?api-version=2023-04-01-preview&%s" % params, body, headers)

        response = conn.getresponse()

        data = json.load(response)

        conn.close()

        if response.status != 200:

            raise Exception(f"Error processing image {image_path}: {data.get('message', '')}")

        return data.get("vector")

    except (requests.exceptions.Timeout, http.client.HTTPException) as e:

        print(f"Timeout/Error for {image_path}. Retrying...")

        raise

vector_query = RawVectorQuery(vector=get_image_vector(query_image_path,

                                                      aiVisionApiKey,

                                                      aiVisionRegion),

                              k=3,

                              fields="image_vector")

def generate_embeddings(text, aiVisionEndpoint, aiVisionApiKey):

    url = f"{aiVisionEndpoint}/computervision/retrieval:vectorizeText"

    params = {

        "api-version": "2023-02-01-preview"

    }

    headers = {

        "Content-Type": "application/json",

        "Ocp-Apim-Subscription-Key": aiVisionApiKey

    }

    data = {

        "text": text

    }

    response = requests.post(url, params=params, headers=headers, json=data)

    if response.status_code == 200:

        embeddings = response.json()["vector"]

        return embeddings

    else:

        print(f"Error: {response.status_code} - {response.text}")

        return None

query = "farm"

vector_text = generate_embeddings(query, aiVisionEndpoint, aiVisionApiKey)

vector_query = RawVectorQuery(vector=vector_text,

                              k=3,

                              fields="image_vector")

# Perform vector search

results = search_client.search(

    search_text=query,

    vector_queries= [vector_query],

    select=["description"]

)

for result in results:

    print(f"{result['description']}")

    display(Image(DIR_PATH + "/images/" + result["description"]))

    print("\n")


Monday, April 28, 2025

 Image processing is made easy with platforms like landing.ai

As an example, the following is an application that counts cars in drone images. The dataset is based on 512x512 resolution images of highways and is annotated in the Pascal VoC format. The model is hosted and usable with a sample web-request as follows:

from PIL import Image

from landingai.predict import Predictor

# Enter your API Key

endpoint_id = "11cb6c44-3b6a-4b47-bac9-031826bc80ea"

api_key = "YOUR_API_KEY"

# Load your image

image = Image.open("image.png")

# Run inference

predictor = Predictor(endpoint_id, api_key=api_key)

predictions = predictor.predict(image)


And it can even be requested with agentic ai framework as follows:

import requests

url = "https://api.va.landing.ai/v1/tools/agentic-object-detection"

files = {

  "image": open("{{path_to_image}}", "rb")

}

data = {

  "prompts": "{{prompt}}",

  "model": "agentic"

}

headers = {

  "Authorization": "Basic {{your_api_key}}"

}

response = requests.post(url, files=files, data=data, headers=headers)

print(response.json())


For context on DFCS drone video sensing platform, please check the references.


Sunday, April 27, 2025

 Some more illustrations for drone imagery processing:

def stable_groups(keypoints, groups, threshold):

    for kp in keypoints:

        matched = false

        for group in groups:

            mean_feature = get_mean_feature(group)

            recent_pixel = get_recent_pixel(group)

            if kd.feature - mean_feature < threshold and abs(lucas_kanade_optical_flow(recent_pixel)-kp.pixel) < threshold:

               group.add(kp)

               matched = true

               break

        if matched == false:

            groups.add(create_group(kp))


def global_groups(stable_groups,global_groups, threshold):

    for stable_group in stable_groups:

        matched = false

        for global_group in global_groups:

            mean_feature = get_mean_feature(global_group)

            recent_pixel = get_recent_pixel(global_group)

            if get_mean_feature(stable_group) - mean_feature < threshold and delta_least_squares(stable_group,global_group):

               global_group.add(stable_group)

               matched = true

               break

        if matched == false:

            global_groups.add(create_global_group(stable_group))


def spherical_gps_to_position_n_orientation(gps, frame):

    return (d,x,d,y,h)


def camera_angle(keypoint, resolutionW, resolutionH, field_of_view):

       return arctan((x1 x (tan(field_of_view)/2) ) / (W/2)


def world_coordinates(keypoint, drone_frame):

    # solve these equations

    # 1.      (di.h - si.h).tan-theta-i-x = s.x - di.x,

    # 2.      (di.h - si.h).tan-theta-i-y = s.y - di.y

    # return (s.x,s.y, s.h)


Saturday, April 26, 2025

 This is illustration for sift feature extraction:

import cv2

sift = cv2.xfeatures2d.SIFT_create()

def compute_one(im):

 return sift.detectAndCompute(im, None)

def compute_sift(frames):

 print('get sift features')

 sift_features = [(None, None) for _ in frames]

 for frame_idx, im in enumerate(frames):

  if im is None or frame_idx % 3 != 0:

   continue

  print('... sift {}/{}'.format(frame_idx, len(frames)))

  keypoints, descs = compute_one(im)

  sift_features[frame_idx] = (keypoints, descs)

 return sift_features


Friday, April 25, 2025

 Drone Imagery Processing

We mentioned the drone video sensing platform DFCS to comprise of an image processor, an analytical engine and a drone router where the vision processor creates vectors for KeyPoint that are a tuple of pixel position and feature descriptor of the patch around the pixel which translates to world co-ordinates and time lapse information of that location. This article explains some of the tenets of the image processor.

One of the main requirements of the image processor is fast-frame alignment. Given that the images could be from any one of the units of the UAV swarm and from any position, the alignment of video frames is essential for subsequent tasks such as object detection and change-tracking. These three tasks are completed with the help of operators in an image pipeline fed with images from the drones’ sensors. The first flight around the region input by the user itself provides most of the survey of the landscape and brings in images from various vantage points. Most of the images are top-down imagery from this first video.

The frame alignment computes a mapping from each pixel to world-coordinates (longitude-latitude-height). The object detection and change-tracking encode the structured information obtained from the images. Machine Learning models extract information from the video. Frame alignment efficiently combines GPS and compass readings with image features. There is no need to compute or stash intermediary or output images from this processing. SIFT feature extraction derives KeyPoint in each video frame. Then KeyPoint are grouped together to describe the same world location such as a road divider or a chimney in two phases. Grouping involves creating stable groups in KeyPoint from multiple top-down images in a segment of the video from an aerial flight over the world location and then using that to create global groups by merging stable groups that describe the same world location. This inevitably leads to consolidation of all KeyPoint pertaining to a world location. Then the video frame is aligned by matching the SIFT KeyPoint computed in a single frame against the global groups, and this matching is used to estimate the drone’s position and orientation when it captured the frame. SIFT yields KeyPoint, frame alignment yields position and orientation and grouping yields KeyPoint corresponding to same world location. Grouping is iterative and initially starts with an empty set. For each frame, a KeyPoint is attempted to be matched with an existing group based on two conditions: 1. the similarity of the KeyPoint descriptor and the mean across descriptors in a group must lie below a threshold and 2. the pixel position of the most recent KeyPoint in the group when transformed via optical flow must fall close to that of the KeyPoint within a small threshold. Closeness is measured by Euclidean distance and the transformation is done with Lucas-Kanade method. If there is no match, the KeyPoint becomes a new group with a singleton member. Both existing and new groups are added to the global group.

After this aggregation into groups, GPS and compass readings are used to determine the world co-ordinates of stable groups. To merge stable groups into global groups, the co-ordinates of the global group is computed as the average across those of the stable groups and replace the optical flow constraint with the position estimate similarity constraint using the criteria of least-squares error to be below a threshold.