Friday, April 4, 2025

 This article is all about AI agents. There are many types, development processes and real-world implementations. There has always been automation for complex workflows with curated artifacts, they have been boutique and never really intended for Large Language Models. While information can be tapped from multiple data sources or a knowledge base, enhanced decision-making processes needed to leverage AI agents. The operational framework of AI agents and the ways they augment LLMs is described here.

AI agents are software entities that complete a task autonomously on behalf of a user, including making requests to other services to improve the reach of standalone LLMs. They can retrieve real-time data from external databases, and APIs, manage interactive sessions with users and automate routine tasks that can be invoked dynamically or on schedule and with different parameters. An agent framework provides the tools and structures necessary to a developer to build robust, scaleable and efficient agent-based systems. This agent framework is an evolution of Reason-Action (ReAct) framework where an LLM is prompted to follow Thought/Action/Observation sequences. The Agent framework extends this by including external tools into the action step. The tools can range from simple calculators and database calls to python code generation and execution and even interactions with other agents. The calling program typically parses the output of the LLM at each step to determine the next steps. As an example, a prompt to find the weather in Seattle, WA involves a thought for needing to access a weather API to get the information, an action to call the weather API with location information and an observation on the response, followed by a thought that the relevant information is 65 degrees Fahrenheit and sunny, an action to report it to the user and an observation that the user is informed. By increasing iterations, articulation of granularity, dynamic adaptability and interoperability, the decision-making process can be arbitrarily enhanced. Compared with traditional software agents, there is very little distraction from syntax and format and more emphasis on semantics and latent meaning by virtue of LLMs and vector databases. This helps them to provide more dynamic, context-aware responses.

LLM agents can be diverse with each tailored to address specific challenges in information processing, decision support, and task automation. The task-specific agents are designed to perform specific, well-defined tasks. The conversational agents leverage natural language not a query language to interact with users. The decision support agents analyze complex data and provide insights. The workflow-automation agents co-ordinate and execute multi-step processes across different systems. The information retrieval agents can search and extract relevant information from large datasets or document repositories. The collaborative agents are creative and work with humans to accomplish complex tasks. The predictive agents use historical data and current trends to forecast future outcomes. The adaptive learning agents improve performance over time by learning from interactions and feedback. By categorizing different types of agents, an organization can streamline their operations, improve customer experiences and gain valuable insights.


Thursday, April 3, 2025

 This is a summary of the book titled “Win the inside game” written by Steve Magness and published by HarperOne in 2025. The author is a performance coach who argues for a cognitive and psychological strategies to start living your full potential especially when there are increasing numbers of burnouts and for a many, a crisis of meaning. As we immerse ourselves in workplace and social media, Steve suggests developing a healthy sense of self-worth and intrinsic motivation. It’s just that we are in survival mode with the pressures of the modern life that we are merciless on ourselves and what it means for as growth and purpose. Hard work is not always virtuous. Intrinsic motivation and playful exploration will foster a sense of belonging and growth. By accepting ourselves and showing self-compassion, we can embrace the messiness of life. We learn to recast failures and losses as opportunities for learning and growth. We must proactively surround ourselves with people, objects and environments that support our growth. We will find more freedom and authenticity by disrupting the state of fear.

Many people live in "survival mode," feeling trapped in a fight for survival due to the pressures of modern life. This mode, which involves avoiding or shutting down, fighting and defending, narrowing and clinging, or accepting and exploring threats, can hinder growth and undermine a sense of life's meaning. Existential psychologist Tatjana Schnell suggests four qualities essential for a meaningful life: coherence, significance, purpose, and belonging. However, these qualities can be elusive in the modern world, with social media platforms encouraging inauthentic self-presentation, productivity-obsessed work culture, and superficial online interactions failing to foster a deep sense of belonging. Recognizing and addressing these needs can help individuals thrive in a world that is too big for their minds to handle.

The belief that hard work is virtuous can hinder happiness in the modern world. The Protestant notion that hard work is a virtue has led to an unhealthy fixation on external indicators of success, leading to poorer performance, stress, anxiety, and burnout. To thrive, prioritize intrinsic motivations and do what matters to you rather than competing with others.

High performance comes from intense work and commitment, which can grow only from an internal drive that manifests at the intersection of interest, motivation, and talent. Children naturally explore various interests, often becoming obsessed for periods of time. As people grow older, they often feel a need to choose a particular path, but rigid attachment to a narrow identity can lead to feelings of missing out and a crisis of meaning.

To achieve sustainable excellence, bring childlike exploration back into your life, seek a balance between exploration and commitment, alternate between narrowing and broadening focus, and be wary of success that might lead to cementing a commitment for the wrong reasons.

To embrace the messiness of life, cultivate self-compassion, "be someone," and "integrate the messiness." Accept your inner critic and focus on wisdom and courage to alleviate suffering. Hold onto a core sense of yourself that endures even in failures and setbacks. Seek meaning from diverse sources, such as hobbies or volunteering. Craft an empowering narrative of your journey to increase resilience and stress management. Recast failures and losses as opportunities for learning and growth.

In today's world, losing well is essential. Learning to lose well means accepting a loss and learning from it, rather than throwing a tantrum or shutting down. Failure brings clarity and helps you see yourself and pursuits as they are. Learning to lose well also helps you win better, as emotional outbursts or avoidance after a loss can lead to retreat and self-protection. Reframe your performance and view success and failure as part of your learning and growth journey.

To exit survival mode, create an environment that supports growth and downregulates the nervous system. Research shows that a person's physical environment can significantly impact their performance, with studies showing that making an office feel more like home can improve performance by up to 160%. This creates psychological ownership, which supports emotional needs for identity, belonging, and safety. Surround yourself with people who inspire you and serve as role models and cultivate relationships that feel expansive.

To find more freedom and authenticity, disrupt the state of fear by using physiological techniques to reset the nervous system. If your fears aren't life-threatening, confront them deliberately, such as dressing in a ridiculous outfit and going out in public.

Reducing attachment to specific outcomes and approaching life with more openness can help you stop living in a fearful, protective, and defensive state and start thriving. By embracing change, you can grow, adapt, form genuine relationships, and achieve goals that align with your authentic self.


Wednesday, April 2, 2025

 This is a summary of the book titled “Energy - a human history” written by Richard Rhodes and published by Simon and Schuster in 2018. The author is a prolific and acclaimed writer who covers major innovations in energy in the last 400 years. As a journal of scientific history, this book covers breakthroughs such as turning coal into steam, building railroads, electric grid and automobiles and eventually to harness the power of atom. His lessons on the benefits and risks of each source of energy holds value as we evaluate the challenges of climate change. Cheap abundant energy has driven prosperity for society. Wood was scarce and that prompted finding coal, but mining and transportation was difficult. Canals were built and steam drove the railway beginning with freight in 1831. The search for oil began because kerosene could be distilled from bitumen. Electricity was a breakthrough for the industrial age and the invention of the internal combustion engine propelled transportation. Oil became a global hunt with Middle East becoming a huge producer. World war II spurred the development of nuclear power, and its aftermath highlighted the pollution from power generation. Understanding the benefits and risks of each source of energy is crucial to managing environmental impact.

Over the last 400 years, western societies have demonstrated remarkable innovation in finding and exploiting new energy sources. Wood gave way to coal, and coal made room for oil, as coal and oil now make way for natural gas, nuclear power, and renewables. Obscure inventors and scientists made great advances motivated by the scarcity, cost, or other shortcomings of existing energy sources, delivering more efficient sources of heat, light, and transportation.

Elizabethan England's scarcity of wood led to the search for alternatives, such as coal, which provided energy but was difficult to mine. As coal mining expanded, miners faced the problem of flooding, leading to the development of steam engines. James Watt patented a better steam engine in 1769, which was sold to coal miners and other industrialists under an exclusive patent until 1800.

Mine owners built canals to reduce the cost of transporting coal, such as the Bridgewater Canal, which reduced the price of coal in Manchester by 50%. Improved smelting allowed for the use of iron rails for efficient coal hauling.

The Liverpool and Manchester Railway, the first commercial passenger and freight railway, opened in 1831, powered by steam. Cornish inventor Richard Trevithick developed a high-pressure steam engine, allowing it to be smaller than earlier models. George Stephenson won a competition to demonstrate the safety and speed of steam-powered rail, leading to the development of the first railway in the world, the Liverpool and Manchester Railway. The first gas lights were installed in London in 1807. The search for oil began with kerosene, a fuel source for lighting, invented by Canadian physician Abraham Gesner. The US Civil War boosted the market for oil, with production reaching 4.8 million barrels by 1870. However, the environmental costs of drilling, transporting, and distilling oil became apparent, making the process messy.

Electricity was a significant energy source that fueled economic growth. Despite the existence of electricity, scientists were unsure of how to use it. Hans Christian Oersted discovered electromagnetism, which enabled the generation of electricity in sufficient quantities for practical use. Early developers recognized Niagara Falls as a potential power source and worked to harness its potential effectively. William Stanley Jr. developed alternating current (AC), allowing transmission over long distances. Westinghouse built generators and transmission lines to harness Niagara Falls' power, making Buffalo, New York, the first electrified city. The introduction of electric streetcars in the 1880s reduced transportation costs and accelerated city growth. Henry Ford developed his first automobile in 1896, using a gasoline-powered internal combustion engine.

The need for oil became international, leading to exploration in the Middle East. In 1933, Standard Oil of California signed a 60-year deal with Saudi Arabia, leading to a significant discovery in 1938. The development of oil fields required the construction of oil and gas pipelines, which were later used to deliver natural gas, a by-product of oil production. The outbreak of World War II boosted the demand for oil, leading to the construction of the world's largest and longest oil pipeline, the Big Inch. The Atomic Energy Act granted the US government a monopoly on nuclear power, but the Atomic Energy Commission created a joint venture to build a nuclear reactor near Pittsburgh in 1953. By the 1950s, the problem of pollution from power generation became evident, and the connection between air pollution and health was poorly understood until the mid-20th century.

In the 1950s, a chemist at the California Institute of Technology discovered that smog in Los Angeles was caused by automobile and factory emissions interacting with sunlight and ozone. This led to the 1970 US Clean Air Act. Wealth has been linked to environmental regulation, with wealthier societies becoming cleaner and healthier. Understanding the benefits and risks of energy sources is crucial for managing environmental impact and addressing climate change. Climate change has increased public awareness, leading to research on renewable energy sources like wind and solar power.


Tuesday, April 1, 2025

 The vectors generated by embedding models are often stored in a specialized vector database. Vector databases are optimized for storing and retrieving vector data efficiently. Like traditional databases, vector databases can be used to manage permissions, metadata and data integrity, ensuring secure and organized access to information. They also tend to include update mechanisms so newly added texts are indexed and ready to use quickly.

The difference that a vector database and Retrieval Augmented Generation makes might be easier to explain with an example. When a chatbot powered by LLama2 LLM is asked about an acronym that was not part of its training text, it tends to guess and respond with an incorrect expansion and elaborating on what that might be. It does not even hint that it might be making things up. This is often referred to as hallucination. But if an RAG system is setup with access to documentation that explains what the acronym stands for, the relevant information is indexed and becomes part of the vector database, and the same prompt will now give a more pertinent information. With RAG, the LLM provides correct answers.

If the prompt was provided with the relevant documents that contain an answer, which is referred to as augmenting the prompt, the LLM can leverage that against the vector database and provide more compelling and coherent answers that would turn out to be knowledgeable as well as opposed to the hallucination referred above. By automating this process, we can make the chat responses to be more satisfactory every time. This might require additional steps of building a retrieval system backed by a vector database. It might also involve extra steps of data processing and managing the generated vectors. RAG also has added benefits for the LLM to consolidate multiple sources of data into a readable output tailored to the user's prompt. RAG applications can also incorporate proprietary data which makes it different from the public data that most LLM are trained on. The data can be up to date so that the LLM is not restricted to the point-in-time that it was trained on. RAG reduces hallucinations and allows the LLM to provide citations and query statistics to make the processing more transparent to the users. As with all retrieval systems, fine-grained data access control also brings about its own advantages.

There are four steps for building Retrieval-Augmented Generation (RAG):

1. Data Augmentation

a. Objective: Prepare data for a real-time knowledge base and contextualization in LLM queries by populating a vector database.

b. Process: Integrate disparate data using connectors, transform and refine raw data streams, and create vector embeddings from unstructured data. This step ensures that the latest version of proprietary data is instantly accessible for GenAI applications.

2. Inference

a. Objective: Connect relevant information with each prompt, contextualizing user queries and ensuring GenAI applications handle responses accurately.

b. Process: Continuously update the vector store with fresh sensor data. When a user prompt comes in, enrich and contextualize it in real-time with private data and data retrieved from the vector store. Stream this information to an LLM service and pass the generated response back to the web application.

3. Workflows

a. Objective: Parse natural language, synthesize necessary information, and use reasoning agents to determine the next steps to optimize performance.

b. Process: Break down complex queries into simpler steps using reasoning agents, which interact with external tools and resources. This involves multiple calls to different systems and APIs, processed by the LLM to give a coherent response. Stream Governance ensures data quality and compliance throughout the workflow.

4. Post-Processing

a. Objective: Validate LLM outputs and enforce business logic and compliance requirements to detect hallucinations and ensure trustworthy answers.

b. Process: Use frameworks like BPML or Morphir to perform sanity checks and other safeguards on data and queries associated with domain data. Decouple post-processing from the main application to allow different teams to develop independently. Apply complex business rules in real-time to ensure accuracy and compliance and use querying for deeper logic checks.

These steps collectively ensure that RAG systems provide accurate, relevant, and trustworthy responses by leveraging real-time data and domain-specific context

#codingexercise: CodingExercise-03-31-2025.docx

Monday, March 31, 2025

 This is a summary of the book titled “Prompt Engineering for Generative AI - Future Proof inputs for reliable AI outputs” written by James Phoenix and Mike Taylor and published by O’Reilly in 2024. The authors are data scientists who explain that the varying results from queries to Large Language Models such as ChatGPT can be made more accurate, relevant and consistent with prompt engineering which focuses on how the inputs are worded so that users can truly harness the power of AI. With several examples, they teach the ins and outs of crafting text- and image-based prompts that will yield desirable outputs. LLMs, particularly those that are used in chatbots are trained on large datasets to output human like text and there are some principles to optimize the responses, namely, set clear expectations, structure your request, give specific examples, and assess the quality of responses. Specify context and experiment with different output formats to maximize the results. “LangChain” and “Autonomous agents” are two features of LLMs that can be tapped to get high quality responses. Diffusion models are effective for generating images from text. Image outputs for creative prompts can be further enhanced by training the model on specific tasks. Using the prompting principles mentioned in this book we can build an exhaustive content-writing AI.

Prompt engineering is a technique used by users to create prompts that guide AI models like ChatGPT to generate desired outputs. These prompts provide instructions in text, either to large language models (LLMs) or image-related diffusion AIs like Midjourney. Proper prompt engineering ensures valuable outputs, as generic inputs create varying outputs. Prompt engineering follows basic principles, such as providing clarity about the type of response, defining the general format, giving specific examples, and assessing the quality of responses. Large language models (LLMs) can learn from vast amounts of data, enabling the generation of coherent, context-sensitive, and human-sounding text. These models use advanced algorithms to understand the meaning in text and produce outputs that are often indistinguishable from human work. Tokens, created by Byte-pair Encoding (BPE), are used to compress linguistic units into tokens, which can be assigned numbers or vectors. LLM models are initially trained on massive amounts of data to instill a broad, flexible understanding of language, then fine-tuned to adapt to more specialized areas and tasks.

ChatGPT is a machine learning model that can generate text in various formats, such as lists, hierarchical structures, and more. To optimize its results, specify context and experiment with different output formats. To avoid issues with LLM outputs, try using alternative formats like JSON or YAML. Advanced LLMs like ChatGPT-4 can also make recommendations if the model's response is inadequate. Users can provide more context to the model to generate more accurate outputs.

LangChain, an open-source framework, can help address complex generative AI issues such as incorrect responses or hallucinations. It integrates LLMs into other applications and enables fluid interactions between models and data sources with retrieval, augmentation and generation enhancements. It allows developers to build applications like conversational agents, knowledge retrieval systems, and automated pipelines. As LLM applications grow, it's beneficial to use LangChain's prompt templates, which allow for validation, combination, and customization of prompts.

Large language models (LLMs) play a crucial role in AI evolution by addressing complex problems autonomously. They can use chain-of-thought reasoning (CoT) to break down complex problems into smaller parts, allowing for more efficient problem-solving. Agent-based architectures, where agents perceive their environment and act in pursuit of specific goals, are essential for creating useful applications. Diffusion models, such as DALL-E 3, Stable Diffusion, and Midjourney, are particularly effective in generating high-quality images from text inputs. These models are trained on massive internet data sets, allowing them to imitate most artistic styles. However, concerns about copyright infringement have been raised, but the images are not literal imitations of images or styles but are derived from patterns detected among a vast array of images. As AI image generation shifts, the focus will likely shift towards text-to-video and image-to-video generation.

AI image generation can be a creative process, with each model having its own unique idiosyncrasies. The first step is to specify the desired image format, which can be stock press photos or traditional oil paintings. AI models can replicate any known art style, but copyright issues should be considered. Midcentury allows users to reverse engineer a prompt from an image, allowing them to craft another image in the sample's style.

Stable Diffusion, an open-source image generation model, can be run for free and customized to suit specific needs. However, customization can be complicated and best done by advanced users. The web user interface AUTOMATIC1111 is particularly appealing for serious users, as it allows for higher resolution images with significant controls. DreamBooth can be used to fine-tune the model to understand unfamiliar ideas in training data.

To create an exhaustive content-writing AI, users should specify the appropriate writing tone and provide keywords. Blind prompting can make it difficult for the model to evaluate its own quality but providing at least one example can significantly improve the response quality.


Sunday, March 30, 2025

 A previous article1 described how the formation of a UAV swarm flows through space and time using waypoints and trajectory. While the shape formations in these cases are known, the size depends on the number of units in the formation, the minimum distance between units, the presence of external infringements and constraints and the margin required to maintain from such constraints. An earlier prototype2, also described the ability to distribute drones to spread as close to the constraints using self-organizing maps which is essentially drawing each unit to the nearest real-world element that imposes a constraint such as when drones fly through tunnels by following the walls. This establishes the maximum boundaries for the space that the UAV swarm occupies with the core being provided by the waypoints and trajectory that each unit of the swarm can follow one after the other in sequence if the constraints are too rigid or unpredictable. The progress along the trajectory spanning the waypoints continues to be with the help the center of the formation. Given the minimum-maximum combination and the various thresholds for the factors cited, the size of the shape for the UAV swarm at a point of time can be determined.

This article argues that the vectorization, clustering and model does not just apply to the UAV swarm formation in space but also applies to maintaining a balance between constraints and sizing and determining the quality of the formation, using Vector-Search-as-a-judge. The idea is borrowed from LLM-as-a-judge3 which helps to constantly evaluate and monitor many AI applications of various LLMs used for specific domains including Retrieval Augmented Generation aka RAG based chatbots. By virtue of automated evaluation with over 80% agreement on human judgements and a simple 1 to 5 grading scale, the balance between constraints and sizing can be consistently evaluated and even enforced. It may not be at par with human grading and might require several auto-evaluation samples, but these can be conducted virtually without any actual flights of UAV swarms. A good choice of hyperparameters is sufficient to ensure reproducibility, single-answer grading, and reasoning about the grading process. Emitting the metrics for correctness, comprehensiveness and readability is sufficient in this regard. The overall workflow for this judge is also like the self-organizing map in terms of data preparation, indexing relevant data, and information retrieval.

As with all AI models, it is important to ensure AI safety and security4 to include a diverse set of data and to leverage the proper separation of the read-write and read-only accesses needed between the model and the judge. Use of a feedback loop to emit the gradings as telemetry and its inclusion into the feedback loop for the model when deciding on the formation shape and size, albeit optional, can ensure the parameters of remaining under the constraints imposed is always met.

The shape and size of the UAV formation is deterministic at a point of time but how it changes over time depends on the selection of waypoints between source and destination as well as the duration permitted for the swarm to move collectively or stream through and regroup at the waypoint. A smooth trajectory was formed between the waypoints and each unit could adhere to the trajectory by tolerating formation variations.

Perhaps, the biggest contribution of the vectorization of all constraints in a landscape is that a selection of waypoints offering the least resistance for the UAV swarm to keep its shape and size to pass through can be determined by an inverse metric to the one that was used for self-organizing maps.

#Codingexercise

https://1drv.ms/w/c/d609fb70e39b65c8/Echlm-Nw-wkggNaVNQEAAAAB63QJqDjFIKM2Vwrg34NWVQ?e=grnBgD


Saturday, March 29, 2025

 Measuring RAG performance:

Since a RAG Application has many aspects that affect its retrieval or generation quality, there must be ways to measure its performance, but this is still one of the most challenging parts of setting up a RAG Application. It sometimes helpful to evaluate each step of the RAG Application creation process independently. Both the model and the knowledge base must be effective.

The evaluations in retrieval step, for instance, involves identifying the relevant records that should be retrieved to address each prompt. A precision and recall metric such as F-score can come helpful in benchmarking and improvements. Generating good answers to those prompts can also be evaluated so that it is free of hallucinations and incorrect responses. Leveraging another LLM to provide prompts and to check responses can also be helpful and this technique is known as LLM-as-a-judge. The scores resulting from this technique must be simple and, in the range, say 1-5 with a higher rating indicating a true response to the context.

RAG isn’t the only approach to customizing to equipping models with new information, but any approach will involve trade-offs between cost, complexity and expressive power. Cost comes from inventory and bill of materials. Complexity means technical difficulty that is usually reflected in time, effort, and expertise required. Expressiveness refers to the model’s ability to generate diverse, inclusive, meaningful and useful responses to prompts.

Besides RAG, prompt engineering offers an alternative to guide a model’s outputs towards a desired result. Large and highly capable models are often required to understand and follow complex prompts and entail serving costs or per-token costs. This is especially useful when public data is sufficient and there is no need for proprietary or recent knowledge.

Improving overall performance also requires the model to be fine-tuned. This has a special meaning in the context of large language models where it refers to taking a pretrained model and adapting it to a new task or domain by adjusting some or all of its weights on new data. This is a necessary step for building a chatbot on say medical texts.

While RAG infuses data into the overall process, it does not change the model. Fine-tuning can change a model’s behavior, so that it need not be the same as when it was originally. It is also not a straightforward process and may not be as reliable as RAG in generating relevant responses


Friday, March 28, 2025

 Measuring RAG performance:

Since a RAG Application has many aspects that affect its retrieval or generation quality, there must be ways to measure its performance, but this is still one of the most challenging parts of setting up a RAG Application. It sometimes helpful to evaluate each step of the RAG Application creation process independently. Both the model and the knowledge base must be effective.

The evaluations in retrieval step, for instance, involves identifying the relevant records that should be retrieved to address each prompt. A precision and recall metric such as F-score can come helpful in benchmarking and improvements. Generating good answers to those prompts can also be evaluated so that it is free of hallucinations and incorrect responses. Leveraging another LLM to provide prompts and to check responses can also be helpful and this technique is known as LLM-as-a-judge. The scores resulting from this technique must be simple and, in the range, say 1-5 with a higher rating indicating a true response to the context.

RAG isn’t the only approach to customizing to equipping models with new information, but any approach will involve trade-offs between cost, complexity and expressive power. Cost comes from inventory and bill of materials. Complexity means technical difficulty that is usually reflected in time, effort, and expertise required. Expressiveness refers to the model’s ability to generate diverse, inclusive, meaningful and useful responses to prompts.

Besides RAG, prompt engineering offers an alternative to guide a model’s outputs towards a desired result. Large and highly capable models are often required to understand and follow complex prompts and entail serving costs or per-token costs. This is especially useful when public data is sufficient and there is no need for proprietary or recent knowledge.

Improving overall performance also requires the model to be fine-tuned. This has a special meaning in the context of large language models where it refers to taking a pretrained model and adapting it to a new task or domain by adjusting some or all of its weights on new data. This is a necessary step for building a chatbot on say medical texts.

While RAG infuses data into the overall process, it does not change the model. Fine-tuning can change a model’s behavior, so that it need not be the same as when it was originally. It is also not a straightforward process and may not be as reliable as RAG in generating relevant responses.

#codingexercise: https://1drv.ms/w/c/d609fb70e39b65c8/EYKwhcLpZ3tAs0h6tU_RYxwBxeAeg1Vg2DH7deOt-niRhw?e=qbXLag


Thursday, March 27, 2025

 RAG with vector search involves retrieving information using a vector database, augmenting the user’s prompt with that information and generating a response based on the user’s prompt and information retrieved using an LLM. Each of these steps can be implemented by a variety of approaches but we will go over the mainstream.

1. Data Preparation: This is all about data ingestion into a vector database usually with the help of connectors. This isn’t a one-time task because a vector database should be regularly updated and provide high-quality information for an embeddings model otherwise, it might sound outdated. The great part of the RAG process is that the LLM weights do not need to be adjusted as data is ingested. Some of the common stages in this step include parsing the input documents, splitting the documents into chunks which incidentally can affect the output quality, and using an embedding model to convert each of the chunks into a high-dimensional numerical vector, storing and indexing the embeddings which results in a vector index to boost search efficiency and recording metadata that can participate in filtering. The value of embeddings for RAG is in the articulation of similarity scores between the meanings of the set of original text.

2. Retrieval is all about getting the relevant context. After preprocessing the original documents, we have a vector database storing chunks, embeddings and metadata. In this step, the user provides a prompt which the application uses to query the vector database and with the relevant results, augments the original prompt in the next step. Querying the vector database is done with the help of similarity scores between the vector representing the query to those in the database. There are many ways to improve search results and these include hybrid search, reranking, summarized text comparison, contextual chunk retrieval, prompt refinement, and domain specific tuning.

3. Augmenting the prompt with the retrieved context equips the model with both the prompt and the context needed to address the prompt The structure of the new prompt that combines the retrieved texts and the users’ prompt can impact the quality of the result.

4. Generation of the response is done with the help of an LLM and follows after the retrieval and augmentation steps. Some LLMs are quite good at following instructions but many require post processing. Another important consideration is whether RAG system should have memory of previous prompts and responses. One way to enhance generation is to add multi-turn conversation ability which allows us to ask a follow-up question.

Last but not the least, RAG performance must be measured. Sometimes, other LLMs are used to judge the response quality with simple scores like a range of 1-5. Prompt engineering plays a significant role in such cases to guide a model’s results towards a desired result. Fine-tuning can enhance the model’s expressiveness and accuracy.


Wednesday, March 26, 2025

 RAG with vector search involves retrieving information using a vector database, augmenting the user’s prompt with that information and generating a response based on the user’s prompt and information retrieved using an LLM. Each of these steps can be implemented by a variety of approaches but we will go over the mainstream.

1. Data Preparation: This is all about data ingestion into a vector database usually with the help of connectors. This isn’t a one-time task because a vector database should be regularly updated and provide high-quality information for an embeddings model otherwise, it might sound outdated. The great part of the RAG process is that the LLM weights do not need to be adjusted as data is ingested. Some of the common stages in this step include parsing the input documents, splitting the documents into chunks which incidentally can affect the output quality, and using an embedding model to convert each of the chunks into a high-dimensional numerical vector, storing and indexing the embeddings which results in a vector index to boost search efficiency and recording metadata that can participate in filtering. The value of embeddings for RAG is in the articulation of similarity scores between the meanings of the set of original text.

2. Retrieval is all about getting the relevant context. After preprocessing the original documents, we have a vector database storing chunks, embeddings and metadata. In this step, the user provides a prompt which the application uses to query the vector database and with the relevant results, augments the original prompt in the next step. Querying the vector database is done with the help of similarity scores between the vector representing the query to those in the database. There are many ways to improve search results and these include hybrid search, reranking, summarized text comparison, contextual chunk retrieval, prompt refinement, and domain specific tuning.

3. Augmenting the prompt with the retrieved context equips the model with both the prompt and the context needed to address the prompt The structure of the new prompt that combines the retrieved texts and the users’ prompt can impact the quality of the result.

4. Generation of the response is done with the help of an LLM and follows after the retrieval and augmentation steps. Some LLMs are quite good at following instructions but many require post processing. Another important consideration is whether RAG system should have memory of previous prompts and responses. One way to enhance generation is to add multi-turn conversation ability which allows us to ask a follow-up question.

Last but not the least, RAG performance must be measured. Sometimes, other LLMs are used to judge the response quality with simple scores like a range of 1-5. Prompt engineering plays a significant role in such cases to guide a model’s results towards a desired result. Fine-tuning can enhance the model’s expressiveness and accuracy.


Tuesday, March 25, 2025

 The vectors generated by embedding models are often stored in a specialized vector database. Vector databases are optimized for storing and retrieving vector data efficiently. Like traditional databases, vector databases can be used to manage permissions, metadata and data integrity, ensuring secure and organized access to information. They also tend to include update mechanisms so newly added texts are indexed and ready to use quickly.

The difference that a vector database and Retrieval Augmented Generation makes might be easier to explain with an example. When a chatbot powered by LLama2 LLM is asked about an acronym that was not part of its training text, it tends to guess and respond with an incorrect expansion and elaborating on what that might be. It does not even hint that it might be making things up. This is often referred to as hallucination. But if an RAG system is setup with access to documentation that explains what the acronym stands for, the relevant information is indexed and becomes part of the vector database, and the same prompt will now give a more pertinent information. With RAG, the LLM provides correct answers.

If the prompt was provided with the relevant documents that contain an answer, which is referred to as augmenting the prompt, the LLM can leverage that against the vector database and provide more compelling and coherent answers that would turn out to be knowledgeable as well as opposed to the hallucination referred above. By automating this process, we can make the chat responses to be more satisfactory every time. This might require additional steps of building a retrieval system backed by a vector database. It might also involve extra steps of data processing and managing the generated vectors. RAG also has added benefits for the LLM to consolidate multiple sources of data into a readable output tailored to the user's prompt. RAG applications can also incorporate proprietary data which makes it different from the public data that most LLM are trained on. The data can be up to date so that the LLM is not restricted to the point-in-time that it was trained on. RAG reduces hallucinations and allows the LLM to provide citations and query statistics to make the processing more transparent to the users. As with all retrieval systems, fine-grained data access control also brings about its own advantages.

There are four steps for building Retrieval-Augmented Generation (RAG):

1. Data Augmentation

a. Objective: Prepare data for a real-time knowledge base and contextualization in LLM queries by populating a vector database.

b. Process: Integrate disparate data using connectors, transform and refine raw data streams, and create vector embeddings from unstructured data. This step ensures that the latest version of proprietary data is instantly accessible for GenAI applications.

2. Inference

a. Objective: Connect relevant information with each prompt, contextualizing user queries and ensuring GenAI applications handle responses accurately.

b. Process: Continuously update the vector store with fresh sensor data. When a user prompt comes in, enrich and contextualize it in real-time with private data and data retrieved from the vector store. Stream this information to an LLM service and pass the generated response back to the web application.

3. Workflows

a. Objective: Parse natural language, synthesize necessary information, and use reasoning agents to determine the next steps to optimize performance.

b. Process: Break down complex queries into simpler steps using reasoning agents, which interact with external tools and resources. This involves multiple calls to different systems and APIs, processed by the LLM to give a coherent response. Stream Governance ensures data quality and compliance throughout the workflow.

4. Post-Processing

a. Objective: Validate LLM outputs and enforce business logic and compliance requirements to detect hallucinations and ensure trustworthy answers.

b. Process: Use frameworks like BPML or Morphir to perform sanity checks and other safeguards on data and queries associated with domain data. Decouple post-processing from the main application to allow different teams to develop independently. Apply complex business rules in real-time to ensure accuracy and compliance and use querying for deeper logic checks.

These steps collectively ensure that RAG systems provide accurate, relevant, and trustworthy responses by leveraging real-time data and domain-specific context.

Reference:

An earlier article: https://1drv.ms/w/c/d609fb70e39b65c8/EVqYhXoM2U5GpfzPsvndd1ABWXzGeXD1cixxJ9wRsWRh3g?e=aVoTd1


Monday, March 24, 2025

 RAG:

The role of a RAG is to create a process of combining a user’s prompt with relevant external information to form a new expanded prompt for a large language model aka LLM. The expanded prompt enables the LLM to provide more relevant, timely and accurate responses that the direct querying based on just data embeddings on domain-specific data. Its importance lies in providing real-time, contextualized, and trustworthy data for UAV swarm applications. If LLMs could be considered as encapsulating business logic in AI applications, then RAG and knowledge bases can be considered as data platform services.

LLMs are machine learning algorithms that can interpret, manipulate, and generate text-based content. They are trained on massive text datasets from diverse sources, including books, internet scraped text, and code repositories. During the training process, the model learns statistical relationships between words and phrases, enabling it to generate new text using the context of text it has already seen or generated. LLMs are typically used via "prompting," which is text that a user provides to an LLM and that the LLM responds to. Prompts can take various forms, such as incomplete statements or questions or instructions and this is even more scarce when dealing with multimodal data from UAV swarm sensors. RAG applications that enable users to ask questions about text generally use instruction-following and question-answering LLMs. In RAG, the user's question or instruction is combined with some information retrieved from an external data source, forming the new, augmented prompt. This helps to overcome issues like hallucinations, maintaining up-to-date information, and overcoming domain-specific knowledge.

The four steps for building Retrieval-Augmented Generation (RAG) are:

1. Data Augmentation

a. Objective: Prepare data for a real-time knowledge base and contextualization in LLM queries by populating a vector database.

b. Process: Integrate disparate data using connectors, transform and refine raw data streams, and create vector embeddings from unstructured data. This step ensures that the latest version of UAV swarm proprietary data is instantly accessible for GenAI applications.

2. Inference

a. Objective: Connect relevant information with each prompt, contextualizing user queries and ensuring GenAI applications handle responses accurately.

b. Process: Continuously update the vector store with fresh sensor data. When a user prompt comes in, enrich and contextualize it in real-time with private data and data retrieved from the vector store. Stream this information to an LLM service and pass the generated response back to the web application.

3. Workflows

a. Objective: Parse natural language, synthesize necessary information, and use reasoning agents to determine the next steps to optimize performance.

b. Process: Break down complex queries into simpler steps using reasoning agents, which interact with external tools and resources. This involves multiple calls to different systems and APIs, processed by the LLM to give a coherent response. Stream Governance ensures data quality and compliance throughout the workflow.

4. Post-Processing

a. Objective: Validate LLM outputs and enforce business logic and compliance requirements to detect hallucinations and ensure trustworthy answers.

b. Process: Use frameworks like BPML or Morphir to perform sanity checks and other safeguards on data and queries associated UAV swarms. Decouple post-processing from the main application to allow different teams to develop independently. Apply complex business rules in real-time to ensure accuracy and compliance and use querying for deeper logic checks.

These steps collectively ensure that RAG systems provide accurate, relevant, and trustworthy responses by leveraging real-time data and domain-specific context.


Sunday, March 23, 2025

 The four steps for building Retrieval-Augmented Generation (RAG) are:

1. Data Augmentation

a. Objective: Prepare data for a real-time knowledge base and contextualization in LLM queries by populating a vector database.

b. Process: Integrate disparate data using connectors, transform and refine raw data streams, and create vector embeddings from unstructured data. This step ensures that the latest version of UAV swarm proprietary data is instantly accessible for GenAI applications.

2. Inference

a. Objective: Connect relevant information with each prompt, contextualizing user queries and ensuring GenAI applications handle responses accurately.

b. Process: Continuously update the vector store with fresh sensor data. When a user prompt comes in, enrich and contextualize it in real-time with private data and data retrieved from the vector store. Stream this information to an LLM service and pass the generated response back to the web application.

3. Workflows

a. Objective: Parse natural language, synthesize necessary information, and use reasoning agents to determine the next steps to optimize performance.

b. Process: Break down complex queries into simpler steps using reasoning agents, which interact with external tools and resources. This involves multiple calls to different systems and APIs, processed by the LLM to give a coherent response. Stream Governance ensures data quality and compliance throughout the workflow.

4. Post-Processing

a. Objective: Validate LLM outputs and enforce business logic and compliance requirements to detect hallucinations and ensure trustworthy answers.

b. Process: Use frameworks like BPML or Morphir to perform sanity checks and other safeguards on data and queries associated UAV swarms. Decouple post-processing from the main application to allow different teams to develop independently. Apply complex business rules in real-time to ensure accuracy and compliance and use querying for deeper logic checks.

These steps collectively ensure that RAG systems provide accurate, relevant, and trustworthy responses by leveraging real-time data and domain-specific context.


Saturday, March 22, 2025

 This is a review of Retrieval Augmented Generation aka RAG but in a more general sense than the specific application to UAV swarm discussed earlier.  This is a process of combining a user’s prompt with relevant external information to form a new expanded prompt for a large language model aka LLM. The expanded prompt enables the LLM to provide more relevant, timely and accurate responses. 

LLMs are machine learning algorithms that can interpret, manipulate, and generate text-based content. They are trained on massive text datasets from diverse sources, including books, internet scraped text, and code repositories. During the training process, the model learns statistical relationships between words and phrases, enabling it to generate new text using the context of text it has already seen or generated. LLMs are typically used via "prompting," which is text that a user provides to an LLM and that the LLM responds to. Prompts can take various forms, such as incomplete statements or questions or instructions. RAG applications that enable users to ask questions about text generally use instruction-following and question-answering LLMs. In RAG, the user's question or instruction is combined with some information retrieved from an external data source, forming the new, augmented prompt. 

An effective RAG application uses Vector Search to identify relevant text for a user's prompt. An embedding model translates each text into a numeric vector, encapsulating its meaning. This process converts the user's query to a comparable vector, allowing for mathematical comparison and identifying the most similar and relevant texts. These vectors represent the meanings of the text from which they are generated, enabling retrieval of the text most relevant to the user's query. However, embedding models may not capture the exact meaning desired, so it's essential to test and evaluate every component of a RAG application. Vector databases are optimized for storing and retrieving vector data efficiently, managing permissions, metadata, and data integrity. LLMs, however, are not reliable as knowledge sources and may respond with made-up answers or hallucinations. To mitigate these issues, explicit information can be provided to the LLM, such as copying and pasting reference documents to ChatGPT or another LLM. Implementing RAG with Vector Search can address the limitations of LLM-only approaches by providing additional context for the LLM to use when formulating an answer. 

RAG applications enable the integration of proprietary data, up-to-date information, and enhanced accuracy of LLM responses. They provide access to internal documents and communications, reducing the occurrence of hallucinations and allowing for human verification. RAG also enables fine-grained data access control, allowing LLMs to securely reference confidential or personal data based on user access credentials. It equips LLMs with context-specific information, enabling applications that LLMs alone may not generate reliably. RAG is particularly useful in question-answering systems, customer service, content generation, and code assistance. For instance, a large e-commerce company uses Databricks for an internal RAG application, allowing HR to query hundreds of employee policy documents. RAG systems can also streamline the customer service process by providing personalized responses to customer queries, enhancing customer experience and reducing response times. Additionally, RAG can enhance code completion and Q&A systems by intelligently searching and retrieving information from code bases, documentation, and external libraries. 

RAG with Vector Search is a process that involves retrieving information from an external source, augmenting the user's prompt with that information, and generating a response based on the user's prompt and information retrieved using an LLM. Data preparation is a continuous process, involving parsing raw input documents into text format, splitting documents into chunks, and embedding the text chunks. The choice of chunk size depends on the source documents, LLM, and the RAG application's goals. 

Embeddings are a type of language model that generates numeric vectors or series of numbers from a text, encoding the nuanced and context-specific meaning of each text. They can be mathematically compared to each other, allowing for better understanding of the meanings of the original texts. 

Embeddings are stored in a specialized vector database, which efficiently stores and searches for vector data like embeddings. Vector databases often incorporate update mechanisms to allow for easy searching of newly added chunks. Overall, RAG with Vector Search is a valuable tool for generating effective and relevant responses. 

Friday, March 21, 2025

 Emerging trends:

Constructing an incremental “knowledge base” of a landscape from drone imagery merges ideas from simultaneous localization and mapping (SLAM), structure-from-motion (SfM), and semantic segmentation. Incremental SLAM and 3D reconstruction is suggested in the ORB-SLAM2 paper by Mur-Atal and Tardos in 2017 where a 3D Map is built by estimating camera poses and reconstructing scene geometry from monocular, stereo, or RGB-D inputs. Such SLAM framework can also be extended by fusing in semantic cues to enrich the resulting map with object and scene labels The idea of including semantic information for 3D reconstruction is demonstrated by SemanticFusion written by McCormick et al for ICCV 2017 where they use a Convolutional Neural Network aka CNN for semantic segmentation as their system fuses semantic labels into a surfel-based 3D map, thereby transforming a purely geometric reconstruction into a semantically rich representation of a scene. SemanticFusion helps to label parts of the scene – turning a raw point cloud or mesh into a knowledge base where objects, surfaces and even relationships can be recognized and queries. SfM, on the other hand, helps to stitch multi-view data into a consistent 3D-model where the techniques are particularly relevant for drone applications. Incremental SfM pipelines can populate information about a 3D space based on the data that arrives in the pipeline, but the drones can “walk the grid” around an area of interest to make sure sufficient data is captured to buid the 3D-model from 0 to 100% and the progress can even be tracked. Semantic layer is not added to SfM processing, but semantic segmentation or object detection can be layered on independently overly the purely geometric data. Layering-on additional modules for say, object detection, region classification, or even reasoning over scene changes helps to start with basic geometric layouts and add optinally to build comprehensive knowledge base. Algorithms that crunch these sensor data whether they are images or LiDAR data must operate in real-time and not on batch periodic analysis. They can, however, be dedicated to specific domains such as urban monitoring, agricultural surveying, or environmental monitoring for additional context-specific knowledge.


Thursday, March 20, 2025

 An earlier article1 described the creation and usage of a Knowledge Base for LLMs. One of the ideas emphasized behind is the end-to-end service expectations from the system and not just the provisioning of a vector database. In this regard, it is important to call out that semantic similarity and embeddings just does not cut it to capture the nuances of a query. In vector databases, each data point (document, image, or any object) is often stored along with metadata – structured information that provides additional context. For example, metadata could include attributes like timestamp, author, location, category, etc. During a vector search, filters can be applied on this metadata to narrow down the results, ensuring only relevant items are retrieved. This is particularly helpful when the dataset is large and diverse. This technique is sometimes referred to as “metadata filtering”

Some examples of where this makes a difference include:

1. Product recommendations: This case involves an e-commerce vector search where product embeddings are used to find similar items. If a customer searches for “lightweight hiking shoes,” the vector embeddings find semantically similar products. Adding a metadata filter like gender: female or brand: Columbia ensures the results align with specific requirements.

2. Content Moderation or compliance: Imagine a company using vector search to identify similar documents across various teams. By filtering metadata like department: legal or classification: confidential, only the relevant documents are retrieved. This prevents retrieving semantically similar but irrelevant documents from unrelated teams or departments.

3. Geospatial Search: A travel app uses vector embeddings to recommend destinations based on a user’s travel history and preferences. Using metadata filters for location: within 100 miles ensures the recommendations are regionally relevant.

4. Media Libraries: In a vector search for images, combining embeddings with metadata like resolution: >=1080p or author: John Doe helps surface high-quality or specific submissions.

And some examples where it doesn’t:

1. Homogeneous Datasets: If the dataset lacks meaningful metadata (e.g., all records have the same category or timestamp), filtering doesn’t add value because the metadata doesn’t differentiate between records.

2. Highly Unstructured Queries: For a generic query like “artificial intelligence” in a research database, metadata filtering might not help much if the user is looking for broad, cross-disciplinary results. Overly restrictive filters could exclude valuable documents.

3. When Metadata is Sparse or Inaccurate: If the metadata is inconsistently applied or missing in many records, relying on filters can lead to incomplete or skewed results.

Another technique that improves query responses is “contextual embeddings”.This improves retrieval accuracy, cutting failures with re-ranking. It involves both a well-known Retrieval Augmented Generation technique with semantic search using embeddings and lexical search using sparse retrievers like BM25. The entire knowledge base is split into chunks. Both the TF-IDF encodings as well as semantic embeddings are generated. Parallel searches using both lexical and semantic searches are run. The results are then combined and ranked. The most relevant chunks are located, and the response is generated with enhanced context. This enhancement over multimodal embeddings and GraphRAG2 is inspired by Anthropic and a Microsoft Community blog.

#Codingexercise

https://1drv.ms/w/c/d609fb70e39b65c8/EdJ3VDeiX2hGgAjzKHaFVoYBTCOvDz2W8EjTCUg08hyWkQ?e=BDjivM


Wednesday, March 19, 2025

Use of RAG in creating KB

 Gen AI created a new set of applications that require a different data architecture than traditional systems which includes structured and unstructured data. Applications like chatbot can perform satisfactorily only with information from diverse data sources. A chatbot requires an LLM model to respond with information from a knowledge base, typically a vector database. The underlying principle in a chatbot is Retrieval Augmented Generation. The LLM could be newer GPT3.5 or GPT4 to reduce hallucinations, maintain up-to-date information, and leverage domain-specific knowledge.

As with all LLMs, it is important to ensure AI safety and security1 to include a diverse set of data and to leverage the proper separation of the read-write and read-only accesses needed between the model and the judge. Use of a feedback loop to emit the gradings as telemetry and its inclusion into the feedback loop for the model when deciding on the formation shape and size, albeit optional, can ensure the parameters of remaining under the constraints imposed is always met.

Evaluating the quality of chatbot responses must take into account both the knowledge base and the model involved. LLM-as-a-judge evaluates the quality of a chatbot as an external entity. Although, it suffers from limitations such as it may not be at par with human grading, it might require several auto-evaluation samples, it may have different responsiveness to different chatbot prompts and slight variations in the prompt or problem can drastically affect its performance, it can still agree on human grading on over 80% of judgements. This is achieved by using a 1-5 grading scale, using GPT-3.5 to save costs and when there is one grading example per score and using GPT-4 when there are no examples to understand grading rules.


Tuesday, March 18, 2025

 The use of Large Language Model (LLM) for building a knowledge base (KB) seems to be a tribal art but in fact, it is applicable here as in the vast collection of domain specific text across many industries. A knowledge graph captures relationships between entities so bot the nodes and the edges are important to discover and there is no estimate of precision and recall to begin with. We take a specific example one application of LLM to build a KB with IText2KB. This is a zero-shot method for constructing incremental, topic-independent knowledge graphs from unstructured data using large-language models, without the need for extensive post-processing which is one of the main challenges for constructing knowledge graphs. Other challenges generally include the unstructured data type which might result in lossy processing and require advanced NLP techniques for meaningful insights, few-shot learning and cross-domain knowledge extraction. NLP techniques, in turn, face limitations, including reliance on pre-defined entities, and extensive human annotation.

This approach consists of four modules: Document distiller, Incremental Entities Extractor, Incremental Relations Extractor, and Neo4J graph integrator. The Document Distiller uses LLMs specifically GPT-4 to rewrite documents into semantic blocks, guided by a flexible schema to enhance graph construction. The Incremental Entities Extractor iteratively builds a global entity set by matching local entities from documents with previously extracted global entities. The Incremental Relations Extractor utilizes global document entities to extract both stated and implied relations, with variations based on the context provided. The approach is adaptable to various use cases, as the schema can be customized based on user preferences. The final module integrates the extracted entities and relations into a Neo4j database to visualize the knowledge graph. This forms a zero-shot technique because there are no predefined examples or ontologies.

The effectiveness of this technique which has broad applicability, can best be described by some metrics such as schema consistency scores across documents where a high score reflects high performance, information consistency metric where the higher consistency is desirable, triplet extraction precision which is more for local context-specific entites than for global entites and affects the richness of the graph, the false discovery rate which should be as low as possible for a successful entity/resolution process and estimation of cosine similarity for merging entites and relationships and to remove duplicates. This method outperforms on all these metrics. The results from experiments with documents such as CVs, scientific articles and websites have also emphasized effective data refinement and impact of document chunk size on KG construction.


Monday, March 17, 2025

Every use case listed in the scenarios targeted by the UAV swarm cloud automation maps to a set of direct and current commercial players and companies who could benefit and many across industries are catching up to the emerging AI trends and deploying applications with LLMs and RAGs. So, explaining the benefits of cloud-based pipelines to continuously analyze drone imagery to build a knowledge base of landscape would not be lost on many of them. But the ideal partner for this venture would be someone who engages deeply from the start so that field tests are not only practical but routine. The value of the software will be better articulated through the voice of the customer rather than the founder and such a partnership will likely be a win-win for both from the get-go.  This article explains not only the selection but the method of engagement with such a partner. 

Drone imagery is popular today in many defense-industry related applications by virtue of remote-operated drones. The use of UAV swarm is however better applied to surveying, remote sensing, disaster preparedness and responses such as wildfires, and those that make use of LiDAR data. Power line and windmill monitoring companies are especially suited for making use of a fleet of drones. Besides, there are over ten LiDAR companies that are public in US and many more across Europe and Asia that make use of a fleet of drones, photogrammetry and LiDAR data. Those that are using simultaneous localization and mapping (SLAM), structure-from-motion (SfM), and semantic segmentation with CNNs are possibly building their own knowledge bases, so it would not hurt to show them one that is built in the cloud in incremental, observable and near real-time.  

The right way to build this software is also iterative with stakeholder input. We leverage the agile sprint-based approach to build this software. Keeping a community edition or open source, opens engagement with the partner while drawing developer audience from source code and community platforms, including marketplaces for public clouds and journals, newsletters, podcasts, and social media platforms where we can find contacts and leads. A specific milestone could be pegged as presenting a PoC at AI Infra summit.  

Aside from technical aspect, a winning business plan could target a market that’s both well-defined and large, so this can be an advantage when fundraising or getting the attention of an investor. Polishing the business plan and addressing weakness prevents the investors from having to micromanage – an unhealthy situation. VCs also make it known through social media and other marketing avenues that they are funding startups but in casting a net, it is important to establish a shared reality of success. Consistently proving that the founding idea is going to work will have a snowball effect. While an experienced founder may bank on VC contacts, the first-time founder can dodge an obstacle course of promising leads that go nowhere and may have to rely on angel investors. Pitch-deck and follow-up calls must be rehearsed and never emailed or done offline. Controlling the narrative, reframing the questions, and answering them on our terms are in our hands. Creating a contract might be necessary but it cannot be relied upon. Developing a sales funnel during external engagements is important. From the start, an open-source GitHub for curating ideas and implementations should be made available.