This article is all about AI agents. There are many types, development processes and real-world implementations. There has always been automation for complex workflows with curated artifacts, they have been boutique and never really intended for Large Language Models. While information can be tapped from multiple data sources or a knowledge base, enhanced decision-making processes needed to leverage AI agents. The operational framework of AI agents and the ways they augment LLMs is described here.
AI agents are software entities that complete a task autonomously on behalf of a user, including making requests to other services to improve the reach of standalone LLMs. They can retrieve real-time data from external databases, and APIs, manage interactive sessions with users and automate routine tasks that can be invoked dynamically or on schedule and with different parameters. An agent framework provides the tools and structures necessary to a developer to build robust, scaleable and efficient agent-based systems. This agent framework is an evolution of Reason-Action (ReAct) framework where an LLM is prompted to follow Thought/Action/Observation sequences. The Agent framework extends this by including external tools into the action step. The tools can range from simple calculators and database calls to python code generation and execution and even interactions with other agents. The calling program typically parses the output of the LLM at each step to determine the next steps. As an example, a prompt to find the weather in Seattle, WA involves a thought for needing to access a weather API to get the information, an action to call the weather API with location information and an observation on the response, followed by a thought that the relevant information is 65 degrees Fahrenheit and sunny, an action to report it to the user and an observation that the user is informed. By increasing iterations, articulation of granularity, dynamic adaptability and interoperability, the decision-making process can be arbitrarily enhanced. Compared with traditional software agents, there is very little distraction from syntax and format and more emphasis on semantics and latent meaning by virtue of LLMs and vector databases. This helps them to provide more dynamic, context-aware responses.
LLM agents can be diverse with each tailored to address specific challenges in information processing, decision support, and task automation. The task-specific agents are designed to perform specific, well-defined tasks. The conversational agents leverage natural language not a query language to interact with users. The decision support agents analyze complex data and provide insights. The workflow-automation agents co-ordinate and execute multi-step processes across different systems. The information retrieval agents can search and extract relevant information from large datasets or document repositories. The collaborative agents are creative and work with humans to accomplish complex tasks. The predictive agents use historical data and current trends to forecast future outcomes. The adaptive learning agents improve performance over time by learning from interactions and feedback. By categorizing different types of agents, an organization can streamline their operations, improve customer experiences and gain valuable insights.