This blog is part V of a series. See parts I, II, III, and IV.
In our last set of articles, we’ve talked about how GenAI techniques are evolving, and some of the best practices to improve your results with these tools and techniques. In this post, we’ll talk about the most intriguing advancement in this field: the concept of AI agents. You’ve probably heard the hype - these agents have the potential to revolutionize our lives by automating complex tasks, enhancing productivity, and providing sophisticated solutions. Let’s talk about what they are, when to use them, and how multiple agents can work together effectively.
AI agents, specifically LLM-powered agents, are systems designed to reason through problems, create plans to solve them, and execute these plans using a set of tools. Usually, when they’re brought up as a potential solution to an AI problem, you’ll hear something like “These agents possess complex reasoning capabilities, memory, and execution functionalities, which distinguish them from simpler generative AI models that primarily focus on generating text or performing isolated tasks.” But putting it like this can also gloss over what agents can really do, or make them seem sentient or magical somehow. So before we start pulling them apart to look at the guts, let’s demystify that statement a little. A GenAI Agent is really just a series of LLM prompts, with the following additions:
It’s worth breaking these things out to dispel the idea that an agent is somehow an entirely different usage of AI. In the end, the whole system is just an iterated series of LLM calls, but with the elements above giving more contextual input to those calls, and allowing the LLM output to “go other places” including calling other tools, functions or even other agents. Agent systems undeniably have a lot of power and flexibility and are being used for a lot of great applications. But they aren’t so much a new form of AI as they are creative and flexible arrangements of the same base LLM systems.
AI agents are powerful tools but are not always the best choice for every scenario. These systems can be significantly complex, both intrinsically and in the emergent sense of having many stochastic elements operating cooperatively. As a result, they can require a lot of careful validation and testing. Understanding when to deploy these agents and when to consider simpler alternatives is crucial for developing efficient GenAI systems.
The concept of using multiple AI agents, sometimes referred to as a "swarm" or "ecosystem," involves a collection of agents working collaboratively to solve problems. This decentralized approach can be likened to microservices in software engineering, where each agent specializes in specific tasks but contributes to a common goal.
In this setup, multiple agents coexist in a single environment, collaborating on tasks. These agents can work together to simulate environments like digital companies or virtual neighborhoods. For example, in a software development project, different agents might handle coding, design, and testing, while another agent handles engaging the others in a structured way, handling project management and concept development. This collaborative effort can lead to rapid prototyping and cost-effective development, but it can also produce a wide distribution of results.
Multi-agent architectures inherently support tried and true classic object-oriented design principles, including encapsulation and specialization, making development and maintainability more manageable and extensible.
With all the possible GenAI applications, it’s easy to get lost in different agent geometries, and equally easy to produce something complex that never gets quite stable enough for a real deployment. So what to do? In my opinion, you:
AI agents are still in their infancy. As their reasoning and decision-making capabilities become more sophisticated and reliable, they’ll usher in a significant leap in the capabilities of artificial intelligence to enhance our daily lives. But they’re still currently limited by most of the same things that prompt chains are - complex edge cases, the need for validation, and a lack of sophisticated reasoning capabilities. Combining better reasoning, memory, and execution tools makes them suitable for a wide range of applications, but it’s essential to understand their current limitations and to deploy them judiciously, particularly in scenarios that require emotional intelligence, high-stakes decision-making, or could be better handled by a simpler approach.
To learn more about MVL, read our manifesto and let’s build a company together.
We are with our founders from day one, for the long run.