What is an agent?
We've witnessed a rapid evolution in AI applications, moving from simple copilots to more sophisticated agent systems. This shift hasn't happened in isolation; it's been driven by advancements in language models and a growing appetite for autonomous AI solutions in the business world.
The Current Landscape
Startups are building horizontal platforms to enable business users to create and configure their own agents:
- BREVIAN: No-code AI agent platform automating business processes
- ema: AI-powered digital employees for task automation
- Relevance AI: Customizable AI agents for various business applications
Frameworks are emerging for developers to build their own agents:
- LangChain: Python library for creating language model-powered applications
- LlamaIndex: Python library for building agents that interact with external data
- CrewAI: Framework for orchestrating "crews" of agents
- AutoGPT: Open-source framework for autonomous GPT-4 agents
Additionally, companies are developing vertical solutions for specific use cases:
- Hebbia: AI-powered research and analysis for financial services
- Harvey: AI legal assistant for law firms
This all sounds promising... if it works.
So what is an agent?
At their core, agents are systems that can take user input (structured or unstructured), process it through a series of steps, and produce an output. They typically consist of three main components: tasks (user instructions), memory (data storage), and tools (functions they can execute).
When you break it down, this current structure isn't too different from a traditional backend server.
- User requests come in with some input and some context (tasks)
- Data is stored and retrieved (memory)
- Various functions are called to process the request (tools)
- Communication between "agents" in multi-agent systems is done through API calls or a shared data store (e.g., message queue)
Instead of tightly typed API endpoints, we're dealing with natural language.
Whats next?
What excites me about where this is all going is the potential to learn from historical executions and make better decisions on the fly.
A system powered by a large language model could potentially improve its performance over time, learning from specific tasks or user preferences without explicit code.
Perhaps "Agents" aren't meant to replace our existing services entirely. Instead, they represent a new architectural approach that can work alongside and enhance our current systems.
More readings if you're interested: