AI agents are supposed to be the next big trend in AI, but there's no exact definition of what they are. At the moment, people can't agree on what exactly an AI agent consists of.
Simply put, an AI agent uses AI-powered software to perform a set of tasks that a human customer service agent, HR professional, or IT help desk employee would have performed in the past, but ultimately could include any task. You ask an agent to do something, and they do it for you. They span multiple systems and do more than just answer questions.
Seems simple enough. But the lack of clarity makes it complicated. Even among tech giants, there is no consensus. Google sees it as a task-based assistant for different jobs, like helping developers code, helping marketers create color schemes, or helping IT professionals track down issues by querying log data.
For Asana, agents work like additional employees, handling assigned tasks like talented colleagues. Sierra, a startup founded by former Salesforce co-CEO Bret Taylor and Google veteran Clay Bavor, sees agents as customer experience tools, empowering people to accomplish actions that go far beyond traditional chatbots to help solve a more complex set of problems.
The lack of a consistent definition leaves room for confusion about what these things specifically do, but regardless of how they're defined, agents are meant to help complete tasks automatically, with as little human intervention as possible.
Rudina Seseli, founder and managing partner at Glasswing Ventures, says it's still early days, which may be why there's no consensus: “There's no single definition of what an 'AI agent' is. But the most common view is that an agent is an intelligent software system designed to perceive its environment, reason about it, make decisions, and act to achieve a specific goal autonomously,” Seseli told TechCrunch.
She says they use a range of AI techniques to achieve this: “These systems incorporate a range of AI/ML techniques, including natural language processing, machine learning and computer vision, and operate autonomously or alongside other agents or human users in dynamic domains.”
Box co-founder and CEO Aaron Levie says that over time, AI capabilities will grow and AI agents will be able to do more for humans, and he says the dynamics driving that evolution are already at work.
“AI agents have multiple components in a self-reinforcing flywheel that are helping to dramatically improve what AI agents can accomplish in the short and long term: GPU price/performance, model efficiency, model quality and intelligence, AI frameworks, and infrastructure improvements,” Levie wrote recently on LinkedIn.
This is an optimistic view of technology that assumes growth will occur in all areas, but that's not necessarily the case: MIT robotics pioneer Rodney Brooks noted in a recent TechCrunch interview that AI has to deal with much harder problems than most technologies, and it won't necessarily grow as rapidly as chips based on Moore's Law.
“When humans see an AI system perform a task, they quickly generalize that to similar ones and make inferences about the AI system's capabilities — not just its performance on that task, but its capabilities related to that task,” Brooks said in an interview. “And they're usually very overoptimistic, because they're using a model of human performance on the task.”
The problem is, it's hard to cross systems, and the situation is complicated by the fact that some legacy systems don't even have basic API access. As Levie noted, while there have been steady improvements, enabling software to access multiple systems while solving problems that may be encountered along the way may be harder than many think.
If so, then we may all be overestimating what AI agents can do. David Cushman, research lead at HFS Research, thinks that the current swarm of bots is similar to Asana: an assistant that helps humans complete specific tasks to achieve strategic goals defined by the user. The challenge is to enable machines to deal with unforeseen circumstances in a fully automated way, and we're clearly not close to that yet.
“I think this is the next step,” he says, “where AI can operate independently and effectively at scale, where humans set guidelines and guardrails and apply multiple technologies to take humans out of the loop. Up until now, everything has been focused on keeping humans in the loop with GenAI,” he says. So the key here, he says, is to let AI agents take over and apply true automation.
Jon Turow, a partner at Madrona Ventures, says this will require the creation of an AI agent infrastructure: a technology stack designed specifically for building agents, however you define them. In a recent blog post, Turow outlined some examples of AI agents currently in production and how they're being built today.
In Turow's view, the proliferation of AI agents, like any other technology, requires a technology stack (though he acknowledges that the definition is still a bit hazy.) “All this means there's a challenge ahead for the industry to build out the infrastructure to support AI agents and the applications that rely on them,” Turow wrote in the article.
“Over time, inference will incrementally improve, state-of-the-art models will guide more workflows, and developers will focus more on their product and their data — what differentiates them. Developers want the underlying platform to 'just work' with scale, performance, and reliability.”
Another thing to note here is that you'll probably need multiple models to make your agents work, not just a single LLM. This makes sense if you think of these agents as a collection of different tasks. “Right now, I don't think a single large-scale language model can handle agent tasks, at least not with the monolithic large-scale language models that are publicly available. I don't think you're yet going to get to multi-step inference, which is what I really hope for in the future of agents. I think we're getting close, but we're not there yet,” said Fred Havemeyer, head of U.S. AI and software research at Macquarie US Equity Research.
“I think the most effective agents are multiple collections of multiple different models, with a routing layer that sends requests and prompts to the most effective agents and models. And I think that's going to get kind of interesting. [automated] It's a role like that of a supervisor and a delegator.”
Ultimately, for Havemeyer, the industry is working toward the goal of agents operating independently. “When I think about the future of agents, I would like to see and hope to see agents that are truly autonomous, that have abstract goals and can reason completely independently about all of the individual steps in between,” he told TechCrunch.
But the reality is, we're still in a transitional phase with these agents, and we don't know when we'll reach the final stage that Havemeyer mentioned. What we've seen so far is clearly a promising step in the right direction, but we still need some progress and breakthroughs before AI agents can function as we currently envision them. And it's important to understand that we're not there yet.