AI agents are thought to be the next big thing in AI, but there is no exact definition of what they are. At the moment, people disagree on what exactly constitutes an AI agent.
At its simplest, an AI agent is an AI agent that uses AI to perform a series of tasks for you that might have been performed in the past by a human customer service agent, human resources representative, or IT help desk employee. Although it is best described as software, it can ultimately involve anyone. task. When you ask it to do something, it does it for you, potentially across multiple systems, and does more than just answer questions. For example, Perplexity last month released an AI agent to help people with their holiday shopping (though it's not the only one). And last week, Google announced its first AI agent, called Project Mariner, that can be used to search for flights and hotels, shop for groceries, find recipes, and other tasks.
It seems so easy, right? But the lack of clarity makes it complicated. There's no consensus, even among the tech giants. Google sees them as task-based assistants for jobs, such as helping developers code. Help marketers create color schemes. Helps IT professionals track down issues by querying log data.
In Asana, agents act like additional employees, handling assigned tasks like any other good colleague. Sierra, a startup founded by former Salesforce co-CEO Brett Taylor and Google veteran Clay Baber, sees agents as a customer experience tool and wants to help people go far beyond traditional chatbots. We help you accomplish actions that help solve complex problems.
The lack of consistent definitions leaves room for confusion as to what exactly these do, but regardless of how they are defined, agents perform tasks in an automated manner with minimal human intervention. It is meant to help you complete it.
Rudina Seseri, founder and managing partner of Glasswing Ventures, says it's still early days and that could be the reason for the lack of agreement. “There is no single definition of what an ‘AI agent’ is. However, the most common view is that an agent perceives its environment, reasons about it, makes decisions, and autonomously pursues specific objectives. It's an intelligent software system designed to take action to achieve something,” Seseri told TechCrunch.
She says they use a lot of AI technology to make that happen. “These systems incorporate a variety of AI/ML techniques, including natural language processing, machine learning, and computer vision, and operate autonomously in dynamic domains or in parallel with other agents or human users. It works.”
Aaron Levie, co-founder and CEO of Box, said that as AI capabilities improve over time, AI agents will be able to do more on behalf of humans, and the dynamics driving that evolution will continue to grow. He says he is already working.
“AI agents have multiple self-reinforcing flywheels that dramatically improve what AI agents can accomplish in the short and long term, including GPU price/performance, model efficiency, model quality and intelligence, and AI frameworks. “There are components: infrastructure improvements,” Levy recently wrote on LinkedIn.
This is an optimistic view of technology that assumes growth will occur in all of these areas, but that is not necessarily a given. Rodney Brooks, a pioneer in robotics at MIT, said in a recent interview with TechCrunch that AI will have to deal with much more difficult problems than most technologies, and that AI will not necessarily have to deal with problems like Moore's Law-based chips. He pointed out that it does not necessarily mean that they will grow rapidly.
“When humans see an AI system perform a task, they quickly generalize it to similar things and make inferences about the AI system's capabilities. It's not just the performance about it that matters, but the ability about it.” Brooks said during an interview. “And they're usually far too optimistic because they're using a model of individual performance on the job.”
The problem is that systems are difficult to traverse, and this is further complicated by the fact that some legacy systems lack basic API access. While we're seeing steady improvements like the ones Levie alluded to, many are still thinking about how to allow software to access multiple systems while resolving any issues it may encounter along the way. It may prove to be more difficult than it is.
If so, everyone may be overestimating what an AI agent should be able to do. David Cushman, research leader at HFS Research, thinks of today's bot fleet in a similar way to Asana. It is an assistant that helps humans complete certain tasks in order to achieve some kind of strategic goal defined by the user. The challenge is to enable machines to respond to contingencies in a truly automated way, and we are clearly not close to that yet.
“I think that's the next step,” he said. “This is where AI is operating independently and effectively at scale. So this is where humans set guidelines and guardrails and apply multiple technologies to keep humans out of the loop. Until now, it was all about keeping humans up to date with GenAI,” he said. So the key here is to let the AI agents take over and apply true automation, he said.
Jon Turow, a partner at Madrona Ventures, says this will require the creation of an AI agent infrastructure, a technology stack specifically designed to create agents (however you define an agent). In a recent blog post, Turow outlined examples of AI agents in the wild today and how they are currently being built.
In Turow's view, the proliferation of AI agents requires a technology stack like any other technology, although he admits that definition is still a little elusive. “All of this means our industry has work to do to build the infrastructure that supports AI agents and the applications that depend on them,” he wrote in the article.
“Over time, inference will gradually improve, frontier models will control more and more workflows, and developers will want to focus on their products and data – what differentiates them. They want the underlying platform to “just work” with scale, performance, and reliability. ”
Another thing to keep in mind here is that you will likely need multiple models rather than a single LLM to make your agent work. This is natural if we think of these agents as a collection of different tasks. “Right now, I don't believe that a single large-scale language model, at least a publicly available monolithic large-scale language model, can handle agent tasks. I don't believe that a single large-scale language model, at least a publicly available monolithic large-scale language model, can handle agent tasks. I don't think they can make that inference yet. I think they're getting close, but they're not quite there yet,” said Fred Havemeyer, head of U.S. AI and software research at Macquarie U.S. Equity Research. said.
“I think the most effective agents are likely to be multiple collections of multiple different models, with a routing layer that sends requests and prompts to the most effective agents and models. And it's kind of I think it will be interesting [automated] Supervisor, a type of delegating role. ”
Ultimately for Havemeyer, the industry is working toward this goal of agents operating independently. “As I think about the future of agents, I'd like to see agents that are truly autonomous and able to set abstract goals and reason out every individual step in between completely independently. ” he said. Tech Crunch.
But in reality, we are still in a transition period involving these agents, and we do not know when we will reach this end state described by Habemeyer. While what we've seen so far is clearly a promising step in the right direction, there are still several advances and breakthroughs needed before AI agents behave as envisioned today. is required. And it's important to understand that we're not there yet.
This story was originally published on July 13, 2024 and has been updated to include new agents from Perplexity and Google.