On Tuesday, OpenAI released a new tool designed to help developers and businesses build AI agents (automated systems) that can use their own AI models and frameworks to accomplish tasks independently.
This tool is part of OpenAI's new answers API, allowing you to develop custom AI agents that can perform web searches, scan corporate files, and navigate websites. The answer API effectively replaces Openai's assistant API, which is scheduled to go to sunset in the first half of 2026.
The hype around AI agents has grown dramatically in recent years, despite the tech industry having a hard time showing or defining people what “AI agents” really are. In the latest example of agent hype running ahead of utilities, the Chinese startup butterfly effect went viral earlier this week on a new AI agent platform called Manus, which users quickly discovered, and did not offer much of the company's promises.
In other words, Openai has a high interest in getting the agent right.
“It's very easy to demonstrate an agent,” Openai's API product head Olivier Godemont told TechCrunch in an interview. “Scaling agents is pretty difficult, and it's very difficult to get people to use it.”
Earlier this year, Openai introduced two AI agents to ChatGPT. The operator is a deep search that navigates the website on your behalf and compiles research reports for you. Both tools provided a glimpse into what agent technology could achieve, but in the “autonomous” sector it required considerable hope.
Now, OpenAI wants to use API answers to sell access to components that power AI agents, allowing developers to build their own operator and deep search style agent applications. Openai hopes that developers can create several applications with agent technology that they find more autonomous than what is currently available.
Using the Answer API, developers can tap the same AI model (preview) in Openai's ChatGPT Search Web Search Tool hood: GPT-4O Search and GPT-4O Mini Search. The model can browse the web to answer questions and cite the source when generating replies.
Openai claims that the GPT-4O and GPT-4O mini searches are very accurate. The company's SimpleQA benchmark measures the model's ability to answer questions that require short facts, and the GPT-4O search score is 90%, with a GPT-4O mini search score of 88% (higher ones better). For comparison, GPT-4.5 – Openai's much larger and more recently released model – just 63% score.
The fact that AI-powered search tools are more accurate than traditional AI models is not necessarily surprising. In theory, GPT-4O searches can look up the correct answer. However, web search does not provide a resolved problem for hallucinations. AI search tools tend to struggle with short navigation queries (such as “Lakers Scare Today”) beyond their de facto accuracy, and recent reports suggest that ChatGPT citations aren't always reliable.
The answer API also includes a file search utility that allows you to quickly scan files in your company's database to retrieve information. (Openai claims it does not train models on these files.) Additionally, developers using the Response API can tap Openai's Computer Usage Agent (CUA) model. This model generates mouse and keyboard actions, allowing developers to automate computer-used tasks such as data entry and app workflows.
According to Openai, companies can optionally run locally released CUA models in the research preview. The consumer version of CUA available to the operator can only perform actions on the web.
To be clear, the answer API does not solve all the technical issues that plague AI agents today.
Although AI-powered search tools are more accurate than traditional AI models, they are not surprising given that they can look up the correct answers, web search doesn't make AI Hallucinations a problem that solves them. The GPT-4o search still makes 10% of the effectively wrong questions. Beyond its accuracy, AI search tools tend to struggle with short navigation queries (such as “Lakers Scare Today”), and recent reports suggest that ChatGPT citations aren't always reliable.
In a blog post provided to TechCrunch, Openai said the CUA model is “still not very reliable to automate operating system tasks,” and is prone to making “careless” mistakes.
However, Openai said these are early repetitions of agent tools and are constantly working to improve them.
In addition to the Response API, OpenAI is releasing an open source toolkit called the Agent SDK. It provides a free tool for integrating models with internal systems, implementing protections, and monitoring AI agent activity for debugging and optimization. The Agent SDK is a kind of follow-up to Swarm from Openai, a multi-agent orchestration framework released later last year.
Godemont hopes Openai can close the gap between the AI Agent demo and the product this year, and in his opinion, “agents are AI's most influential application.” It reflects Openai CEO Sam Altman, a declaration made in January. 2025 is the year in which AI agents enter the workforce.
Whether 2025 will really be the “AI Agent Year” or not, Openai's latest release shows that the company wants to move from flashy agent demos to impactful tools.