What do AI voice agents and self-driving cars have in common? Brooke Hopkins, former head of technology at Waymo, argues that their performance can be evaluated similarly. Coval, a new Hopkins startup, is trying to do just that.
“When I left Waymo, I realized that many of the problems we were having at Waymo were exactly what the rest of the AI industry was facing,” Hopkins (pictured above, center) told TechCrunch. . “But everyone was saying this is a new paradigm and we have to come up with testing methods from first principles and basically everyone has to re-write everything. And I said, “Wait, we've spent the last 10 years trying to figure out how to make autonomous driving a reality.”
In 2024, she launched Coval, a platform that builds simulations to test and evaluate how AI voice and chat agents perform tasks, in the same way that Hopkins tested self-driving cars at Waymo. I decided to raise it. Coval can run thousands of simulations simultaneously, such as having agents make restaurant reservations or answer customer service questions asked through indirect methods.
Coval's technology evaluates agents based on a common set of metrics, but companies can also customize what they're looking for and continue evaluating regressions using Coval. Users can also capture this data and the insights it provides and provide it to end customers as a demo or monitoring tool to show them that their agents are working as intended.
“One of the biggest hurdles for companies to hire agents is that they have confidence that this is not just a smoke-and-mirrors demo,” Hopkins said. “Vendor selection is a very complex task for these executives, as it is very difficult to know what to request and even how to prove that these agents are doing what is expected. Because it’s difficult.’ And this allows our companies to really demonstrate and demonstrate that. ”
Hopkins actually developed the idea behind Coval during Y Combinator's summer 2024 batch, before launching the product to the public in October 2024. did. He said demand is strong and has exploded over the past two months, with customers asking how quickly they can get representation. It was evaluated.
The San Francisco-based startup announced a $3.3 million seed round led by MaC Venture Capital with participation from Y Combinator and General Catalyst. The startup will use the funding to strengthen its engineering team and strive to achieve product-market fit. Hopkins added that the company will also work on allowing users to evaluate other types of AI agents, such as web-based agents, in the future.
Coval arrives as the momentum and hype surrounding AI agents appears to be at an all-time high. Enterprise technology leaders like Marc Benioff are praising (and marketing) the technology, saying Salesforce will deploy more than 1 billion AI agents by next year. OpenAI is rumored to be releasing its take on AI agents soon.
A number of start-up companies have also been established in this field. In Y Combinator's three 2024 cohorts alone, there were more than 100 startups building AI agents. Some AI agent startups have secured large rounds of venture funding. One of them, /dev/agents, raised a $55 million seed round at a $500 million valuation in November 2024, less than a year after it was founded.
This momentum means more companies may seek help evaluating agents. Hopkins said Koval has a chance to stand out from the pack because, unlike the inevitable new entrants, Koval has a head start.
“I think what really sets us apart is that I’ve been working in this space for five years and have built these systems over and over again,” she said. “We build multiple iterations, see how it fails, see how it scales, and incorporate the same concepts into learning Coval and all of them.”