Francois Cholet, a former Google engineer and influential AI researcher, co-founded a nonprofit organization that helps develop benchmarks to examine “human-level” intelligence in AI.
The nonprofit ARC Prize Foundation will be led by Greg Kamradt, former Salesforce engineering director and founder of AI product studio Leverage. Mr. Kamrat will serve as president and director.
Fundraising for the ARC Prize Foundation will begin in late January.
“[W]We are…growing into a proper nonprofit foundation that serves as a useful north star to artificial general intelligence,” Chollet wrote in a post on the nonprofit’s website. (Artificial general intelligence is a vague term, but it is generally understood to mean AI that can perform most tasks that humans can perform.)[W]seeks to stimulate progress by promoting [the gap] In basic human abilities. ”
The ARC Award Foundation expands on ARC-AGI, a test developed by Chollet to assess whether AI systems can effectively learn new skills outside of the data used for training. It consists of puzzle-like questions in which the AI must generate a grid of correct “answers” from a collection of squares of different colors. These problems are designed to force the AI to adapt to new problems it has never seen before.
In 2019, Chollet introduced ARC-AGI, which stands for “Abstract and Reasoning Corpus for Artificial General Intelligence.” Many AI systems can pass the Math Olympiad test and find potential solutions to doctoral-level problems. But until this year, the best-performing AIs could only solve just under a third of ARC-AGI's tasks.
“Unlike most state-of-the-art AI benchmarks, we are not trying to measure AI risk with herculean exam questions,” Chollet wrote in the post. “Future versions of the ARC-AGI benchmark will focus on shrinkage. [the human capability] Toward the gap to zero. ”
Last June, Chollet and Zapier co-founder Mike Knoop launched a competition to build an AI that could beat ARC-AGI. OpenAI's unreleased o3 model achieved a qualifying score for the first time, but it required an extraordinary amount of computing power.
Chollet believes that ARC-AGI is flawed, that many models were able to achieve high scores in brute force attacks, and that he does not believe that o3 has human-level intelligence. I made it clear that.
“[E]Early data points will help future [successor to the ARC-AGI] Benchmarking remains a big challenge for o3, and even high computing can drop scores below 30% (whereas a smart person could score above 95% without training),” said Chollet. said in a statement last December. “You can see the emergence of artificial general intelligence when the task of creating tasks that are easy for ordinary humans but difficult for AI becomes completely impossible.”
Knoop says he plans to launch the first generation of ARC-AGI benchmarks “in the first quarter” alongside the new competition. The nonprofit organization also plans to begin designing the third edition of ARC-AGI.
It remains to be seen how the ARC Awards Foundation will address the criticism Chollet faces for overselling ARC-AGI as a benchmark for achieving AGI. The very definition of AGI is currently hotly debated. Recently, one OpenAI staff member argued that if we define AGI as an AI that is “better than most humans at most tasks,” then AGI has “already” been achieved.
Interestingly, OpenAI CEO Sam Altman said in December that the company intends to partner with the ARC-AGI team to build future benchmarks. Chollet did not provide any updates on the potential partnership in today's announcement.
However, in a series of posts about He stated that he would establish the
TechCrunch has a newsletter focused on AI. Sign up here to get it delivered to your inbox every Wednesday.