Japanese AI startup Sakana said that AI produced one of the first peer-reviewed science publications. However, while the claims are not necessarily true, there are warnings to be aware of.
The discussions swirling around AI and its role in the scientific process grows day by day. Many researchers don't think AI is ready to function as a “co-scientist,” while others believe it is possible, but they acknowledge that it is an early age.
Sakana falls into the latter camp.
The company said it will use an AI system called AI Scientist-V2 to generate papers that Sakana submitted to ICLR's workshop, a long-term, reputable AI conference. Sakana claims that the workshop organizers and ICLR leadership have agreed to work with the company to carry out an experiment in which AI-generated manuscripts are double-blind.
Sakana said that in collaboration with researchers from the University of British Columbia and Oxford, he submitted three AI-generated papers to the aforementioned workshops and submitted for peer review. AI Scientist-V2 generated an “end-to-end” paper “end-to-end” containing scientific hypotheses, experimental and experimental code, data analysis, visualization, text, and titles.
“We generated research ideas by providing AI with a summary and explanation of the workshop,” Sakana research scientist and founding member Robert Lange told TechCrunch in an email. “This ensured that the generated papers were about topics and appropriate submissions.”
One of the three papers was accepted into the ICLR workshop. This is a paper that casts important lenses on AI model training techniques. Sakana said she immediately retracted the paper before it was published for transparency and respect for the ICLR treaty.
Snippet of paper image credits generated in Sakana AI: Sakana
“The accepted papers show that both introduce new and promising ways to train neural networks and there are remaining empirical challenges,” Lange said. “It provides an interesting data point that will spark further scientific investigation.”
However, the results are not as impressive as they seem at first glance.
In a blog post, Sakana admits that AI has made occasional “embarrassing” quote errors. For example, it misattributes the method to a 2016 paper, rather than an original 1997 work.
Sakana's papers were less scrutinized than other peer-reviewed publications. The company retracted it after the initial peer review, so the paper was not subject to additional “meta reviews”, during which the workshop organizers theoretically rejected it.
Secondly, there is the fact that the acceptance rate for conference workshops tends to be higher than the acceptance rate for the main “conference track.” The company said none of the AI-generated studies passed the internal bars of the ICLR conference track publications.
Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, called the results of Sakana “a bit misleading.”
“The people in Sakana chose their papers from several generated ones, meaning they used human judgment in terms of choosing the output they thought would come in,” he said in an email. “What I think this shows is that humans and AI are effective, and AI is not the only thing that produces scientific advancement.”
Mike Cook, a researcher at King's College London, specializing in AI, questioned the rigour of the peer reviewer and workshop.
“New workshops like this are often reviewed by more junior researchers,” he told TechCrunch. “It's also worth noting that this workshop is about negative outcomes and difficulties. It's great. I've run similar workshops before, but it's definitely easy to write convincingly about failures with AI.”
Cook added that considering AI is excellent at writing human sound prose, he was not surprised that AI can pass peer reviews. The partially AI-generated papers passing the Journal review are not even new, Cook noted, nor the ethical dilemma this brings for science.
The technical drawback of AI is that many scientists wary of supporting it for serious work, such as hallucination trends. Furthermore, experts fear that AI could simply generate noise in the scientific literature.
“We need to ask ourselves [Sakana’s] The results are about how good AI is in designing and implementing experiments, or how good it is to sell ideas to humans. “There is a difference between passing a peer review and the contribution of knowledge being made to the field.”
Sakana does not argue that its AI can produce groundbreaking or particularly novel scientific research, due to its credibility. Rather, the goal of this experiment was to “study the quality of research in which AI was generated.”
“[T]Here's a difficult question [AI-generated] Science should first be judged by its own merits to avoid bias against it,” the company wrote. “Today, we will continue to exchange opinions with the research community regarding the state of this technology, ensuring that it will not grow into a future situation, and our sole purpose is to pass peer reviews, which will greatly undermine the meaning of the scientific peer review process.”