AI wrote a scientific paper that passed peer review

Science has always relied on a curious human’s mind forming a hypothesis, designing an experiment, analyzing the results and presenting the case to that person’s peers. Over centuries, we’ve built better tools such as electron microscopes, particle accelerators and supercomputers, but the core loop of scientific discovery has remained stubbornly human. Now, for the first time, that loop has started with a new kind of mind.

So far, scientists have often had artificial intelligence help them with solving a predefined, narrow task such as folding proteins, says Jeff Clune, a professor of computer science at the University of British Columbia. “We’re saying the AI gets to be the scientist,” he says.

In a recent Nature study, Clune and his colleagues unveiled the AI Scientist, an AI system that wrote a paper without human involvement that passed peer review for a workshop at the 2025 International Conference on Learning Representations (ICLR), a top-tier venue in the field of machine learning. The paper was mediocre, according to Clune and other experts. But its existence marks a turning point that the scientific community is only beginning to grapple with: AI has quickly moved from assisting scientists to attempting to be one.

On supporting science journalism

If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The AI Scientist comprises multiple modules. After it is given a general topic prompt by researchers, it surveys available literature and generates hypotheses. “We’re just giving it a general direction like ‘Come up with something interesting to study on how the AI learns,’” Clune explains. The system then evaluates and refines those ideas, filtering out any that are not novel. From there, further modules plan and execute experiments, analyze and plot the data and, finally, write the paper. It even does its own internal peer review process to find flaws in its papers, Clune says. (The system relies on existing foundation models such as Anthropic’s Claude Sonnet or OpenAI’s GPT-4o; the team’s contribution is the pipeline orchestrating these models).

To see if The AI Scientist’s output could meet human standards, the team submitted three papers generated by it to the I Can’t Believe It’s Not Better (ICBINB) workshop at the 2025 ICLR. One was accepted. (The conference organizers gave their permission for the AI-generated papers to be submitted, and all of the AI Scientist’s papers were withdrawn from the conference after the review process.)

The team behind the AI Scientist admits the bar for this workshop was lower than that of a main conference publication. “Would a mediocre graduate student get one paper in three accepted at a place that accepts 70 percent of papers? Sure!” says Jodi Schneider, an associate professor of information sciences at the University of Wisconsin–Madison, who was not involved in Clune’s study.

The AI’s papers “are okay but not great,” Clune says. To him, some of the AI’s ideas seemed truly creative, yet the system struggled with execution. “The logic and the writing and the thinking throughout the whole paper didn’t all fit together beautifully,” he notes. Further issues included hallucinated references, duplicated figures and a lack of methodological rigor.

Overall, Clune and his colleagues’ new study has received a lukewarm reception. “The approach is agentic and without any real novelty,” says Maria Liakata, a professor of natural language processing at Queen Mary University of London, who was not involved in the work.

There was one metric, though, where the AI Scientist did outperform human researchers by a huge margin: it produced a formally passable paper on machine learning within 15 hours at a cost Clune estimated to be around $140. Compare that with the capability of a graduate student, who might take a full semester to write their first accepted workshop paper, according to Schneider.

As costs drop and output speeds increase, AI-authored papers present the scientific community with an immediate challenge. “The AI-written papers are probably going to make things much worse,” warns Yanan Sui, an associate professor at Tsinghua University in China and the senior workshop chair for ICLR 2026.

To safeguard against this flood, top-tier venues have begun setting limits. “There are strict rules for the main conference that do not allow submission of purely AI-written papers,” Sui says. The compromise, for now, is forced transparency—the authors using AI must clearly state how it was used. Sui admits, though, that journals and conferences usually lack the tools to reliably detect AI-generated contributions.

The tools to autonomously write these contributions, meanwhile, have already started to proliferate. Intology claimed its AI Zochi passed peer review for the main proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (though human researchers were involved in areas such as verifying results before submission and communicating with peer reviewers). Another group called the Autoscience Institute stated that its AI system created papers that were accepted at ICLR workshops before the AI Scientist.

“We’re not going to be able to remove the power to generate AI scientific papers,” says Aaron Schein, a data scientist at the University of Chicago and one of the ICBINB workshop organizers. “This technology is only going to get better. I don’t think there’s anything to do about that.”

But what if one day the AI-generated papers stop being mediocre?

Clune sees the transition unfolding in two phases. “In the very short term, you’re going to get a lot of slop and garbage, and the peer review systems are going to have to deal with that,” he says. But eventually, he argues, AI systems will be far better at science than human researchers. “I predict the AI Scientist actually marks the dawn of a new era of rapid scientific advances,” Clune claims, imagining humans reduced to curators witnessing AI achieve scientific wonders.

Liakata, though, thinks there’s still something for us humans to do. “I believe the future is not fully autonomous scientific discovery but advanced human-agent interaction where the human can scrutinize and contribute to the process,” she says.