Today, we’re introducing Marvin, an autonomous research agent for ML science. Marvin takes the information overload and busywork out of research. It does deep literature review, generates and tests novel and scientifically valid hypotheses, and can perform the entire research loop fully autonomously, end to end. Learn more about Marvin.

Why we built Marvin

The bottleneck in ML research today isn’t compute or data. It’s the preparation. More research is being produced now than at any point in history, and the pace is only increasing. Researchers must ingest and synthesize growing volumes of information before they can actually start their research. And even once they start, a lot of the research cycle is still spent on logistics rather than the science itself.

We built Marvin because nothing out there worked well enough for our own research. The existing options were either too dumb (chasing red herrings down rabbit holes or proposing smart-sounding ideas that were anything but), too wasteful (channeling Ralph Wiggum on experiments that were never going to work), or too opaque (poor documentation, no reasoning traces or “logic trail” that forms the bedrock of scientific reproducibility).

Full autonomy with logic trail

The latter issue with closed-loop agents is especially problematic for doing science. Full autonomy is only useful if you can trust it. AI is incredibly good at generating plausible-looking outputs, which will only further compound the reproducibility crisis in academia today.

Scientific figure showing that higher water intake reduces amyloid pathology and improves cognition in 5xFAD mice.

Did you know drinking water can prevent Alzheimer’s? Neither did we. Better keep the receipts.1

In order for autonomous scientists to contribute real, meaningful discoveries, the system has to do more than generate the output. It has to carry forward rich context continuously, make sensible, data- and fact-driven decisions, and leave behind a clear record for others, both human and agentic, to inspect and validate.

Marvin is for everyone

We do not see autonomous systems as replacements for human work. They should augment us, increase our productivity, and let us spend more of our time on the parts of the work that actually matter. That is why we built Marvin to be a scientific collaborator: flexible and sophisticated enough to function as a coworker, not just a tool.

Whether you’re a highly technical ML researcher who just needs more clones of you or a bench scientist who has never written a line of code, Marvin can join your team and pick up the work you want to delegate at the degree of autonomy you want to grant it. It can handle anything from just literature review to a full end-to-end research loop, and at any time, you can review or discuss the results or redirect the next experiments before Marvin kicks off again. The level of autonomy is yours to set.

Marvin’s capabilities are also cross-domain. It can do research across fields as diverse as frontier AI research, computational biology and bioinformatics, and materials science. That is because scientific method and rigor are universal, and we designed Marvin’s research loop and memory system around the same principles we used running our own research teams and academic labs.

Work with Marvin

In head-to-head evaluations using both LLM judges and human PhD judges in the relevant fields, Marvin scored higher than competing autonomous science agents on research depth, rigor, and creativity. We will publish a broader meta-paper with those results closer to Marvin’s open launch.

Marvin is in closed testing now. Read more on the Marvin page and see examples of its work there. If you’re interested, we’d love to hear about your project’s needs and discuss how Marvin can help.

  1. Ruslan Salakhutdinov, “the future of science is less about producing results and more about verifying them,” X, July 18, 2025. Embedded figure above from the linked post.