The New Scientific Instrument
Every era of scientific acceleration has been catalyzed by a new instrument: the telescope revealed the cosmos, the microscope unveiled cellular life, the particle accelerator exposed subatomic structure. Each time, the instrument did not replace the scientist -- it extended human perception into territories previously unreachable.
We are witnessing the emergence of a new scientific instrument of comparable significance: the AI scientist. Not a metaphor, but an increasingly literal description of systems that can read thousands of papers, generate hypotheses, design experiments, analyze data, and produce manuscripts -- in some cases with minimal human intervention.
This article maps the landscape from research assistants to fully autonomous scientific agents, assessing both the extraordinary opportunities for advancing human knowledge and the cautionary tales that must temper our ambition. The spirit here is optimistic and pro-human: AI for science is, at its best, the most powerful amplifier of human curiosity ever built.
"The question is not whether AI will transform science. The question is whether we will guide that transformation with the rigor, transparency, and humanistic values that science itself demands."
-- The thesis of this article
Layer 1: The Research Assistant Revolution
The first layer of AI for science is already mainstream. A new generation of AI-powered research assistants has fundamentally changed how scientists interact with the literature -- a corpus that now exceeds 225 million papers and grows by millions per year [1].
Elicit: Structured Extraction at Scale
Elicit, used by over 5 million researchers (per its homepage as of early 2026), uses language models to search, summarize, and extract structured data from over 138 million papers. Its strength lies in systematic review workflows: Elicit can identify relevant papers, extract key findings into structured tables, and generate evidence summaries with self-reported accuracy of approximately 90% [2]. For researchers conducting literature reviews that previously took weeks, Elicit compresses the cycle to hours.
SciSpace: The Multilingual Platform
SciSpace offers access to 280 million papers across 75+ languages, linking 150+ research tools into an end-to-end platform for discovering, analyzing, and writing scientific literature [3]. For non-English-speaking research communities -- including the vast Spanish-language scientific ecosystem -- SciSpace represents a meaningful step toward democratizing access to global knowledge.
Consensus: Evidence-Based Answers
Consensus takes a different approach, focusing on answering research questions directly from peer-reviewed evidence using RAG (Retrieval-Augmented Generation) grounded in scholarly databases [4]. When a clinician asks "Does metformin reduce cancer risk?", Consensus synthesizes answers from the actual literature rather than generating plausible-sounding text.
Layer 2: Agentic Scientific Platforms
The second layer moves beyond assisted search into agentic scientific reasoning -- systems that don't just retrieve information but actively reason over it, plan multi-step investigations, and synthesize novel insights.
Ai2 Asta: The Open Science Standard
In August 2025, the Allen Institute for AI (Ai2) launched Asta, an integrated open ecosystem designed to transform scientific research with trustworthy AI agents [5]. Built on the foundation of Semantic Scholar (225M+ papers, 1.5 billion queries per year), Asta represents a principled alternative to the flood of opaque AI tools.
Asta comprises three components: the Asta Assistant for finding papers, generating cited summaries, and running data analyses; AstaBench, the first rigorous benchmark suite for evaluating scientific AI agents on real research tasks; and Asta Resources, an open-source developer toolkit with APIs, post-trained language models for science, and the Scientific Corpus Tool [5].
What makes Asta significant is its commitment to scientific values. As Ai2 CEO Ali Farhadi stated: "AI can be transformative for science, but only if it's held to the same standards as science itself" [5]. In a landscape where many AI tools make opaque claims with no way to evaluate them, Asta's open-source approach with standardized benchmarks sets a new standard.
The landscape of AI for science exists on a spectrum from augmentation (Level 1: research assistants that help find and summarize papers) through agentic reasoning (Level 2: platforms that plan investigations and synthesize evidence) to autonomous discovery (Level 3: end-to-end AI scientists that generate hypotheses, design experiments, and produce new knowledge). Each level amplifies human capability differently -- and demands different governance approaches.
Layer 3: The End-to-End AI Scientist
The most ambitious -- and most consequential -- frontier is the autonomous AI scientist: a system that performs the full cycle of scientific discovery, from hypothesis generation through experimental design, execution, analysis, and publication.
FutureHouse: Robin and Kosmos
FutureHouse, co-founded in 2023 by Sam Rodriques and Andrew White with backing from former Google CEO Eric Schmidt, has set a 10-year mission to build AI scientists capable of end-to-end autonomous discovery [6].
Their system Robin demonstrated the first credible end-to-end AI scientific workflow: all hypotheses, experiment choices, data analyses, and main text figures were generated autonomously by the AI. Human researchers executed the physical experiments, but the intellectual framework was entirely AI-driven. The entire process -- from concept to paper submission -- was completed in 2.5 months by a small team [6].
Robin's successor, Kosmos, represents a major leap. Using "structured world models," Kosmos can process 1,500 papers and 42,000 lines of analysis code in a single run. Perhaps most remarkably, based on polling of seven beta users, a single Kosmos run accomplishes work those researchers estimated as equivalent to roughly 6 months of PhD or postdoctoral work (average ~6.14 months) -- and Edison reports this scales linearly with run depth, providing one of the first inference-time scaling laws for scientific research [7].
The commercial spinout Edison Scientific, launched in November 2025, announced a $70 million seed round at a reported $250 million valuation in December 2025, with investors including Google Chief Scientist Jeff Dean. OpenAI CEO Sam Altman has described this category of AI-driven scientific discovery as "one of the most important impacts of AI" [7].
Ginkgo Bioworks: The Autonomous Laboratory
Ginkgo Bioworks, in collaboration with OpenAI, has demonstrated something even more radical: an AI system that autonomously designs, executes, and learns from physical biological experiments [8].
In February 2026, Ginkgo reported results showing that GPT-5, connected to their cloud laboratory infrastructure, ran 36,000 experimental conditions across six iterative cycles to optimize cell-free protein synthesis, achieving what they described as a 40% cost reduction over the state of the art ($422/gram vs. $698/gram). GPT-5 operated as an experimental scientist -- designing experiments, analyzing results, and refining its approach across iterations [8].
This represents a significant milestone. For the first time, an AI system has reportedly closed the loop between digital reasoning and physical experimentation with minimal human intervention. As Ginkgo CEO Jason Kelly stated in the announcement: "This is AI doing real experimental science: designing experiments, running them, and learning from the results" [8].
| System | Level | Capability | Scale |
|---|---|---|---|
| Elicit | L1: Assistant | Search, summarize, extract data | 5M+ researchers, 138M papers |
| SciSpace | L1: Assistant | Multilingual discovery and analysis | 280M papers, 75+ languages |
| Consensus | L1: Assistant | Evidence-based Q&A from literature | Peer-reviewed RAG |
| Ai2 Asta | L2: Agentic | Multi-step scientific reasoning | 225M+ papers, open-source |
| FutureHouse Robin | L3: Autonomous | End-to-end discovery workflow | 2.5 months concept-to-paper |
| Edison / Kosmos | L3: Autonomous | 6-month PhD equivalent per run | $70M seed, $250M valuation |
| Ginkgo + OpenAI | L3+: Closed-loop | Autonomous physical experimentation | 36K conditions, 40% reported improvement |
The Genesis Mission: A National Commitment
On November 24, 2025, President Trump signed an Executive Order launching the Genesis Mission -- described as "this generation's Manhattan Project" for scientific discovery [9]. The goal is audacious: double US scientific productivity in ten years using AI.
The Department of Energy is charged with implementation, integrating the computing power of America's national laboratories into a unified AI platform. The mission identifies at least 20 science and technology challenges of national importance spanning advanced manufacturing, biotechnology, critical materials, nuclear fusion energy, quantum information science, and semiconductors [9]. By August 2026, the DOE must demonstrate initial operating capability for at least one challenge.
Twenty-four organizations have signed collaboration agreements to advance the mission [10]. This is not a research grant program -- it is a national mobilization at the scale of the Apollo Program, bringing together world-class supercomputers, national laboratory infrastructure, and frontier AI capabilities.
For Europe, the Genesis Mission is both an inspiration and a warning. The US is making a whole-of-government bet on AI-accelerated science. If Europe does not mount a comparable effort, the gap in scientific productivity -- and the economic and strategic advantages that flow from it -- will widen further. RAND's call for a European AGI Preparedness Plan [11] is directly relevant here: preparedness means not only governing AI but harnessing it for scientific leadership.
Pandora's Box: Cautionary Tales for the AI Scientist
The Greek myth of Pandora tells of a jar (often mistranslated as a box) that, once opened, released all evils into the world -- but left hope inside. It is a fitting metaphor for the AI scientist: the potential for transformative good is immense, but so are the risks if we proceed without wisdom.
The Reproducibility Crisis, Amplified
Science already faces a reproducibility crisis -- estimates suggest that over 70% of researchers have tried and failed to reproduce another scientist's experiments [12]. AI-generated hypotheses and experiments could either help solve this (through systematic documentation and standardized protocols) or catastrophically worsen it (through opaque reasoning, hallucinated citations, and automated p-hacking at scale). The path depends entirely on how we build these systems.
Hallucination in Scientific Context
When a chatbot hallucinates a restaurant recommendation, the consequence is a bad dinner. When an AI scientist hallucinates a drug interaction, a materials property, or a statistical result, the consequences can be measured in human lives. FutureHouse's own assessment is honest: current models operate at "B-level intelligence," far from matching the capabilities of graduate students in complex domains like biology [6]. The gap between impressive demos and reliable scientific instruments remains substantial.
The Automation of Fraud
The scientific community has already seen AI weaponized for fabrication. By mid-2024, Wiley had retracted over 11,300 papers from its Hindawi portfolio -- many from journals compromised by paper mills using AI-generated content -- and shut down 19 compromised journals entirely [13]. The same tools that enable legitimate autonomous research can produce high-volume scientific fraud: convincing-looking papers that are entirely fictional. Journals, peer reviewers, and institutions need new detection and verification mechanisms urgently.
The Concentration Risk
If autonomous scientific discovery becomes dominated by a small number of well-funded private labs -- Edison Scientific ($70M seed), Ginkgo+OpenAI, Google DeepMind -- the direction of scientific inquiry could be shaped primarily by commercial incentives rather than the public interest. The diseases studied, the materials engineered, the discoveries prioritized would reflect the portfolio strategies of venture capital rather than the needs of humanity. Ai2's open-source approach with Asta [5] offers a crucial counterweight, but the economic gravitational pull toward concentration is powerful.
In the original Greek myth, after all evils had escaped, one thing remained in the jar: elpis -- hope. The cautionary tales above are real and serious. But they are challenges to be met with rigor, not reasons for retreat. Every scientific instrument in history has been misused. The printing press spread propaganda alongside enlightenment. The internet enabled fraud alongside global collaboration. The AI scientist will be no different. What matters is whether we build the governance, transparency, and institutional structures to keep hope alive.
The European Opportunity: Human-Centered AI for Science
This is where Europe's comparative advantage becomes decisive. The cautionary tales above point to a clear need: AI for science requires exactly the institutional strengths that Europe possesses -- regulatory frameworks, strong universities, a culture of peer review, commitment to open science, and democratic governance of public research.
Europe is not starting from zero. Horizon Europe, the EUR 95.5 billion research framework, already funds AI-for-science projects across the continent. The EuroHPC Joint Undertaking is building world-class supercomputing infrastructure -- including Spain's MareNostrum 5 at the Barcelona Supercomputing Center. Plan S has established Europe as the global leader in open access mandates. And CERN demonstrated that Europe can build scientific infrastructure that the world uses. The question is whether we connect these assets into a coherent AI-for-science strategy:
Conclusion: Prometheus, Not Pandora
Greek mythology gives us two paradigms for transformative technology. Pandora's jar -- a warning about the unintended consequences of curiosity and power. And Prometheus -- the titan who stole fire from the gods and gave it to humanity, enabling civilization itself, accepting punishment for the audacity of believing that humans deserved the tools of the gods.
The AI scientist is fire. It will transform what is possible in drug discovery, materials science, climate modeling, energy research, and fundamental physics. A single Kosmos run already equals six months of postdoctoral work. Ginkgo's autonomous lab has already produced results that surpass the published state of the art. The Genesis Mission aims to double American scientific productivity in a decade.
These are not speculative projections. They are happening now.
The choice before us is not whether to use this fire, but how. We can proceed with the recklessness of Pandora -- opening the jar without governance, transparency, or concern for consequences. Or we can proceed as Prometheus -- with audacity tempered by responsibility, bringing the tools of discovery to all of humanity, not just the few who can afford them.
I choose Prometheus. I believe Europe should too. Build the AI scientists. Make them open. Make them rigorous. Make them accountable. And make them available to every researcher, in every language, in every country -- so that the next breakthrough in medicine, energy, or materials science comes not from the wealthiest lab but from the most curious mind.
At Saturdays.AI, we taught 30,000 people across 12 countries that AI is not magic reserved for Silicon Valley -- it is a tool for anyone with curiosity and determination. That same principle applies to AI for science. The student in Tarragona, the researcher in Dakar, the clinician in Bogota -- they all deserve access to AI-powered scientific instruments as powerful as those in Stanford or Beijing. This is the AI4All vision applied to the deepest human endeavor: the pursuit of knowledge.
Human first. AI frontier. Science for all.
References
- Semantic Scholar. "About Semantic Scholar." Allen Institute for AI. semanticscholar.org
- Elicit. "AI for Scientific Research." elicit.com
- SciSpace. "AI Research Agent: 150+ Tools, 280M Papers." scispace.com
- Consensus. "AI Search Engine for Research." consensus.app
- Allen Institute for AI. "Asta: Accelerating Science Through Trustworthy Agentic AI." August 2025. allenai.org/blog/asta
- FutureHouse. "Demonstrating End-to-End Scientific Discovery with Robin: A Multi-Agent System." futurehouse.org/research-announcements (Robin)
- Edison Scientific. "Announcing Kosmos: An AI Scientist for Autonomous Discovery." edisonscientific.com/articles/announcing-kosmos
- Ginkgo Bioworks. "Autonomous Laboratory Driven by OpenAI's GPT-5 Achieves 40% Improvement Over State-of-the-Art." February 2026. prnewswire.com (Ginkgo + OpenAI)
- The White House. "Fact Sheet: President Trump Unveils the Genesis Mission to Accelerate AI for Scientific Discovery." November 24, 2025. whitehouse.gov/fact-sheets/2025/11 (Genesis Mission)
- U.S. Department of Energy. "Collaboration Agreements with 24 Organizations to Advance the Genesis Mission." energy.gov/articles (Genesis Mission Collaborations)
- Negele, M. et al. "Europe and the Geopolitics of AGI: The Need for a Preparedness Plan." RAND Corporation, 2025. rand.org/pubs/research_reports/RRA4636-1.html
- Baker, M. "1,500 Scientists Lift the Lid on Reproducibility." Nature 533, 452-454 (2016). nature.com/articles/533452a
- Van Noorden, R. "More than 10,000 research papers were retracted in 2023 -- a new record." Nature 624, 479-481 (2023). nature.com/articles/d41586-023-03974-8