In This Article
- Why Designing a Molecule Is So Hard
- What Synthegy Actually Does
- How It Handles the Hardest Part of Chemistry
- How Accurate Is It? What 36 Chemists Found
- What This Means for Drug Discovery and Beyond
Imagine trying to build a piece of furniture without instructions, working backwards from the finished product to figure out every cut, joint, and fitting. Now imagine the furniture has a billion possible configurations, and one wrong choice at step three invalidates everything that follows. That is roughly what chemists face every time they design a new molecule. Researchers at EPFL have built an AI system called Synthegy that handles the strategic planning — and all a chemist needs to do is describe what they want in plain language.
Why Designing a Molecule Is So Hard
Every useful molecule — a new antibiotic, a more efficient solar cell material, a cancer drug — has to be physically built through a sequence of chemical reactions. The sequence matters enormously. Choose the wrong order and reactive parts of the molecule interfere with each other. Miss a protection step and a sensitive section gets destroyed before it can be used.
The planning technique chemists use is called retrosynthesis. Rather than planning forward from raw ingredients, a chemist starts with the target molecule and works backwards, repeatedly asking: what simpler compound could I convert into this? Each backward step opens new branches, and a complete synthesis plan can involve dozens of decisions, each one requiring years of hard-won expertise to make well.
Computers can already scan enormous libraries of known reactions and generate thousands of possible pathways. The problem is that scanning is not the same as judging. Existing tools produce options but struggle to tell you which ones reflect sound chemical strategy — and which ones are technically possible but practically disastrous. That gap is exactly what Synthegy was built to close.
What Synthegy Actually Does
Picture a senior chemist sitting beside a computer. The computer generates 500 possible routes to a target molecule. The chemist glances at them, immediately discards 480 based on instinct, and focuses on the remaining 20. Synthegy is trying to replicate that first instinct — at speed, and without requiring the senior chemist to be in the room.
The system works in two layers. The first layer is a standard retrosynthesis algorithm that generates candidate pathways for a given target molecule. The second layer is a large language model — the same class of AI that powers modern conversational tools — repurposed as an evaluator rather than a generator.
Each candidate pathway gets converted into text and reviewed by the language model, which scores how well it matches the chemist's stated goals. The model explains its reasoning for every score, giving the chemist a ranked, annotated shortlist rather than an undifferentiated pile of options. The whole process starts with a single plain-language instruction — something as direct as "avoid unnecessary protecting groups" or "form the ring early."
"With Synthegy, we're giving chemists the power to just talk, allowing them to iterate much faster and navigate more complex synthetic ideas."
— Andres M. Bran, First Author · Matter, 2026 · EPFLHow It Handles the Hardest Part of Chemistry
Reaction mechanisms are the step-by-step description of how a chemical reaction actually proceeds at the molecular level — specifically, how electrons move between atoms to break old bonds and form new ones. Understanding mechanisms is what separates a chemist who can predict new reactions from one who can only repeat known ones. It is, by most accounts, the deepest and most difficult layer of chemical reasoning.
Synthegy applies the same language-model evaluation approach to mechanisms that it uses for synthesis planning. It breaks a reaction down into individual electron movement steps, generates multiple possible sequences, and uses the AI to steer the search toward pathways that are chemically plausible — not just mathematically possible.
Crucially, the system accepts additional context as text input: reaction conditions, expert hypotheses, known constraints. A researcher can feed it a hunch and ask it to evaluate whether that hunch survives scrutiny. That flexibility is what makes it genuinely useful for research rather than just impressive in a demonstration setting.
How Accurate Is It? What 36 Chemists Found
Accuracy claims for AI chemistry tools tend to rely on benchmark datasets — curated test sets where the correct answer is already known. The Synthegy team did something harder. They ran a double-blind study with 36 practising chemists who provided 368 independent evaluations of the system's pathway rankings, without knowing which choices the AI had made.
The chemists agreed with Synthegy's assessments 71.2% of the time on average. That is not perfection, and the researchers do not present it as such. But for a system making strategic chemical judgements based on plain-language instructions — a task that previously required years of specialist training to perform at all — the alignment is striking.
One finding the team did not soften: performance scales strongly with model size. Larger language models performed substantially better than smaller ones at chemical reasoning, which has direct implications for how labs choose to deploy the tool. The gap between a capable and a limited implementation is not cosmetic.
What This Means for Drug Discovery and Beyond
The most immediate application is drug discovery, where molecule design is simultaneously the most important and most time-consuming step in developing new medicines. A tool that lets a medicinal chemist describe a synthetic strategy in a sentence — and receive a ranked, explained shortlist of routes within seconds — compresses a process that currently takes weeks into something closer to an afternoon.
But the implications extend past pharmaceuticals. Materials science, agrochemistry, and clean energy research all depend on the ability to design and synthesise novel compounds. Any field where molecular design is a bottleneck stands to benefit from a faster, more accessible planning layer.
What Synthegy does not do is replace the chemist. The system still needs a human to set the goal, evaluate the shortlist, run the reactions, and interpret the results. What it removes is the brutal combinatorial burden of sorting through thousands of plausible-but-wrong pathways before finding the one worth pursuing. The question it leaves open is what chemists will do with the time they get back — and how quickly the science moves when the planning no longer takes the longest.
- Language as interface — Synthegy replaces complex filter menus and rule sets with plain-language instructions, making advanced synthesis planning accessible without deep computational expertise.
- AI as evaluator, not generator — The large language model scores and explains existing pathways rather than creating new chemistry, a distinction that keeps human chemists in control of the science.
- Model size matters — Larger language models significantly outperform smaller ones at chemical reasoning, a constraint that will shape how the tool is deployed at scale.
"The connection between synthesis planning and mechanisms is very exciting: we usually use mechanisms to discover new reactions that enable us to synthesize new molecules. Our work is bridging that gap computationally through a unified natural language interface." — Andres M. Bran, EPFL, Matter, 2026.
📄 Source & Citation
Primary Source: Bran, A. M., Neukomm, T. A., Armstrong, D., Jončev, Z., & Schwaller, P. (2026). Chemical reasoning in LLMs unlocks strategy-aware synthesis planning and reaction mechanism elucidation. Matter, 102812. https://doi.org/10.1016/j.matt.2026.102812
Authors & Affiliations: Andres M. Bran, Théo A. Neukomm, Daniel Armstrong, Zlatko Jončev, and Philippe Schwaller. Laboratory of Artificial Chemical Intelligence (LIAC), EPFL, Lausanne, Switzerland. Supported by NCCR Catalysis and b12 Labs.
Institutional Source: Ecole Polytechnique Fédérale de Lausanne (EPFL). Original story via ScienceDaily, May 5, 2026. https://www.sciencedaily.com/releases/2026/05/260504023844.htm
Key Themes: AI chemistry · Retrosynthesis · Large language models · Drug discovery · Reaction mechanism · Molecular design
Supporting References:
[1] Corey, E. J. & Cheng, X.-M. (1989). The Logic of Chemical Synthesis. Wiley-Interscience. (Foundational text on retrosynthetic analysis.)
[2] Schwaller, P. et al. (2019). Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Central Science, 5(9), 1572–1583.
[3] Segler, M. H. S., Preuss, M., & Waller, M. P. (2018). Planning chemical syntheses with deep neural networks and symbolic AI. Nature, 555, 604–610.
No comments yet. Be the first to share your thoughts.
Leave a Comment