This AI Designs Molecules From Plain English Descriptions

Imagine trying to build a piece of furniture without instructions, working backwards from the finished product to figure out every cut, joint, and fitting. Now imagine the furniture has a billion possible configurations, and one wrong choice at step three invalidates everything that follows. That is roughly what chemists face every time they design a new molecule. Researchers at EPFL have built an AI system called Synthegy that handles the strategic planning — and all a chemist needs to do is describe what they want in plain language.

Why Designing a Molecule Is So Hard

Every useful molecule — a new antibiotic, a more efficient solar cell material, a cancer drug — has to be physically built through a sequence of chemical reactions. The sequence matters enormously. Choose the wrong order and reactive parts of the molecule interfere with each other. Miss a protection step and a sensitive section gets destroyed before it can be used.

The planning technique chemists use is called retrosynthesis. Rather than planning forward from raw ingredients, a chemist starts with the target molecule and works backwards, repeatedly asking: what simpler compound could I convert into this? Each backward step opens new branches, and a complete synthesis plan can involve dozens of decisions, each one requiring years of hard-won expertise to make well.

Computers can already scan enormous libraries of known reactions and generate thousands of possible pathways. The problem is that scanning is not the same as judging. Existing tools produce options but struggle to tell you which ones reflect sound chemical strategy — and which ones are technically possible but practically disastrous. That gap is exactly what Synthegy was built to close.

What Is Retrosynthesis? Retrosynthesis is the art of planning a molecule by working backwards from the end goal. A chemist identifies the target structure, then asks what reaction could have produced it, then what reaction could have produced that precursor, and so on — until arriving at simple, commercially available starting materials. It is one of the most strategically demanding tasks in all of chemistry, and mastering it typically takes a decade of training.

What Synthegy Actually Does

Picture a senior chemist sitting beside a computer. The computer generates 500 possible routes to a target molecule. The chemist glances at them, immediately discards 480 based on instinct, and focuses on the remaining 20. Synthegy is trying to replicate that first instinct — at speed, and without requiring the senior chemist to be in the room.

The system works in two layers. The first layer is a standard retrosynthesis algorithm that generates candidate pathways for a given target molecule. The second layer is a large language model — the same class of AI that powers modern conversational tools — repurposed as an evaluator rather than a generator.

Each candidate pathway gets converted into text and reviewed by the language model, which scores how well it matches the chemist's stated goals. The model explains its reasoning for every score, giving the chemist a ranked, annotated shortlist rather than an undifferentiated pile of options. The whole process starts with a single plain-language instruction — something as direct as "avoid unnecessary protecting groups" or "form the ring early."

"With Synthegy, we're giving chemists the power to just talk, allowing them to iterate much faster and navigate more complex synthetic ideas."

— Andres M. Bran, First Author · Matter, 2026 · EPFL

How It Handles the Hardest Part of Chemistry

Reaction mechanisms are the step-by-step description of how a chemical reaction actually proceeds at the molecular level — specifically, how electrons move between atoms to break old bonds and form new ones. Understanding mechanisms is what separates a chemist who can predict new reactions from one who can only repeat known ones. It is, by most accounts, the deepest and most difficult layer of chemical reasoning.

Synthegy applies the same language-model evaluation approach to mechanisms that it uses for synthesis planning. It breaks a reaction down into individual electron movement steps, generates multiple possible sequences, and uses the AI to steer the search toward pathways that are chemically plausible — not just mathematically possible.

Crucially, the system accepts additional context as text input: reaction conditions, expert hypotheses, known constraints. A researcher can feed it a hunch and ask it to evaluate whether that hunch survives scrutiny. That flexibility is what makes it genuinely useful for research rather than just impressive in a demonstration setting.

71.2%

Agreement between Synthegy rankings and expert chemists

Chemists in the double-blind validation study

368

Valid expert evaluations collected

How Accurate Is It? What 36 Chemists Found

Accuracy claims for AI chemistry tools tend to rely on benchmark datasets — curated test sets where the correct answer is already known. The Synthegy team did something harder. They ran a double-blind study with 36 practising chemists who provided 368 independent evaluations of the system's pathway rankings, without knowing which choices the AI had made.

The chemists agreed with Synthegy's assessments 71.2% of the time on average. That is not perfection, and the researchers do not present it as such. But for a system making strategic chemical judgements based on plain-language instructions — a task that previously required years of specialist training to perform at all — the alignment is striking.

One finding the team did not soften: performance scales strongly with model size. Larger language models performed substantially better than smaller ones at chemical reasoning, which has direct implications for how labs choose to deploy the tool. The gap between a capable and a limited implementation is not cosmetic.

Why "Protecting Groups" Matter During synthesis, certain parts of a molecule are chemically reactive and will interfere with reactions unless temporarily blocked. A protecting group is a chemical attachment that shields a reactive site, then gets removed later. Adding and removing protecting groups takes time, costs money, and introduces failure points. One of Synthegy's key capabilities is recognising when protecting group steps are unnecessary — and flagging routes that avoid them.

What This Means for Drug Discovery and Beyond

The most immediate application is drug discovery, where molecule design is simultaneously the most important and most time-consuming step in developing new medicines. A tool that lets a medicinal chemist describe a synthetic strategy in a sentence — and receive a ranked, explained shortlist of routes within seconds — compresses a process that currently takes weeks into something closer to an afternoon.

But the implications extend past pharmaceuticals. Materials science, agrochemistry, and clean energy research all depend on the ability to design and synthesise novel compounds. Any field where molecular design is a bottleneck stands to benefit from a faster, more accessible planning layer.

What Synthegy does not do is replace the chemist. The system still needs a human to set the goal, evaluate the shortlist, run the reactions, and interpret the results. What it removes is the brutal combinatorial burden of sorting through thousands of plausible-but-wrong pathways before finding the one worth pursuing. The question it leaves open is what chemists will do with the time they get back — and how quickly the science moves when the planning no longer takes the longest.

Language as interface — Synthegy replaces complex filter menus and rule sets with plain-language instructions, making advanced synthesis planning accessible without deep computational expertise.
AI as evaluator, not generator — The large language model scores and explains existing pathways rather than creating new chemistry, a distinction that keeps human chemists in control of the science.
Model size matters — Larger language models significantly outperform smaller ones at chemical reasoning, a constraint that will shape how the tool is deployed at scale.

"The connection between synthesis planning and mechanisms is very exciting: we usually use mechanisms to discover new reactions that enable us to synthesize new molecules. Our work is bridging that gap computationally through a unified natural language interface." — Andres M. Bran, EPFL, Matter, 2026.

📄 Source & Citation

Primary Source: Bran, A. M., Neukomm, T. A., Armstrong, D., Jončev, Z., & Schwaller, P. (2026). Chemical reasoning in LLMs unlocks strategy-aware synthesis planning and reaction mechanism elucidation. Matter, 102812. https://doi.org/10.1016/j.matt.2026.102812

Authors & Affiliations: Andres M. Bran, Théo A. Neukomm, Daniel Armstrong, Zlatko Jončev, and Philippe Schwaller. Laboratory of Artificial Chemical Intelligence (LIAC), EPFL, Lausanne, Switzerland. Supported by NCCR Catalysis and b12 Labs.

Institutional Source: Ecole Polytechnique Fédérale de Lausanne (EPFL). Original story via ScienceDaily, May 5, 2026. https://www.sciencedaily.com/releases/2026/05/260504023844.htm

Key Themes: AI chemistry · Retrosynthesis · Large language models · Drug discovery · Reaction mechanism · Molecular design

Supporting References:

[1] Corey, E. J. & Cheng, X.-M. (1989). The Logic of Chemical Synthesis. Wiley-Interscience. (Foundational text on retrosynthetic analysis.)

[2] Schwaller, P. et al. (2019). Molecular transformer: a model for uncertainty-calibrated chemical reaction prediction. ACS Central Science, 5(9), 1572–1583.

[3] Segler, M. H. S., Preuss, M., & Waller, M. P. (2018). Planning chemical syntheses with deep neural networks and symbolic AI. Nature, 555, 604–610.

This AI Designs Molecules From Plain English Descriptions

In This Article

Why Designing a Molecule Is So Hard

What Synthegy Actually Does

How It Handles the Hardest Part of Chemistry

How Accurate Is It? What 36 Chemists Found

What This Means for Drug Discovery and Beyond

📄 Source & Citation

Leave a Comment

This AI Designs Molecules From Plain English Descriptions

In This Article

Why Designing a Molecule Is So Hard

What Synthegy Actually Does

How It Handles the Hardest Part of Chemistry

How Accurate Is It? What 36 Chemists Found

What This Means for Drug Discovery and Beyond

📄 Source & Citation

Leave a Comment

Related Articles

AI Can Now Predict India's Air Quality With 97% Accuracy

AI Just Ranked the Most Dangerous Chemicals in Your Body

AI in Medicine: What It Can Already Do, Where It's Failing, and Why Doct…