Science · Technology · The Future
NAVSORATIMES
Science · Technology · The Future
← Back
🧬 Biology

What 400 Million Years of Plant DNA Is Finally Telling Us

A landmark algorithm called Conservatory has catalogued 2.3 million conserved non-coding sequences across 284 plant species — including regulatory DNA older than the first flowers, with direct implications for crop engineering and precision agriculture.

A scientific illustration of a luminous DNA double helix embedded within a plant stem cross-section, surrounded by botanical tissue and floating genetic timeline markers representing plant evolution over hundreds of millions of years.
Scientists Mapped 300 Million Years of Plant Gene Regulation — And Found Ancient Switches Still Running | NavsoraTimes

In This Article

  1. The long-standing paradox in plant regulatory DNA
  2. What Conservatory does — and why it works where other tools fail
  3. The scale of discovery: 2.3 million conserved sequences
  4. Ancient switches that still control plant development today
  5. How gene regulation evolves after duplication
  6. What this means for crop engineering and agriculture

A new preprint from UMass Amherst, Cold Spring Harbor Laboratory, and the Hebrew University of Jerusalem introduces Conservatory — an algorithm that traces conserved regulatory DNA across 300 million years of plant evolution. What it found opens a direct path to precision crop engineering.

A Long-Standing Paradox in Regulatory DNA

Developmental genes are often nearly identical across distantly related plants — yet the non-coding DNA that switches them on and off appeared to share nothing across species. Were these sequences evolving too fast, or were scientists missing the tools? According to Amundson, Hendelman, and colleagues: the conservation was always there.

What Are Cis-Regulatory Elements? Non-coding DNA sequences that switch genes on or off. Transcription factors bind them to control when and how much of a gene is produced. Because they evolve faster than coding DNA, tracing them across deep time has — until now — been nearly impossible.

What Conservatory Does — And Why It Works

Previous tools relied on whole-genome alignment — which fails across deep time because plant genome duplications scramble gene-regulator positions. Conservatory works at the level of gene orthogroups instead, reconstructing ancestral regulatory sequences and using them to search outward across species. Bridge genomes fill in gaps that direct alignment cannot cross. The result is a tool built for the messy, duplicated reality of plant genomes.

314
plant genomes analysed across 284 species
2.3M
unique conserved non-coding sequences identified
3,321
CNSs predating the origin of flowering plants

The Scale of Discovery: 2.3 Million Conserved Sequences

Applied to 314 genomes spanning 284 species, Conservatory identified 2.3 million unique conserved non-coding sequences. Most are conserved only within plant families. But 3,321 predate angiosperms entirely, and 633 appear to predate seed plants — regulatory DNA under selection for at least 300 million years. Independent validation confirmed these are genuinely functional: 99.3 percent of Arabidopsis CNSs overlap with biochemical evidence including open chromatin, transcription factor binding sites, and activating histone marks. In maize, they concentrate within super-enhancers. Every line of evidence points the same way.

"Ancient CNSs may encode functionally-enriched core regulatory sequences — a bedrock of complex cis-regulatory control of expression programs, tuned by more shallowly conserved peripheral sequences."

— Amundson, Hendelman et al., bioRxiv, 2026

Ancient Switches That Still Control Development Today

The most ancient CNSs cluster near developmental and transcriptional regulators — exactly where one would expect core conserved programs to be encoded. The team proved this experimentally using CRISPR-Cas9 in tomato. Deleting the seed-plant-level CNS S217 near SlWOX9 produced extra cotyledons, premature shoot apical meristem termination, and leaf fusions. Removing angiosperm-level CNSs from the SlWOX2 promoter caused fully penetrant embryonic lethality — the plant could not complete embryogenesis at all. Multiple independent alleles confirmed both results.

633
CNSs conserved since before seed plants — 300+ million years old
96%
of Arabidopsis CNSs overlap at least one TF binding site
~0.67%
of CNSs lack any supporting functional genomics evidence

In WUSCHEL (WUS), the master stem cell regulator, five ancient CNSs were identified in the meristem module — including conserved binding sites for the proteins that relay cytokinin hormone signals to the gene. The molecular link between a plant hormone and stem cell identity has been encoded in the same short DNA sequences for at least 115 million years.

How Gene Regulation Evolves After Duplication

When genes duplicate, their CNSs duplicate with them. The two copies then diverge asymmetrically — one paralog retaining more ancestral regulatory sequences, the other accumulating losses and novel sequences faster. Ancient CNSs are preferentially retained in both copies, suggesting their functions are too deeply embedded in regulatory networks to lose easily. CNS order is conserved across species even when distances vary: the WUS meristem module sits within 1.5 kilobases of its gene start in 224 of 240 species, while the TB1 enhancer in grasses can wander tens of kilobases away.

The Maize Domestication Link The TB1 gene — which determines whether maize branches like wild teosinte or grows a single crop stalk — contains twelve conserved CNSs 65 kilobases upstream. A domestication-linked transposon insertion sits between two CNS clusters, disrupting their spacing. Domestication may have worked by breaking the geometry of an ancient regulatory arrangement, not by changing its sequences.

The study also found that new regulatory sequences frequently originate from older ones rather than arising from scratch — replacement sequences at syntenic positions showed significantly higher similarity to ancestral CNSs than random controls. Regulatory evolution, it turns out, is largely a process of recycling material that already works.

What This Means for Crop Engineering

Precision agriculture increasingly aims to fine-tune gene expression — adjusting how much of a gene is made, in which tissues, and when — rather than knocking genes out entirely. That requires knowing where regulatory sequences are. Conservatory now provides exactly that: a genome-wide regulatory map for 314 plant genomes, freely accessible at conservatorycns.com.

  • Ancient CNSs near developmental genes are likely essential — understanding them enables precise developmental manipulation without lethal side effects
  • Family-level CNSs offer species-specific regulatory handles that may encode traits unique to individual crops or wild relatives
  • CNS position data can guide synthetic regulatory circuit design that respects the spatial constraints under which natural enhancers operate

The algorithm is open source and available on GitHub. As pan-genomic crop resources expand, its resolution will only improve. The era of reading plant regulatory DNA across deep evolutionary time has, effectively, begun.

"Our study provides a unified map of plant CNS evolution based on 314 genomes. As pan-genomic resources expand, improved phylogenomic resolution will further clarify orthologous relationships between CNSs and guide deeper computational analyses and functional assays to refine our understanding of regulatory sequence evolution and provide even higher resolution targets for trait engineering." — Amundson, Hendelman et al., bioRxiv, 2026.


📄 Source & Citation

Primary Source: Amundson KR, Hendelman A, Ciren D, Yang H, de Neve AE, Tal S, Sulema A, Jackson D, Bartlett ME, Lippman ZB, & Efroni I. (2026). A deep-time landscape of plant cis-regulatory sequence evolution. bioRxiv preprint. https://doi.org/10.1101/2025.09.17.676453

Authors & Affiliations: Kirk R. Amundson & Anat Hendelman (co-first authors); Department of Biology, University of Massachusetts Amherst; Cold Spring Harbor Laboratory; Howard Hughes Medical Institute; Institute of Plant Science, The Hebrew University, Rehovot, Israel; Sainsbury Laboratory Cambridge University (SLCU).

Data & Code: Full CNS atlas available at www.conservatorycns.com. Algorithm code: https://github.com/idanefroni/Conservatory. Raw data deposited at Zenodo and NCBI GEO (GSE307325).

Key Themes: Conserved non-coding sequences · Cis-regulatory element evolution · Plant genomics · WUSCHEL · TEOSINTE BRANCHED1 · CRISPR-Cas9 functional validation · Maize domestication · Crop engineering · Comparative genomics

Supporting References:

[1] Carroll SB. (2008). Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution. Cell, 134:25–36.

[2] Hendelman A et al. (2021). Conserved pleiotropy of an ancient plant homeobox gene uncovered by cis-regulatory dissection. Cell, 184:1724–1739.

[3] Song B et al. (2021). Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize. Genome Research, 31:1245–1257.

[4] Ricci WA et al. (2019). Widespread long-range cis-regulatory elements in the maize genome. Nature Plants, 5:1237–1249.

[5] Li X & Schmitz RJ. (2025). Cis-regulatory dynamics in plant domestication. Trends in Genetics. doi:10.1016/j.tig.2025.02.005

👁6 views
6 min read
💬0 comments

💬 Comments (0)

No comments yet. Be the first to share your thoughts.

Leave a Comment

⏳ Comments are reviewed before publishing. Please keep discussion respectful and on-topic.