Engineer at scale → Model DNA output → Make new chromosomes

A cell

We are a new team at the Generative Biology Institute, EIT, aiming to make chromosome-scale DNA designs predictable and programmable.

We gather data that are missing from current models of DNA function, use them to develop new frameworks that predict synthetic DNA behaviour inside cells, and apply these engineering tools and computational models to create mammalian chromosomes with defined functional properties.

We aim to make sequences with applications in medicine, biotechnology, and basic science.

Our research

Engineer DNA at scale

Most of human chromosomes are made of non-coding sequence - it plays critical roles in gene regulation, chromatin architecture, and the safeguarding of genetic information. In natural genomes, these functions are spread sparsely across gigabases, so we do not yet know how much of the non-coding DNA, and which sequences must be written to produce a chromosome that performs to a specification. Decoding how function emerges from non-coding sequence demands experimentation at the scale of billions of base pairs — a scale that has, until now, been out of reach. To close this gap, we have developed a versatile toolbox that harnesses CRISPR prime editing and recombinases to generate deletions, inversions, translocations, and duplications across the genome at scale. We exploit these tools to systematically create and phenotype defined and stochastic structural variants as a plentiful source of diverse sequence configurations not present in nature, enabling us to assign function to individual genes, non-coding sequences and their combinations, to build predictive models, and to probe the limits, rules, and biases that govern chromosome design.

Model DNA output

How do we ensure that the chromosomes we design will actually function as intended inside a cell? Our tests on applying predictive models on sequences that substantially depart from the natural human genome have revealed that the performance degrades severely, exposing a gap in generalization ability that must be closed to design long sequences that include non-coding DNA. State-of-the-art sequence-based models such as Enformer and AlphaGenome, trained on rich compendiums of functional genomic data from initiatives like ENCODE and GTEx, can predict DNA methylation, gene expression, chromatin accessibility, transcription factor binding, and chromatin conformation with impressive accuracy across human and selected model organism genomes — yet they remain anchored to a fundamentally narrow slice of sequence and context space, built on the same canonical chromosomes that all current models share as their training foundation. To realise the full potential of large-scale DNA writing, we collaborate with the AI and Robotics Institute to develop computational methods that generalise robustly to novel sequences and contexts.

Make new chromosomes

We now have the engineering tools to probe the functional boundaries of genomes in ways that were previously unimaginable. We have already demonstrated what is achievable: through iterative installation of recombinase recognition sites into repetitive sequences in human HEK293T and HAP1 cells, we recently created the most extensively engineered human genomes to date, accumulating over 1,600 targeted sequence insertions in a single cell line over the course of one year. Building on this foundation, we accelerate chromosomal engineering through automation and the infrastructure of the EIT, to radically remodel human cell line genomes. For example, iterative deletion of non-essential regions can expose the minimal requirements for chromosome segregation and replication, and directly test whether the vast non-coding expanses of human DNA are truly dispensable or required in some configuration; saturating genomes with disease-associated risk alleles allows testing hypotheses about mutational load in common disorders; and systematically eliminating xenoantigens from animal genomes can deliver safer transplant donors for humans. 

Selected publications

Our approach

1) We work on important problems. We pick projects that bring change or impact our understanding. We know the context, examples, literature, and gaps. The projects reflect society’s, field’s, GBI’s, team’s and personal take on importance.

2) We get things done. We start projects with a scope and a clear vision of success, and finish them. Every project has an accountable leader. We plan ahead, and execute with urgency along the critical path without frustration.

3) We succeed as a team. We have a diverse mix of backgrounds and skillsets, complementing each other with our strengths. Everyone has a chance to grow.

4) We are excited about science. We read broadly, discuss latest developments, know the ancients, and keep up to date both with the depth of our field, and the entire breadth of engineering and modelling biology.

Our Team

Our Sanger team website, active until August 2026, is here

Team leader

Zeinab Sheikhi

Ph.D student

We are recruiting scientists - browse openings or contact Leo