Even though the lab is focused on bacterial pathogens, we (Marco and Gabriel)
took a look at the “2020 main character” pathogen, SARS-CoV-2, and just
put out a preprint on the topic of real-time identification of epistatic
interactions.
A particularly interesting development during the pandemic has been the
unprecedented amount of genome sequencing, and its use in very quickly
developing diagnostics (e.g. PCR primers) and treatments (i.e. sequence-based
vaccines and their updates).
Other applications closer to our field of study are the identification of
virus variants with a fitness advantage
and predicting the impact of genetic variants on genes and viral fitness.
We asked ourselves: can we find another use for this genomic data?

The application we were looking for would ideally be computed quickly, so that
it could be part of the public health decision process, and should leverage
the metadata associated with each sequence so that context would be taken into account.
We decided to look at a generally overlooked aspect in the evolution of this virus:
epistatic interactions. Generally speaking, two positions in the genome can be said to
interact epistatically if the impact of a double mutation deviates from the “sum” of the
effect of the single ones.
A particularly interesting example of macroscopic epistatic interaction can be found in the history
of scurvy.
We already know that these interactions are important for the evolution of
the Omicron variant (BA.1), has it has been elegantly showed through
experimental work.
These experimental studies are however relatively slow to perform and restricted to
certain regions of the genome (e.g. the Spike RBD). Could we come up with a
fast computational method that scales to millions of sequences?
We reworked a method based on mutual information (MI) and applied to a large
set of SARS-CoV-2 sequences (~4M), finding 474 interactions across the genome,
the majority in the Spike gene (185).

Even though we made the method somewhat scalable, it took ~36 hours to complete
using several nodes on HZI’s cluster. But luckily we can obtain similar results
with as few as 10k sequences (a bit better if closer to 100k), which only takes
2 hours to complete.

How do these interactions change over time, and how quickly can we spot known ones,
such as those found in Omicron BA.1? We further modified the method to reduce the influence
of older sequences and thus highlight emerging interactions.
And indeed we were able to identify a cornerstone epistatic interaction in the Spike protein
between residue 501 and 498 as early as 6 (!) Omicron sequences were present in the dataset,
which speaks to the sensitivity of the method.

The flip side of this sensitivity is that genomes with incorrect metadata (i.e. date)
will make epistatic interactions appear at odd times, which was the case before we
used the excellent community resources put together by Emma Hodcroft.

We hope this work demonstrates that pathogen genomic sequencing is here to stay,
as the data can be used to generate many useful predictions that can in turn guide
public health decisions.
This work was started during the first lockdown, but it was significantly advanced by
Gabriel Innocenti, who was supported first by a FEMS research and training grant
and then by RESIST. A real pleasure to work with him!
P.S. as with our other papers, we have also shared the analysis code so that the study can be reproduced.
As promised it did not take two years to get another
lab group photo!
This time we took advantage of a quick trip to Würzburg to visit a collaborator.

Update 2023/05/24
Adam flexed his Photoshop muscles and created a nicer composite photo with everyone in it. Can you spot who was added after the fact?

Almost two years after the first group picture, here’s an
updated one with the current members of the lab.
It’s a bit tricky to get everyone in one room given our hybrid work
arrangements, but hopefully it won’t be 2 years until we rember to
take another picture!

After a long wait made of various bureaucratic and pandemic-related
hurdles, the lab is finally at its full capacity!
We also finally managed to get everyone in one room long enough to
take a photo.

We are hiring! We are looking for a postdoc to study
antimicrobial resistance (AMR) using high-throughput laboratory evolution
and genomics. Potential research topics include the influence of genetic
background on the evolution of AMR, dynamics of horizontal gene
transfer in microbial communities, and resistance to sequence-based therapeutics.
We are looking for a candidate with a PhD in molecular biology, microbiology or
evolutionary biology and with significant molecular biology experience.
Previous research in the area of antimicrobial resistance is optional, as well
as experience in liquid-handling platforms and basic computational biology.
We offer a contract with an initial period of 2 years with the possibility of
further extensions, and the opportunity to be trained in microbial genomics and
computational biology.
The lab is located in Hannover, an affordable
and vibrant city.
More information about the project and the research environment is available
here. Canditates should apply following the instructions found in the
MHH website.
Informal inquiries are welcome, please do get in touch with us if you have any questions!