News from the lab


Time is relative (in genomic epidemiology)

Marco Galardini
04 May 2026

We have just posted a new preprint, describing our work on improving how to detect bacterial transmission in hospitals using genomics. This effort was led by Judit, who worked in collaboration with colleagues from Hannover Medical School and Copenhagen University Hospital-Rigshospitalet, to use a large (~30k genomes!) bacterial genomics dataset that our collaborator Susanne Häußler has put together over the years.

Bacterial infections are a much too common “perk” for patients being hospitalized, and it’s the job of epidemiologists to identify the transmission routes for these pathogens. Often this problem boils down to a relatively simple question: is this particular pair of bacterial samples related to each other? Genomics, used to read all the millions of letters in the bug’s DNA, provides the highest possible resolution to answer this question. But how to choose the number of genetic differences (“SNPs”) that separate related (i.e. the same bug moving between patients) from unrelated samples?

The standard practice in the field has relied on a fixed SNPs threshold (e.g. 20), which generally works, but has two problems: it’s rather arbitrary, but most importantly it does not take into account the impact of time. Since every time a bacterial cell duplicates there is a chance that errors (i.e. SNPs) are introduced, then samples that are farther apart in time can be expected to have more SNPs separating them. But how can we calibrate such a “SNPs accumulation clock”? Judit had the brilliant intuition that in the dataset we had ~50 patients that had been sampled multiple times (~20!) over their hospital stay. She could then calibrate our empirical clock within the same dataset we would use the clock for. We hope that this approach will be taken up by genomic epidemiologists in their daily practice.

Schematic representation of how we calibrated a mutation accumulation clock

Once we had used our calibrated clocks to identify transmitting bugs, we wanted to know if we could identify genetic characteristics that could differentiate them from non-transmitting ones. We used two approaches to answer this question, one using lists of known genes, and one looking at the whole “haystack”. Even though we could identify many genes and genetic variants associated with the ability to transmit between patients, we failed to use them to predict which samples were part of a transmission chain in a held-out dataset. This suggests that patient and environment factors might dominate the probability of bacterial transmission. Measuring and including these factors in future analysis may then lead to a system that is better for predictions.

Differences in virulence and resistance scores between bacterial pathogens

There’s more to discover in the preprint and the accompanying code repository, so please dig in!


A thesis submission for 2026

Marco Galardini
29 April 2026

We are very happy to report that the fifth (!) PhD thesis from the lab has been submitted this month!

As it is customary with many graduate students, Hien submitted her thesis one day before the deadline; but this time I am to blame for pushing her to finish one last experiment before completing her thesis.

We are all very excited about her hard work and are now looking forward to the public defenses in June.

A photo of Hien and her PhD thesis

Congratulations to her!


PhD position in bacterial genomics and machine learning available

Marco Galardini
06 November 2025

Update (2025-12-16): the position has been provisionally filled. Thanks to all who applied!

After a rather long hiatus we are hiring again!

We are looking for a candidate to fill a computational PhD student position, with the main task of better understanding which genetic elements (“genes”) make bacterial pathogens such as E. coli, K. pneumoniae, P. aeruginosa, and E. faecium virulent and resistant to antibiotics. We have in fact developed methods (here and here) to sift through large numbers of bacterial genomes for this exact purpose (see here, here, and here), and it will be the job of the candidate to use and improve these methods so that they can be applied to an even larger number of genomes. We are also particularly interested in implementing more machine learning methods and to integrate molecular phenotypes such as gene expression and proteomics.

Our lab is part of the RESIST excellence cluster at Hannover Medical School (MHH), and we are part of a collaborative project with Susanne Häussler and Meike Stiesch to identify genetic determinants of pathogenicity and virulence in life-threatening bacterial infections.

We are looking for a candidate with a strong computational background and relevant technical skills:

  • Strong background in computational biology, bioinformatics, computer science, or a related field.
  • Experience with programming languages such as Python and workflow management systems such as Snakemake.
  • Familiarity with machine learning techniques and libraries (e.g., scikit-learn, TensorFlow, PyTorch).
  • Knowledge of bacterial genomics and related bioinformatic tools.
  • Experience with high-performance computing (HPC).
  • Experience with version control systems (e.g., git) and software development best practices.
  • Experience with statistical analysis and data visualization.

We offer a fully-funded PhD position for a little longer than 3 years, to begin in February 2026. The student will be enrolled in the Biomedas graduate program, which offers curriculum specifically designed for computational biology students. The lab is located at Twincore, embedded in a large scientific campus that include Hannover Medical School, and the Center for Individualized Infection Medicine (CiiM); we are also part of the Helmholtz Centre for Infection Research (HZI), which offers lots of opportunities for collaboration.

Hannover is the capital of Lower Saxony, in northern Germany, and is an affordable and vibrant city.

Please apply as soon as possible by sending the following documents to Marco Galardini at marco.galardini@twincore.de:

  1. Your CV
  2. A short motivation letter
  3. The link to a software portfolio (e.g., your GitHub account)
  4. Contact details of two references

Two photos from hannover


Two PhD graduations in one day

Marco Galardini
01 July 2025

Yesterday we had the great pleasure to see not only one, but two of our students defending their PhD dissertations! We started off with Judit, who focused on the use of large genome collections to study the transmission of bacterial pathogens in the clinic and on methods to improve current practices for genomic epidemiology. She sustained a rather long Q&A session from her examiners: Conor Meehan and Dirk Schlüter. Then it was time for Hannes to talk about his work on using k-mer based methods to improve the interpretability of bacterial GWAS and aid the design of antimicrobials based on antisense oligonucleotides (i.e. asobiotics), using very large genome collections. He had to contend with a similarly though Q&A session from his examiners: Franziska Faber and Daniel Depledge, who is by now a regular examiner for our PhD students.

A photo of Hannes and Judit with their PhD hats

After a delicious BBQ organized at our department by them, we headed back to MHH for the graduation ceremony, where we had a pleasant surprise: Judit won the “Infection Biology Prize” for her thesis, which comes with a cool check for 1000 Euros!

A photo of Judit receiving the prize

Thanks to both for their very hard work and lovely day of celebrations!


When the culprit is under your nose

Marco Galardini
10 June 2025

People familiar with the field of bacterial genomics have long been aware that microbial genomes are densely packed with genes, and thus are depleted of so-called “junk DNA” (a term that has fallen out of fashion by the way!). As a result, the more abundant protein coding portions of these genomes get the most attention from researchers aiming to find which genetic variants explain phenotypic variation among isolates. Previous work has however already shown that bacterial non-coding regions are both highly diverse and show signals of being evolutionarly constrained. We also knew that these regions influence the expression of genes encoded directly downstream from them. We therefore hypothesized that we could uncover statistical associations between genetic variants in non-coding regions and gene expression variability across isolates.

The results of this work have just been published as a preprint, a work that was led by Bamu during her time as a PhD student in the lab. Bamu was the very first person brave enough to join the lab, and has manged to work both in the dry- and wet-lab, a feat that not many people can achieve!

Bamu indeed found that it was possible to identify at least one genetic variant whose presence was associated with gene expression changes in up to 39% of tested genes in two important bacterial pathogens (E. coli and P. aeruginosa). Using the right way to represent the complex genetic variation (i.e. gene-centric k-mers) allowed Bamu to capture the highest proportion of associations.

Barplots indicating the main results from the association analysis

Once we found these associations, the next task would be to validate some of them and to understand the actual mechanism operating behind the scene. Here Bamu used a combination of in-silico and in-vitro approaches, which very clearly indicated that no single mechanism would be sufficient to explain the observed associations.

The last part of the study was instead dedicated to the understanding of the role of non-coding genetic variation to antimicrobial resistance. Again, Bamu used her dry- and wet-lab skills to show that indeed there are non-coding variants in both species that are associated with antimicrobial resistance. This leads us to conclude that these often neglected regions of the bacterial genome need to be taken into account if we want to be eventually able to make the most out of bacterial genomes.