News from the lab

Making microbial GWAS accessible through a bit of plumbing

Marco Galardini
10 July 2024

We have just published a research preprint describing microGWAS, a software pipeline to facilitate microbial GWAS studies. This effort was led by Judit, Bamu, and Jenny, who worked as a team to make my old chaotic code a “production-ready” tool. It was a real pleasure to see this communal effort take shape!

microGWAS logo

What we wrote last time we published a method related to microbial GWAS remains a good simple primer on the subject:

As everyone in the field of genomics has heard ad nauseam, we now have an abundance of genome sequences available; when that is combined with phenotypic measurements the obvious question is then “which gene is responsible for this phenotype?”. Statistical genetics (i.e. genome-wide association studies, GWAS) would be one way to answer that question, or rather the more correct one “which genetic variant is associated, and hopefully causal, for the variation in phenotype, across this collection of genomes?”.

The complex nature of microbial genetic variability means that in practice one has to use a number of software tools to preprocess genomes prior to the actual statistical association analysis. These tools have each a number of quirks and informal best practices, and very often one needs to write a small script to connect the output of one tool to the input of the next one in the so-called “software pipeline”. On top of these diffilcuties, very rarely the “raw” output of the association analysis provides the information that user needs. Most commonly a user would want to know: 1) which genes are associated with a given phenotype, and 2) is there a biological process that is overrepresented in the gene list?

These are the problems that our pipeline hopes to solve! We used the popular Snakemake workflow management system to connect each individual step of the typical end-to-end microbial GWAS analysis, including a number of downstream analyses to provide the users with an annotated gene list.

Schematic representation of the pipeline steps

As you can see from the simplified scheme above, the pipeline carries out all of the work needed to go from annotated genome assemblies and a phenotype table to annotated results and diagnostic plots. We even leverage Snakemake’s support for conda to automate the cumbersome (and frankly irritating) process of installing and insulating individual tools.

As we want to make this pipeline sustainable in medium term, we have also added a small test dataset to speed up the developing process; we hope that young and eager researchers in the microbial bioinformatics community will be interested in contributing to maintain the pipeline and implement new features.

More information about the four (4!) sets of genetic variants that are used in five (5!) distinct associations tests can be found in the preprint, as well as in the online documentation.

The first PhD graduation from the lab

Marco Galardini
03 July 2024

Last month Dilfuza successfully defend her PhD dissertation from the questions of her two examiners: Ana Rita Brochado and Dan Depledge.

Congratulations to her for pulling this off and be the first PhD student to graduate from our lab!

A photo of Dilfuza with her PhD hat and the other members of the lab

One submission and one farewell

Marco Galardini
26 April 2024

We are very happy to report that the first PhD thesis from the lab has been submitted this month! With just one day to spare before the deadline (as it should be 😅), Dilfuza has submitted her thesis to the ZIB office. Now we wait for the public defense in June.

A photo of Dilfuza and her PhD thesis

In other news, last month Adam has officially left the lab to take an exciting new job as a postdoc in the lab of Craig MacLean at the University of Oxford. Luckily Adam made a big push before leaving and finished some large scale experiments thanks to his usual stamina, which will be missed!

A photo of Adam and Hien doing a last big push in the lab

Congratulations to Dilfuza and Adam for these exciting news!

We did run the Hannover half marathon

Marco Galardini
16 April 2024

As anticipated in the previous post, we intended to run the Hannover half marathon, and indeed we did!

We all managed to get to the finish line, with a spatial mention to Hannes, who completed the race in 01:59:52, with just 8 seconds to spare for the 2-hour psychological barrier!

A photo of Judit, Hannes and Marco with posing with the half marathin medal

We didn’t quite manage to get a photo on the day of the race, but we are happy to report that we collected 366 Euros through our DKFZ donation campaign. Thanks to all who donated!

We are running the Hannover half marathon

Marco Galardini
25 March 2024

Our lab (Hannes, Judit and Marco) is running the Hannover half marathon on the 14th of April, and we have decided that in order to boost our determination we needed support from colleagues, friends, and family. And what better way to do it but to collect donations for a worthy cause? As researchers ourselves, we have decided to support cancer research through the DKFZ (the German Cancer Research Center). Cancer is the second cause of death in Germany, with an estimate ~270’000 deaths in 2019. Improved diagnostics and therapeutics have reduced the death rates from cancer since the 90s, which clearly proves how a donation to cancer research has the potential to affect the lives of many patients. So please help us get through the last two weeks of training and through race day by supporting Cancer research! Follow the link below and scroll to the bottom of the page to find the donation button. Thanks!

The Microbial Pangenomes Lab is running a Half Marathon for Cancer Research - DKFZ

P.s. if you are having trouble getting through the payment system do get in touch with Marco, and he can make the donation on your behalf and collect the money afterwards through a bank transfer.

« Prev
1
2
3
4
5
Next »