The Biomedical Genomics Group @ Health Data Science Unit at the BioQuant Center and Medical Faculty Heidelberg is working on understanding gene (de)regulation in disease. We develop novel methods to integrate large scale genomics datasets, using statistical and machine-learning approaches. We focus in particular on epigenomics data and single-cell approaches.
We are contributing to the OpenLab Epigenomics initiative to provide support and expertise to groups generating large scale epigenomics datasets like ChIP-seq, ATAC-seq and chromatin interaction data.
We are contributing to the SARS-CoV-2 research effort and developped MapMyCorona, an interactive website to display blast results of viral sequences to available SARS-CoV-2 sequences in a geographical and temporal fashion
We are looking for a scientific associate for a one-year project to implement a benchmarking platform in a cloud environment to test cellular deconvolution methods for RNA-seq/DNA methylation data. Find the job description here
We are happy to welcome motivated students for lab rotation, bachelor and master thesis.
Find all of our publications at Google Scholar
The Epigenomics OpenLab is a joined effort of the DKFZ and Medical Faculty to support groups in the processing of their epigenomics datasets. We offer assistance and expertise, as well as access to our processing pipelines, and are happy to host external members to guide them through the analysis.
Please contact us if you are looking for assistance.
Members of the lab are involved in teaching in the Molecular Biotechnology Bachelor and Master Program at the university Heidelberg.
Here are some possible topics/projects for students wanting do to their bachelor thesis in our group during the summer semester 2020
Currently, more and more single-cell RNA-seq datasets are generated to increase the resolution of transcriptomics to the single-cell level. These datasets allow to understand the mixture of cell types within a tissue sample, and have been applied to create atlases of cell types from mouse embryos. On the other hand, there are thousands of bulk RNA-seq datasets available, which lack this resolution. We are working on implementing methods to re-interpret bulk datasets using single-cell information, and map for example patient data onto trajectories defined from single-cell expression. The project would be to contribute to the development of this method, in particular, the visualization of the data, and to apply it to a large set of pediatric tumor types. Comparison to datasets of normal tumor would be used to validate the method.
In the last three years, a new wave of technologies that allows profiling multiple molecular levels in single-cells at the same time has come to light, e.g.; CITEseq, scCAT-seq, scNMT-seq, and scDam&T. Therefore it is crucial to develop new methods that take into account multiple layers of information at the same time to find clusters of cells, identify interactions between such layers and generate signatures or factors underlying the differences between cells.
Auto-encoders are a popular way to achieve dimensional reduction in a non-linear way, and extract relevant features from a dataset. This can be applied e.g. to a single-cell dataset and can be compared to a method based on linear approaches such as principal component analysis or non-negative matrix factorization. Such approaches can also be used to perform integration of multi-omics datasets. The goal of the project ist to explore the possibilities of auto-encoders for integrating single-cell RNA-seq and single-cell ATAC-seq from different in-house and published datasets, and compare the result of these integrations to other methods implemented e.g. in popular R packages or based on integrative non-matrix factorization
Schizophrenia is a severe disease whose diagnosis is mostly based on clinical interviews. Within a large consortium, we are working on improving this by identifying molecular signatures based on multiple omics data types, for example DNA methylation, and gene expression (RNA-seq). This integration will likely improve stratification of patients based on a single data type. The goal of the project would be to implement several strategies to perform this data integration (neural networks, integrative linear methods, …) to identify patient groups and benchmark these approaches against single data stratification.
The existence of large RNA-seq datasets of tumor tissue and matching normal tissue allows to conduct comparative studies. In particular, recent approaches allow to determine the activity of pathways and transcription factors from the transcriptomic data, which can be used to understand how pathways and master regulators are jointly activated or seem to have mutually exclusive patterns. In recent projects, we have for example described how mesenchymal phenotypes appear to be tightly related to pathway activation, for example the RAS pathway. The goal of the project is to conduct a large scale analysis of the activity patterns of pathways and master regulators, and to understand how these patterns are perturbed in tumor tissues compared with normal counterpart. We will in particular focus on processes related to ferroptosis across various tumor types to describe how this process is related to other pathways.