DeepBlue epigenomic data server

Analysing large-scale epigenomic data

Abstract

Large amounts of epigenomic data are publicly available, yet their retrieval for downstream analysis is a research bottleneck. Typically, users download huge files that span the entire genome, even if they are only interested in a small subset (e.g. promoter regions) or an aggregation thereof. Moreover, complex operations on genome-level data are not always feasible on a local computer due to resource limitations.

The DeepBlue Epigenomic Data Server (http://deepblue.mpi-inf.mpg.de/) mitigates this issue by providing a powerful interface and API for filtering, transforming, aggregating and downloading multi-scale data from several epigenomic consortia, making it the ideal resource for integrating epigenomics resources into analysis workflows and tools.

While any programming language can be used to access the DeepBlue API, we developed an R/Bioconductor package (https://doi.org/doi:10.18129/B9.bioc.DeepBlueR) that enables users not proficient in scripting or programming languages to analyze epigenomic data in a user-friendly way. Here, the extracted data are automatically converted to suitable R data structures for downstream analysis and visualization within the Bioconductor framework.

In the second part of my talk, I will highlight our recent work on DNA methylation heterogeneity on bulk bisulfite sequencing data. Notably, most studies of DNA methylation focus almost exclusively on mapping differences in average methylation, neglecting the contribution of cell-to-cell variability. While various scores have been proposed to capture DNA methylation heterogeneity, we performed the first systematic comparison of existing scores on both simulated and experimental data. In our benchmark, we consider different scenarios from which heterogeneity can arise, including cell-type heterogeneity and allele-specific methylation, and propose two new scores that accurately capture DNA methylation heterogeneity and are the first to cover individual CpGs.

Date
Location
Auf der Morgenstelle 16, 72076 Tübingen, Germany
Avatar
Markus List
Head of the Research Group Big Data in Biomedicine