Integrative modelling of single-cell data to elucidate the genetic architecture of complex disease

NIH RePORTER · NIH · R01 · $507,248 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT Leveraging Genome Wide Association Studies (GWAS) to understand disease has proven challenging, as the underlying biological mechanisms are often poorly captured by bulk tissues. Recent advances in single-cell sequencing have led to a torrent of data across multiple modalities, contexts, and individuals, which provide an unprecedented opportunity to understand disease biology at high resolution. We hypothesize that the fine-scale cellular contexts captured by single-cell data will be effective at explaining disease heritability and fine-mapping disease mechanisms. However, current approaches to integrate single-cell data with GWAS largely rely on off- the-shelf approaches developed for bulk sequencing, which obscure the rich phenotypic diversity present in individual cells within and across canonical cell types. The sparse and highly variable nature of single-cell data has additionally posed challenges for robustly identifying single-cell quantitative trait loci (QTL). Single-cell data continues to increase in size and complexity, emphasizing the need for scalable integrative modeling. Here, we propose a 5 year research plan to develop novel approaches for integrating single-cell data with GWAS by modeling complex cellular phenotypes not captured by existing bulk approaches. Our proposal will identify novel disease-relevant cell states; leverage multiple single-cell modalities to fine-map disease variants and their target genes; and discover novel single-cell QTLs associated with disease. Our specific aims are: Aim 1: Leveraging single-cell epigenetic data to identify heritable components of disease; Aim 2: Leveraging single-cell data to fine- map disease variants and their mechanisms; Aim 3: Defining the regulatory effects of disease variants using population-scale scRNA-seq. While our proposed approaches are broadly applicable to common diseases, we will benchmark them on immune-related traits and neuropsychiatric traits which we have studied extensively with bulk datasets in published work and where we have now aggregated a large collection of relevant single-cell datasets. Our collaboration has multiple strengths: our focus on functional data integration across multiple single- cell modalities; our broad statistical and computational expertise; and our extensive, data-driven publication record on common disease.

Key facts

NIH application ID
10879333
Project number
1R01HG013083-01A1
Recipient
DANA-FARBER CANCER INST
Principal Investigator
ALEXANDER GUSEV
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$507,248
Award type
1
Project period
2024-08-01 → 2028-04-30