Integration of GTEx and HuBMAP data to gain population-level cell-type-specific insights

NIH RePORTER · NIH · R03 · $314,739 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY/ABSTRACT The NIH Common Fund Genotype-Tissue Expression project (GTEx) collected whole-genome sequencing and gene expression data from 47 tissues sites of hundreds of subjects. It generated a huge impact by providing tissue-level gene expression and expression quantitative trait loci (eQTLs) for over 7,000 publications. However, tissues are mixtures of myriad cells, and tissue-level gene regulation is affected by cellular compositions. To obtain cell-type-specific (CTS) effects, GTEx started to collect single-nucleus RNA-sequencing (snRNA-seq) data from eight tissue types. The single-cell data collection is extremely expensive and labor-intensive, and thus snRNA-seq data are only collected from 25 tissue samples of 16 donors that may not represent the population. More cost and labor-efficient methods are urgently needed to use existing datasets fully. It turns out that with another NIH Common Fund project, Human BioMolecular Atlas Program (HuBMAP), we can gain population- level insights with HuBMAP single-cell data as a reference by developing computationally efficient methods. Complementary to GTEx and other single-cell references, the HuBMAP single-cell reference allows us to deconvolve the 47 GTEx tissues into over 200 cell types. In addition to the cellular fractions, we will calculate CTS eQTLs for those cell types at a population scale. Specifically, we will: 1) estimate cellular fractions of over 200 cell types from 47 tissue sites across the human body; 2) calculate CTS-eQTLs for those hundreds of cell types with statistical rigor and power. We will further consider the potential selection bias in the eQTL analysis that GTEx collected only normal tissues. The successful completion of this project will maximize the usage of NIH Common Fund GTEx and HuBMAP projects to provide a new eQTL resource at cell-type resolution. It will be powerful in downstream analyses such as CTS colocalization by connecting with genome-wide association studies (GWAS) and CTS transcriptome-wide association studies (TWAS) by predicting genetically regulated CTS gene expression. Altogether, this project will provide a global picture of the human body at high resolution to map cells to health and complex diseases.

Key facts

NIH application ID
10575440
Project number
1R03OD034501-01
Recipient
UNIVERSITY OF PITTSBURGH AT PITTSBURGH
Principal Investigator
Jiebiao Wang
Activity code
R03
Funding institute
NIH
Fiscal year
2022
Award amount
$314,739
Award type
1
Project period
2022-09-20 → 2024-09-19