# Computational and Statistical Methods to determine variant effect across cell types and development stages

> **NIH NIH U01** · YALE UNIVERSITY · 2024 · $1,913,337

## Abstract

Project Summary
Built on the success of the GTEx project, the recently launched dGTEx project will recruit 120 donors to identify
genetic variants affecting gene expressions across tissues at four developmental stages (postnatal, early
childhood, pre-pubertal, and post-pubertal). Because there are considerably fewer samples in the dGTEx
project than that in the GTEx project, there is a critical need to develop powerful and robust statistical methods
to best use the dGTEx data for eQTL analysis. Moreover, single-cell sequencing is planned for the dGTEx
project, creating additional challenges and opportunities. The overall objective of our project is to develop and
apply novel statistical and computational methods to integrate different data sets to facilitate eQTL analysis of
the dGTEx data, and share the results with the research community. We will accomplish this objective through
three specific aims. For the first aim, we will infer tissue-specific eQTLs based on the total read count data by
borrowing information across tissues and developmental stages. We will then develop a hierarchical Bayesian
method to infer cell-type-specific eQTLs across developmental stages by jointly analyzing single-cell data and
bulk samples with computationally estimated cell-type proportions. We will also consider isoform eQTLs for this
aim. For the second aim, we will develop methods for identifying allele-specifically expressed genes in different
cell types. To gain more power, we will develop methods to jointly call allelic events across tissues and cell
types, correct for the specific biases in single-cell expression data, and develop methods for integrating allele-
specific chromatin accessibility and allele-specific expression using single-cell multiome data. Single-cell data
will then be combined with bulk RNA-seq data to improve allele-specific expression inference across subjects
further. Finally, we will jointly analyze total read counts and allele-specific data for eQTL inference for this aim.
In the third aim, we will develop methods to integrate data from other sources to complement the data collected
from the dGTEx project, such as data from the GTEx project. We will leverage chromatin data to "transfer"
known eQTLs from bulk tissues and larger cohorts to the specific (smaller) single-cell cohorts. We will also
incorporate predicted effects of genetic variants from deep learning approaches in our modeling and analysis.
To facilitate transcriptome-wide association studies for complex traits rooted in early development, we will
develop gene expression imputation models based on our eQTL results. We will work with the dGTEx team to
share our results with the broader scientific community via the dGTEx portal and ANVIL.

## Key facts

- **NIH application ID:** 10990741
- **Project number:** 1U01HG013840-01
- **Recipient organization:** YALE UNIVERSITY
- **Principal Investigator:** Mark Bender Gerstein
- **Activity code:** U01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $1,913,337
- **Award type:** 1
- **Project period:** 2024-09-23 → 2027-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10990741

## Citation

> US National Institutes of Health, RePORTER application 10990741, Computational and Statistical Methods to determine variant effect across cell types and development stages (1U01HG013840-01). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10990741. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
