Statistical Methods for Gene Regulatory Analysis From Single Cell Genomics Data

NIH RePORTER · NIH · P20 · $24,423 · view on reporter.nih.gov ↗

Abstract

Gene regulatory networks (GRNs) provide information on the cis-regulatory elements controlling contextspecific expression of target genes, as well as the transcription factors acting on these elements. Understanding the dynamics of gene regulation is fundamental for understanding how cells undergo specialization for different functions, despite having the same genome; how cells respond to different environments by modulating gene expression; and how non-coding genetic variants cause diseases. Inference of GRNs from genomics data is a systematic approach to study gene regulation. However, the accuracy of such inference is limited if the cellular context under interest is a heterogenous mixture. The development of single cell genomics technologies can fill this gap by providing high-resolution GRNs. Therefore, there is a compelling need for efficient statistical methods to infer GRNs from single cell genomics data. The long-term goal of this project is to obtain a mechanistic understanding of how noncoding genetic variants affect cellular context-dependent GRNs and influence phenotypes. Single cell transcriptomic (scRNA-seq) and chromatin accessibility (scATAC-seq) data provide information on different cellular features, i.e., gene expression and active regulatory element location, respectively. Integration of these two types of data will provide more accurate information on gene regulation. In Specific Aim 1, we will extend our initial studies inferring subpopulation-dependent GRNs from unpaired scRNA-seq and scATAC-seq data (supported by a COBRE in Human Genetics Pilot Project since 02/01/2022) by benchmarking existing methods for integrative analysis of unpaired scRNA-seq and scATAC-seq data to build an optimized pipeline for unpaired data analysis. We will develop a statistical method to infer subpopulation-specific GRNs and analyze large-scale published datasets to build a database of GRNs for hundreds of cellular contexts. In Specific Aim 2, we will develop statistical methods for comparative gene regulatory analysis based on single cell genomics data. The comparison of GRNs between samples from diseased versus healthy patients or between two different treatments is an important scientific problem. Thus, an efficient computational method for comparative gene regulatory analysis based on different types of single cell genomics data is needed. In Specific Aim 3, we will develop a method and software to infer cell type specific GRNs from sc-multiome data. This method and software would have a significant and broad impact by providing a detailed view of how trans- and cis-regulatory elements work together to affect gene expression in a cell type-specific manner.

Key facts

NIH application ID
10808906
Project number
5P20GM139769-04
Recipient
CLEMSON UNIVERSITY
Principal Investigator
Robert R. H Anholt
Activity code
P20
Funding institute
NIH
Fiscal year
2024
Award amount
$24,423
Award type
5
Project period
2021-02-10 → 2024-02-02