Methods for Genomic Analysis in Heterogeneous Tissues

NIH RePORTER · NIH · R01 · $636,087 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract The vast majority of genomic data are generated for heterogeneous tissues, whereas many genomic measurements (e.g., gene expression and methylation) are tissue and cell-type specific. Notably, cell-type- specific analysis can lead to important insights in understanding of underlying biological mechanisms. Furthermore, analysis that ignores cell-type-specific effects often results in a substantial power loss and false positive discoveries. Thus, there is a pressing need to develop methods that can facilitate cell-type specific analysis on existing and future bulk datasets. Existing efforts to address tissue heterogeneity focus on the inference of cell counts from bulk RNA and methylation, however these approaches do not detect cell-type specific association but rather are used to avoid false discoveries. By contrast, this proposal will focus on a novel set of statistical tools for the inference of the cell-type specific expression and methylation signal in each gene and each individual. The approach studied in this project will include the development of methods for the imputation of methylation from single nucleus RNA-seq. These methods will allow to generate reference data for methylation using publicly available single-nucleus RNA data. In addition, this project will generate single nucleus RNA-seq and methylation for sorted cells from Mexican and Finnish blood and adipose samples, resulting in the largest dataset that includes both types of data, particularly on Latinos and on adipose tissue. These reference data will be used as training data for the developed methods. Finally, the methods developed will be used to search for cell-type specific associations with obesity, nonalcoholic fatty liver disease, type 2 diabetes, and dyslipidemias, as well as perform cell-type specific eQTL and mQTL analyses on a large Mexican and Finnish population. In order to achieve this goal, bulk methylation data will be generated for Mexican and Finnish adipose samples for which genotypes, bulk RNA-seq, and refined phenotypes are already available. Importantly, the Latino data will be one of the largest non-European datasets with expression, methylation and genotype information. This data will be made available to the research community. Thus, accomplishing this project will advance the understanding of population-specific genetic and epigenetic components of highly common cardiometabolic disorders with high morbidity and mortality worldwide. Mexicans have the highest susceptibility of these cardiometabolic disorders, and this study will provide much needed new genomics data in this admixed minority population to combat cardiometabolic disease in diverse populations.

Key facts

NIH application ID
10187622
Project number
5R01HG010505-03
Recipient
UNIVERSITY OF CALIFORNIA LOS ANGELES
Principal Investigator
ERAN HALPERIN
Activity code
R01
Funding institute
NIH
Fiscal year
2021
Award amount
$636,087
Award type
5
Project period
2019-09-15 → 2023-06-30