An expanded framework for RNA quality correction in expression analyses in the human brain

NIH RePORTER · NIH · R21 · $233,250 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract Serious brain disorders like schizophrenia, bipolar disorder, major depression, autism spectrum disorder, and Alzheimer’s disease are debilitating illnesses that are substantial burdens on both the families of affected individuals and the public health. While they all have high degrees of heritability, the etiology underlying these disorders in the majority of patients has been difficult to characterize. The strongest clues for the etiological underpinnings of these disorders, particularly neuropsychiatric and neurodevelopmental, come from recent genetic studies which have identified hundreds of common loci that each contribute to small effects of risk, but the mechanisms guiding any individual risk locus remain largely unknown. These common variants are therefore hypothesized to manifest at the gene pathway- and network-level, but there has been substantial variability in the pathways associated with the illness based on the genetic association results. Many groups, including our own, have therefore utilized postmortem human brain tissue to better understand the molecular correlates of the both genetic and non-genetic effects of these disorders, as gene expression levels may better illuminate mechanisms of risk. However, in this proposal we point out a damaging and often overlooked issue related to confounding effects of RNA quality in comparing postmortem tissue between patients and controls – we have identified strong confounding effects of RNA quality found in the majority of published, and our own, datasets. We first describe these widespread RNA quality effects, demonstrate that existing statistical approaches do not remove this confounding, and show these RNA quality effects drive inference in co-expression and network analyses – using both simulated and real data, we identify hundreds of false positive network edges while discovering only few true edges. In this application, to better understand the molecular etiology of these debilitating disorders, we propose a framework to accurately model RNA quality in gene expression datasets based on molecular degradation experiments across the human brain. This framework, called “quality surrogate variable analysis”, will be applied to better identify molecular signatures at the gene and network level for debilitating brain disorders to improve replication and interpretability from these large publicly available datasets. Gene networks resulting from our RNA quality-corrected framework will be interrogated for biological functionality and clinical relevance using pre-defined gene sets. These results can illuminate potentially novel biological associations underlying serious mental illness. We hypothesize that removing the biases induced by RNA quality will result in the strongest enrichment with these gene sets at the gene and network levels. Correctly modeling potential RNA quality effects in postmortem gene expression data will be an important tool in the statistical ...

Key facts

NIH application ID
9964910
Project number
5R21MH120497-02
Recipient
LIEBER INSTITUTE, INC.
Principal Investigator
Leonardo Collado Torres
Activity code
R21
Funding institute
NIH
Fiscal year
2020
Award amount
$233,250
Award type
5
Project period
2019-07-01 → 2022-06-30