Discovery of structural RNAs involved in human health and disease

NIH RePORTER · NIH · R01 · $366,514 · view on reporter.nih.gov ↗

Abstract

Many fundamental cellular functions depend on a variety of RNA structures conserved through evolution, and other functional RNA structures are expected to be discovered. A signature of a conserved RNA structure is found in alignments where paired positions display correlated substitutions (covariation) that preserve the base pair. This evolutionary signal can be used both to predict RNA structure and to identify new conserved RNAs. Recent publications and preliminary results have made three important advances: A statistical covariation test that identifies significant covariation over background covariation due to phylogeny. This test, implemented in a method called R-scape (RNA Structural Covariation Above Phylogenetic Expectation), provides information and control over the rate of false positive predictions. A power of covariation calculation, recently published, that identifies “negative” pairs with power (variation) but insignificant covariation, unlikely to form RNA base pairs. A new cascading folding algorithm, named CaCoFold (Cascade covariation/variation Constrained Folding) also recently published, that combines all positive and negative evolutionary information into complex structures including all types of pseudoknots and triplets. In human, the efficacy of these advances has been tested by ac- curately predicting the structures of the human non-coding RNAs MALAT1 and telomerase RNA, and by inferring that the non-coding RNAs HOTAIR and XIST do not have a conserved structure. These three advances give us a competitive advantage to perform unbiased genome-wide screens for con- served structural RNAs in vertebrates with accurate 3D structure prediction. Previous vertebrate screens for structural RNAs have been hindered by thousand of false positive predictions. In contrast, our new covariation statistical test allows for controlling the rate of false positives. R-scape has already been used to find struc- tural RNAs in bacteria and viruses. Our recent eukaryotic pilot screen in fungi has identified 17 novel structural RNAs. We hypothesize that many structural RNAs with implications for human health and disease are still to be discovered, and that we now have the tools to find and characterize these RNAs. This proposal has three specific aims that will advance the study of structural RNA biology, and the discov- ery of novel biological mechanisms involving RNA structures. The first aim proposes systematic genome-wide searches to find novel conserved vertebrate RNA structures in human. The second aim proposes to combine revolutionary 3D structure prediction methods in machine learning with the signals used by CaCoFold into a state of the art RNA folding method for the accurate prediction of 3D RNA structures. The third aim introduces a method to identify RNA structures in ultra conserved vertebrate UTRs where there is no covariation signal, and our current method lacks power. We expect our work will unveil primate-specific novel regulatory mecha- ni...

Key facts

NIH application ID
10521342
Project number
1R01GM144423-01A1
Recipient
HARVARD UNIVERSITY
Principal Investigator
Elena Rivas
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$366,514
Award type
1
Project period
2022-09-15 → 2026-07-31