Prediction of nearest neighbor parameters for folding RNAs with modified nucleotides

NIH RePORTER · NIH · R21 · $204,073 · view on reporter.nih.gov ↗

Abstract

Natural and synthetic RNAs play key roles in cellular function, biotechnology, and medicine. RNAs fold into intricate structures, which often drive their functions, thus determining RNA structure is fundamental to biology and biotechnology. Computational thermodynamics-based secondary structure modeling (TSSM) is a popular, low-cost, and rapid approach to structure prediction, which has enabled transcriptome-wide structure-function studies and massive structure-based screens of synthetic RNA libraries. However, recent evidence suggests that a diversity of post-transcriptional chemical nucleotide modifications additionally exert profound impact on local and/or global structure, to ultimately modulate the RNA’s stability, expression, or regulatory function. Such modifications are widespread in all life domains and represent a new and poorly understood layer of gene regulation, which has been implicated in disease. Moreover, they are routinely introduced into RNA medicines as a means of evading the innate immune response. Taken together, the wealth of natural modifications and development of novel artificial ones, the growing interest in their mechanism, and their centrality to RNA medicine underscore a pressing need to determine structures of RNAs with modified nucleotides rapidly and accurately. However, TSSM methods cannot account for the effects of modifications due to a lack of parameters to estimate their folding stabilities. They rely on the feature-rich Turner nearest-neighbor (NN) thermodynamic model, which is parameterized by 294 free-energy change values derived for canonical bases from 802 costly and laborious UV melting experiments. Given the diverse and rapidly expanding pool of modifications, it is impractical to repeat such experiments for each type. The premise of this proposal is that NN parameters can be learned more efficiently from alternative experiments, which are affordable, widely accessible, and high throughput. Specifically, next-generation sequencing has transformed RNA Structure Probing (SP) into a routine massively parallel experiment, which reports structural information about local nucleotide dynamics. SP is widely used to gain insights into RNA structure and function from genome-wide studies and to constrain TSSM algorithms to improve their predictions. However, unlike melting assays, the relationship between RNA folding stability and SP measurements is highly nontrivial, and thus the problem of recovering the parameters from SP data is difficult. The goal of this proposal is to develop novel algorithms and software to estimate NN parameters from high-throughput SP data. We will design statistical inference methods that reconcile information from folding algorithms and SP experiments and apply them to data for unmodified and modified RNAs to estimate new parameters for modified nucleotides. As the link between SP data and folding thermodynamics is complex, and furthermore, the ability to fit the Turner parameters fro...

Key facts

NIH application ID
10576175
Project number
1R21GM148835-01
Recipient
UNIVERSITY OF CALIFORNIA AT DAVIS
Principal Investigator
Sharon Aviran
Activity code
R21
Funding institute
NIH
Fiscal year
2023
Award amount
$204,073
Award type
1
Project period
2023-03-01 → 2025-02-28