Development of Next-Generation Mass Spectrometry-based de novo RNA Sequencing for all Modifications

NIH RePORTER · NIH · R01 · $299,486 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY An RNA sequence with all its diverse modifications constitutes ‘true’ information content of the RNA. Defects in RNA modifications account for >100 human diseases, such as breast cancer, type-2 diabetes and obesity, affecting millions of Americans. Despite its significance, the true sequence of a RNA, i.e., identity and location of each and every nucleotide building block (modified or not) within a full-length RNA, remains a mystery, mainly because of the lack of a general method to directly sequence any nucleotide, especially modified nucleotides (including unknown ones) at single-nucleotide resolution. No existing technology can sequence all modifications simultaneously to unfold the true RNA sequences at a large scale or the transcriptomic level. What complicates RNA modification studies is that >170 modification types have been discovered, and not all of nucleotide modifications are modified completely to 100% at their RNA sites. They are even undetectable by NGS-based technologies, which require the conversion of RNA to cDNAs that do not have any modification information. Tools to map RNA modifications are limited only to a few popular modifications, and can usually analyze only one modification type at a time. Mass spectrometry (MS) is currently the only technique that can characterize all RNA modifications; however, conventional MS methods lose information regarding the location and co-occurrence of modified nucleotides. To resolve these outstanding issues, we have recently developed a series of novel next generation mass spectrometry-based sequencing (NextGen MassSpec-Seq) approaches that can de novo directly sequence tRNAs without a cDNA and can sequence and quantify all nucleotide modifications simultaneously. For the duration of this proposal, we will further develop NextGen MassSpec-Seq to sequence tRNAs efficiently in different cellular and even disease conditions, make it scalable toward high throughput, and expand its application to simultaneously sequence and map all modifications quantitatively on any RNA type and at the transcriptomic level. Specifically, we propose to develop MS for large-scale de novo sequencing of full-length tRNAs, together with all diverse nucleotide modifications (Aim 1), empower MS to simultaneously sequence and quantify multiple RNA modifications, allowing quantitative mapping at single nucleotide and stoichiometric precision (Aim 2), scale up NextGen MassSpec-Seq and combine it with high-throughput NGS sequencing for direct sequencing of diverse RNA modifications at the transcriptomic level (Aim 3). Our tool will address a long- standing issue of how to reveal the ‘true” RNA sequences and provide a transformative tool for studying RNA modifications, which will promote better understanding of functions of post-transcriptional modifications and their correlations to RNA-related diseases and pandemics.

Key facts

NIH application ID
10794359
Project number
5R01HG012853-02
Recipient
NEW YORK INST OF TECHNOLOGY
Principal Investigator
Shenglong Zhang
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$299,486
Award type
5
Project period
2023-03-01 → 2024-07-31