Post-transcriptional Regulatory Networks

NIH RePORTER · NIH · R01 · $643,713 · view on reporter.nih.gov ↗

Abstract

RNA-binding proteins (RBPs) play key roles in RNA splicing, editing, nuclear export, translation, turnover, and subcellular localization. Reflecting their importance, RBPs and their cis-regulatory elements (CREs) have broad implications in human health: mutations in RBPs or CREs have well-established roles in cancer, developmental defects, particularly in neural development, and in neural degenerative diseases. Using a combination of a high-throughput, in-vitro-selection-based RNA binding assay, RNAcompete, and machine learning (ML) models trained to map from an RBP’s protein sequence to its RNA binding preferences, this project will endeavor to assign RNA sequence- and structural-context binding preferences to all human RBPs, all vertebrate RBPs, and the vast majority of metazoan RBPs. These specificities will then be used to detect and assign function to RBPs and cis-regulatory elements (CREs) in human genomes, as well as those of other model organisms. The specificities, machine learning models, and predicted CREs will be distributed widely via publication, open-source software, and user-friendly web tools like cisBP-RNA. This project has the potential to transform cancer and human genetics research supporting the estimation of the functional impact of germline or somatic mutations on post-transcriptional regulation (PTR). By improving the reconstruction of PTR networks, this project will speed research in this emerging field toward a complete understanding of this key process. This project will also permit the study of the evolution of PTR by developing tools to reconstruct PTR networks in other organisms based solely on genomic and transcriptomic data. RNAcompete will be used to assess the RNA sequence-binding preferences of the 511 still-uncharacterized RBPs in humans and D. rerio (zebrafish), thereby establishing a complete catalog of binding preferences for all likely sequence-specific RBPs in these two species. These data will be combined with binding data for >500 other RBPs from a variety of sources and used to train an ML model that reconstructs RNA-binding preferences given RBP protein sequences. These models will also leverage recent advances in de novo prediction of protein structure from sequence. RBPs will be assigned roles in PTR based on (i) the location and conservation, in human transcripts of their predicted target CREs, (ii) the correlation of their expression with the PTR fate of their putative target transcripts, and (iii) other, more powerful regression methods like the Inferelator. CRE predictions will be continuously improved using in vivo data to recalibrate in vitro motif models and to improve in silico predictions of transcript RNA secondary structure. Our predicted CREs and reconstructed PTR networks will be validated by comparisons with in vivo data collected by our team and others.

Key facts

NIH application ID
10899590
Project number
5R01HG013328-02
Recipient
SLOAN-KETTERING INST CAN RESEARCH
Principal Investigator
Timothy Hughes
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$643,713
Award type
5
Project period
2023-08-04 → 2027-05-31