Translation of overprinted non-canonical open reading frames from alternative transcript variants

NIH RePORTER · NIH · R01 · $433,865 · view on reporter.nih.gov ↗

Abstract

This project describes a class of recently discovered human genes: internal open reading frames (iORFs) that overlap annotated protein coding sequences in alternative reading frames. Because iORFs are translated in a different reading frame, their amino acid sequences are different from the annotated protein that they overlap, meaning that overlapping genes encode two entirely different protein products. Overlapping genes are well- characterized in viral genomes, but were thought to be essentially absent from the human genome. While evidence is growing that the microproteins encoded in iORFs can play important roles in human cells, it is currently unclear how many functional human iORFs exist and how they are expressed. We provide preliminary data demonstrating that, in potentially hundreds of cases, alternative transcript variants can recode a human gene from expressing the annotated, canonical protein to the iORF-encoded microprotein. In Aim 1, we will apply long-read sequencing technologies to identify and validate the existence of iORF-encoding alternative transcripts in high throughput, thus establishing how broadly iORF recoding occurs in human cells. In Aim 2, we will provide molecular and cellular evidence that iORFs are functional, and determine whether their cellular roles are related to those of the canonical proteins that they overlap despite their differing amino acid sequences. In Aim 3, we will characterize the molecular mechanism and regulation of an anti-apoptotic iORF that overlaps a pro-apoptotic death effector domain-containing protein. Taken together, the successful completion of these aims will demonstrate that overlapping human genes are plentiful and functional, and provide mechanistic insight into a specific overlapping gene as a paradigmatic example of the molecular rationale for overlapping gene organization in human. More broadly, our study will provide an entirely new understanding of the roles of alternative transcript variants generated via alternative pre-mRNA splicing and use of alternative transcriptional start site: instead of simply generated isoforms of known proteins, these processes can generate novel transcripts that, in losing the ability to encode the annotated protein coding sequence, can be reprogrammed to express frameshifted iORFs encoding currently unannotated microproteins with distinct sequences and functions. We thus expect to reveal new levels of complexity in the human transcriptome and proteome.

Key facts

NIH application ID
10943268
Project number
1R01GM155404-01
Recipient
YALE UNIVERSITY
Principal Investigator
Sarah Ann Slavoff
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$433,865
Award type
1
Project period
2024-08-15 → 2028-06-30