# Functional annotation of new genes aided by deep learning

> **NIH NIH R35** · UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON · 2021 · $323,658

## Abstract

New genes (NGs) are generated by multiple mechanisms and their end-piece sequences are identified as the
chimeric transcript sequence from multiple human sources including healthy and disease tissues. Therefore,
NGs have been recognized as important biomarkers and therapeutic targets for precision medicine. Many
efforts have been made to study individual NG function and to identify relevant drug targets. However, the
current in-depth research and achievements are mainly concentrated on several driver NGs, and classical
cancer drugs have been directly used to target the NG domains, such as the kinase domain of BCR-ABL1
fusion protein in leukemia. Some of the fusion proteins with retaining DNA-binding domains such as
transcription factors can directly bind their target genes, such as the EWSR1-FLI fusion actively recruiting BAF
complex. Recently, the downstream effectors of driver FGs have emerged as therapeutic targets. For example,
targeting the downstream CCND2 inhibited RUNX1/ETO-driven leukemic expansion in vitro and in vivo and
inhibition of STAT5, the downstream factor of NUP214-ABL1 led to the induction of leukemia cell death.
However, the functions of most identified FGs have not been systematically investigated. This is mainly due to
the limitations of traditional tools and the high cost of experimental procedures. Therefore, there is an urgent
need to develop new tools for analyzing NG breakpoint-specific features systemically in the human genome
and predict their originating and regulatory mechanisms, such as upstream and downstream effectors. In-depth
annotation based on NG structure is important for understanding the cellular mechanisms of NGs. Effective
use of systematic bioinformatics tools for functional annotation can provide a deeper insight into the role of
NGs in the development and progression of diseases such as cancers to find direct and indirect therapeutic
targets. In this study, we will develop five bioinformatics tools for the functional annotation and feature analysis
of NGs, a predictive pipeline for automatic analysis of downstream effects of NGs, and a predictive method for
tracing the origin of NGs.

## Key facts

- **NIH application ID:** 10241506
- **Project number:** 5R35GM138184-02
- **Recipient organization:** UNIVERSITY OF TEXAS HLTH SCI CTR HOUSTON
- **Principal Investigator:** Pora Kim
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $323,658
- **Award type:** 5
- **Project period:** 2020-09-01 → 2025-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10241506

## Citation

> US National Institutes of Health, RePORTER application 10241506, Functional annotation of new genes aided by deep learning (5R35GM138184-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10241506. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
