# Trinity: Transcriptome assembly for genetic and functional analysis of cancer

> **NIH NIH U24** · BROAD INSTITUTE, INC. · 2021 · $78,000

## Abstract

Abstract/Summary
Viruses play a role in 10 to 20% of human cancers. Tying specific viruses to human cancers remains a
fundamental biomedical and technological problem. One virus family, the human papillomaviruses (HPVs),
comprises over 225 known types and causes lesions ranging from benign warts to highly lethal, invasive
carcinomas. HPV-induced tumors occur at several anatomical sites including nearly 100% of cervical
cancers, 90% of anal cancers, 30 to 60% of head & neck tumors, and roughly one-fourth to one-half of vaginal,
vulvar & penile cancers. In most HPV-induced tumors, at least part of the viral DNA genome has become
integrated into the human genome, presumably as a consequence of aberrant host cell DNA repair processes.
Our laboratory work has been focused on the development of accurate, sensitive, broad specificity technology
based on DNA hybridization capture plus massively parallel sequencing to detect and structurally analyze
integrated HPV DNA in precancerous lesions and invasive tumors. Ongoing studies are also developing
fluorescent microscopy technologies for detection of both integrated HPV DNA and HPV RNA transcripts, as
these provide highly specific, potential diagnostic tools to detect the presence of HPV-induced tumor cells,
including post-therapy residual disease in clinical samples. Our work encompasses long-range, nanopore
DNA sequencing plus RNA sequencing to confirm any complex structural rearrangements of the HPV and
human genomes and to elucidate potential viral effects on viral and human gene transcription. Key to these
studies has been a very successful, collaborative effort by the Haas group (Broad Institute at MIT) with the
Lenz (Albert Einstein College of Medicine) and Montagna (Rutgers Cancer Institute of New Jersey) labs to
expand the computational tools of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT). This has led to
the development of expanded CTAT components for detection of integrated HPV DNA, structural assembly
of integrated viral DNA segments, and overall transcriptional analysis of HPV-induced tumors and
precancerous lesions. The CTAT-virus insertion finder (CTAT-VIF) is available on GitHub. The IMAT-ITCR
collaboration proposed here will substantially expand the ongoing collaboration by pursuing three additional
aims. Aim 1 will expand CTAT-VIF to analyze the presence of ~13,000 different viruses and investigate if
any viral DNAs are integrated and/or expressed by screening all The Cancer Genome Atlas (TCGA) and
Genotype-Tissue Expression Project (GTEx) datasets. Aim 2 is to identify and analyze novel virus-human
fusion transcripts generated from virus DNA insertions. Aim 3 will broaden our in-situ analysis of HPV-
induced cervical cancers and precancerous lesions to include spatial transcriptomic analysis, including
expanding CTAT for this technology. These studies should improve computational and laboratory
technologies for understanding the roles of viruses in human cancer.

## Key facts

- **NIH application ID:** 10468389
- **Project number:** 3U24CA180922-09S1
- **Recipient organization:** BROAD INSTITUTE, INC.
- **Principal Investigator:** Eric Banks
- **Activity code:** U24 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $78,000
- **Award type:** 3
- **Project period:** 2013-09-17 → 2023-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10468389

## Citation

> US National Institutes of Health, RePORTER application 10468389, Trinity: Transcriptome assembly for genetic and functional analysis of cancer (3U24CA180922-09S1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10468389. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
