# Computational methods for detecting patterns of complex genomic variation

> **NIH NIH R01** · UNIVERSITY OF CALIFORNIA, SAN DIEGO · 2020 · $289,446

## Abstract

Project Summary
Structural variations (SVs) – involving changes in copy number, inversions, translocations, and other
mechanisms– are an important source of genetic variation. They occur in the germ-line and also in so-
matic cells, where they sometimes play an outsized role in diseases, cancer being a prominent example.
Much work has been done in identifying and cataloging `simple' variants such as deletions, duplications,
translocations, and others. In contrast, our continuing proposal is about `complex' structural variation,
characterized by extensive structural changes involving multiple breakpoints and simple SV events. In
previous research funded by the grant (17 publications), we developed and extended tools for identifying
complex SVs including Breakage Fusion Bridge characterized by speciﬁc copy number patterns, detec-
tion of chains of disparate genomic segments as deﬁned by Chromothripsis and Chromoplexy, and viral
mediated rearrangements. Perhaps most relevant to the current proposal, is the problem of determining
architecture and origin of focal ampliﬁcation of smaller (< 10Mb) genomic segments. Working with col-
laborators, we observed an abundance of large circular, extrachromosomal DNA (Turner, Nature 2017),
detecting them in 40% of all cancer samples across a multitude of histological subtypes. EcDNA are hot-
spots for complex, even multi-chromosomal genomic rearrangements, and o↵er a mechanistic explanation
of focal ampliﬁcations. These discoveries were supported by the devlopment of many computational tools:
AmpliconArchitect (AA) for reconstructing the ﬁne structure of ecDNA using Illumina short-reads,
ViFi for identifying complex variation due to viral integration in humans, and ecDetect for detection
and quantiﬁcation of ecDNA in cytogenetic images acquired in metaphase. For this grant, we will (i)
develop Amplicon Reconstructor (AR) as a tool for disambiguated AA reconstructed amplicons using long
reads–Oxford Nanopore, Paciﬁc Biosciences, and Optical Nanopore technology; (ii) use AR to understand
the evolution of complex structural variation thorugh directed evolution of ecDNA in the lab; and (iii),
integrate data from thousands of whole genome sequences, transcript and other epigenetic data to elucidate
the functional aspects of ecDNA elements.

## Key facts

- **NIH application ID:** 9818448
- **Project number:** 2R01GM114362-05
- **Recipient organization:** UNIVERSITY OF CALIFORNIA, SAN DIEGO
- **Principal Investigator:** Vineet Bafna
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $289,446
- **Award type:** 2
- **Project period:** 2016-01-01 → 2023-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9818448

## Citation

> US National Institutes of Health, RePORTER application 9818448, Computational methods for detecting patterns of complex genomic variation (2R01GM114362-05). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9818448. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
