# Structural variation analysis with and without a reference genome

> **NIH NIH R35** · UNIVERSITY OF ALABAMA AT BIRMINGHAM · 2023 · $359,989

## Abstract

Structural variations (SVs) analysis is very important because they are a major source of genetic variations and
account for a wide range of phenotypes in many species. To better understand their contribution to diversity,
divergence, and a variety of phenotypic traits, we should address two critical issues for SV analysis:
accurate SV characterization and understanding their formation mechanisms. Without accurate SV
results, we may miss the SV events that account for the phenotypes. Without understanding their formation
mechanisms, we may not distinguish the phenotype associated SVs from other SVs. As the sequencing
technology evolves, many new sequencing platforms such as PacBio, Oxford Nanopore, and 10X Genomics
with longer sequencing reads appeared and have demonstrated great potential. However, the computational
algorithms for SV analysis are inadequate for organisms both with and without a reference genome and SV
mechanism analysis was merely based on short (<10bp mostly) breakpoint junction sequences due to
technical limitations. As more of such data is being generated, there is an urgent need to fill in the gap by
developing more accurate and efficient algorithms for SV discovery and establishing an innovative way to
investigate SV formation mechanisms. The long-term goal of the laboratory is to comprehensively characterize
all forms of SVs and understand their functional consequences and formation mechanisms. The goals of the
next three years are to develop efficient algorithms to SV analysis for organisms both with and without a
reference. We will focus on large insertions, inversions, and complex SVs which are always underrepresented.
For organisms with a reference, we will develop a de novo assembly evaluation method to optimize existing
tools and/or develop new assembly methods. Given these toolkits, the goals for the following two years are to
study the SV formation mechanisms based on global genomic architecture. Our central hypothesis is that there
may be some hotspots, signatures around the SV locus either inherited from paternal or maternal genomes
causing the rearrangement formation susceptibility. We will test the hypothesis based on investigating a global
and haplotype picture of SVs using the new sequencing platforms. It is expected that the research will
contribute a suite of robust methods on the long-read sequencing data to identify all forms of SVs with high
sensitivity and precision. Besides, it is expected that this work will provide novel insights into SV formation
mechanisms. The proposed work is innovative in that the proposed computational approach will greatly
improve the sensitivity and precision for SV detection using long sequencing reads under the circumstances of
both with and without a reference genome. Also, the outcomes of this work may vertically advance the SV
mechanism research. The proposed research is significant because it will facilitate the discovery of pathogenic
variations and the establishment o...

## Key facts

- **NIH application ID:** 10655596
- **Project number:** 5R35GM138212-04
- **Recipient organization:** UNIVERSITY OF ALABAMA AT BIRMINGHAM
- **Principal Investigator:** Zechen Chong
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $359,989
- **Award type:** 5
- **Project period:** 2020-07-07 → 2025-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10655596

## Citation

> US National Institutes of Health, RePORTER application 10655596, Structural variation analysis with and without a reference genome (5R35GM138212-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10655596. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
