# Detecting structural variants in a large population of samples through high-throughput sequencing data

> **NIH NIH R35** · VANDERBILT UNIVERSITY · 2022 · $388,056

## Abstract

PROJECT SUMMARY
The mapping of the human genome and genome wide association studies have provided great insights in our
understanding of the genetic etiology of hereditary diseases; however, critical gaps remain. A type of genetic
variations that has been difficult to detect in genomic studies has been Structural Variants (SVs), disruptions
involving more than 50 base pairs. SVs have been implicated in a lot of inherited diseases and cancers, yet
their detection remains challenging with conventional DNA sequencing methods. Developments in third-
generation sequencing (linked-read and long-read sequencing) and single-cell RNA sequencing (scRNA-seq)
provide an opportunity to greatly improve the detection of SVs and Copy Number Variations (CNVs), one
common type of SVs. However, existing computational tools do not fully take advantage of the potential and
the opportunities that these technologies offer. In this project, drawing from our unique expertise in this rapidly
evolving area, we propose the development of a new generation of tools that will improve greatly the detection
and phasing of SVs from a large population of samples. We will develop computational tools to generate a
high-quality diploid assembly from each individual and to combine data from large populations of controls and
patients to characterize SVs that confer risk for any particular disease. We will further design a haplotype-
based linkage disequilibrium (LD) mapping approach at the whole genome scale to identify unique sharing
haplotype patterns and provide a new perspective for complex disease studies. Detecting SVs in combination
with small variants will further allow us to explain the etiology of complex diseases. We will also develop
algorithms to detect CNVs from scRNA-seq datasets, which have application in cancer studies. Successful
completion of this project will constitute a major step forward in uncovering the genetic cause of complex
diseases and cancers.

## Key facts

- **NIH application ID:** 10500679
- **Project number:** 1R35GM146960-01
- **Recipient organization:** VANDERBILT UNIVERSITY
- **Principal Investigator:** Xin Maizie Zhou
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $388,056
- **Award type:** 1
- **Project period:** 2022-09-20 → 2027-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10500679

## Citation

> US National Institutes of Health, RePORTER application 10500679, Detecting structural variants in a large population of samples through high-throughput sequencing data (1R35GM146960-01). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10500679. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
