Detection and annotation of structural variants from long-read sequencing

NIH RePORTER · NIH · R01 · $77,232 · view on reporter.nih.gov ↗

Abstract

PROJECT SUMMARY The overarching goal of this project is to develop a suite of computational tools to detect structural variants (SVs) by long-read sequencing, and to facilitate their annotation and clinical interpretation. Although short-read sequencing has been widely used in research and clinical settings, it has limited ability to identify SVs due to the presence of repeat elements. It is known that pathogenic SVs might be missed by short-read sequencing, potentially contributing to the low diagnostic rates (~30-40%) in clinical genome/exome sequencing. The lack of reliable tools for clinical interpretation of SVs further limits our ability to identify mutations that contribute to human diseases. To address these challenges, we will develop LinkedSV to detect SVs from linked-read genome and exome sequencing data generated by the 10X Genomics platform, and develop LongSV to detect SVs from PacBio and Nonopore long-read sequencing data. We will also develop LabelSV to analyze optical mapping data from Bionano Genomics, and to characterize complex SVs by integrating kilobase-resolution SV calls from optical mapping and base-resolution SV calls from sequencing platforms. Finally, based on our prior development of ANNOVAR and InterVar tools, we will develop a computational method to facilitate clinical interpretation of SVs. By integrating gene dosage sensitivity, mutation intolerance, and phenotype information, this method helps clinical interpretation of candidate SVs on disease phenotypes. Taken together, our methods will streamline the workflow for SV detection and variant interpretation. We will distribute and maintain user-friendly software tools to implement the proposed SV detection methods, and to generate reproducible and traceable results that conform to the current and future versions of ACMG (American College of Medical Genetics and Genomics) / AMP (Association for Molecular Pathology) guidelines. We believe that our methods will substantially improve SV detection, enable consistent interpretation of SVs, and facilitate the implementation of genome-guided precision medicine.

Key facts

NIH application ID
10113101
Project number
3R01GM132713-02S1
Recipient
CHILDREN'S HOSP OF PHILADELPHIA
Principal Investigator
Kai Wang
Activity code
R01
Funding institute
NIH
Fiscal year
2020
Award amount
$77,232
Award type
3
Project period
2019-06-01 → 2023-03-31