Massively Parallel Biochemical Annotation of Missense DNA Variants to Support Newborn Screening by Whole Genome Sequencing

NIH RePORTER · NIH · R01 · $578,935 · view on reporter.nih.gov ↗

Abstract

Project Summary Newborn screening (NBS) provides early diagnosis of treatable disorders. Most NBS is based on measurement of biomarkers in dried blood spots (DBS). However, there are hundreds of pediatric and actionable diseases for which a biomarker is difficult to imagine. This has led to consideration of whole genome sequencing as a new paradigm in NBS (newborn sequencing, NBSeq). NBSeq has significant challenges one of which is the false negative problem expected when DNA variants of unknown significance (VOUS) are ignored. Ignoring VOUS is a necessary element of NBSeq because downstream biochemical analysis of all newborns with VOUS is not feasible because of the commonality of these DNA variants. The goal of this grant is to develop and explore new technology for massively parallel, rapid, and inexpensive biochemical annotation of VOUS. We have developed a new technique we call VOUSDO where we prepare a library of DNAs that encode every possible amino acid substitution in a protein at one change per protein. Human cells are then transfected with this library, and the cells are engineered to express only a single protein variant per cell. The cells are also engineered to express an RNA barcode whose sequence serves to identify the protein variant. After protein variant expression, the protein is engineered to capture its respective barcode. The degree of barcode capture depends on the abundance of the protein. Thus, any amino acid substitution that causes the protein to misfold and become degraded will lead to a protein variant that under-captures its barcodes. Barcodes are converted to DNA at the end of the analysis and these are subjected to next generation sequencing to reveal their sequences. The DNA sequence read frequency of each protein variant is compared to that of the wild type protein to provide the fraction of variant that folds in cells relative to wild type. The above technique works for cytosolic proteins. For proteins that target cellular organelles or are secreted, we will employ a second technique called VAMP-seq. Here, the protein is fused to a fluorescent protein, and the degree of fluorescence depends on the level of folded protein expressed in cells. Again, cells are engineered so that each cell expresses a single protein variant. RNA barcodes are used as well to identify the protein variant. Fluorescence-activated cell sorting is used to collect cells depending on the extent of fluorescence. Barcodes in each collection bin are read as for VOUSDO to give the relative extent of folding of each variant relative to that of wild type protein. These techniques have the potential to provide biochemical annotation of every single site amino acid substitution in the full list of proteins being included in NBSeq pilot studies worldwide. Knowing which VOUS greatly reduce the extent of protein folding is expected to massively reduce the false negative problem associated with ignoring VOUS in NBSeq programs.

Key facts

NIH application ID
10901730
Project number
1R01HD115326-01
Recipient
UNIVERSITY OF WASHINGTON
Principal Investigator
Michael H Gelb
Activity code
R01
Funding institute
NIH
Fiscal year
2024
Award amount
$578,935
Award type
1
Project period
2024-08-23 → 2028-06-30