Development of large-scale sequence-function relationship using in situ optical sequencing

NIH RePORTER · NIH · R21 · $176,600 · view on reporter.nih.gov ↗

Abstract

The exploration and discovery of living systems has been greatly aided by modern protein tools. These tools pervade many fields of biology, from the sub-cellular scale to the cellular scale to the systems scale. Bioengineers have made substantial progress in expanding the functionality and enhancing the performance of these protein tools, but progress is slow due to the community's limited understanding of how a protein's sequence relates to a protein's function. A long-term goal of the community is to develop a more detailed understanding of the sequence-function relationship. This understanding will allow the field to intelligently predict, design, and identify high-performing protein tools. The technical challenge to accessing the detailed sequence-function relationship is the inability to densely sample the large landscape of potential protein sequences: there are approaching infinite possibilities of placing any of the twenty amino acids in the hundreds to thousands of residue positions of a protein. A typical lab may screen a small portion of this landscape with a limited number of mutations scattered throughout the protein or targeted to key regions of the protein. Even during these screens, limitations in the scale of resources needed to functionally assess individual protein variants or sequence individual variants hinder full access to the sequence-function relationship. The typical lab either functionally screens candidates in detail or sequences the candidates in detail, but not both. This incomplete matching between functional information and sequence information in turn prevents accurate predictions that improve protein function. The immediate goal of this proposal is to create an optical screening technology that explores the detailed protein fitness landscape with full sequence and function information on a scale 1-2 orders larger than the scale of existing screens. We will achieve this scale by using optical imaging to perform both the functional assessment and sequencing in situ. We will develop such a technology across 3 aims: (1) We will optically quantify the function of a library of fluorescent protein mutants on large scales within a culture well. (2) In the same well, we will optically sequence individual mutants using recently-developed commercial chemistries and a barcode lookup system. Because the sequencing and functional assessment occur in the same well, we will develop the relationship between sequence and function at the resolution of single protein variants. (3) We will develop a pipeline of image processing techniques that automatically and accurately segment individual cells and calls the bases within each cell footprint throughout the culture well. If successful, the combination of our three aims will enable a typical lab to screen protein tools on large scales with full sequence and function information. We expect our technology to take advantage of existing commodity goods and translate easily from lab to l...

Key facts

NIH application ID
10920356
Project number
5R21GM148901-03
Recipient
UNIVERSITY OF OKLAHOMA HLTH SCIENCES CTR
Principal Investigator
Yiyang Gong
Activity code
R21
Funding institute
NIH
Fiscal year
2024
Award amount
$176,600
Award type
5
Project period
2023-09-15 → 2026-08-31