# Development of large-scale sequence-function relationship using in situ optical sequencing

> **NIH NIH R21** · UNIVERSITY OF OKLAHOMA HLTH SCIENCES CTR · 2024 · $176,600

## Abstract

The exploration and discovery of living systems has been greatly aided by modern protein tools. These
tools pervade many fields of biology, from the sub-cellular scale to the cellular scale to the systems scale.
Bioengineers have made substantial progress in expanding the functionality and enhancing the performance of
these protein tools, but progress is slow due to the community's limited understanding of how a protein's
sequence relates to a protein's function.
 A long-term goal of the community is to develop a more detailed understanding of the sequence-function
relationship. This understanding will allow the field to intelligently predict, design, and identify high-performing
protein tools. The technical challenge to accessing the detailed sequence-function relationship is the inability to
densely sample the large landscape of potential protein sequences: there are approaching infinite possibilities
of placing any of the twenty amino acids in the hundreds to thousands of residue positions of a protein. A typical
lab may screen a small portion of this landscape with a limited number of mutations scattered throughout the
protein or targeted to key regions of the protein. Even during these screens, limitations in the scale of resources
needed to functionally assess individual protein variants or sequence individual variants hinder full access to the
sequence-function relationship. The typical lab either functionally screens candidates in detail or sequences the
candidates in detail, but not both. This incomplete matching between functional information and sequence
information in turn prevents accurate predictions that improve protein function.
 The immediate goal of this proposal is to create an optical screening technology that explores the detailed
protein fitness landscape with full sequence and function information on a scale 1-2 orders larger than the scale
of existing screens. We will achieve this scale by using optical imaging to perform both the functional assessment
and sequencing in situ. We will develop such a technology across 3 aims: (1) We will optically quantify the
function of a library of fluorescent protein mutants on large scales within a culture well. (2) In the same well, we
will optically sequence individual mutants using recently-developed commercial chemistries and a barcode
lookup system. Because the sequencing and functional assessment occur in the same well, we will develop the
relationship between sequence and function at the resolution of single protein variants. (3) We will develop a
pipeline of image processing techniques that automatically and accurately segment individual cells and calls the
bases within each cell footprint throughout the culture well. If successful, the combination of our three aims will
enable a typical lab to screen protein tools on large scales with full sequence and function information. We expect
our technology to take advantage of existing commodity goods and translate easily from lab to l...

## Key facts

- **NIH application ID:** 10920356
- **Project number:** 5R21GM148901-03
- **Recipient organization:** UNIVERSITY OF OKLAHOMA HLTH SCIENCES CTR
- **Principal Investigator:** Yiyang Gong
- **Activity code:** R21 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $176,600
- **Award type:** 5
- **Project period:** 2023-09-15 → 2026-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10920356

## Citation

> US National Institutes of Health, RePORTER application 10920356, Development of large-scale sequence-function relationship using in situ optical sequencing (5R21GM148901-03). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10920356. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
