Leveraging machine learning and evolution to navigate sequence-function landscapes in multidomain proteins

NIH RePORTER · NIH · K99 · $113,368 · view on reporter.nih.gov ↗

Abstract

Project Summary/Abstract Allostery, the phenomenon describing how the state of one site in a protein is coupled to the state of a distal site, is a fundamental driver of functional evolution in protein families. It is especially impactful in multimeric and multidomain proteins – those that arise from the recombination of protein domains that are structurally and functionally distinct. The goal of this proposal is to develop methods that combine computational and experimental approaches to understand the role of allostery in establishing new functions by coupling enzymatic activity to biological processes at the membrane. Insights gained in this work will enable us to better understand how domain recombination has expanded the functional repertoires of protein families, and will enable more efficient engineering of synthetic proteins. In Aim 1 of this proposal, I will leverage recent advances in machine learning and computational geometry to develop more accurate generative models of protein families that implicitly account for evolutionary processes that act upon them. In Aim 2, I will conduct a systematic investigation into sequence-function landscape of a dimeric bacterial bicarbonate transporter that couples proton transport across the membrane to enzymatic production of bicarbonate. Using deep mutational scans in the context of a suppressor screen, I will identify sequence positions that decouple enzymatic activity from proton transport and will use this knowledge to test structure-function hypotheses related to allostery in this protein system. In Aim 3, I will use machine learning models fit to protein families to rationally design focused deep mutational scans to explore allostery in human atrial natriuretic peptide receptors. These receptors directly couple ligand binding to secondary messenger production in a single polypeptide chain containing multiple distinct domains. Using information from evolution will help me make more effective use of an experimentally limited mutational budget and will allow me to interrogate the higher order interactions that are a hallmark of allosteric networks. My background in structural biology and subsequent training in biological machine learning give me a unique perspective and skillset to tackle these challenging problems. The engaging scientific environment at UC Berkeley, and the strong support of my mentors Dr. Yun Song and Dr. David Savage will enable me to more seamlessly operate at the interface of computation and experimentation in biology as I launch my independent research career.

Key facts

NIH application ID
10785243
Project number
1K99GM152766-01
Recipient
UNIVERSITY OF CALIFORNIA BERKELEY
Principal Investigator
Antoine Koehl
Activity code
K99
Funding institute
NIH
Fiscal year
2024
Award amount
$113,368
Award type
1
Project period
2023-12-01 → 2025-11-30