Unlocking sequence-structure-function-disease relationships in large protein super-families

NIH RePORTER · NIH · R35 · $357,769 · view on reporter.nih.gov ↗

Abstract

Project Summary Predicting disease phenotypes from genotypes is a grand challenge in biology and personalized medicine. Our long-term goal is to address this challenge using a combination of computational and experimental approaches. Working towards this goal, we have developed and deployed a powerful evolutionary systems approach to map the complex relationships connecting sequence, structure, function, regulation and disease in biomedically important protein super-families such as protein kinases. We have made important contributions describing the unique modes of allosteric regulation in various protein kinases, deciphering the structural basis of oncogenic activation in a subset of receptor tyrosine kinases, uncovering the regulation of pseudokinases, and developing new tools and resources for addressing data integration challenges in the signaling field. We propose to build on these impactful studies to answer key questions emanating from our ongoing studies such as: What are the functions of pseudokinases, the catalytically-inert members of the kinome, and how can we use pseudokinases to better predict and characterize non-catalytic functions of kinases? What are the functions of conserved cysteine residues in regulatory sites of protein and small molecule kinases and are they post-translationally modified in redox signaling and oxidative stress response that are causally associated with age-related disorders? How can we enhance existing computational models for predicting genome-phenome relationships using structural information, and can machine learning on structurally enhanced knowledge graphs reveal new relationships between patient-derived mutations and disease phenotypes? We propose to answer these questions using a variety of approaches including statistical mining of large sequence datasets, molecular dynamics simulations, machine learning, mass spectrometry, biochemical analysis and in vivo assays. Completion of this work is expected to reveal new allosteric sites for targeting pseudokinase and kinase non-catalytic functions in diseases, and significantly advance our understanding of kinase regulatory mechanisms in disease and normal states. Our work will create new tools and resources for knowledge graph mining and provide explainable models for inferring causal relationships linking genomes and phenomes with potential applications in personalized medicine. Finally, the scope and impact of our work will be significantly broadened by participation in studies extending our specialized tools and technological approaches developed for the study of kinases to other biomedically important gene families such as glycosyltransferases and sulfotransferases.

Key facts

NIH application ID: 10086608
Project number: 1R35GM139656-01
Recipient: UNIVERSITY OF GEORGIA
Principal Investigator: Natarajan Kannan
Activity code: R35
Funding institute: NIH
Fiscal year: 2021
Award amount: $357,769
Award type: 1
Project period: 2021-02-01 → 2026-01-31