# Unlocking sequence-structure-function-disease relationships in large protein super-families

> **NIH NIH R35** · UNIVERSITY OF GEORGIA · 2023 · $137,650

## Abstract

PROJECT SUMMARY (unchanged)
Predicting disease phenotypes from genotypes is a grand challenge in biology and
personalized medicine. Our long-term goal is to address this challenge using a
combination of computational and experimental approaches. Working towards this goal,
we have developed and deployed a powerful evolutionary systems approach to map the
complex relationships connecting sequence, structure, function, regulation and disease
in biomedically important protein super-families such as protein kinases. We have made
important contributions describing the unique modes of allosteric regulation in various
protein kinases, deciphering the structural basis of oncogenic activation in a subset of
receptor tyrosine kinases, uncovering the regulation of pseudokinases, and developing
new tools and resources for addressing data integration challenges in the signaling field.
We propose to build on these impactful studies to answer key questions emanating from
our ongoing studies such as: What are the functions of pseudokinases, the catalytically-
inert members of the kinome, and how can we use pseudokinases to better predict and
characterize non-catalytic functions of kinases? What are the functions of conserved
cysteine residues in regulatory sites of protein and small molecule kinases and are they
post-translationally modified in redox signaling and oxidative stress response that are
causally associated with age-related disorders? How can we enhance existing
computational models for predicting genome-phenome relationships using structural
information, and can machine learning on structurally enhanced knowledge graphs reveal
new relationships between patient-derived mutations and disease phenotypes? We
propose to answer these questions using a variety of approaches including statistical
mining of large sequence datasets, molecular dynamics simulations, machine learning,
mass spectrometry, biochemical analysis and in vivo assays. Completion of this work is
expected to reveal new allosteric sites for targeting pseudokinase and kinase non-
catalytic functions in diseases, and significantly advance our understanding of kinase
regulatory mechanisms in disease and normal states. Our work will create new tools and
resources for knowledge graph mining and provide explainable models for inferring
causal relationships linking genomes and phenomes with potential applications in
personalized medicine. Finally, the scope and impact of our work will be significantly
broadened by participation in studies extending our specialized tools and technological
approaches developed for the study of kinases to other biomedically important gene
families such as glycosyltransferases and sulfotransferases.

## Key facts

- **NIH application ID:** 10793016
- **Project number:** 3R35GM139656-03S1
- **Recipient organization:** UNIVERSITY OF GEORGIA
- **Principal Investigator:** Natarajan Kannan
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2023
- **Award amount:** $137,650
- **Award type:** 3
- **Project period:** 2021-02-01 → 2026-01-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10793016

## Citation

> US National Institutes of Health, RePORTER application 10793016, Unlocking sequence-structure-function-disease relationships in large protein super-families (3R35GM139656-03S1). Retrieved via AI Analytics 2026-05-25 from https://api.ai-analytics.org/grant/nih/10793016. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*