CAREER: Machine Learning Algorithms for Tackling Annotation Inequality in Protein Function Characterization

NSF Award Search · 01002829DB NSF RESEARCH & RELATED ACTIVIT · $773,200 · view on nsf.gov ↗

Abstract

Much of life science research is centered on understanding the functional roles of proteins. However, most scientific attention has historically focused on a limited set of increasingly well-known proteins, while the biological functions of the vast majority remain largely unknown, leading to an “annotation inequality” that biases our understanding of protein functions. This project aims to advance protein function annotation by developing artificial intelligence (AI) and machine learning (ML) methods to improve the accuracy and coverage of protein function predictions and to bridge the gap in function knowledge between understudied and well-characterized proteins. The research has potential impacts on advancing our understanding of various topics centered around protein biology and yields new analytic algorithms with broad basic biological and biomedical applications, such as drug discovery, vaccine development, and personalized diagnosis and treatment, ultimately contributing to improvements in human health as well as understanding of basic biology. The research activities are tightly integrated with education and outreach efforts. This project introduces a systematic computational framework for protein function annotation. The research focuses on developing new ML models that are specifically designed to tackle core challenges in protein function annotation, such as annotation bias, data sparsity, and the need for heterogeneous data integration and error-controlled pred

Key facts

NSF award ID
2442063
Awardee
Georgia Tech Research Corporation (GA)
SAM.gov UEI
EMW9FC8J3HN4
PI
Yunan Luo
Primary program
01002829DB NSF RESEARCH & RELATED ACTIVIT
All programs
CAREER-Faculty Erly Career Dev, ADVANCES IN BIO INFORMATICS
Estimated total
$773,200
Funds obligated
$298,251
Transaction type
Continuing Grant
Period
09/01/2025 → 08/31/2030