# CAREER: Machine Learning Algorithms for Tackling Annotation Inequality in Protein Function Characterization

> **NSF 01002829DB NSF RESEARCH & RELATED ACTIVIT** · Georgia Tech Research Corporation (GA) · $773,200

## Abstract

Much of life science research is centered on understanding the functional roles of proteins. However, most scientific attention has historically focused on a limited set of increasingly well-known proteins, while the biological functions of the vast majority remain largely unknown, leading to an “annotation inequality” that biases our understanding of protein functions. This project aims to advance protein function annotation by developing artificial intelligence (AI) and machine learning (ML) methods to improve the accuracy and coverage of protein function predictions and to bridge the gap in function knowledge between understudied and well-characterized proteins. The research has potential impacts on advancing our understanding of various topics centered around protein biology and yields new analytic algorithms with broad basic biological and biomedical applications, such as drug discovery, vaccine development, and personalized diagnosis and treatment, ultimately contributing to improvements in human health as well as understanding of basic biology. The research activities are tightly integrated with education and outreach efforts.

This project introduces a systematic computational framework for protein function annotation. The research focuses on developing new ML models that are specifically designed to tackle core challenges in protein function annotation, such as annotation bias, data sparsity, and the need for heterogeneous data integration and error-controlled pred

## Key facts

- **NSF award ID:** 2442063
- **Awardee organization:** Georgia Tech Research Corporation (GA)
- **SAM.gov UEI:** EMW9FC8J3HN4
- **PI:** Yunan Luo
- **Primary program:** 01002829DB NSF RESEARCH & RELATED ACTIVIT
- **All programs:** CAREER-Faculty Erly Career Dev, ADVANCES IN BIO INFORMATICS
- **Estimated total:** $773,200
- **Funds obligated:** $298,251
- **Transaction type:** Continuing Grant
- **Period:** 09/01/2025 → 08/31/2030

## Primary source

NSF Award Search: https://www.nsf.gov/awardsearch/showAward?AWD_ID=2442063

## Citation

> US National Science Foundation, Award 2442063, CAREER: Machine Learning Algorithms for Tackling Annotation Inequality in Protein Function Characterization. Retrieved via AI Analytics 2026-06-08 from https://api.ai-analytics.org/grant/nsf/2442063. Licensed CC0.

---

*[NSF Awards dataset](/datasets/nsf-awards) · CC0 1.0*
