# Administrative Supplement: Improving AI/ML-Readiness of data generated from HALBE or other NIH-funded research

> **NIH NIH R01** · UNIVERSITY OF NORTH TEXAS HLTH SCI CTR · 2021 · $142,142

## Abstract

PROJECT SUMMARY
Currently, data collected and shared in the biorepository of our parent project HABLE continues to expand
rapidly. Artificial Intelligence and Machine Learning (AI/ML) cannot make these data valuable to biomedical
research until these data are AI/ML-ready. Therefore, there is an urgent need to develop effective AI/ML
readiness for HABLE and other NIH-funded data-sharing projects.
This proposal will focus on three critical and common areas to improve the AI/ML-readiness of data generated
from our parent HABLE project: missing data imputation, feature selection and outlier removal, and data
readiness report. We will address the following three specific aims: Aim 1) Develop a Machine Learning Based
Multiple Imputation Method for Handling Missing Data; Aim 2) Develop a Recursive Feature Elimination and
Cross-Validation (RFE-CV) Algorithm for Feature Selection and Outlier removal, and Aim 3) Develop an
Integrated Tool to Report Data Readiness.
The algorithms and tools from this application will be the first of their kind to report data readiness for NIH data-
sharing projects to facilitate heterogeneous data and feature engineering for AI/ML. It will make data scientists
improve their AI/ML modeling more effectively and effortlessly. The administrative supplement project will
benefit not only the parent HABLE project but also all other NIH-funded data-sharing projects. We expect that
with the development of the algorithms and tools we will complete high data readiness in the HABLE project,
which will eventually make HABLE more innovative in developing state of the art methods for Alzheimer’s
Disease (AD) clinical trials, leading to the development of effective personalized treatments which slow the
progression of, and prevent, AD.

## Key facts

- **NIH application ID:** 10415363
- **Project number:** 3R01AG058533-02S1
- **Recipient organization:** UNIVERSITY OF NORTH TEXAS HLTH SCI CTR
- **Principal Investigator:** LEIGH A JOHNSON
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $142,142
- **Award type:** 3
- **Project period:** 2020-08-01 → 2025-04-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10415363

## Citation

> US National Institutes of Health, RePORTER application 10415363, Administrative Supplement: Improving AI/ML-Readiness of data generated from HALBE or other NIH-funded research (3R01AG058533-02S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10415363. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
