# Towards Better Understanding of ALS using a Multi-Marker Discovery Approach from a Multi-Modal Database (ALS4M)

> **NIH ALLCDC R01** · UNIVERSITY OF MISSOURI-COLUMBIA · 2022 · $299,898

## Abstract

PROJECT SUMMARY / ABSTRACT
The overarching goal of this study is to use new large multi-modal data resources and machine-learning-based
data mining algorithm to better understand risk factors and improve diagnosis for people with Amyotrophic
lateral sclerosis (ALS). Amyotrophic lateral sclerosis (ALS) is a rare, fatal neurodegenerative disorder, with
90% sporadic cases do not have genetic causes and their contributing risk factors are largely unknown. Most
of what is known about ALS risk factors comes from epidemiological studies using registry data, which
historically forms the main standardized big data source to help describe the natural history, epidemiology, and
burden of disease; however, the strength of evidence resulting from these studies varies greatly. One potential
major limitation to registry data are the fields collected are based upon known potential risk factors, which have
restricted its usability for exploring novel associations and causalities. Moreover, ALS is a rare disease with low
prevalence, thus making it infeasible to study its etiology using traditional observational study design due to
statistical power constraints. The digitization of healthcare records and the capacity to link to other relevant
data sources now enables a more representative, enriched and statistically powerful study population; and
ideal for leveraging machine-learning-driven, hypothesis-generating models to identify new risk factors and
patterns identify new risk factors important for understanding, diagnosing, or treating people with ALS. Building
on established well-integrated real world big data source and established ensemble embedded feature
selection framework, an established multi-marker (biomarker, clinical marker, geo-marker, socio-marker)
discovery algorithm will be developed to discover novel, generalizable risk factors (Aim 1); new symptomatic
patterns for early diagnosis (Aim 2), and effective clinical care pathways for ALS (Aim 3). To best translate
findings into clinical insights, a multi-disciplinary and multi-stakeholder team has been assembled, including not
only investigators with diverse expertise in statistics, machine learning, clinical research informatics, neurology,
computer science, epidemiology, but also an engaging patient advisory board with diverse social background.
The proposed work will be one of the first pilot studies applying AI/ML-based, hypothesis-generating algorithms
on statistically powerful real-world data to bridge the knowledge gap on ALS risk factors. The work will not only
provide CDC agency of toxic substance and disease registry (ATSDR) with empirical evidence to better
prioritize future decisions on expanding the ALS registry risk factor survey but serve to inform better designed
proposals for future etiological studies and targeted trials for ALS. This study will also provide an exemplar
framework which can be generalizable to advance research of other rare and complex disease domains by
leveraging r...

## Key facts

- **NIH application ID:** 10610610
- **Project number:** 1R01TS000336-01
- **Recipient organization:** UNIVERSITY OF MISSOURI-COLUMBIA
- **Principal Investigator:** Xing Song
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** ALLCDC
- **Fiscal year:** 2022
- **Award amount:** $299,898
- **Award type:** 1
- **Project period:** 2022-09-30 → 2025-09-29

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10610610

## Citation

> US National Institutes of Health, RePORTER application 10610610, Towards Better Understanding of ALS using a Multi-Marker Discovery Approach from a Multi-Modal Database (ALS4M) (1R01TS000336-01). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10610610. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
