# Advanced Computational Approaches for NMR Data-mining

> **NIH NIH R01** · BAYLOR COLLEGE OF MEDICINE · 2020 · $356,625

## Abstract

ABSTRACT
Nuclear magnetic resonance spectroscopy (NMR)-based metabolomics is a powerful method for identifying
metabolic perturbations that report on different biological states and sample types. Compared to mass
spectrometry, NMR provides robust and highly reproducible quantitative data in a matter of minutes, which
makes it very suitable for first-line clinical diagnostics. Although the metabolome is known to provide an
instantaneous snap-shot of the biological status of a cell, tissue, and organism, the utilization of NMR in clinical
practice is hindered by cumbersome data analysis. Major challenges include high-dimensionality of the data,
overlapping signals, variability of resonance frequencies (chemical shift), non-ideal shapes of signals, and low
signal-to-noise ratio (SNR) for low concentration metabolites. Existing approaches fail to address these
challenges and sample analysis is time-consuming, manually done, and requires considerable knowledge of
NMR spectroscopy. Recent developments in the field of sparse methods for machine learning and accelerated
convex optimization for high dimensional problems, as well as kernel-based spatial clustering show promise at
enabling us to overcome these challenges and achieve fully automated, operator-independent analysis. We
are developing two novel, powerful, and automated algorithms that capitalize on these recent developments in
machine learning. In Aim 1, we describe ‘NMRQuant’ for automated identification and quantification of
annotated metabolites irrespective of the chemical shift, low SNR, and signal shape variability. In Aim 2, we
describe ‘SPA-STOCSY’ for automated de-novo identification of molecular fragments of unknown, non-
annotated metabolites. Based on substantial preliminary data, we propose to evaluate these algorithms'
sensitivity, specificity, stability, and resistance to noise on phantom, biological, and clinical samples, comparing
them to current methods. We will validate the accuracy of analyses by experimental 2D NMR, spike-in, and
mass spectrometry. The proposed efforts will produce new NMR analytical software for discovery of both
annotated and non-annotated metabolites, substantially improving accuracy and reproducibility of NMR
analysis. Such analytical ability would change the existing paradigm of NMR-based metabolomics and provide
an even stronger complement to current mass spectrometry-based methods. This approach, once thoroughly
validated, will enable NMR to reach wide network of applications in biomedical, pharmaceutical, and nutritional
research and clinical medicine.

## Key facts

- **NIH application ID:** 9889134
- **Project number:** 5R01GM120033-04
- **Recipient organization:** BAYLOR COLLEGE OF MEDICINE
- **Principal Investigator:** Zhandong Liu
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $356,625
- **Award type:** 5
- **Project period:** 2017-01-01 → 2021-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9889134

## Citation

> US National Institutes of Health, RePORTER application 9889134, Advanced Computational Approaches for NMR Data-mining (5R01GM120033-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/9889134. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
