# Computational Core

> **NIH NIH U2C** · UNIVERSITY OF GEORGIA · 2021 · $324,755

## Abstract

Overall: Our project combines the significant advantages of a genetic model organism, sophisticated pathway
mapping tools, high-throughput and accurate quantum chemistry (QM), and state-of-the-art experimental
measurements. The result will be an efficient and cost-effective approach for unknown compound identification
in metabolomics, which is one of the major limitations facing this growing field of medical science.
Caenorhabditis elegans has several advantages for this study, including over 10,000 available genetic
mutants, well-developed CRISPR/Cas9 technology, and a panel of over 500 wild C. elegans isolates with
complete genomes. Half of C. elegans genes have homologs to human disease genes, making this model
organism an outstanding choice to improve our understanding of metabolic pathways in human disease. We
will develop an automated pipeline for sample preparation to reproducibly measure tens of thousands of
unknown features by UHPLC-MS/MS. We will use the wild isolates to conduct metabolome-wide genetic
association studies (m-GWAS), and SEM-path to locate unknowns in pathways using partial correlations. The
relevance of the unknown metabolites to specific pathways will be tested by measuring UHPLC-MS/MS data
from genetic mutants of those pathways. Molecular formula and pathway information will be the inputs for
automated quantum mechanical calculations of all possible structures, which will be used to accurately
calculate NMR chemical shifts that will be matched to experimental data. The correct structures will be
validated by comparing them with 2D NMR data of the same compound. The validated computed structures
will then be used to improve QM-based MS/MS fragment prediction, using the experimental UHPLC-MS/MS
data.
The Computational Core (CC) will have two primary components, metabolite pathway mapping and quantum
chemical calculations of NMR and MS/MS data. The pathway mapping interfaces with the Experimental Core
in the generation of m-GWAS results from wild isolates and LC-MS/MS analysis. These genetic associations
will relate known metabolites to known genes. These pathways will be expanded by locating unknown features
through partial correlations, which will significantly reduce the chemical space available to the unknowns. QM
calculations will use this pathway information to limit the number of possible structures for a given molecular
formula, which will be obtained by the Experimental Core. The output of the QM calculations will be accurate
NMR chemical shifts on data from the same chromatographic retention times as the LC-MS/MS of the
unknown, allowing us to find the best computed structure. We also will improve computational MS/MS
predictions. All of the experimental and computational data will be added to a relational database, which will
allow us to search any field (e.g. retention time windows, m/z values, etc.). The CC will provide robust
computing infrastructure at two sites, shared notebooks for analysis, and deposition...

## Key facts

- **NIH application ID:** 10180968
- **Project number:** 5U2CES030167-04
- **Recipient organization:** UNIVERSITY OF GEORGIA
- **Principal Investigator:** Lauren M. MCINTYRE
- **Activity code:** U2C (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $324,755
- **Award type:** 5
- **Project period:** 2018-09-01 → 2023-06-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10180968

## Citation

> US National Institutes of Health, RePORTER application 10180968, Computational Core (5U2CES030167-04). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10180968. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*