# Automated Molecular Identity Disambiguator (AutoMID)

> **NIH NIH R01** · UNIVERSITY OF MIAMI SCHOOL OF MEDICINE · 2022 · $279,970

## Abstract

PROJECT SUMMARY
Small molecules are one of the most important classes of therapeutics alleviating suffering and in many cases
death for hundreds of millions of people worldwide. Small molecules also serve as invaluable tools to study
biology, often with the goal to validate novel targets for the development of future therapeutic drugs.
Reproducibility of experimental results and the interoperability and reusability of resulting datasets depend on
accurate descriptions of associated research objects, and most critically on correct representations of small
molecules that are tested in biological assays. For example, it is not possible to develop predictive models of
protein target - small molecule interactions if their chemical structure representations are not correct. Many
factors contribute to errors in reported chemical structures in small molecule screening and omics reference
databases, scientific publications, and many other web-based resources and documents. Because of the
complexity of representing small molecules chemical structure graphs and the lack of thorough curation, errors
are frequently introduced by non-experts and error propagation across different digital research assets is a
pervasive problem. To address this challenging problem via a scalable approach, we propose the Automated
Molecular Identity Disambiguator (AutoMID). AutoMID will be usable in batch mode at scale via an API, for
example to assist chemical structure standardization and registration by maintainers of digital research assets,
and also via interactive (UI) mode for everyday researchers to quickly and easily validate or correct their small
molecule representations. AutoMID will leverage extensive highly standardized linked databases of chemical
structures and associated information including names, synonyms, biological activity and physical properties
and their sources / provenance and leverage expert rules and AI to enable reliable disambiguation of chemical
structure identities at scale.

## Key facts

- **NIH application ID:** 10357906
- **Project number:** 5R01LM013391-03
- **Recipient organization:** UNIVERSITY OF MIAMI SCHOOL OF MEDICINE
- **Principal Investigator:** BARRY A BUNIN
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $279,970
- **Award type:** 5
- **Project period:** 2020-05-01 → 2024-02-29

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10357906

## Citation

> US National Institutes of Health, RePORTER application 10357906, Automated Molecular Identity Disambiguator (AutoMID) (5R01LM013391-03). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/10357906. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
