# Digital representation of chemical mixtures to aid drug discovery and formulation

> **NIH NIH R44** · COLLABORATIVE DRUG DISCOVERY, INC. · 2021 · $750,747

## Abstract

PROJECT SUMMARY
Collaborative Drug Discovery, Inc. (CDD) proposes to develop a suite of software modules to enable scientists
to unambiguously represent chemical mixtures in standard machine-readable formats, filling an urgent and
widely-recognized need. Chemicals are typically formulated as mixtures. Recording and communicating infor-
mation about chemical mixtures is essential for scientists and support staff in the pharmaceutical industry, in
academia, in non-profit research organizations, in government, at specialty chemical vendors, and at commer-
cial manufacturers to:
• discover, develop, formulate, manufacture and regulate drugs;
• manage reagent inventories; comply with laboratory safety requirements; inform first responders;
• describe and reproduce biomedical experiments; and
• assess and disseminate information about toxicity risks of chemical reagents and consumer products.
 A working committee of the International Union of Pure and Applied Chemistry (IUPAC) is close to for-
malizing “Mixtures InChI” (or MInChI), which will extend the International Chemical Identifier (InChI) to be-
come the first standard to encompass mixtures. MInChI will effectively index mixtures in the same way that
InChI indexes individual compounds.
 In Phase 1 CDD developed the data structures and software necessary to enable adoption and utilization
of MInChI and create the first general-purpose system for recording information about chemical mixtures that
is computable and interoperable. In Phase 2 CDD will continue to develop a sophisticated automated transla-
tion tool that will accurately convert legacy catalogs of chemical mixtures from plaintext descriptions or ad hoc
formats so that they are properly represented in a machine readable format that can in turn be easily rendered
into MInChI identifiers. The broad vision is to help industry to overcome the barriers to adoption so that ma-
chine readable mixture descriptions can quickly deliver benefits for drug discovery, chemical safety, and toxi-
cology.

## Key facts

- **NIH application ID:** 10074602
- **Project number:** 5R44TR002528-03
- **Recipient organization:** COLLABORATIVE DRUG DISCOVERY, INC.
- **Principal Investigator:** BARRY A BUNIN
- **Activity code:** R44 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $750,747
- **Award type:** 5
- **Project period:** 2019-12-19 → 2021-11-30

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10074602

## Citation

> US National Institutes of Health, RePORTER application 10074602, Digital representation of chemical mixtures to aid drug discovery and formulation (5R44TR002528-03). Retrieved via AI Analytics 2026-06-14 from https://api.ai-analytics.org/grant/nih/10074602. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
