# Accessing and Expanding Natural Products Chemical Diversity by Big-data Analysis and Biosynthetic Investigation

> **NIH NIH R35** · UNIVERSITY OF SOUTH CAROLINA AT COLUMBIA · 2024 · $315,897

## Abstract

PROJECT SUMMARY/ABSTRACT
Natural products (NPs) have historically been a critical source of bioactive molecules, with NPs and their
derivatives making up over 50% of FDA-approved small molecule drugs. In recent years, NP-based drug
discovery is facing a fundamental barrier in identifying new drugs due to repeated rediscovery of the same or
similar compounds, representing limited chemical diversity. Fortunately, since NPs have been evolving over
billions of years in trillions of vastly diverse environments, there is an abundance of new bioactive NPs encoded
in nature which may be useful as drugs. However, their accessibility is a problem: only less than 10% of NP
biosynthetic gene clusters (BGCs) have been connected to existing NPs, leaving the vast majority of BGCs
untapped as to what NPs they may produce. The overall goal of this research program is to leverage big-data
informatic analysis and biosynthetic investigation to access and convert the tremendous genetic potential of
these “orphan BGCs”, BGCs with unknown products, into chemical reality, connecting them to their products and
in turn supplying structurally diverse pools of NPs for drug discovery screening. To this end, we propose two
research directions: (1) Utilizing our established big-data correlational networking analysis, we have identified
hidden proteases missing from the BGCs of almost all class III lanthipeptides. We previously used this method
to discover two new families of class III lanthipeptides from Firmicutes for the first time. We will leverage these
hidden proteases to further unlock the inherent chemical diversity of lanthipeptides and generate two libraries of
natural and non-natural peptides through in vitro enzymatic synthesis and targeted biosynthetic engineering for
drug discovery screening. (2) Mining the untapped microbial genetic potential, with an initial emphasis on sulfur-
containing NPs and unprecedented biosynthetic pathway hybridization, we have prioritized two promising orphan
BGCs with highly unique enzymology and connected them to their native products. The first features a novel S-
hydroxylating flavoprotein, potentially involved in the formation of a new sulfur-containing functionality. The
second has an unprecedented terpenoid-fatty acid-non-ribosomal peptide hybridization mediated by unusual
cross-pathway enzymatic combinations. We will further investigate the new biosynthesis harbored by these
BGCs to produce new NPs, inform future genome mining of similar pathways, and enable pathway engineering
to further increase NPs chemical diversity. Our significant progress in both research directions supports the
feasibility of this proposal as well as our competence to establish a successful and sustainable independent
program in this field. We have fostered several key collaborations in bioactivity screening and protein structural
biology that further strengthen our research program. In addition, this program will provide opportunities to train
undergra...

## Key facts

- **NIH application ID:** 10907760
- **Project number:** 5R35GM150565-02
- **Recipient organization:** UNIVERSITY OF SOUTH CAROLINA AT COLUMBIA
- **Principal Investigator:** Jie Li
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $315,897
- **Award type:** 5
- **Project period:** 2023-09-01 → 2028-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10907760

## Citation

> US National Institutes of Health, RePORTER application 10907760, Accessing and Expanding Natural Products Chemical Diversity by Big-data Analysis and Biosynthetic Investigation (5R35GM150565-02). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/10907760. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
