Data to Design: An Integrated Approach to Developing New Synthetic Methods

NIH RePORTER · NIH · K99 · $122,904 · view on reporter.nih.gov ↗

Abstract

Project Summary Machine learning (ML)—i.e., the use of computer algorithms that can automatically learn from sampled data and make predictions or decisions without explicit programming—is increasingly important in a wide array of applications, from image and speech recognition to product recommendation systems. Meanwhile, synthetic chemistry plays a central role in the development of medicines, agrochemicals, fine chemicals, and new materials, but the field has traditionally shown a strong aversion to adopting ML tools. A fundamental challenge in synthetic chemistry is to expedite access to high-value building blocks in a predictable and efficient manner to accelerate discovery programs. However, the development and optimization of new synthetic methodologies have traditionally relied on empirical methods. This trial-and-error approach wastes crucial time and resources, limits the likelihood of unexpected discoveries, and fails to identify reactivity cliffs or rationalize the role of additives. The goal of this proposed project is to integrate ML with synthetic chemistry to provide solutions to these longstanding challenges, particularly in the contexts of med-chem library preparation, process optimization, and rapid assembly of chiral bioactive structures. Two aims of this career development application are: (a) Mentored phase (K99): My short-term goal is to learn ML and data science tools, while developing ML workflows that reduce the number of experiments needed to obtain the desired outcome of any chemical reactions (i.e., optimization). This will be realized by undertaking three distinct types of optimization campaign, in the form of three case studies (A1, A2, and A3) that reflect those typically encountered in chemistry settings. (b) Independent phase (Roo): Armed with a better understanding of ML and data science, my long-term goal is to facilitate design and discovery of robust new asymmetric methods. This will be achieved by engaging in three different case studies (B1, B2, and B3) where stereoselectivity is currently poor or nonexistent. These projects will enable me to create my own niche in catalytic research. Integration of my established expertise (asymmetric synthesis and comp chem) with that of the host lab (ML, data science, and photoredox catalysis), together with enabling technologies from Merck and Genentech (HTE), will collectively confer the capability to accomplish these overall goals. The excellent facilities of UCLA will be augmented by close industry collaboration and the active support of the C-CAS consortium. Overall, through this fellowship, I will gain critical mentored training in both academic and industry settings, build new professional skills, and achieve distinctive academic independence in biomedical research.

Key facts

NIH application ID
10887206
Project number
1K99GM151453-01A1
Recipient
UNIVERSITY OF CALIFORNIA LOS ANGELES
Principal Investigator
Rajat Maji
Activity code
K99
Funding institute
NIH
Fiscal year
2024
Award amount
$122,904
Award type
1
Project period
2024-07-01 → 2026-06-30