ABSTRACT Computer aided drug discovery (CADD) can dramatically accelerate and lower costs for the often long and expensive process of drug development. However, most CADD techniques are created for small organic molecules, and tend to be less accurate for drugs designed around biological scaffolds or that include unique chemical properties. The overall goal of the Walker lab is to develop and apply multiscale models for rational design of complex biomolecules. In this proposal, we detail our strategies to: automate the creation of large, high quality datasets with accurate simulations, and develop and apply machine learning (ML) models to design new nucleic acid-based imaging agents and carbohydrate-based drugs. Carbohydrates, particularly polysaccharides, are highly flexible, and our previous work has demonstrated the inaccuracy of even very high quality docking scores as compared to experimental affinities. Similarly, the rational design of synthetic fluorescent nucleobases (SFNs) is challenging because understanding their photophysics requires computing excited state properties. In both cases, we hypothesize that by creating ML models that learn the statistical correlation between highly accurate simulations and known experimental properties, we can both learn new rational design principles and predict novel drug targets.