# GPU-accelerated high-performance computing to supercharge foundational deep learning method development for scalable and accurate prediction of protein structures

> **NIH NIH R35** · VIRGINIA POLYTECHNIC INST AND ST UNIV · 2024 · $240,509

## Abstract

PROJECT SUMMARY/ABSTRACT:
This supplement aims to acquire a Dell high-performance computing (HPC) server with 8 NVIDIA H100
Graphics Processing Units (GPUs) to supercharge foundational deep learning method development for
scalable and accurate prediction of protein structures, paving the way to genomic-scale computational protein
modeling regardless of evolutionary relationships with previously annotated proteins. Artificial intelligence-
powered methods have led to a paradigm-shift in computational modeling of protein structures, yet even the
most successful approaches for protein structure prediction fail to accurately predict structures of large multi-
domain proteins with complex topologies or proteins with short sequences; and heavily depend on the
availability of evolutionary information that are not always abundant such as with orphan proteins or rapidly
evolving proteins. Work on structure prediction that uses single or few homologous sequences remains
inaccurate and/or inefficient, limiting scaling to genomic protein databases. Latest advances in artificial
intelligence such as foundational deep learning models hold the key to address the limitations. The parent R35
grant of this supplement aims to develop cutting-edge deep learning models to automate genomic-scale
protein structure modeling with the key tasks of: (1) accurate de novo modeling of protein structures beyond
evolutionary relatedness, even with single-sequence input; (2) high-fidelity identification of remotely
homologous proteins despite low sequence similarly to previously annotated proteins; and (3) atomistic
refinement of predicted protein structures to drive them towards experimental resolutions terms of
stereochemical qualities and side-chain positioning. Our substantial progress in the first three years of the
project has demonstrated the feasibility and promise of our approach. However, training and testing
foundational deep learning models leveraging the transformer neural network architectures on evolutionary-
scale molecular data require a large amount of GPU computing power. Using the current GPU resource
available to us, it takes six months for a developer to complete the training and testing of one deep learning
method end to end. While such a speed can yield steady progress, it is not fast enough to unleash the power
of these advanced deep learning methods and realize the full potential and impact of the parent R35 project.
This supplement will enable us to acquire a high-performance computing server consisting of 8 NVIDIA H100
80GB GPUs to significantly speed up the research in the parent R35 project. The requested GPUs can
drastically reduce the time to complete the development a deep learning method from about six months to less
than six weeks, thus dramatically improving the productivity of the developers and in turn accelerating
publication and dissemination of the methods and tools developed in this project. The large shared GPU
memory will enable us to ...

## Key facts

- **NIH application ID:** 11036862
- **Project number:** 3R35GM138146-05S1
- **Recipient organization:** VIRGINIA POLYTECHNIC INST AND ST UNIV
- **Principal Investigator:** Debswapna Bhattacharya
- **Activity code:** R35 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2024
- **Award amount:** $240,509
- **Award type:** 3
- **Project period:** 2020-09-15 → 2025-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/11036862

## Citation

> US National Institutes of Health, RePORTER application 11036862, GPU-accelerated high-performance computing to supercharge foundational deep learning method development for scalable and accurate prediction of protein structures (3R35GM138146-05S1). Retrieved via AI Analytics 2026-05-24 from https://api.ai-analytics.org/grant/nih/11036862. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
