OAC Core: RINAS: Data I/O CyberInfrastructure for Extreme-scale Foundation Model and Generative AI Training on HPC

NSF Award Search · 01002526DB NSF RESEARCH & RELATED ACTIVIT · $599,091 · view on nsf.gov ↗

Abstract

The RINAS project researches techniques to enable modern AI to be trained using fewer computer resources. Large Language Models (LLMs) have significant computer resource demands, and all modern AI applications across different areas and with various data types will exhibit similar demands on the underlying cyberinfrastructure. AI training is unusually sensitive to the performance of the data center disks on which training data is stored. This is due to the large data sizes, which prevent in-memory storage and necessitate training involving many iterative epochs, during which the full dataset must be read and processed. The project will use and extend an open-source data formatting and compression system developed by the Apache Foundation. Furthermore, it will utilize traditional optimizations, such as parallel computing (where multiple jobs run simultaneously), and modifications to the AI (Artificial Intelligence) models that accelerate training without significantly compromising model accuracy. AI itself will be utilized to enhance the performance of these jobs further. The new software developed will be deployed on supercomputers run by the National Science Foundation (NSF) and the Department of Energy. It will allow the development of new AI models that will benefit the nation across both industry and academia. The project will collaborate with the AI development and use community through three alliances, each having over 100 organizational members from industry, academia,

Key facts

NSF award ID
2504401
Awardee
University of Virginia Main Campus (VA)
SAM.gov UEI
JJG6HU8PA4S5
PI
Geoffrey C Fox
Primary program
01002526DB NSF RESEARCH & RELATED ACTIVIT
All programs
SMALL PROJECT
Estimated total
$599,091
Funds obligated
$599,091
Transaction type
Standard Grant
Period
09/01/2025 → 08/31/2028