Adaptive Inference by Stabilized Cross-Validation

NSF Award Search · 01002526DB NSF RESEARCH & RELATED ACTIVIT · $250,000 · view on nsf.gov ↗

Abstract

Modern data analysis and statistical learning are characterized by two defining features: complex data structures and black-box algorithms. The complexity of data structures arises from advanced data collection technologies and data-sharing infrastructures, such as imaging, remote sensing, wearable devices, and genomic sequencing. In parallel, black-box algorithms—particularly those stemming from advances in deep neural networks—have demonstrated remarkable success on modern datasets. This confluence of complex data and opaque models introduces new challenges for uncertainty quantification and statistical inference, a problem we refer to as ``black-box inference''. This research project aims to develop flexible, valid inference procedures for modern complex data that harness the strengths of black-box machine learning algorithms. These contributions have potential applications in areas such as policy evaluation, model selection, treatment effect identification, and algorithmic fairness auditing. A central focus of the project is the development of novel variants of a classical statistical tool: cross-validation, repurposed to enable adaptive inference in conjunction with powerful black-box models. Although cross-validation is widely used for evaluating estimator performance, its theoretical foundations remain limited, particularly in the context of complex data and modern algorithms. This research will begin with a multi-population comparison problem, using a stabilized

Key facts

NSF award ID
2515687
Awardee
Carnegie Mellon University (PA)
SAM.gov UEI
U3NKNFLNQ613
PI
Jing Lei
Primary program
01002526DB NSF RESEARCH & RELATED ACTIVIT
All programs
Artificial Intelligence (AI), Machine Learning Theory
Estimated total
$250,000
Funds obligated
$250,000
Transaction type
Standard Grant
Period
09/01/2025 → 08/31/2028