Distributional Reinforcement Learning for Risk-Sensitive Sequential Decision Making: New Theory and Methods

NSF Award Search · 01002627DB NSF RESEARCH & RELATED ACTIVIT · $299,601 · view on nsf.gov ↗

Abstract

Many critical online decision systems, including clinical support, financial risk management, and autonomous technologies, must look beyond average performance to avoid rare but catastrophic "tail events." Traditional reinforcement learning often summarizes future outcomes as a single expected value, which masks significant risks and uncertainty. This research addresses this limitation by developing distributional reinforcement learning methods that learn the full range of possible outcomes to support safer, risk-aware, and privacy-preserving decision-making. By improving the trustworthiness of systems in health, finance, and operations, this work strengthens the intersection of machine learning, artificial intelligence, and statistics while promoting the responsible use of sensitive individual data. Additionally, the project supports education by training students at the intersection of statistics, machine learning, optimization, and responsible artificial intelligence. The research focuses on quantile temporal difference learning, a scalable model-free method for estimating return quantiles from observed transitions. First, the project will establish finite-time guarantees for quantile temporal difference learning in both synchronous settings and asynchronous settings with Markovian data, including bounds for quantile estimation error and for the accuracy of the estimated return distribution. Second, the project will develop statistical inference methods for distribution

Key facts

NSF award ID: 2610563
Awardee: University of Miami (FL)
SAM.gov UEI: RQMFJGDTQ5V3
PI: Lan Wang
Primary program: 01002627DB NSF RESEARCH & RELATED ACTIVIT
All programs: Artificial Intelligence (AI), Machine Learning Theory
Estimated total: $299,601
Funds obligated: $299,601
Transaction type: Standard Grant
Period: 07/01/2026 → 06/30/2029