CAREER: A Safety-Aware Learning Framework for Identifying and Mitigating Risks in Human-LLM Interactions in Healthcare

NSF Award Search · 01003031DB NSF RESEARCH & RELATED ACTIVIT · $499,821 · view on nsf.gov ↗

Abstract

Large language models are increasingly used in healthcare applications such as virtual assistants and decision support tools, offering new opportunities to improve access to care and patient outcomes. However, these systems can introduce new kinds of risks that arise not from the model alone, but from how people interact with it. For example, patients may rely too heavily on automated advice, receive responses shaped by harmful preconceptions or be unintentionally influenced toward unsafe decisions. These risks are especially concerning in sensitive settings such as mental health and addiction recovery, where errors can have serious consequences. This project addresses these challenges by developing new methods to make interactions between people and artificial intelligence systems safer and more trustworthy. The work aims to improve the reliability of healthcare technologies, support safer patient experiences, and contribute to the broader goal of responsible artificial intelligence. Educational activities include developing interdisciplinary coursework and engaging students from diverse backgrounds in research at the intersection of artificial intelligence and health. This project develops a unified, safety-aware learning framework for identifying and mitigating risks in human-large language model interactions in healthcare. The research investigates three integrated thrusts. First, it develops predictive models to detect fundamental and emerging interaction risks, such as overreliance, stereotyping, manipulation, and privacy violations, using supervised and contrastive learning techniques with interpretable outputs. Second, it introduces robust learning methods to mitigate these risks by incorporating user intent, clinical context, and interaction dynamics, including adversarial training and personalized reinforcement learning algorithms. Third, it designs an adaptive, closed-loop method that jointly optimizes risk identification and mitigation through self-su

Key facts

NSF award ID: 2541748
Awardee: University of California-Santa Cruz (CA)
SAM.gov UEI: VXUFPE4MCZH5
PI: Chenguang Wang
Primary program: 01003031DB NSF RESEARCH & RELATED ACTIVIT
All programs: Artificial Intelligence (AI), CAREER-Faculty Erly Career Dev
Estimated total: $499,821
Funds obligated: $349,875
Transaction type: Continuing Grant
Period: 10/01/2026 → 09/30/2031