# The role of distributional reinforcement learning in human neurons during impulsive choices

> **NIH NIH R01** · UTAH STATE HIGHER EDUCATION SYSTEM--UNIVERSITY OF UTAH · 2022 · $509,811

## Abstract

ABSTRACT
Recent developments in artificial intelligence and neuroscience have revealed neural codes for reinforcement
that represent predictions of a range of possible future reward outcomes, rather than a singular expected value.
This distributional reinforcement learning has enabled improved performance of artificial agents and has
straightforward implications for numerous neuropsychiatric disorders, particularly impulse control and substance
use disorders. This proposal aims to leverage our experience recording neuronal activity from the brains of
human neurosurgical patients in order to translate these recordings in a novel research direction: to understand
the mechanisms of human choice behavior. We will determine where distributional codes exist in the human
prefrontal and mesial temporal cortices, and how those codes are expressed dynamically in time as humans
make impulsive choices during the Balloon Analog Risk Task (BART) and a probabilistic reversal learning task.
The results of these experiments will have both important basic scientific implications and will begin to address
how distributional reinforcement learning in the human brain contributes to impulsive choices.
In order to begin translating this new area of knowledge to understand the underpinnings of human decisions,
we will first establish the presence of distributional reinforcement learning in four brain areas that comprise a
human decision-making circuit: Orbitofrontal Cortex, Anterior Cingulate Cortex, Amygdala, and Hippocampus.
Specific Aim 1 will test the three essential predictions of distributional RL: whether populations of neurons in
each of these brain areas exhibit 1) asymmetric scaling of reward prediction errors, 2) diverse reversal points,
and 3) that prediction error asymmetries and reversal points correlate across neurons. Specific Aim 2 seeks to
decode BART reward prediction distributions from neurons in the aforementioned brain areas and determine
how changes in BART reward distributions correlate with the propensity to make impulsive choices. Specific
Aim 3 will test how diversity in optimism and pessimism in each neuron recorded from the aforementioned brain
areas correlates with valuation or devaluation across trials.
The completion of these aims will constitute important basic research findings in discovering distributional RL in
the human prefrontal and mesial temporal cortices. By uncovering neural population codes that underlie
potentially impulsive choices in human decision-making circuits, these experiments also address fundamental
neural mechanisms underlying impulsive choices. This issue is central to addressing important problems for
contemporary mental health including substance use disorder and a many other neuropsychiatric disorders.
These findings will have readily translatable implications for improving targeted electrical therapies for psychiatric
disorders.

## Key facts

- **NIH application ID:** 10335061
- **Project number:** 1R01MH128187-01
- **Recipient organization:** UTAH STATE HIGHER EDUCATION SYSTEM--UNIVERSITY OF UTAH
- **Principal Investigator:** Elliot H Smith
- **Activity code:** R01 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2022
- **Award amount:** $509,811
- **Award type:** 1
- **Project period:** 2022-02-03 → 2026-12-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10335061

## Citation

> US National Institutes of Health, RePORTER application 10335061, The role of distributional reinforcement learning in human neurons during impulsive choices (1R01MH128187-01). Retrieved via AI Analytics 2026-05-27 from https://api.ai-analytics.org/grant/nih/10335061. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*