The role of distributional reinforcement learning in human neurons during impulsive choices

NIH RePORTER · NIH · R01 · $509,811 · view on reporter.nih.gov ↗

Abstract

ABSTRACT Recent developments in artificial intelligence and neuroscience have revealed neural codes for reinforcement that represent predictions of a range of possible future reward outcomes, rather than a singular expected value. This distributional reinforcement learning has enabled improved performance of artificial agents and has straightforward implications for numerous neuropsychiatric disorders, particularly impulse control and substance use disorders. This proposal aims to leverage our experience recording neuronal activity from the brains of human neurosurgical patients in order to translate these recordings in a novel research direction: to understand the mechanisms of human choice behavior. We will determine where distributional codes exist in the human prefrontal and mesial temporal cortices, and how those codes are expressed dynamically in time as humans make impulsive choices during the Balloon Analog Risk Task (BART) and a probabilistic reversal learning task. The results of these experiments will have both important basic scientific implications and will begin to address how distributional reinforcement learning in the human brain contributes to impulsive choices. In order to begin translating this new area of knowledge to understand the underpinnings of human decisions, we will first establish the presence of distributional reinforcement learning in four brain areas that comprise a human decision-making circuit: Orbitofrontal Cortex, Anterior Cingulate Cortex, Amygdala, and Hippocampus. Specific Aim 1 will test the three essential predictions of distributional RL: whether populations of neurons in each of these brain areas exhibit 1) asymmetric scaling of reward prediction errors, 2) diverse reversal points, and 3) that prediction error asymmetries and reversal points correlate across neurons. Specific Aim 2 seeks to decode BART reward prediction distributions from neurons in the aforementioned brain areas and determine how changes in BART reward distributions correlate with the propensity to make impulsive choices. Specific Aim 3 will test how diversity in optimism and pessimism in each neuron recorded from the aforementioned brain areas correlates with valuation or devaluation across trials. The completion of these aims will constitute important basic research findings in discovering distributional RL in the human prefrontal and mesial temporal cortices. By uncovering neural population codes that underlie potentially impulsive choices in human decision-making circuits, these experiments also address fundamental neural mechanisms underlying impulsive choices. This issue is central to addressing important problems for contemporary mental health including substance use disorder and a many other neuropsychiatric disorders. These findings will have readily translatable implications for improving targeted electrical therapies for psychiatric disorders.

Key facts

NIH application ID
10335061
Project number
1R01MH128187-01
Recipient
UTAH STATE HIGHER EDUCATION SYSTEM--UNIVERSITY OF UTAH
Principal Investigator
Elliot H Smith
Activity code
R01
Funding institute
NIH
Fiscal year
2022
Award amount
$509,811
Award type
1
Project period
2022-02-03 → 2026-12-31