# A theoretical framework for probabilistic reinforcement learning in the basal ganglia

> **NIH NIH U19** · HARVARD MEDICAL SCHOOL · 2020 · $497,846

## Abstract

Project abstract
According to the standard reinforcement learning framework, the basal ganglia implements estimation of long-
term future reward and the control of actions to maximize future reward. Dopamine (DA) plays a central role by
providing the learning signal (reward prediction error, or RPE) that guides updating of reward predictions and
the action policy. Despite its success, the reinforcement learning framework has been challenged from a
number of directions. Some studies have suggested that DA encodes reward predictions themselves, rather
than reward prediction errors, and other studies have suggested that DA may play a role in invigorating action
selection independently from its contribution to learning. A major goal of this project is to develop a
reinforcement learning theory of basal ganglia function that addresses these challenges, and more broadly
presents a unifying view of how learning, probabilistic inference, and action selection work together to produce
adaptive behavior. Our theoretical innovation can be divided into three components. First, we argue that
cortical inputs to the striatum encode a probability distribution over hidden states, known as the belief state.
Second, we argue that striatal projection neurons transform this input through a set of basis functions, whose
purpose is to facilitate reward prediction. The synaptic weights that parametrize these predictions are updated
based on the DA RPE signal. Third, we argue that action selection circuits in the dorsal striatum use
probabilistic information about rewards to implement uncertainty-guided exploration.

## Key facts

- **NIH application ID:** 9993580
- **Project number:** 5U19NS113201-02
- **Recipient organization:** HARVARD MEDICAL SCHOOL
- **Principal Investigator:** Samuel J Gershman
- **Activity code:** U19 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2020
- **Award amount:** $497,846
- **Award type:** 5
- **Project period:** 2019-08-15 → 2024-07-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/9993580

## Citation

> US National Institutes of Health, RePORTER application 9993580, A theoretical framework for probabilistic reinforcement learning in the basal ganglia (5U19NS113201-02). Retrieved via AI Analytics 2026-05-23 from https://api.ai-analytics.org/grant/nih/9993580. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*
