CRCNS: Reward and motivation in neural networks

NIH RePORTER · NIH · R01 · $432,000 · view on reporter.nih.gov ↗

Abstract

The overall goal of this project is to develop a reinforcement learning (RL) theory of motivation, understood here as motivational salience, and to test the conclusions of this theory using experimental observations obtained in the ventral pallidum (VP). Animals' actions depend on the shifting values of internal demands determined by physiological or behavioral conditions, such as thirst, hunger, addiction, specific nutrient deficiency, etc. These need-based modulations of the perceived values of reinforcements (reward or punishment} are described by a mathematical variable called motivational salience or, simply, motivation. Including motivation adds a new level of complexity to RL theory, and allows it to generate flexible ongoing behaviors. Here, we will investigate how motivation can be learned by neuronal networks to generate complex adaptive behaviors and compare the conclusions of our theory with the VP circuits. Previous studies indicate that the VP plays an important role in a variety of behaviors, potentially, by influencing motivational salience. In vivo recordings suggest that VP neuron firing correlates with motivational states. Lesions, pharmacological and optogenetic manipulations in VP cause profound changes in behaviors motivated by natural rewards or drugs of addiction. Dysfunction of this structure is linked to depression and drug addiction in humans. Our theoretical results suggest that distinct classes of neurons in the VP should play essential roles in representing either positive or negative motivational states. We further hypothesize that the functional interactions locally within the VP are critical for generating such signals that guide motivated behaviors. Consistent with predictions of RL theory, in our preliminary studies, we found that individual VP neurons could be classified as either positive or negative 'motivation neurons', as the activities of these neurons represented both expected values of outcomes and motivational states. When population activity is considered, representations of outcome expectation can be distinguished from representations of motivation fluctuating according to the animals' physiological states. Based on the preliminary data, we devised an integrated approach, combining studies in computational analysis and theory (Koulakov lab) with advanced molecular genetic tools, optogenetics, chemogenetics, electrophysiology, and imaging in behaving mice (Li lab), to test our hypotheses through the following Aims: Aim 1. To develop methods for identifying motivation in the population activity of VP neurons. Here we will use novel behavioral and computational methods to disambiguate representations of motivation and outcome expectation in neuronal responses. Aim 2. To develop reinforcement learning theory of motivation and to test its predictions using responses of VP neurons. Here we will develop the Q-learning theory of motivation and compare networks trained using this theory to responses of VP neur...

Key facts

NIH application ID: 10017031
Project number: 5R01DA050374-02
Recipient: COLD SPRING HARBOR LABORATORY
Principal Investigator: ALEXEI KOULAKOV
Activity code: R01
Funding institute: NIH
Fiscal year: 2020
Award amount: $432,000
Award type: 5
Project period: 2019-09-30 → 2024-07-31