# Decision dynamics during a continuous-time foraging task: a reinforcement learning approach

> **NIH NIH F31** · BRANDEIS UNIVERSITY · 2021 · $31,985

## Abstract

Project Summary
It is likely that evolution has strongly shaped the neural circuitry of the reward systems to optimize
performance in the many tasks involved in foraging for resources, a critical part of every animal's life. This
proposition was the inspiration for the development of “optimal” foraging theories, such as the marginal
value theorem (MVT), which derive analytically the foraging behavior (sequences of choices) that
maximizes the long-term rate of reward, usually considered to be energy intake. While these analytic
theories have had some success in describing animal behavior, the theories themselves rely on strict
assumptions about the environment that do not hold in many natural situations and are not flexible enough
to generalize to more complicated environments or other tasks. Therefore, the end goal of this project is to
understand which of a family of general-purpose decision (reinforcement-learning) algorithms is most likely
to be employed by the brain to solve value-based tasks and to use this knowledge to predict under what
circumstances these algorithms will lead to optimal or suboptimal behavior.
With this project, I will improve our understanding of animal decision processes in these more natural
environments by performing a foraging experiment that is continuous in time and violates many of the
assumptions that prior analytical theories of foraging rely on. Rats motivated by thirst will be allowed to
sample freely from two or three (palatable or aversive) tastant options (“patches”) in an open field and,
critically, will be allowed to direct their encounters with the options, something which past experiments have
lacked. Measurements of licking (consumption) behavior at each of the tastant options will allow me to
measure the decision dynamics of the rat over several 1-hour sessions. In particular, I will measure how the
sampling times at each option correlate with the values of the alternatives to gain insight into how rats
combine the values of available options to make decisions.
As a complement to this behavioral task, I will simulate a set of reinforcement learning agents that vary in
the rules used for learning action values, choosing actions, and planning actions. By quantitatively
comparing the decision behavior of these artificial agents to that obtained from rats I will determine which of
the simulated agents best reproduces the rat behavior, giving insight into the decision algorithms used by
rats and providing a direction for future electrophysiological recordings during this task. Importantly, this
comparison of animal behavior with that produced by artificial agents will allow me to assess how close to
“optimal” rat behavior is and, in the cases where it is suboptimal, to provide quantitative explanations for
why it is so.

## Key facts

- **NIH application ID:** 10129762
- **Project number:** 5F31DA051155-02
- **Recipient organization:** BRANDEIS UNIVERSITY
- **Principal Investigator:** Benjamin Nicolaas Ballintyn
- **Activity code:** F31 (R01, R21, SBIR, etc.)
- **Funding institute:** NIH
- **Fiscal year:** 2021
- **Award amount:** $31,985
- **Award type:** 5
- **Project period:** 2020-04-01 → 2022-08-31

## Primary source

NIH RePORTER: https://reporter.nih.gov/project-details/10129762

## Citation

> US National Institutes of Health, RePORTER application 10129762, Decision dynamics during a continuous-time foraging task: a reinforcement learning approach (5F31DA051155-02). Retrieved via AI Analytics 2026-05-22 from https://api.ai-analytics.org/grant/nih/10129762. Licensed CC0.

---

*[NIH grants dataset](/datasets/nih-grants) · CC0 1.0*