ABSTRACT Optimal decision-making requires a delicate balance of stability and flexibility. On the one hand, stability is required to exploit learned contingencies between environmental features, instrumental actions, and goals. On the other, flexibility is required for the acquisition of a new behavioral strategy when the contingencies change. Dopamine (DA) neurons of the ventral tegmental area (VTA) have been implicated as important regulators of this balance. Conditions associated with DA dysfunction–most notably schizophrenia, Parkinson’s, and Huntington’s disease–profoundly disrupt performance on tasks requiring behavioral flexibility, such as the Wisconsin Card Sorting Test, typically by increasing “perseverative” responses that track a previously learned feature that is no longer relevant. Numerous animal studies have also directly implicated DA in the capacity to shift strategies, using pharmacological interventions, DA depletion, and microdialysis in important projection targets of VTADA neurons. Moreover, VTADA stimulation can promote either the maintenance or reorganization of behavioral strategy depending on whether it is timed to mimic tonic or phasic modes of firing. However, little is known about the endogenous neural activity patterns that underlie the acquisition of a new strategy–in part, because its anatomical location deep within the brain has kept the VTA inaccessible to large-scale recording during behavior. We have developed a decision making paradigm for mice that requires a strategy shift and can be performed under a two-photon microscope, allowing the use of state-of-the-art deep-brain imaging techniques to simultaneously monitor VTADA activity at cellular resolution. In this task, subjects navigate a T-maze within a virtual reality environment. The reward location is determined by one of two rules: a sensory rule guided by visuospatial cues, or an alternation rule guided by the previous choice. After initial training on one rule, subjects are challenged with a rule shift that enforces the acquisition of a new strategy. The recent discovery of heterogeneous task-feature representations within the VTADA population–a finding that contradicts the standard view that VTADA broadcasts a global reward prediction error signal–has inspired us to use this paradigm to test an intriguing hypothesis: that the RPE is represented in multiple feature-specific components–rather than as a global signal–and that subsets of the VTADA population track specific features based on their relevance to the current task strategy. Thus, the research plan outlined in this proposal will allow the first characterization of endogenous VTADA activity during a shift in behavioral strategy, and may help to clarify the link between DA and theories of reinforcement learning. The results are expected to have broad relevance to decision-making, and may uncover specific mechanisms that link dopamine dysfunction to deficits in flexibility.