Learning desired actions from experience requires evaluating alternative actions by integrating the consequences assigned to each action over time. In the real world, actions and outcomes occur in complex sequences, and a continuous stream of events must be parsed into appropriate pairs of causative action and outcome before such pairs can be evaluated. However, how the brain solves this problem, known as temporal credit assignment (TCA) is unknown. The goal of the proposed project is to use an innovative paradigm and test novel hypotheses regarding the role of heterogeneous dynamics of memory in the prefrontal cortex in TCA. In our contextual lagged bandit task, monkeys will choose between three options offered in one of two alternating contexts, but feedback for a choice in one context will be temporally delayed and delivered after another choice is made in the other context. Learning optimal choices in this task consists of two parts: causal inference for learning the causal structure or model of the task, and model-based TCA for learning the value of each choice. Learning from delayed outcome requires memory of a chosen action or eligibility trace (ET). Although theories postulate that ET exponentially decays over time (i.e. exp-ET), exp-ET cannot resolve TCA when causative action is separated from the outcome in time and by irrelevant events. We hypothesize the flexible dynamics of ET might be crucial for causal inference and model-based TCA. More specifically, we hypothesize that model-based TCA requires dynamically-modulated ET (dynamic- ET) which is selectively activated at the predicted time of its outcome to obviate receiving credits from intervening events. For causal inference, we hypothesize that memory of past actions might be re-activated by hindsight in search of a new causal link (i.e. hypothetical ET, hyp-ET). We also hypothesize that ET might be strongly sustained until the lagged feedback (i.e. persistent ET, persist-ET) to test the accuracy of new link at the expense of confounding intervening inputs. We will investigate how flexible dynamics of ETs might be supported by heterogeneous dynamics of neural activity across different regions of cortico-striatal network. First, we will assess whether the primate prefrontal cortex (PFC) provides dynamic-ET for model-based TCA, whereas the striatum provides exp-ET for contiguity-based TCA. Second, we will assess whether dorsomedial and dorsolateral PFC provide hyp-ET and persist-ET, respectively. We will take a highly integrative approach and combine multi-scale neural recordings, perturbations and computational modeling to examine whether and how complex patterns and dynamics of neural activity in the prefrontal cortex constitute necessary and sufficient conditions to support model-based TCA. The proposed project will transform the conventional view of memory as storage, recasting memory as an integral part of learning and reasoning with temporal dynamics being its key structure....