Matches in SemOpenAlex for { <https://semopenalex.org/work/W2141996686> ?p ?o ?g. }
- W2141996686 endingPage "1537" @default.
- W2141996686 startingPage "1532" @default.
- W2141996686 abstract "In dynamic environments, adaptive behavior requires striking a balance between harvesting currently available rewards (exploitation) and gathering information about alternative options (exploration) [1Stephens D.W. Krebs J.R. Foraging Theory. Princeton University Press, Princeton, NJ1986Google Scholar, 2Gittins J.C. Multi-armed Bandit Allocation Indices. Wiley, Chichester, NY1989Google Scholar, 3Whittle P. Restless bandits: Activity allocation in a changing world.J. Appl. Probab. 1988; 25: 287-298Crossref Google Scholar, 4Berry D.A. Fristedt B. Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London1985Crossref Google Scholar]. Such strategic decisions should incorporate not only recent reward history, but also opportunity costs and environmental statistics. Previous neuroimaging [5Daw N.D. O'Doherty J.P. Dayan P. Seymour B. Dolan R.J. Cortical substrates for exploratory decisions in humans.Nature. 2006; 441: 876-879Crossref PubMed Scopus (1242) Google Scholar, 6Wittmann B.C. Daw N.D. Seymour B. Dolan R.J. Striatal activity underlies novelty-based choice in humans.Neuron. 2008; 58: 967-973Abstract Full Text Full Text PDF PubMed Scopus (155) Google Scholar, 7Walton M.E. Devlin J.T. Rushworth M.F. Interactions between decision making and performance monitoring within prefrontal cortex.Nat. Neurosci. 2004; 7: 1259-1265Crossref PubMed Scopus (359) Google Scholar, 8Yoshida W. Ishii S. Resolution of uncertainty in prefrontal cortex.Neuron. 2006; 50: 781-789Abstract Full Text Full Text PDF PubMed Scopus (142) Google Scholar] and neurophysiological [9Shima K. Tanji J. Role for cingulate motor area cells in voluntary movement selection based on reward.Science. 1998; 282: 1335-1338Crossref PubMed Scopus (532) Google Scholar, 10Kennerley S.W. Walton M.E. Behrens T.E. Buckley M.J. Rushworth M.F. Optimal decision making and the anterior cingulate cortex.Nat. Neurosci. 2006; 9: 940-947Crossref PubMed Scopus (630) Google Scholar, 11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar, 12Quilodran R. Rothe M. Procyk E. Behavioral shifts and action valuation in the anterior cingulate cortex.Neuron. 2008; 57: 314-325Abstract Full Text Full Text PDF PubMed Scopus (214) Google Scholar, 13Rudebeck P.H. Behrens T.E. Kennerley S.W. Baxter M.G. Buckley M.J. Walton M.E. Rushworth M.F. Frontal cortex subregions play distinct roles in choices between actions and stimuli.J. Neurosci. 2008; 28: 13775-13785Crossref PubMed Scopus (221) Google Scholar] studies have implicated orbitofrontal cortex, anterior cingulate cortex, and ventral striatum in distinguishing between bouts of exploration and exploitation. Nonetheless, the neuronal mechanisms that underlie strategy selection remain poorly understood. We hypothesized that posterior cingulate cortex (CGp), an area linking reward processing, attention [14Kobayashi Y. Amaral D.G. Macaque monkey retrosplenial cortex: II. Cortical afferents.J. Comp. Neurol. 2003; 466: 48-79Crossref PubMed Scopus (309) Google Scholar], memory [15Vogt B.A. Finch D.M. Olson C.R. Functional heterogeneity in cingulate cortex: The anterior executive and posterior evaluative regions.Cereb. Cortex. 1992; 2: 435-443PubMed Google Scholar, 16Vogt B.A. Gabriel M. Neurobiology of Cingulate Cortex and Limbic Thalamus: A Comprehensive Handbook. Birkhauser, Boston1993Crossref Google Scholar], and motor control systems [17Vogt B.A. Gabriel M. Vogt L.J. Poremba A. Jensen E.L. Kubota Y. Kang E. Muscarinic receptor binding increases in anterior thalamus and cingulate cortex during discriminative avoidance learning.J. Neurosci. 1991; 11: 1508-1514PubMed Google Scholar], mediates the integration of variables such as reward [18McCoy A.N. Crowley J.C. Haghighian G. Dean H.L. Platt M.L. Saccade reward signals in posterior cingulate cortex.Neuron. 2003; 40: 1031-1040Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar], uncertainty [19McCoy A.N. Platt M.L. Risk-sensitive neurons in macaque posterior cingulate cortex.Nat. Neurosci. 2005; 8: 1220-1227Crossref PubMed Scopus (283) Google Scholar], and target location [20Dean H.L. Platt M.L. Allocentric spatial referencing of neuronal activity in macaque posterior cingulate cortex.J. Neurosci. 2006; 26: 1117-1127Crossref PubMed Scopus (60) Google Scholar] that underlie this dynamic balance. Here we show that CGp neurons distinguish between exploratory and exploitative decisions made by monkeys in a dynamic foraging task. Moreover, firing rates of these neurons predict in graded fashion the strategy most likely to be selected on upcoming trials. This encoding is distinct from switching between targets and is independent of the absolute magnitudes of rewards. These observations implicate CGp in the integration of individual outcomes across decision making and the modification of strategy in dynamic environments. In dynamic environments, adaptive behavior requires striking a balance between harvesting currently available rewards (exploitation) and gathering information about alternative options (exploration) [1Stephens D.W. Krebs J.R. Foraging Theory. Princeton University Press, Princeton, NJ1986Google Scholar, 2Gittins J.C. Multi-armed Bandit Allocation Indices. Wiley, Chichester, NY1989Google Scholar, 3Whittle P. Restless bandits: Activity allocation in a changing world.J. Appl. Probab. 1988; 25: 287-298Crossref Google Scholar, 4Berry D.A. Fristedt B. Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London1985Crossref Google Scholar]. Such strategic decisions should incorporate not only recent reward history, but also opportunity costs and environmental statistics. Previous neuroimaging [5Daw N.D. O'Doherty J.P. Dayan P. Seymour B. Dolan R.J. Cortical substrates for exploratory decisions in humans.Nature. 2006; 441: 876-879Crossref PubMed Scopus (1242) Google Scholar, 6Wittmann B.C. Daw N.D. Seymour B. Dolan R.J. Striatal activity underlies novelty-based choice in humans.Neuron. 2008; 58: 967-973Abstract Full Text Full Text PDF PubMed Scopus (155) Google Scholar, 7Walton M.E. Devlin J.T. Rushworth M.F. Interactions between decision making and performance monitoring within prefrontal cortex.Nat. Neurosci. 2004; 7: 1259-1265Crossref PubMed Scopus (359) Google Scholar, 8Yoshida W. Ishii S. Resolution of uncertainty in prefrontal cortex.Neuron. 2006; 50: 781-789Abstract Full Text Full Text PDF PubMed Scopus (142) Google Scholar] and neurophysiological [9Shima K. Tanji J. Role for cingulate motor area cells in voluntary movement selection based on reward.Science. 1998; 282: 1335-1338Crossref PubMed Scopus (532) Google Scholar, 10Kennerley S.W. Walton M.E. Behrens T.E. Buckley M.J. Rushworth M.F. Optimal decision making and the anterior cingulate cortex.Nat. Neurosci. 2006; 9: 940-947Crossref PubMed Scopus (630) Google Scholar, 11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar, 12Quilodran R. Rothe M. Procyk E. Behavioral shifts and action valuation in the anterior cingulate cortex.Neuron. 2008; 57: 314-325Abstract Full Text Full Text PDF PubMed Scopus (214) Google Scholar, 13Rudebeck P.H. Behrens T.E. Kennerley S.W. Baxter M.G. Buckley M.J. Walton M.E. Rushworth M.F. Frontal cortex subregions play distinct roles in choices between actions and stimuli.J. Neurosci. 2008; 28: 13775-13785Crossref PubMed Scopus (221) Google Scholar] studies have implicated orbitofrontal cortex, anterior cingulate cortex, and ventral striatum in distinguishing between bouts of exploration and exploitation. Nonetheless, the neuronal mechanisms that underlie strategy selection remain poorly understood. We hypothesized that posterior cingulate cortex (CGp), an area linking reward processing, attention [14Kobayashi Y. Amaral D.G. Macaque monkey retrosplenial cortex: II. Cortical afferents.J. Comp. Neurol. 2003; 466: 48-79Crossref PubMed Scopus (309) Google Scholar], memory [15Vogt B.A. Finch D.M. Olson C.R. Functional heterogeneity in cingulate cortex: The anterior executive and posterior evaluative regions.Cereb. Cortex. 1992; 2: 435-443PubMed Google Scholar, 16Vogt B.A. Gabriel M. Neurobiology of Cingulate Cortex and Limbic Thalamus: A Comprehensive Handbook. Birkhauser, Boston1993Crossref Google Scholar], and motor control systems [17Vogt B.A. Gabriel M. Vogt L.J. Poremba A. Jensen E.L. Kubota Y. Kang E. Muscarinic receptor binding increases in anterior thalamus and cingulate cortex during discriminative avoidance learning.J. Neurosci. 1991; 11: 1508-1514PubMed Google Scholar], mediates the integration of variables such as reward [18McCoy A.N. Crowley J.C. Haghighian G. Dean H.L. Platt M.L. Saccade reward signals in posterior cingulate cortex.Neuron. 2003; 40: 1031-1040Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar], uncertainty [19McCoy A.N. Platt M.L. Risk-sensitive neurons in macaque posterior cingulate cortex.Nat. Neurosci. 2005; 8: 1220-1227Crossref PubMed Scopus (283) Google Scholar], and target location [20Dean H.L. Platt M.L. Allocentric spatial referencing of neuronal activity in macaque posterior cingulate cortex.J. Neurosci. 2006; 26: 1117-1127Crossref PubMed Scopus (60) Google Scholar] that underlie this dynamic balance. Here we show that CGp neurons distinguish between exploratory and exploitative decisions made by monkeys in a dynamic foraging task. Moreover, firing rates of these neurons predict in graded fashion the strategy most likely to be selected on upcoming trials. This encoding is distinct from switching between targets and is independent of the absolute magnitudes of rewards. These observations implicate CGp in the integration of individual outcomes across decision making and the modification of strategy in dynamic environments. To probe the neuronal processes mediating the strategic balance of immediate reward and information acquisition, we recorded the activity of single cingulate cortex (CGp) neurons in two rhesus macaques performing a “restless” variant of the four-armed bandit for juice rewards [3Whittle P. Restless bandits: Activity allocation in a changing world.J. Appl. Probab. 1988; 25: 287-298Crossref Google Scholar, 5Daw N.D. O'Doherty J.P. Dayan P. Seymour B. Dolan R.J. Cortical substrates for exploratory decisions in humans.Nature. 2006; 441: 876-879Crossref PubMed Scopus (1242) Google Scholar] (Figure 1). This variant provides a high level of environmental variability with a behaviorally tractable number of options. On each trial, monkeys chose one of four targets whose payoffs were randomly selected from distributions centered about their values on the previous trial. Once a target was chosen, monkeys in principle had perfect knowledge of its present value (there was no added variance in payouts), though the values of all targets changed each trial. As a result, monkeys had to select an option to learn its current value and integrate this information with their statistical knowledge of the environment to predict its relative value on upcoming trials. Both monkeys were highly adept at optimizing reward. They earned 92% and 91%, respectively, of the total reward that would have been earned by an omniscient observer. Nevertheless, despite this high level of performance, a perfectly greedy decision maker, focused on the option with highest immediate value, would have harvested more, though not all, available reward (see Supplemental Data available online). More importantly, nothing intrinsic to the task design serves to distinguish exploratory from exploitative decisions. On each trial, both monkeys simply selected among the four available options and received a reward. As a result, individual decisions must be classified as exploratory or exploitative according to a model-based analysis of each monkey's behavior, with model parameters chosen to maximize the likelihood of observed choices. We report here only results based on our best-fitting Kalman filter model, though results were similar for other models as well (see Supplemental Data). We analyzed the firing rates of 83 single neurons in CGp in both monkeys performing the four-armed bandit task (59 from monkey N and 24 from monkey B). We focused on two trial epochs, a 2 s decision epoch (DE; 1 s before trial initiation extending to juice delivery) and a 2 s postreward evaluation epoch (EE; from the offset of juice delivery through the intertrial interval). Analyses based on mean firing rates in each epoch readily identified neurons that discriminated between the two strategies (14%, n = 12/83, DE; 16%, n = 13/83, EE; p < 0.05, Mann-Whitney U test), with 22% of neurons doing so in at least one epoch (n = 18/83; p < 0.025, Bonferroni-corrected Mann-Whitney U test). Figure 2A depicts the average firing rate of a single neuron on trials classified as either exploratory or exploitative. Responses on exploit trials were significantly higher in both decision and evaluation epochs (Mann-Whitney U test, p < 0.01). In contrast, the neuron whose activity is plotted in Figure 2B was more responsive on exploratory trials in both epochs (Mann-Whitney U test, p < 0.01). Although the population as a whole exhibited slightly higher firing on exploratory trials in both epochs (modulation index = 0.0084 [DE], 0.0026 [EE]), the population of cells with significant modulation was mixed (modulation index = 0.046 [DE], −0.011 [EE]), indicating heterogeneity in single-cell responses to the different strategies. Thus, firing rates in CGp distinguish between upcoming exploratory and exploitative decisions in the epoch leading up to selection and continue to reflect that choice during the postreward delay. We previously reported that responses of CGp neurons predict impending switches from one option to another on the next trial in a simple two-alternative task [11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar]. Based on that finding, we hypothesized that CGp neurons would also carry predictive information about more general impending choices of strategy. To test this hypothesis, we regressed the probability of exploration on the upcoming decision as a function of observed firing rate for each neuron. Of the 83 neurons in our sample, about 16% showed significant correlations between firing rate during the decision epoch and the probability of exploration on the ensuing choice [n = 13, p < 0.05, Mann-Whitney U test; p(n > 12) < 0.001, binomial test]. Even more importantly, 16% of neurons showed a correlation between firing during EE and the probability of exploration on the following trial [n = 13 p < 0.05, Mann-Whitney U test; p(n > 12) < 0.001, binomial test], suggesting that CGp differentially signals the probability of strategic decisions within a block of trials. Figures 2C and 2D depict the separate population averages for the subsets of cells whose activity correlated negatively (n = 6) and positively (n = 7) with exploration. Average response for each of these two groups of neurons strongly predicted probability of impending strategic choices in a graded fashion. One potential confound of these results arises from the link between exploitation and the likelihood of increased reward. Because we might expect that exploitative choices, on average, yielded higher rewards, a possible alternative interpretation of the present data is that effects of strategy on neuronal activity are reducible entirely to neuronal sensitivity to reward value. To investigate this possibility, we calculated and fit reward size tuning curves for each of our 83 neurons (Figures 3A and 3B). Consistent with previous studies [11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar, 18McCoy A.N. Crowley J.C. Haghighian G. Dean H.L. Platt M.L. Saccade reward signals in posterior cingulate cortex.Neuron. 2003; 40: 1031-1040Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar], we found that the firing rates of CGp neurons during the decision epoch varied with the amount of reward received on the previous trial (DE, n = 39, p < 0.05, F test for quadratic regression) and that firing rates in the evaluation epoch varied with the amount just received (EE, n = 44, p < 0.05, F test). Over the range of experienced reward values (50–350 μl), we found a heterogeneity of tuning curves: some were linear (n = 14 positive, 6 negative, DE; n = 18 positive, n = 9 negative, EE; p < 0.05 nonzero regression coefficient), whereas others were U-shaped, both concave up (n = 8, DE; n = 8, EE; p < 0.05 nonzero regression coefficient) and concave down (n = 11, DE; n = 9, EE; p < 0.05 nonzero regression coefficient). We therefore restricted our next series of analyses to those trials where monkeys adopted different strategies but received the same amount of reward. We found that 12% [n = 10; p(n > 9) < 0.001, binomial test] of recorded neurons still showed different mean firing rates on explore versus exploit trials (p < 0.01, Bonferroni-corrected Mann-Whitney U test). Data for an example neuron showing this effect are shown in Figures 3D–3F. Here, the middle third of reward values have been subdivided into three categories (medium-low, medium-medium, and medium-high), and neuronal firing is plotted as a function of time for both explore and exploit trials, controlled for received reward. This neuron, like many others in our population, showed clear sensitivity to strategy even when we controlled for the value of the reward the monkeys received. Two other possible confounds arise from the known spatial tuning of CGp and the close relationship between exploration and simply switching between targets. As reported previously [21Dean H.L. Crowley J.C. Platt M.L. Visual and saccade-related activity in macaque posterior cingulate cortex.J. Neurophysiol. 2004; 92: 3056-3068Crossref PubMed Scopus (51) Google Scholar], we found that 63% of neurons were tuned for the location of the target chosen (n = 52/83, p < 0.05, one-way analysis of variance of mean firing rates for each target over all trials; see Supplemental Data). Across the population, 39% of neurons were significantly tuned for both reward size and target location [18McCoy A.N. Crowley J.C. Haghighian G. Dean H.L. Platt M.L. Saccade reward signals in posterior cingulate cortex.Neuron. 2003; 40: 1031-1040Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar], whereas 23% were tuned for neither (EE; 34% and 24%, respectively, in DE). However, the population as a whole showed no consistent target tuning across trials (p > 0.2, one-sample t test for contralateral and upper-hemifield tuning indices). In the case of target switching, because repeatedly choosing a poor target is not necessarily exploitative (if higher reward has recently been sampled elsewhere), there is not a strict one-to-one correspondence between exploitation and perseveration or between exploration and switching. As a result, our task design allows for the possibility of disambiguating these phenomena. Indeed, a partial correlation analysis of neuronal firing rates and upcoming decisions (firing rate in DE for decision in current trial; firing rate in EE for decision in following trial) that controlled for the effects of reward tuning, spatial tuning, target switching, and previous explore/exploit decision revealed significant correlations in 12% of neurons [n = 11, DE; n = 10, EE; p < 0.05 Spearman partial correlation; p(n > 9) < 0.01, binomial test]. Thus, even when all known effects on firing rate were accounted for, a significant number of neurons still exhibited clear predictive correlations with upcoming strategy. Collectively, these results indicate not only that single neurons in CGp receive information about both previous rewards and previous choices [19McCoy A.N. Platt M.L. Risk-sensitive neurons in macaque posterior cingulate cortex.Nat. Neurosci. 2005; 8: 1220-1227Crossref PubMed Scopus (283) Google Scholar] and maintain that information across trials [11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar, 19McCoy A.N. Platt M.L. Risk-sensitive neurons in macaque posterior cingulate cortex.Nat. Neurosci. 2005; 8: 1220-1227Crossref PubMed Scopus (283) Google Scholar], as reported previously, but that these same neurons also carry signals related to dynamic changes in choice strategy in a multiplexed format. We found that, when choosing among multiple targets whose relative values changed dynamically, neurons in posterior cingulate cortex signaled the distinction between trials on which monkeys pursued an exploratory rather than an exploitative strategy. This signal was robust against classifications of trials based on differing models of behavior, including a perfectly greedy strategy and a simple heuristic based on comparison to a reward threshold (see Supplemental Data). More importantly, single neurons signaled in graded fashion the probability of pursuing each strategy on upcoming trials. Previous work has shown that CGp neurons are sensitive to reward [18McCoy A.N. Crowley J.C. Haghighian G. Dean H.L. Platt M.L. Saccade reward signals in posterior cingulate cortex.Neuron. 2003; 40: 1031-1040Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar], risk [19McCoy A.N. Platt M.L. Risk-sensitive neurons in macaque posterior cingulate cortex.Nat. Neurosci. 2005; 8: 1220-1227Crossref PubMed Scopus (283) Google Scholar], and option switching [11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar] and integrate this information across multiple trials [11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar], but the present study generalizes the decision environment to one in which exploration and exploitation are distinguishable from a simple “win-stay/lose-shift” heuristic [11Hayden B.Y. Nair A.C. McCoy A.N. Platt M.L. Posterior cingulate cortex mediates outcome-contingent allocation of behavior.Neuron. 2008; 60: 19-25Abstract Full Text Full Text PDF PubMed Scopus (132) Google Scholar, 22Barraclough D.J. Conroy M.L. Lee D. Prefrontal cortex and decision making in a mixed-strategy game.Nat. Neurosci. 2004; 7: 404-410Crossref PubMed Scopus (479) Google Scholar] based only on the most recent reward received and in which outcomes must be evaluated in light of multiple options with dynamically changing rewards. As a result, relatively bad outcomes in rich environments might be acceptable under circumstances where all alternatives are poor and searching for better options necessitates choosing among several competing alternatives, each with a distinct reward history. Strategic decisions in such an environment thus require greater abstraction and integration of information than in comparatively static contexts because no single variable (or single trial) contains sufficient information on which to base a decision. Together, these results invite the hypothesis that CGp is part of a network that monitors the outcomes of individual decisions and integrates that information into higher-level strategies spanning multiple choices. Thus, although the prolonged time courses of firing-rate changes in CGp are unlikely to be responsible for decisions on individual trials, their tonic activity levels might be responsible for encoding the gradual accumulation of information that gives rises to changes in strategy. However, we do not expect that this integration of single-trial history with strategic information is limited to CGp, nor is this its sole function. For example, Daw et al. [5Daw N.D. O'Doherty J.P. Dayan P. Seymour B. Dolan R.J. Cortical substrates for exploratory decisions in humans.Nature. 2006; 441: 876-879Crossref PubMed Scopus (1242) Google Scholar] found that exploratory decisions are associated with activation in frontal polar cortex and intraparietal sulcus, whereas exploitative decisions are associated with activity in striatum. We speculate that information about individual rewards and reward predictions is initially computed in striatum, orbitofrontal cortex, and medial prefrontal cortex; subsequently combined with recent reward outcome and choice history and maintained online in CGp; and finally passed to the anterior cingulate cortex (ACC), where it is utilized in the selection of appropriate actions. Moreover, reciprocal connections between ACC and CGp might play a role in learning which combinations of single-trial variables are most relevant when deciding among strategies to maximize reward. In this framework, recent reward history, computational difficulty, stimulus novelty, memory load, and the statistics of the environment are distilled into a small number of task-related decision variables for the purposes of encoding and selecting among potential actions. Thus, individual neurons that compose this network would be expected to display sensitivity to the many single-trial variables like risk, reward, and spatial location that serve as its inputs. We found that these variables are represented multimodally by neurons in CGp. As a result, we suspect that CGp might play a key role in the process of learning the combinations of stimuli and accumulated statistics most relevant to making decisions, analogous to its role in simple conditioning [23Gabriel M. Foster K. Orona E. Interaction of laminae of the cingulate cortex with the anteroventral thalamus during behavioral learning.Science. 1980; 208: 1050-1052Crossref PubMed Scopus (52) Google Scholar, 24Gabriel M. Sparenborg S.P. Stolar N. Hippocampal control of cingulate cortical and anterior thalamic information processing during learning in rabbits.Exp. Brain Res. 1987; 67: 131-152Crossref PubMed Scopus (81) Google Scholar, 25Gabriel M. Kubota Y. Sparenborg S. Straube K. Vogt B.A. Effects of cingulate cortical lesions on avoidance learning and training-induced unit activity in rabbits.Exp. Brain Res. 1991; 86: 585-600Crossref PubMed Scopus (104) Google Scholar]. This is in keeping with our observation of increased firing rates in response to block boundaries, near-threshold decisions, and aborted trials in the bandit task, as also observed in ACC [12Quilodran R. Rothe M. Procyk E. Behavioral shifts and action valuation in the anterior cingulate cortex.Neuron. 2008; 57: 314-325Abstract Full Text Full Text PDF PubMed Scopus (214) Google Scholar]. If so, CGp dysfunction might be related to deficiencies in memory-guided learning and action selection observed in disorders like Alzheimer's disease and obsessive-compulsive disorder, and its proper function might be crucial to the flexible adaptation of strategy in response to changing environments. All procedures were approved by the Duke University Institutional Animal Care and Use Committee and were conducted in compliance with the Public Health Service's Guide for the Care and Use of Animals. Two rhesus monkeys (Macaca mulatta) served as test subjects for recording. A small prosthesis and a stainless steel recording chamber were attached to the calvarium. The chamber was placed over CGp at the intersection of the interaural and midsagittal planes. Animals were habituated to laboratory conditions and trained to perform oculomotor tasks for liquid reward. Animals received analgesics and antibiotics after all surgeries. The chamber was kept sterile with antibiotic washes and sealed with sterile caps. Monkeys were familiar with the task. Eye position was sampled at 1000 Hz (camera, SR Research). Data were recorded by a computer running MATLAB (The Mathworks) with Psychtoolbox [26Brainard D.H. The Psychophysics Toolbox.Spat. Vis. 1997; 10: 433-436Crossref PubMed Scopus (11968) Google Scholar] and Eyelink [27Cornelissen F.W. Peters E.M. Palmer J. The Eyelink Toolbox: Eye tracking with MATLAB and the Psychophysics Toolbox.Behav. Res. Methods Instrum. Comput. 2002; 34: 613-617Crossref PubMed Scopus (630) Google Scholar]. Visual stimuli were squares (6° wide) on a computer monitor 50 cm away. A solenoid valve controlled juice delivery. Juice flavor was the same for each target. On every trial, a central cue appeared and stayed on until the monkey fixated it. Fixation was maintained within a 1°–2° window. After a brief delay, the central cue disappeared and the four targets were displayed in the corners of the screen. Targets appeared in the same location each trial. After selection of a target, its border was illuminated and reward was delivered, followed by a 1 s intertrial interval. Rewards varied from 40 ms to 280 ms of solenoid open time in 5 ms increments (50–350 μl, in 7.5 μl increments). Juice volumes were linear in solenoid open time, and we have previously shown that monkeys discriminate juice volumes as small as 20 μl [18McCoy A.N. Crowley J.C. Haghighian G. Dean H.L. Platt M.L. Saccade reward signals in posterior cingulate cortex.Neuron. 2003; 40: 1031-1040Abstract Full Text Full Text PDF PubMed Scopus (144) Google Scholar]. All target values began at 200 μl and reset each block. Blocks were 60 trials long and were cued by the appearance of a gray square in the center of the screen. Reward values for all targets changed each trial according to a biased random walk (see Supplemental Data). Single electrodes (Frederick Haer Co.) were lowered under microdrive guidance (Kopf) until the waveforms of one to three individual neurons were isolated. Individual action potentials were identified by standard criteria and isolated on a Plexon system. Neurons were selected on the basis of the quality of isolation, but not on selectivity for the task. Recordings were made in areas 23 and 31 in the cingulate gyrus and ventral bank of the cingulate sulcus, anterior to the intersection of the marginal and horizontal rami. We used an alpha of 0.05 as a criterion for significance. Peristimulus time histograms (PSTHs) were constructed by aligning spikes to trial events, averaging across trials, and smoothing by a Gaussian filter with 50 ms standard deviation. Shaded regions in PSTHs represent the standard error of the mean (± SEM), also Gaussian smoothed. Firing-rate modulation indices were calculated in each epoch as m=(fexplore−fexploit)/(fexplore+fexploit), where f is the firing rate averaged over the relevant subset of trials. Behavioral parameters were fitted by custom scripts written with the MATLAB Optimization Toolbox (The Mathworks). Details of modeling can be found in Supplemental Data. Analyses utilized a binary, model-based classification of choices on each trial as exploratory or exploitative (see Supplemental Data). We tested for significant differences in firing rates in both the decision and evaluation epochs as a function of the explore/exploit classification of the decision on the current trial (that is, effects on DEn and EEn as a function of xn, where xn is the binary explore/exploit variable). We also tested for predictive correlations between firing in one epoch and upcoming decision (DEn with xn and EEn with xn+1). To do this, we binned firing rates for each neuron into deciles of percent maximal firing and examined the percentage of exploratory decisions made subsequent to epochs with firing rates in each bin. This allowed us to construct a probability of exploration as a function of percent maximal firing, which we averaged across significant cells of each tuning. Our reward controls were performed by grouping the 45 distinct reward values into nine bins and comparing firing rates within each bin during the evaluation epoch on explore and exploit trials. Significance levels utilize a Bonferroni correction for the number of tests performed, which varied (not all bins contained an explore or exploit trial). Our reward-controlled plots grouped the 15 middle rewards into three groups of five, denoted medium-high, medium-medium, and medium-low. Our partial correlation analyses correlated (raw, unbinned) firing rate in a given epoch (DEn or EEn) with the upcoming explore/exploit decision (xn and xn+1, respectively). In each case, the correlation is controlled for spatial location (split into two variables, one for upper versus lower hemifield and one for left versus right hemifield, each taking values ± 1), previous received reward (rn−1 and rn, respectively), chosen target switch (binary; sn and sn+1, respectively), and previous explore/exploit choice (xn−1 and xn, respectively). Correlations were calculated as Spearman rank correlations and so allow for generic monotonic relations among variables. This work was supported by National Institute on Drug Abuse postdoctoral fellowship 023338-01 (B.Y.H.), National Institutes of Health grant R01EY013496 (M.L.P.), and the Duke Institute for Brain Studies (M.L.P.). We thank K. Watson for assistance in training the animals and A. Long for comments on the manuscript. Download .pdf (.78 MB) Help with pdf files Document S1. Supplemental Experimental Procedures, Six Tables, and Five Figures" @default.
- W2141996686 created "2016-06-24" @default.
- W2141996686 creator A5031978385 @default.
- W2141996686 creator A5059189714 @default.
- W2141996686 creator A5064602852 @default.
- W2141996686 creator A5072933482 @default.
- W2141996686 date "2009-09-01" @default.
- W2141996686 modified "2023-10-14" @default.
- W2141996686 title "Neurons in Posterior Cingulate Cortex Signal Exploratory Decisions in a Dynamic Multioption Choice Task" @default.
- W2141996686 cites W1606220205 @default.
- W2141996686 cites W1981678973 @default.
- W2141996686 cites W1981725931 @default.
- W2141996686 cites W1995156698 @default.
- W2141996686 cites W1997373688 @default.
- W2141996686 cites W2005815081 @default.
- W2141996686 cites W2009639618 @default.
- W2141996686 cites W2017719955 @default.
- W2141996686 cites W2021600835 @default.
- W2141996686 cites W2056921512 @default.
- W2141996686 cites W2070218409 @default.
- W2141996686 cites W2077611535 @default.
- W2141996686 cites W2081413820 @default.
- W2141996686 cites W2102068574 @default.
- W2141996686 cites W2115282236 @default.
- W2141996686 cites W2121161929 @default.
- W2141996686 cites W2126893123 @default.
- W2141996686 cites W2142080486 @default.
- W2141996686 cites W2142699776 @default.
- W2141996686 cites W2151811471 @default.
- W2141996686 cites W4238176959 @default.
- W2141996686 cites W4294214781 @default.
- W2141996686 doi "https://doi.org/10.1016/j.cub.2009.07.048" @default.
- W2141996686 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3515083" @default.
- W2141996686 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/19733074" @default.
- W2141996686 hasPublicationYear "2009" @default.
- W2141996686 type Work @default.
- W2141996686 sameAs 2141996686 @default.
- W2141996686 citedByCount "149" @default.
- W2141996686 countsByYear W21419966862012 @default.
- W2141996686 countsByYear W21419966862013 @default.
- W2141996686 countsByYear W21419966862014 @default.
- W2141996686 countsByYear W21419966862015 @default.
- W2141996686 countsByYear W21419966862016 @default.
- W2141996686 countsByYear W21419966862017 @default.
- W2141996686 countsByYear W21419966862018 @default.
- W2141996686 countsByYear W21419966862019 @default.
- W2141996686 countsByYear W21419966862020 @default.
- W2141996686 countsByYear W21419966862021 @default.
- W2141996686 countsByYear W21419966862022 @default.
- W2141996686 countsByYear W21419966862023 @default.
- W2141996686 crossrefType "journal-article" @default.
- W2141996686 hasAuthorship W2141996686A5031978385 @default.
- W2141996686 hasAuthorship W2141996686A5059189714 @default.
- W2141996686 hasAuthorship W2141996686A5064602852 @default.
- W2141996686 hasAuthorship W2141996686A5072933482 @default.
- W2141996686 hasBestOaLocation W21419966861 @default.
- W2141996686 hasConcept C15744967 @default.
- W2141996686 hasConcept C162324750 @default.
- W2141996686 hasConcept C169760540 @default.
- W2141996686 hasConcept C169900460 @default.
- W2141996686 hasConcept C180747234 @default.
- W2141996686 hasConcept C187736073 @default.
- W2141996686 hasConcept C199360897 @default.
- W2141996686 hasConcept C2777348757 @default.
- W2141996686 hasConcept C2778402161 @default.
- W2141996686 hasConcept C2778733324 @default.
- W2141996686 hasConcept C2779843651 @default.
- W2141996686 hasConcept C2780451532 @default.
- W2141996686 hasConcept C2781210436 @default.
- W2141996686 hasConcept C41008148 @default.
- W2141996686 hasConcept C529278444 @default.
- W2141996686 hasConcept C86803240 @default.
- W2141996686 hasConceptScore W2141996686C15744967 @default.
- W2141996686 hasConceptScore W2141996686C162324750 @default.
- W2141996686 hasConceptScore W2141996686C169760540 @default.
- W2141996686 hasConceptScore W2141996686C169900460 @default.
- W2141996686 hasConceptScore W2141996686C180747234 @default.
- W2141996686 hasConceptScore W2141996686C187736073 @default.
- W2141996686 hasConceptScore W2141996686C199360897 @default.
- W2141996686 hasConceptScore W2141996686C2777348757 @default.
- W2141996686 hasConceptScore W2141996686C2778402161 @default.
- W2141996686 hasConceptScore W2141996686C2778733324 @default.
- W2141996686 hasConceptScore W2141996686C2779843651 @default.
- W2141996686 hasConceptScore W2141996686C2780451532 @default.
- W2141996686 hasConceptScore W2141996686C2781210436 @default.
- W2141996686 hasConceptScore W2141996686C41008148 @default.
- W2141996686 hasConceptScore W2141996686C529278444 @default.
- W2141996686 hasConceptScore W2141996686C86803240 @default.
- W2141996686 hasIssue "18" @default.
- W2141996686 hasLocation W21419966861 @default.
- W2141996686 hasLocation W21419966862 @default.
- W2141996686 hasLocation W21419966863 @default.
- W2141996686 hasLocation W21419966864 @default.
- W2141996686 hasOpenAccess W2141996686 @default.
- W2141996686 hasPrimaryLocation W21419966861 @default.
- W2141996686 hasRelatedWork W1983510784 @default.
- W2141996686 hasRelatedWork W2035590484 @default.
- W2141996686 hasRelatedWork W2059312295 @default.