Serveur d'exploration sur les dispositifs haptiques

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Suboptimal Integration of Reward Magnitude and Prior Reward Likelihood in Categorical Decisions by Monkeys

Identifieur interne : 001E60 ( Pmc/Curation ); précédent : 001E59; suivant : 001E61

Suboptimal Integration of Reward Magnitude and Prior Reward Likelihood in Categorical Decisions by Monkeys

Auteurs : Tobias Teichert [États-Unis] ; Vincent P. Ferrera [États-Unis]

Source :

RBID : PMC:2996133

Abstract

Sensory decisions may be influenced by non-sensory information regarding reward magnitude or reward likelihood. Given identical sensory information, it is more optimal to choose an option if it is a priori more likely to be correct and hence rewarded (prior reward likelihood bias), or if it yields a larger reward, given that it is the correct choice (reward magnitude bias). Here, we investigated the ability of macaque monkeys to integrate reward magnitude and prior reward likelihood information into a categorical decision about stimuli with high signal strength but variable decision uncertainty. In the asymmetric reward magnitude condition, monkeys over-adjusted their decision criterion such that they chose the highly rewarded alternative far more often than was optimal; in contrast, monkeys did not adjust their decision criterion in response to asymmetric reward likelihood. This finding shows that in this setting, monkeys did not adjust their decision criterion based on the product of reward likelihood and reward magnitude as has been reported to be the case in value-based decisions that do not involve decision uncertainty due to stimulus categorization.


Url:
DOI: 10.3389/fnins.2010.00186
PubMed: 21151367
PubMed Central: 2996133

Links toward previous steps (curation, corpus...)


Links to Exploration step

PMC:2996133

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">Suboptimal Integration of Reward Magnitude and Prior Reward Likelihood in Categorical Decisions by Monkeys</title>
<author>
<name sortKey="Teichert, Tobias" sort="Teichert, Tobias" uniqKey="Teichert T" first="Tobias" last="Teichert">Tobias Teichert</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Neuroscience, Columbia University</institution>
<country>New York, NY, USA</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ferrera, Vincent P" sort="Ferrera, Vincent P" uniqKey="Ferrera V" first="Vincent P." last="Ferrera">Vincent P. Ferrera</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Neuroscience, Columbia University</institution>
<country>New York, NY, USA</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PMC</idno>
<idno type="pmid">21151367</idno>
<idno type="pmc">2996133</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2996133</idno>
<idno type="RBID">PMC:2996133</idno>
<idno type="doi">10.3389/fnins.2010.00186</idno>
<date when="2010">2010</date>
<idno type="wicri:Area/Pmc/Corpus">001E60</idno>
<idno type="wicri:Area/Pmc/Curation">001E60</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a" type="main">Suboptimal Integration of Reward Magnitude and Prior Reward Likelihood in Categorical Decisions by Monkeys</title>
<author>
<name sortKey="Teichert, Tobias" sort="Teichert, Tobias" uniqKey="Teichert T" first="Tobias" last="Teichert">Tobias Teichert</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Neuroscience, Columbia University</institution>
<country>New York, NY, USA</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
<author>
<name sortKey="Ferrera, Vincent P" sort="Ferrera, Vincent P" uniqKey="Ferrera V" first="Vincent P." last="Ferrera">Vincent P. Ferrera</name>
<affiliation wicri:level="1">
<nlm:aff id="aff1">
<institution>Department of Neuroscience, Columbia University</institution>
<country>New York, NY, USA</country>
</nlm:aff>
<country xml:lang="fr">États-Unis</country>
<wicri:regionArea></wicri:regionArea>
<wicri:regionArea># see nlm:aff region in country</wicri:regionArea>
</affiliation>
</author>
</analytic>
<series>
<title level="j">Frontiers in Neuroscience</title>
<idno type="ISSN">1662-4548</idno>
<idno type="eISSN">1662-453X</idno>
<imprint>
<date when="2010">2010</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass></textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">
<p>Sensory decisions may be influenced by non-sensory information regarding reward magnitude or reward likelihood. Given identical sensory information, it is more optimal to choose an option if it is
<italic>a priori</italic>
more likely to be correct and hence rewarded (prior reward likelihood bias), or if it yields a larger reward, given that it is the correct choice (reward magnitude bias). Here, we investigated the ability of macaque monkeys to integrate reward magnitude and prior reward likelihood information into a categorical decision about stimuli with high signal strength but variable decision uncertainty. In the asymmetric reward magnitude condition, monkeys over-adjusted their decision criterion such that they chose the highly rewarded alternative far more often than was optimal; in contrast, monkeys did not adjust their decision criterion in response to asymmetric reward likelihood. This finding shows that in this setting, monkeys did not adjust their decision criterion based on the product of reward likelihood and reward magnitude as has been reported to be the case in value-based decisions that do not involve decision uncertainty due to stimulus categorization.</p>
</div>
</front>
<back>
<div1 type="bibliography">
<listBibl>
<biblStruct>
<analytic>
<author>
<name sortKey="Bernoulli, D" uniqKey="Bernoulli D">D. Bernoulli</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Busemeyer, J R" uniqKey="Busemeyer J">J. R. Busemeyer</name>
</author>
<author>
<name sortKey="Myung, I J" uniqKey="Myung I">I. J. Myung</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Erev, I" uniqKey="Erev I">I. Erev</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Feng, S" uniqKey="Feng S">S. Feng</name>
</author>
<author>
<name sortKey="Holmes, P" uniqKey="Holmes P">P. Holmes</name>
</author>
<author>
<name sortKey="Rorie, A" uniqKey="Rorie A">A. Rorie</name>
</author>
<author>
<name sortKey="Newsome, W T" uniqKey="Newsome W">W. T. Newsome</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Fiorillo, C D" uniqKey="Fiorillo C">C. D. Fiorillo</name>
</author>
<author>
<name sortKey="Tobler, P N" uniqKey="Tobler P">P. N. Tobler</name>
</author>
<author>
<name sortKey="Schultz, W" uniqKey="Schultz W">W. Schultz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Gold, J I" uniqKey="Gold J">J. I. Gold</name>
</author>
<author>
<name sortKey="Shadlen, M N" uniqKey="Shadlen M">M. N. Shadlen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Green, D" uniqKey="Green D">D. Green</name>
</author>
<author>
<name sortKey="Swets, J" uniqKey="Swets J">J. Swets</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Hayden, B Y" uniqKey="Hayden B">B. Y. Hayden</name>
</author>
<author>
<name sortKey="Heilbronner, S R" uniqKey="Heilbronner S">S. R. Heilbronner</name>
</author>
<author>
<name sortKey="Nair, A C" uniqKey="Nair A">A. C. Nair</name>
</author>
<author>
<name sortKey="Platt, M L" uniqKey="Platt M">M. L. Platt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Healy, A F" uniqKey="Healy A">A. F. Healy</name>
</author>
<author>
<name sortKey="Kubovy, M" uniqKey="Kubovy M">M. Kubovy</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Herrnstein, R J" uniqKey="Herrnstein R">R. J. Herrnstein</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kahneman, D" uniqKey="Kahneman D">D. Kahneman</name>
</author>
<author>
<name sortKey="Tversky, A" uniqKey="Tversky A">A. Tversky</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Kiani, R" uniqKey="Kiani R">R. Kiani</name>
</author>
<author>
<name sortKey="Shadlen, M N" uniqKey="Shadlen M">M. N. Shadlen</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Maddox, W T" uniqKey="Maddox W">W. T. Maddox</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Mccoy, A N" uniqKey="Mccoy A">A. N. McCoy</name>
</author>
<author>
<name sortKey="Platt, M L" uniqKey="Platt M">M. L. Platt</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Navalpakkam, V" uniqKey="Navalpakkam V">V. Navalpakkam</name>
</author>
<author>
<name sortKey="Koch, C" uniqKey="Koch C">C. Koch</name>
</author>
<author>
<name sortKey="Perona, P" uniqKey="Perona P">P. Perona</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Rangel, A" uniqKey="Rangel A">A. Rangel</name>
</author>
<author>
<name sortKey="Camerer, C" uniqKey="Camerer C">C. Camerer</name>
</author>
<author>
<name sortKey="Montague, P R" uniqKey="Montague P">P. R. Montague</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Romo, R" uniqKey="Romo R">R. Romo</name>
</author>
<author>
<name sortKey="Salinas, E" uniqKey="Salinas E">E. Salinas</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Schall, J D" uniqKey="Schall J">J. D. Schall</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Sugrue, L P" uniqKey="Sugrue L">L. P. Sugrue</name>
</author>
<author>
<name sortKey="Corrado, G S" uniqKey="Corrado G">G. S. Corrado</name>
</author>
<author>
<name sortKey="Newsome, W T" uniqKey="Newsome W">W. T. Newsome</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Tobler, P N" uniqKey="Tobler P">P. N. Tobler</name>
</author>
<author>
<name sortKey="Fiorillo, C D" uniqKey="Fiorillo C">C. D. Fiorillo</name>
</author>
<author>
<name sortKey="Schultz, W" uniqKey="Schultz W">W. Schultz</name>
</author>
</analytic>
</biblStruct>
<biblStruct>
<analytic>
<author>
<name sortKey="Voss, A" uniqKey="Voss A">A. Voss</name>
</author>
<author>
<name sortKey="Rothermund, K" uniqKey="Rothermund K">K. Rothermund</name>
</author>
<author>
<name sortKey="Brandtstadter, J" uniqKey="Brandtstadter J">J. Brandtstadter</name>
</author>
</analytic>
</biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article">
<pmc-dir>properties open_access</pmc-dir>
<front>
<journal-meta>
<journal-id journal-id-type="nlm-ta">Front Neurosci</journal-id>
<journal-id journal-id-type="publisher-id">Front. Neurosci.</journal-id>
<journal-title-group>
<journal-title>Frontiers in Neuroscience</journal-title>
</journal-title-group>
<issn pub-type="ppub">1662-4548</issn>
<issn pub-type="epub">1662-453X</issn>
<publisher>
<publisher-name>Frontiers Research Foundation</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="pmid">21151367</article-id>
<article-id pub-id-type="pmc">2996133</article-id>
<article-id pub-id-type="doi">10.3389/fnins.2010.00186</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Suboptimal Integration of Reward Magnitude and Prior Reward Likelihood in Categorical Decisions by Monkeys</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Teichert</surname>
<given-names>Tobias</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">*</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ferrera</surname>
<given-names>Vincent P.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Department of Neuroscience, Columbia University</institution>
<country>New York, NY, USA</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Daeyeol Lee, Yale University School of Medicine, USA</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Benjamin Hayden, Duke University Medical Center, USA; Ming Hsu, University of California at Berkeley, USA</p>
</fn>
<corresp id="fn001">*Correspondence: Tobias Teichert, Department of Neuroscience, Columbia University, 1051 Riverside Drive, Unit 87, New York, NY 10032, USA. e-mail:
<email>tt2288@columbia.edu</email>
</corresp>
<fn fn-type="other" id="fn002">
<p>This article was submitted to Frontiers in Decision Neuroscience, a specialty of Frontiers in Neuroscience.</p>
</fn>
</author-notes>
<pub-date pub-type="epreprint">
<day>13</day>
<month>8</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>19</day>
<month>11</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="collection">
<year>2010</year>
</pub-date>
<volume>4</volume>
<elocation-id>186</elocation-id>
<history>
<date date-type="received">
<day>02</day>
<month>8</month>
<year>2010</year>
</date>
<date date-type="accepted">
<day>16</day>
<month>10</month>
<year>2010</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright © 2010 Teichert and Ferrera.</copyright-statement>
<copyright-year>2010</copyright-year>
<license license-type="open-access" xlink:href="http://www.frontiersin.org/licenseagreement">
<license-p>This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.</license-p>
</license>
</permissions>
<abstract>
<p>Sensory decisions may be influenced by non-sensory information regarding reward magnitude or reward likelihood. Given identical sensory information, it is more optimal to choose an option if it is
<italic>a priori</italic>
more likely to be correct and hence rewarded (prior reward likelihood bias), or if it yields a larger reward, given that it is the correct choice (reward magnitude bias). Here, we investigated the ability of macaque monkeys to integrate reward magnitude and prior reward likelihood information into a categorical decision about stimuli with high signal strength but variable decision uncertainty. In the asymmetric reward magnitude condition, monkeys over-adjusted their decision criterion such that they chose the highly rewarded alternative far more often than was optimal; in contrast, monkeys did not adjust their decision criterion in response to asymmetric reward likelihood. This finding shows that in this setting, monkeys did not adjust their decision criterion based on the product of reward likelihood and reward magnitude as has been reported to be the case in value-based decisions that do not involve decision uncertainty due to stimulus categorization.</p>
</abstract>
<kwd-group>
<kwd>reward bias</kwd>
<kwd>categorization</kwd>
<kwd>signal detection theory</kwd>
<kwd>psychometric function</kwd>
</kwd-group>
<counts>
<fig-count count="7"></fig-count>
<table-count count="2"></table-count>
<equation-count count="19"></equation-count>
<ref-count count="21"></ref-count>
<page-count count="13"></page-count>
<word-count count="10750"></word-count>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction">
<title>Introduction</title>
<p>Typical decisions depend on sensory information that is evaluated with respect to certain rules as well as non-sensory information such as reward magnitude and reward likelihood. Sensory evidence and non-sensory information may favor different choices. Based on signal detection theory it is possible to show that there is an optimal way to integrate reward information by shifting the decision rule/criterion as a function of the unconditional expected value of the options (see below). The aim of this experiment was to test whether macaque monkeys, one of the primary animal models of human decision making, integrate sensory and reward information optimally from the point of view of signal detection theory. In particular, we tested whether manipulations of reward magnitude and reward likelihood have the same effect on the monkeys’ decision criteria as predicted by signal detection theory.</p>
<p>Decision making tasks can incorporate several elements: sensory evidence, rules, and outcome values. In value-based decision tasks subjects choose among options based on their payoffs which are learned either by instruction or experience (Sugrue et al.,
<xref ref-type="bibr" rid="B19">2005</xref>
; Rangel et al.,
<xref ref-type="bibr" rid="B16">2008</xref>
). In sensory decision tasks, subjects infer the correct choice from the sensory information which is evaluated with respect to a particular decision rule (Romo and Salinas,
<xref ref-type="bibr" rid="B17">2003</xref>
; Schall,
<xref ref-type="bibr" rid="B18">2003</xref>
; Gold and Shadlen,
<xref ref-type="bibr" rid="B6">2007</xref>
). Correct choices are rewarded equally regardless of which option was chosen. Real decision making, however, is oftentimes a mixture of sensory and value-based decision making: subjects choose between options which differ with respect to the rewards they offer based on sensory information that may help guide the decision process toward the correct choice. The ensuing problem which is a mixture between value-based and sensory decision making will be referred to as
<italic>biased sensory decision making</italic>
. In biased sensory decision tasks, the optimal choice will depend on both, sensory and non-sensory value information.</p>
<p>A typical example of biased sensory decision making is medical diagnostics: based on haptic or visual information a doctor needs to decide, for example, if a tumor is present or not. On the one hand, a random subject will be much more likely to be healthy, thus making the “no tumor” diagnosis almost a sure bet. On the other hand, failing to correctly diagnose a tumor will come at a much higher cost. In biased sensory decision tasks we can define the
<italic>unconditional expected value</italic>
as the product of reward (or cost) magnitude and the prior probability of the option being correct and hence rewarded. Based on this definition it can be shown that the optimal decision criterion which maximizes payoffs in the long run is a function of unconditional expected value (e.g., Green and Swets,
<xref ref-type="bibr" rid="B7">1966</xref>
; Feng et al.,
<xref ref-type="bibr" rid="B4">2009</xref>
or Materials and Methods). Hence, the optimal decision criterion is the same regardless of whether an option is
<italic>a priori</italic>
twice as likely to be correct, or associated with a reward that is twice as large.</p>
<p>The main finding in the field of value-based decision making is that the choice behavior of subjects may be approximated as a function of expected value. Human as well as non-human subjects will in general prefer the option with the largest expected value (matching law, e.g., Herrnstein,
<xref ref-type="bibr" rid="B10">1961</xref>
; expected utility theory, e.g.,Bernoulli,
<xref ref-type="bibr" rid="B1">1738</xref>
; for systematic deviations from expected utility theory see e.g.,Kahneman and Tversky,
<xref ref-type="bibr" rid="B11">1979</xref>
). As subjects seem to be able to estimate expected value of different response options and change their behavior accordingly, it is tempting to assume that the same mechanisms may also mediate the placement of decision criteria in biased sensory decision tasks. If this were the case, the shift of decision criteria could be approximated as a function of unconditional expected value.</p>
<p>It was the aim of the present study to investigate the relation between unconditional expected value and decision criteria in biased sensory decision tasks. In particular, we wanted to test whether changes of unconditional expected value have the same impact on decision criteria when caused by reward magnitude as opposed to prior reward likelihood manipulations. Our null hypothesis assumes that the neural network responsible for setting decision criteria in biased decision tasks will operate as a function of unconditional expected reward. In other words, we assumed that doubling the reward magnitude of an option has the same effect on decision criteria as doubling the likelihood of that option being correct, and hence rewarded. We speculate that neurons coding expected value/utility (Fiorillo et al.,
<xref ref-type="bibr" rid="B5">2005</xref>
; Tobler et al.,
<xref ref-type="bibr" rid="B20">2005</xref>
) could serve as the input to a network that could adjust decision criteria as a function of expected value/utility. Contrary to our expectations and predictions from signal detection theory, we found that decision criteria were only affected by manipulations of reward magnitude, and not by prior reward likelihood.</p>
</sec>
<sec sec-type="materials|methods" id="s1">
<title>Materials and Methods</title>
<sec>
<title>Subjects</title>
<p>Subjects were three male macaque monkeys (monkeys L, G, and C). At the time of the experiments the animals were between 5 and 9 years old and weighted between 8 and 12 kg.</p>
<p>Monkey C exhibited a strong response bias even in the absence of any reward manipulations. As this bias was probably due to damage caused during previous electrophysiological recordings, he was excluded from the second half of the experiments. Monkeys were prepared for the experiments by surgical implant of a post for restraining head movements and a scleral search coil to monitor eye-position. All methods were approved by the Institutional Animal Care and Use Committee at Columbia University and the New York State Psychiatric Institute.</p>
</sec>
<sec>
<title>Training and prior experience of the animals</title>
<p>Our goal was to approximate the natural behavior of macaque monkeys in a biased decision making task. To do so we selected the animals according to two criteria. First, we made sure that our subjects were never before exposed to asymmetric reward magnitude or reward likelihood manipulations. Second, we used animals that had extensive experience (>12 months) with the sensory decision task in question. This reduced the potential effects of training history. All of the training procedures that were used (see below) are standard and did not encourage a particular pattern of behavior in the main experiment. In particular, we used standard shaping procedures to introduce the animals to the task. Prior to the implantation of the head-post, animals learned to touch a touch-screen for fluid rewards. Next, they learned to touch a fixation spot in order to initiate a trial and then, to touch a single response target following the appearance of particular visual stimulus. Once the monkeys were comfortable with the task, we started introducing a second distracter target on some fraction of the trials. Monkeys were only rewarded for choosing the correct target. Once the monkeys performed well on the task with the distractor target (>80% correct), a head-post was implanted and eye-movements were tracked either with a scleral search coil or an infrared eye-tracker. The monkeys quickly learned to fixate a fixation spot for fluid rewards and switched naturally from signaling their choices in the decision task with manual responses to eye-movements. At the time of the experiments reported here, the monkeys had extensive experience with the sensory decision task (>12 months) and potential effects of the training procedure on the outcome of the experiments were minimal.</p>
</sec>
<sec>
<title>Setup</title>
<p>The animals performed the task in an upright primate chair while head-movements were restrained by a surgically implanted head-post. Visual stimuli were generated and controlled by a CRS VSG2/3F video frame buffer. Stimuli were displayed on a 60-Hz CRT-monitor (1280 × 1024 pixels) which was placed at a distance of 50 cm. For two of the animals (monkey L and C), eye position was recorded with a scleral search-coil (CNC Engineering, Seattle, WA, USA) and digitized at a rate of 500 Hz. Prior to the implantation of an eye-coil half way through the experiments, eye position of monkey K was monitored with an infrared camera system and the free eye tracking software i_rec with a temporal resolution of 60 Hz.</p>
</sec>
<sec>
<title>Biased sensory decision task</title>
<p>Monkeys categorized the speed of a moving random dot pattern as either fast or slow by making a saccade to a green and red target, respectively (see below or Figure
<xref ref-type="fig" rid="F1">1</xref>
A for more details). In the neutral condition, stimuli were selected such that all responses were equally likely to be correct and all correct responses were rewarded equally. On different days, we manipulated either the reward magnitudes associated with correct responses (
<italic>reward magnitude bias</italic>
) or biased the stimulus selection such that one of the responses had a higher likelihood of being correct and hence rewarded (
<italic>reward likelihood bias</italic>
). Both of these manipulations occurred in the context of a
<italic>category bias</italic>
condition during which higher reward magnitude/likelihood depended on the categorization, e.g., the “slow” category, i.e., the red target, was associated with higher reward magnitude/likelihood (see Figure
<xref ref-type="fig" rid="F1">1</xref>
B). Similarly, we implemented a
<italic>motor bias</italic>
condition for which higher reward magnitude/likelihood was associated with a particular motor response, e.g., higher reward magnitude/likelihood for a rightward saccade.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption>
<p>
<bold>Methods</bold>
.
<bold>(A)</bold>
Macaque monkeys were trained to categorize the speed of moving random dots as either slow or fast relative to a criterion speed which was learned by trial and error. Monkeys categorized speeds by making a saccade to the red and green target to signal slow and fast speeds, respectively.
<bold>(B)</bold>
In the reward magnitude condition, correct responses were either rewarded equally or according to one of four different asymmetric reward schedules. Reward magnitude was manipulating by changing number of valve openings as indicated by the number of water drops. A single valve opening corresponded to approximately 0.07 ml of water.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g001"></graphic>
</fig>
<p>This gave rise to 2 (reward magnitude/likelihood) × 2(category/motor) × 2(fast/slow for category bias or right/left for motor bias) = 8 different bias conditions. On a given day, a subject experienced one block of trials from the eight bias conditions embedded in two blocks of neutral trials: neutral (200–400 trials) – biased (>600 trials) – neutral (until satiated). The rational for embedding the biased condition in blocks of neutral trials was to examine the evolution of the behavioral bias from a neutral starting point. The reward magnitude and reward likelihood bias conditions were run in separate blocks of 4 days each, during which we presented the four possible bias conditions: “slow” and “fast” category bias as well as “right” and “left” motor bias.</p>
<p>On different blocks of 4 days we used different reward magnitude asymmetries: the number of valve openings (see below) for the favored versus the unfavored option could be either 4 to 3, 4 to 2 or 3 to 2. In the reward likelihood condition, the favored option was always twice as likely to be correct than the unfavored option.</p>
<p>An individual trial was initiated when the subject looked at a yellow fixation target in the center of the monitor. After a random delay, a coherently moving random dot pattern appeared in a 5° circular window around the fixation target. At the same time, the two gray potential saccade targets 7° to the left and right of the fixation spot turned red and green, respectively. The position of the red and green target to the left or right of the fixation point was chosen randomly. The monkeys were trained to associate slow speeds with the red target and fast speeds with the green target. Subjects signaled the outcome of the categorization process by making a saccade to the target with the corresponding color. The subjects were free to signal their choice at any time after stimulus onset. For each subject, the stimulus speed of a particular trial was drawn randomly from a set of 6–10 predetermined speeds. The speeds were spaced symmetrically around the cutoff speed of 5.5°/s. The range of speeds was adjusted for each subject individually to account for differences in performance.</p>
<p>Following a valid choice saccade (see below) and a random uniformly distributed delay (200–500 ms), auditory feedback was delivered. A high tone (880 Hz) indicated a correct response, a low tone (440 Hz) indicated a wrong response. If the response was correct, the subjects were required to keep fixating the target in order to receive the fluid reward associated with the correct response. The delay between the auditory feedback and the fluid reward was uniformly distributed between 350 and 650 ms. Reward magnitude was varied exclusively by changing the number of valve openings. The duration of an individual valve opening was constant over all sessions and animals and corresponded to approximately 0.07 ml of water. Each valve opening was accompanied by a very high tone (1200 Hz). Hence, the subjects could easily track reward magnitude by monitoring the number of tones. Multiple valve openings were separated by a pause of 100 ms. Reward delivery was aborted if fixation was broken even if the intended number of valve openings had not been reached. The strict regulation of eye-movements even after the decision can be understood in the light of upcoming electrophysiological recordings in prefrontal cortex which is known to comprise neurons with strong oculomotor selectivity.</p>
<p>Eye movements during the task were restricted to a single saccade to one of the targets after stimulus onset (choice saccade). If the eye left the fixation window prior to stimulus onset, the trial was aborted. Similarly, if the eye left the fixation window after stimulus onset and failed to re-fixate on one of the two saccade targets, the trial was considered incomplete. Incomplete trials were indicated by a very low tone (220 Hz) and were never rewarded. If the eye failed to remain on the target of the initial saccade until the auditory feedback was given the trial was considered revised. Revised trials were never rewarded and indicated by a very distinctive auditory event: an upward followed by a downward sinusoidal sweep.</p>
<p>Incomplete and revised trials were not included in the analysis. In addition, we excluded complete trials with unrealistically short and uncharacteristically long reaction times. The lower reaction time cutoff was set at 110 ms. The upper cutoff was chosen as the 97.5 percentile of the reaction time distribution of the session in question.</p>
<p>In addition to the decision trials described above, subjects performed 10% instructed trials. In such trials the subjects did not have to categorize the stimulus since only a single saccade target appeared on the screen. In addition to the instructed trials, the monkeys also performed 10% fixation trials. During a fixation trial, only the saccade targets and not the random dots were presented on the screen. Monkeys were rewarded if they maintained fixation. This condition served as a control condition for electrophysiological recordings. In the current study it had no purpose other than to get the monkeys accustomed to it.</p>
<p>The timing parameters of the task differed slightly between the animals. For two of the monkeys the random delay prior to stimulus onset was uniformly distributed. For the initial sessions of monkey L the delay ranged between 300 and 500 ms, for monkey C the delay ranged between 200 and 500 ms. For monkey K as well as for the later sessions of monkey L the distribution of the delays followed a truncated exponential distribution with a rate parameter 500 ms and a maximum value of 1500 ms. In addition, a fixed delay of 500 ms was added such that the total delay ranged between 500 and 2000 ms.</p>
</sec>
<sec>
<title>Value-based decision task</title>
<p>In addition to the biased sensory decision task described above, subjects also performed a value-based decision task. The value-based decision task was designed to be as similar as possible to the biased sensory decision task, with the following modifications: (1) The sensory information (motion stimulus) was removed. Instead, subjects were free to choose either saccade target, but the targets were not always rewarded equally. Subjects were cued when to make a saccade by the disappearance of the fixation spot. (2) We used a round and blue fixation point as well as round saccade targets to facilitate switching between the two tasks for the monkeys. (3) In the biased sensory decision task only the correct target (as determined by the sensory stimulus) was rewarded. The difficulty of the sensory discrimination was adjusted such that to average number of correct responses was between 70 and 85%. Lower reward rates dramatically reduced the animals’ compliance. To maintain a similar reward rate on the value-based decision task, we set the reward probability in the neutral condition to 60 and 80% for monkeys K and L, respectively. After the first week, we were able to change the reward rate from 80 to 60% for monkey L without losing compliance. In different sessions, responses were rewarded either with 3 or 4 valve openings. Within a single session, response magnitude never varied. A bias in favor of one of the response alternatives was introduced by setting the probability of the rewarded option to 80%, while setting that of the other option to 40%. (4) To encourage exploration of the alternatives we increased the percentage of instructed trials, i.e., trials with only a single response target. Initially we used 50% instructed trials. We changed this value to 40% after 1 week to increase the number of choice trials. In the biased sensory decision task, we were able to use a lower rate of instructed trials, because the easy trials served a similar purpose as the instructed trials. Except for these four differences, the biased sensory and the value-based decision tasks were identical.</p>
</sec>
<sec>
<title>Data analysis</title>
<p>In order to quantify the behavioral bias in different reward conditions of the biased sensory decision task, standard psychometric functions were fit to the data using a maximum likelihood method. To that aim, responses were binarized, with 0 corresponding to “slow” and 1 to “fast” judgments. The psychometric functions were modeled as cumulative Gaussians with three parameters which described the point of subjective equality (PSE, represented by
<italic>c</italic>
), the just noticeable difference (JND, represented by σ) and lapse parameter λ with 0 ≤ λ < 0.5:
<disp-formula id="E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>λ</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo></mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>λ</mml:mo>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>Φ</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo></mml:mo>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Here Φ corresponds to a Gaussian distribution function. The lapse parameter is thought to model trials in which the animal, for whatever reason, does not perform the task and randomly chooses a response.</p>
<p>Statistical significance was assessed with likelihood-ratio tests of appropriately constructed nested models. For example, to test for significant deviations of the PSE from 5.5 pix/frame (the actual category boundary between slow and fast speeds) we compared likelihood of a model in which PSE was allowed to vary freely with a model in which it was enforced to be equal to 5.5 pix/frame.</p>
<p>In addition, confidence intervals of the parameter estimates were assessed with a bootstrap procedure. To that aim we randomly generated binomially distributed responses
<italic>B</italic>
(
<italic>n</italic>
,
<italic>p</italic>
) for each condition, with
<italic>n</italic>
corresponding to the number of trials in a particular condition and
<italic>p</italic>
corresponding to the percent “fast” choices in that condition. A new psychometric function was fit to this bootstrapped data set. This procedure was repeated 1000 times yielding a set of 1000 parameter estimates.</p>
</sec>
<sec>
<title>Signal detection theory</title>
<p>Under the reasonable assumption that subjects will try to maximize reward, signal detection theory makes precise predictions about the shift of the decision criterion. In the following we will assume that the neuronal representation
<italic>X</italic>
of a stimulus speed
<italic>s</italic>
is variable from trial to trial, and can be represented as a Gaussian distribution with mean
<italic>s</italic>
and variance σ:
<italic>X</italic>
∼ 
<italic>N</italic>
(
<italic>s</italic>
,σ). Further, it is assumed that whenever the neuronal representation
<italic>X</italic>
exceeds a certain criterion value
<italic>c</italic>
, the stimulus is categorized as “fast.” In this setting, a bias will be represented by a change in the criterion value
<italic>c</italic>
.</p>
<p>Let us assume that
<italic>r</italic>
<sub>1/2</sub>
represents the reward magnitude associated with the two choices, respectively. Let
<italic>s
<sub>c</sub>
</italic>
be the actual cutoff speed that separates slow from fast speeds. The expected reward for a particular choice, e.g., “slow,” is given by the product of reward magnitude for this choice and the probability that this choice is correct. The mathematical description is given by Eq.
<xref ref-type="disp-formula" rid="E2">2</xref>
:
<disp-formula id="E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>|</mml:mo>
<mml:mi>X</mml:mi>
<mml:mo>=</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Assuming that there are only two possible stimulus speeds spaced symmetrically around the cutoff speed:
<italic>s</italic>
<sub>1/2</sub>
 = 
<italic>s
<sub>c</sub>
</italic>
 ± δ we can rewrite Eq.
<xref ref-type="disp-formula" rid="E2">2</xref>
:
<disp-formula id="E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo>φ</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Here φ indicates the density-function of a normal distribution. The first two terms together represent the non-sensory information, i.e., the reward magnitude and the prior reward likelihood of an option. The third term represents the sensory information, i.e., given the sensory evidence, how likely is option one or two to be the correct one. The cutoff
<italic>c</italic>
which optimizes expected reward can be found by equating the expected value for the two options:
<disp-formula id="E4">
<label>(4)</label>
<mml:math id="M4">
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="E5">
<label>(5)</label>
<mml:math id="M5">
<mml:mrow>
<mml:munder>
<mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo stretchy="true"></mml:mo>
</mml:munder>
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>unconditional</mml:mtext>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>expected</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext>value</mml:mtext>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>of</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext>option</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext>1</mml:mtext>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:munder>
<mml:mo>φ</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:munder>
<mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo stretchy="true"></mml:mo>
</mml:munder>
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>unconditional</mml:mtext>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>expected</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext>value</mml:mtext>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:mtext>of</mml:mtext>
<mml:mo></mml:mo>
<mml:mtext>option</mml:mtext>
<mml:mo></mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:munder>
<mml:mo>φ</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>In this form it is easy to appreciate that changing the reward magnitudes, i.e.,
<italic>r</italic>
<sub>1/2</sub>
has the same effect as changing the reward likelihood priors, i.e.,
<italic>P</italic>
(
<italic>s</italic>
 = 
<italic>s</italic>
<sub>1/2</sub>
). Further, we can define
<italic>r</italic>
<sub>1/2</sub>
<italic>P</italic>
(
<italic>s</italic>
 = 
<italic>s</italic>
<sub>1/2</sub>
) as the unconditional expected values of the two options which are known before the visual stimuli come on. Based on this definition, we can conclude that the optimal decision criterion is a function of the unconditional expected value.</p>
<p>In the current experiment we used six different speeds symmetrically spaced around the cutoff speed:
<italic>s
<sub>i</sub>
</italic>
 = 
<italic>s
<sub>c</sub>
</italic>
 ± δ
<sub>1,2,3</sub>
. Equation
<xref ref-type="disp-formula" rid="E4">4</xref>
can easily be expanded to accommodate this situation:
<disp-formula id="E5a">
<label>(5)</label>
<mml:math id="M6">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mn>3</mml:mn>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo></mml:mo>
<mml:mo>φ</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
<mml:mn>6</mml:mn>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
<mml:mo></mml:mo>
<mml:mo>φ</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="E6">
<label>(6)</label>
<mml:math id="M7">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mn>3</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mn>6</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
<mml:mo></mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Here
<inline-formula>
<mml:math id="M26">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula>
denotes the reward associated with a correct “slow” categorization and vice verse. For a more detailed treatment see Feng et al. (
<xref ref-type="bibr" rid="B4">2009</xref>
). In the current context we determined the optimal shift
<italic>c</italic>
that solves Eq.
<xref ref-type="disp-formula" rid="E6">6</xref>
numerically.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<p>The aim of our experiment was to investigate sensory decision criteria of macaque monkeys when they choose between response alternatives differing with respect to reward magnitude or prior reward likelihood. For each experimental session, we analyzed two groups of trials: the block of neutral trials completed prior to the introduction of the biased reward schedule and the block of biased trials. We excluded the trials immediately following the introduction of the new reward schedule to allow some time for adjusting the decision criteria to the new reward contingencies. The minimum learning cutoff we used was 100 trials. In conditions for which a large number of biased trials were available we also tested the effect of excluding more trials (up to 600 trials). No qualitative differences were found.</p>
<p>Using nested likelihood tests described in section
<xref ref-type="sec" rid="s1">“Materials and Methods”</xref>
we tested the following null-hypotheses for every recording session: (
<bold>H01</bold>
) The monkeys do
<bold>not</bold>
shift their decision criterion in response to the manipulations of reward contingencies, i.e., the decision criterion in the neutral and biased condition are identical: PSE
<italic>
<sub>n</sub>
</italic>
 = PSE
<italic>
<sub>b</sub>
</italic>
. (
<bold>H02</bold>
) The stimulus discriminability does not change in the biased compared to the neutral condition: σ
<italic>
<sub>n</sub>
</italic>
 = σ
<italic>
<sub>b</sub>
</italic>
. (
<bold>H03</bold>
) The monkeys adopt the optimal decision criterion in the neutral condition: PSE
<italic>
<sub>n</sub>
</italic>
 = 5.5 pix/frame. (
<bold>H04</bold>
) In the biased block, the monkeys adopt a decision criterion at the optimal value deducted from signal detection theory.</p>
<sec>
<title>Reward magnitude bias</title>
<p>Figure
<xref ref-type="fig" rid="F2">2</xref>
shows the results of the initial four reward magnitude sessions (fast, slow, rightward, and leftward bias). For all three animals the decision criteria in the biased condition differ from the one in the neutral condition. The optimal decision criterion predicted by signal detection theory is indicated by the gray lines. In most cases, the observed decision criteria are shifted farther than optimal.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption>
<p>
<bold>Reward magnitude bias</bold>
. Psychometric functions for the first four recording session in the reward magnitude condition: the black color corresponds to the neutral condition averaged over all 4 days. The red and green colors correspond to trials in the slow and fast category bias condition, respectively. In order to visualize the effect of the direction bias we separated the trials into two conditions: The so called “slow direction bias” condition incorporates all trials where the red target could be reached by a saccade in the biased direction, i.e., red target on the left side during leftward bias and red target on the right side during the rightward bias. Similarly, the “fast direction bias” condition corresponds to trials where the green target could be reached by a saccade in the biased direction. The orange and cyan color corresponds to the slow and fast direction bias, respectively. The box and whisker plots indicate confidence intervals of a bootstrap procedure (see
<xref ref-type="sec" rid="s1">Materials and Methods</xref>
for details). The gray lines indicate the predictions of the ideal observer analysis which maximizes reward given the stimulus discriminability observed in the neutral condition. The results show a clear pattern: unbiased behavior in the neutral condition and large shifts of the PSE in the expected direction for the biased conditions.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g002"></graphic>
</fig>
<p>To quantify the results we tested the four null-hypotheses H01–H04 for every recording session (see Table
<xref ref-type="table" rid="T1">1</xref>
). The results for H01 indicate that in almost all cases (45/48), the decision criteria shifted significantly in the direction predicted from the reward asymmetry. This finding provides very strong evidence for the assumption that reward magnitude manipulations have a strong effect on the sensory decision criteria.</p>
<table-wrap id="T1" position="float">
<label>Table 1</label>
<caption>
<p>
<bold>Summary of the reward magnitude and reward likelihood experiments</bold>
.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" rowspan="1" colspan="1"></th>
<th align="center" colspan="2" rowspan="1">Magnitude bias</th>
<th align="center" colspan="2" rowspan="1">Likelihood bias</th>
</tr>
<tr>
<th align="left" rowspan="1" colspan="1"></th>
<th align="center" rowspan="1" colspan="1">Category</th>
<th align="center" rowspan="1" colspan="1">Direction</th>
<th align="center" rowspan="1" colspan="1">Category</th>
<th align="center" rowspan="1" colspan="1">Direction</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>H01</bold>
<break></break>
PSE
<italic>
<sub>n</sub>
</italic>
 = PSE
<italic>
<sub>b</sub>
</italic>
</td>
<td align="left" rowspan="1" colspan="1">23 / 1 / 0 : 24</td>
<td align="left" rowspan="1" colspan="1">22 / 1 / 1 : 24</td>
<td align="left" rowspan="1" colspan="1">0 / 4 / 0 : 4</td>
<td align="left" rowspan="1" colspan="1">0 / 4 / 0 : 4</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>H02</bold>
<break></break>
σ
<italic>
<sub>n</sub>
</italic>
 = σ
<italic>
<sub>b</sub>
</italic>
</td>
<td align="left" rowspan="1" colspan="1">10 / 14 / 0 : 24
<break></break>
# “<” / “ns” / “>” : total</td>
<td align="left" rowspan="1" colspan="1">6 / 17 / 1 : 24</td>
<td align="left" rowspan="1" colspan="1">1 / 3 / 0 : 4</td>
<td align="left" rowspan="1" colspan="1">3 / 1 / 0 : 4</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>H03</bold>
<break></break>
PSE
<italic>
<sub>n</sub>
</italic>
 = optimal</td>
<td align="left" rowspan="1" colspan="1">5 / 14 / 5 : 24
<break></break>
# “>” / “ns” / “<” : total</td>
<td align="left" rowspan="1" colspan="1">6 / 11 / 7 : 24</td>
<td align="left" rowspan="1" colspan="1">0 / 3 / 1 : 4</td>
<td align="left" rowspan="1" colspan="1">0 / 3 / 1 : 4</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">
<bold>H04</bold>
<break></break>
PSE
<italic>
<sub>b</sub>
</italic>
 = optimal</td>
<td align="left" rowspan="1" colspan="1">24 / 0 / 0 : 24
<break></break>
# “>” / “ns” / “<” : total</td>
<td align="left" rowspan="1" colspan="1">23 / 0 / 1 : 24</td>
<td align="left" rowspan="1" colspan="1">0 / 2 / 2 : 4</td>
<td align="left" rowspan="1" colspan="1">0 / 0 / 4 : 4</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>
<italic>The four numbers in each individual cell correspond to the results of the nested hypothesis test. The first and third number indicate the number of instances when the test statistic was significantly larger and smaller than expected under the null hypothesis, respectively. The third number corresponded to the number of non-significant instances. Finally, the fourth number corresponded to the total number tests. In H01 the test statistic was constructed such that the first number indicates the instances when the direction of the difference in decision criteria was consistent with the ideal observer performance. In H02, the first number indicates the instances for which</italic>
σ
<italic>
<sub>b</sub>
was larger than</italic>
σ
<italic>
<sub>n</sub>
, i.e., when stimulus discriminability dropped in the block of biased trials. In H03 the first number corresponds to the instances when PSE
<sub>n</sub>
was significantly smaller than 5.5 pix/frame. Finally, in H04, the first number corresponds to the number of times the shift in PSE is larger than optimal</italic>
.</p>
</table-wrap-foot>
</table-wrap>
<p>H02 tested whether stimulus discriminability differed in the neutral and biased condition. In 31 out of 48 cases this was not the case. For 16 out of the remaining 17 cases, we observed a decrease in stimulus discriminability in the biased condition (see also Figure
<xref ref-type="fig" rid="F2">2</xref>
). Additional analyses indicated that all animals exhibited this trend. However, it was most pronounced for monkey L which also tended to show the strongest effect of reward magnitude on the shift of the decision criteria (see for example Figure
<xref ref-type="fig" rid="F2">2</xref>
or Table
<xref ref-type="table" rid="T2">2</xref>
).</p>
<table-wrap id="T2" position="float">
<label>Table 2</label>
<caption>
<p>
<bold>Fraction of actual and optimal reward volume that could have been earned with the optimal decision criterion</bold>
. Especially in the magnitude bias condition monkeys loose a considerable amount due to the suboptimal integration of sensory and non-sensory information.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" rowspan="1" colspan="1">Fraction of optimal reward</th>
<th align="left" rowspan="1" colspan="1">Magnitude bias</th>
<th align="left" rowspan="1" colspan="1">Likelihood bias</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" rowspan="1" colspan="1">Monkey K</td>
<td align="left" rowspan="1" colspan="1">0.94</td>
<td align="left" rowspan="1" colspan="1">0.99</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Monkey L</td>
<td align="left" rowspan="1" colspan="1">0.93</td>
<td align="left" rowspan="1" colspan="1">0.98</td>
</tr>
<tr>
<td align="left" rowspan="1" colspan="1">Monkey C</td>
<td align="left" rowspan="1" colspan="1">0.96</td>
<td align="left" rowspan="1" colspan="1">NA</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>H03 tested whether the decision criteria were optimal, i.e., at 5.5 pix/frame, in the block of neutral trials. In 25 out of 48 sessions the placement of the decision criteria was not significantly different from optimal. Overall, there was no systematic trend for decision criterion to deviate from 5.5 pix/frame in a particular direction. This is in part due to the fact that monkey C which did exhibit a systematic tendency to misplace the decision criterion in the neutral condition only performed four sessions before being excluded from the experiment. The large percentage of significant deviations from optimality indicate non-systematic daily fluctuations in the decision criteria. It is possible that these fluctuations are due to residual carry-over effects from the biased reward schedule presented on the previous day.</p>
<p>H04 tested the optimality of the decision criterion in the biased trial blocks. In 47 out of 48 instances, the observed shift of the decision criterion was significantly larger than the optimal shift. Note that the stimulus discriminability σ tended to decrease in the blocks of biased trials (see H02). Hence, we used the lower stimulus discriminability in the biased block indicated by σ
<italic>
<sub>b</sub>
</italic>
to estimate the optimal shift. Thus, the observed overcompensation cannot be attributed to reduced stimulus discriminability in the biased condition. Figure
<xref ref-type="fig" rid="F5">5</xref>
illustrates the deviations from optimality in the neutral as well as the magnitude bias condition.</p>
<p>Further, we analyzed the impact of this overcompensation on the average reward magnitude. To that aim we estimated the expected reward magnitude that would be obtained using the optimal decision criterion and compared it with the expected reward magnitude predicted from the measured decision criterion. Table
<xref ref-type="table" rid="T2">2</xref>
reports the fraction of the expected reward magnitude using the actual and the optimal decision criterion. By over-adjusting the decision criterion, the three monkeys lost on average 7, 6, and 4% of the possible reward magnitude, respectively.</p>
</sec>
<sec>
<title>Prior reward likelihood bias</title>
<p>Figure
<xref ref-type="fig" rid="F3">3</xref>
shows the results in the biased reward likelihood condition for two monkeys. As monkey C had exhibited a natural bias even in the absence any reward manipulation, he was excluded from the remainder of the experiment. In stark contrast to the biased reward magnitude condition, there seemed to be no effect of prior reward likelihood on the placement of the decision criteria. This observation is backed by the nested likelihood-ratio tests (Table
<xref ref-type="table" rid="T1">1</xref>
). The results from H01 show that summed over both animals, significant differences between the biased and the neutral condition were found in none of the eight cases. Similarly, the results from H04 indicate that the decision bound was placed significantly suboptimal in all eight instances. However, in contrast to the reward magnitude condition where the subjects shifted their criterion too far here they did not shift it far enough. Results from H02 and H03 do not seem to differ significantly from the ones found in the reward magnitude condition.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption>
<p>
<bold>Reward likelihood bias</bold>
. Psychometric functions for the first four recording session in the reward likelihood condition. Conventions as in Figure
<xref ref-type="fig" rid="F2">2</xref>
. In contrast to the reward magnitude condition, monkeys did not shift their PSE in the biased conditions.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g003"></graphic>
</fig>
<p>Due to the suboptimal decision criteria, monkeys lost some fraction of the possible reward volume. However, the losses were considerable smaller than in the reward magnitude condition. Monkeys K and L lost 1 and 2%, respectively (see Table
<xref ref-type="table" rid="T2">2</xref>
).</p>
<p>Figure
<xref ref-type="fig" rid="F4">4</xref>
summarizes the main result regarding the differences between the reward magnitude and reward likelihood condition (see also Figure
<xref ref-type="fig" rid="F5">5</xref>
). In all but one of the cases, the observed shift in the reward magnitude condition is significantly larger than the optimal shift. In contrast, for six out of eight instances in the reward likelihood condition, the observed shift is significantly smaller than the optimal shift. Figure
<xref ref-type="fig" rid="F4">4</xref>
B illustrates the temporal progression of the results. The lack of an adjustment of the decision criterion in the reward likelihood condition can not be attributed to a linear trend over time: Overcompensation in the reward magnitude condition was found both before and after the reward likelihood sessions.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption>
<p>
<bold>Population analysis</bold>
.
<bold>(A)</bold>
The observed shift of decision criterion (PSE – 5.5 pix/frame) is plotted as a function of the optimal shift which was calculated based on the stimulus discriminability observed the same day in the biased condition. The green and blue color corresponds to the reward magnitude and reward likelihood conditions, respectively. In the reward magnitude condition, the animals consistently shift their decision criterion farther than optimal. In contrast, the decision criterion does not shift systematically in the reward likelihood condition.
<bold>(B)</bold>
The fraction of observed and optimal shift as a function of temporal progression of the recording sessions aligned to the first reward likelihood session. For both monkeys we observed overcompensation in the reward magnitude condition prior to and after the sessions which failed to find any compensation for the reward likelihood condition.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g004"></graphic>
</fig>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption>
<p>
<bold>Deviation from the optimal decision criterion in the neutral, reward magnitude and reward likelihood condition</bold>
. In the neutral condition there is no systematic deviation from optimality, despite considerable unsystematic variability. In the reward magnitude and the reward likelihood condition we do observes systematic deviation from optimality. In the magnitude bias condition subjects shift their decision criterion too far, i.e., they over-compensate. In contrast, they do not shift their decision criterion far enough in the likelihood bias condition, i.e., they under-compensate.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g005"></graphic>
</fig>
</sec>
<sec>
<title>Motor versus category bias</title>
<p>In each of the biased reward conditions (biased magnitude and biased likelihood), the bias was applied either to response direction (left or right) or response category (fast or slow). Visual inspection of Figure
<xref ref-type="fig" rid="F2">2</xref>
suggests that the effect of reward magnitude is larger in the category bias condition than in the direction bias condition. In fact, for the subset of sessions presented in Figure
<xref ref-type="fig" rid="F2">2</xref>
this difference is significant for two of the monkeys. We further tested this assumption on the population level. To that aim we used a linear model to predict observed shift as a function of optimal shift. We tested whether the fit of this model is improved by allowing the slope to vary as a function of bias type, i.e., category or motor bias. Our results indicate clearly that this is not the case (
<italic>p</italic>
 = 0.94). Overall, our results indicate that biased reward schedules have the same effects on the shift of decision criteria in category and motor bias conditions.</p>
</sec>
<sec>
<title>Value-based decision task</title>
<p>In the biased sensory decision task we found a striking dissociation between the subjects’ response to the reward magnitude and reward likelihood manipulation. To assure that these differences were not due to a general insensitivity of the subjects to reward likelihood, we tested their behavior in a standard value-based decision task. We found that both subjects readily adjusted their behavior to changes of reward likelihood in this task. Figure
<xref ref-type="fig" rid="F7">7</xref>
shows how the subjects’ preference shifts toward the option that is twice as likely (80%) to be rewarded as the other option (40%). Averaging the responses from trial number 201 to 1200 after the introduction of the asymmetric reward schedule, monkey K and monkey L chose the favored option 63 and 26% more often than the other, respectively. Monkey K adjusted his choices considerably faster than monkey L. Monkey L responded less to the reward likelihood manipulation, especially in the second half of the experiment. Instead his behavior was strongly biased toward his naturally preferred response option, a rightward saccade. Nevertheless, the results show that both monkeys responded robustly to the changes in reward likelihood. For 10 out of 11 sessions we find that the animals choose the biased option significantly more often in the biased block than in the neutral block of trials (two-sample test for equality of proportions with continuity correction, α = 0.05).</p>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>In the current study we measured sensory decision criteria of macaque monkeys in response to manipulations of reward magnitude and prior reward likelihood favoring one of the two perceptual categories (“fast” or “slow”), or one of the two possible motor responses (leftward or rightward saccade). We report two main findings: first, decision criteria did shift in response to manipulations of reward magnitude. Ideal observer analysis showed that the observed shift was significantly larger than the value that would have optimized reward volume in the long run. Second, decision criteria did not shift in response to manipulations of prior reward likelihood. This is in clear contrast to human observers whose decision criteria are sensitive to changes in prior reward likelihood (e.g., Maddox,
<xref ref-type="bibr" rid="B13">2002</xref>
). In the following we will discuss the implications of our findings in greater detail.</p>
<sec>
<title>Reward magnitude versus likelihood bias</title>
<p>Our results show substantial and significant differences in the subjects’ decision criteria in response to manipulations of reward magnitude on the one hand and reward likelihood on the other. The decision criteria readily adjusted to the manipulation of reward magnitude (see Figure
<xref ref-type="fig" rid="F2">2</xref>
). In our animals which had never before experienced the reward magnitude manipulation, decision criteria began shifting within the first 100 trials (Figure
<xref ref-type="fig" rid="F6">6</xref>
). Within less than 200 trials the decision criterion reached a relatively stable level which was shifted significantly farther than predicted by an ideal observer model. The suboptimal placement of the decision bound in the reward magnitude condition cost the monkeys on average 6% of the expected reward volume, even after accounting for decreased stimulus discriminability in the biased condition. This value is substantially larger than the 1–2% reported previously in a related task (Feng et al.,
<xref ref-type="bibr" rid="B4">2009</xref>
).</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption>
<p>
<bold>Decision criteria as a function of time from the introduction of the biased reward schedule</bold>
. For each speed we calculated moving averages of percent “fast” choices as a function of time from the onset of the biased reward schedule. The moving average was calculated with a box-car kernel of ±25 trials. We fit a psychometric function to the data from each time-point in the moving average. Here we show the time-resolved PSE of the fitted functions. For both monkeys in both bias conditions, the decision criteria start moving in the predicted direction immediately after the introduction of the biased reward schedule. They reach a reasonably stable level after about 200 trials. Note that the graph averages over all trials of a given monkey and condition. Hence, a single line comprises data of e.g., the “fast” and “slow” category bias regardless of the reward asymmetry used on that particular day (4 versus 2 or 4 versus 3, or 3 versus 2 valve openings). These different conditions are balanced only for the first 600 trials.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g006"></graphic>
</fig>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption>
<p>
<bold>Results from the value-based decision task</bold>
. The proportion of choices in favor of the biased response is plotted separately for the neutral and the biased blocks. Data from a particular bias type are connected by a dotted line. The plotting symbol and line type indicate the subject. In every single session, subjects shifted their preferences in favor of the response that was more likely to be correct. On most days this effect was quite strong and caused the monkeys to choose the biased response 80% of the time or more. On some days, however, the effect was rather small. This seemed to be the case when an animal already had a very strong bias, e.g., toward a particular direction and the likelihood manipulation favored a particular color. Further, while both monkeys responded to the reward likelihood manipulations, monkey K responded stronger and with a shorter delay than monkey L. As monkey K already had a very strong natural rightward preference, we did not run the rightward bias condition. The response bias in the neutral block is caused mainly by carry-over effects from the bias introduced on the previous day.</p>
</caption>
<graphic xlink:href="fnins-04-00186-g007"></graphic>
</fig>
<p>In stark contrast, the decision criteria of the monkeys did
<bold>not</bold>
adjust to manipulations of prior reward likelihood (see Figure
<xref ref-type="fig" rid="F3">3</xref>
). This led to an average reduction in expected reward volume of 2%. Even after up to 1000 trials with unequal reward likelihoods we failed to find any sign which might hint at a shift of decision criteria. Clearly, we can not rule out that such a shift would have developed if the same likelihood bias had been maintained for several thousands of trials distributed over multiple days. Similarly, our study does not exclude that the animals will eventually learn to adapt more optimal decision thresholds in both conditions as they the gain more experience with the biased decision making task in general. However, this does not take away from our main finding which is that naive animals exhibit dramatically different responses to manipulations in reward magnitude on the one hand and prior reward likelihood on the other.</p>
<p>One possible explanation for the striking lack of an effect in response to the prior reward likelihood manipulation is that our subjects were insensitive to changes of reward likelihood in general, or that the differences in reward likelihood were too small to notice. However, our results from the value-based decision task show that this is not the case. In this task the subjects readily adjusted their behavior to differences in reward likelihood of the same magnitude. Hence, it seems more likely that the subjects failed to establish a connection between prior reward likelihood, decision criteria and reward maximization. This link may have been difficult to establish in the biased decision task, as there were two sources of information indicating which of the two options was more likely to be correct: the sensory evidence evaluated on a trial by trial basis and the prior reward likelihood estimated over a large sample of trials. The sensory evidence may in some way have prevented the use of the prior reward likelihood information. It is possible that differences in reward magnitude are more salient than differences in reward likelihood: reward size is experienced immediately while reward probability emerges over multiple trials. However, it is important to note that the reward magnitude effect develops gradually over more than 100 trials, suggesting that reward magnitude is also evaluated over a longer time scale.</p>
</sec>
<sec id="s2">
<title>Comparison between human and macaque observers</title>
<p>A large body of literature has examined the phenomenology of biased sensory decision making in human observers: These studies investigated how subjects adjust their decision criteria in categorization tasks with unequal reward magnitudes and/or prior reward likelihood (Busemeyer and Myung,
<xref ref-type="bibr" rid="B2">1992</xref>
; Erev,
<xref ref-type="bibr" rid="B3">1998</xref>
; Maddox,
<xref ref-type="bibr" rid="B13">2002</xref>
; Voss et al.,
<xref ref-type="bibr" rid="B21">2008</xref>
; Feng et al.,
<xref ref-type="bibr" rid="B4">2009</xref>
; Navalpakkam et al.,
<xref ref-type="bibr" rid="B15">2009</xref>
). In the following, we want to relate our findings in macaque monkeys to the human literature in general and one very simple conceptual model in particular.</p>
<p>Several studies have shown that human subjects indeed adjust their decision criteria in response to both unequal reward magnitudes and prior reward likelihoods. However, subjects are conservative and do not adjust their criterion far enough in order to optimize expected reward (Green and Swets,
<xref ref-type="bibr" rid="B7">1966</xref>
; Healy and Kubovy,
<xref ref-type="bibr" rid="B9">1981</xref>
). Further, several studies have shown that humans place their decision criteria closer to the optimal criterion when faced with unequal prior reward likelihoods rather than unequal reward magnitudes (reviewed for example in Maddox,
<xref ref-type="bibr" rid="B13">2002</xref>
).</p>
<p>Macaque monkeys and humans behave similar only in response to the reward magnitude manipulation: both species shift their decision criteria in response to such manipulations. However, while human subjects under-compensate, i.e., they do not shift their decision criteria far enough, macaques over-compensate (see Figures
<xref ref-type="fig" rid="F2">2</xref>
and
<xref ref-type="fig" rid="F4">4</xref>
as well as Feng et al.,
<xref ref-type="bibr" rid="B4">2009</xref>
). In contrast, the most pronounced interspecies differences can be observed in the prior reward likelihood condition: while human subjects show large and close to optimal shifts in this condition, macaques failed to show any significant shift in our biased decision task.</p>
<p>These clear differences suggest that different mechanisms underlie the adjustment of decision bounds in the two species. In the following we will relate our results to a very simple conceptual model which has been put forward to explain the behavior of human subjects in similar tasks: the so-called
<bold>competition-between-reward-and-accuracy</bold>
(COBRA) hypothesis holds that decision criteria arise as a compromise between accuracy and reward maximization.</p>
<p>The amount Δ
<italic>c
<sub>A</sub>
</italic>
by which decision criteria need to be shifted from the actual category bound
<italic>s
<sub>c</sub>
</italic>
in order to maximize accuracy is a function of the prior reward likelihoods, i.e., the base rates of the different categories:
<disp-formula id="E7">
<label>(7)</label>
<mml:math id="M8">
<mml:mrow>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>A</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mo>σ</mml:mo>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>The criterion value
<italic>c
<sub>A</sub>
</italic>
is the sum of actual category bound
<italic>s
<sub>c</sub>
</italic>
, and the optimal accuracy criterion shift:
<italic>c
<sub>A</sub>
</italic>
 = 
<italic>s
<sub>c</sub>
</italic>
+ Δ
<italic>c
<sub>A</sub>
</italic>
. The index σ indicates that the optimal shift also depends on how well an observer is able to discriminate the different stimuli. The precise form of the functional relationship
<italic>f</italic>
<sub>σ</sub>
is given by Eq.
<xref ref-type="disp-formula" rid="E6">6</xref>
. The criterion shift Δ
<italic>c
<sub>R</sub>
</italic>
which maximizes reward volume is a function of unconditional expected values:
<disp-formula id="E8">
<label>(8)</label>
<mml:math id="M9">
<mml:mrow>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>R</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mo>σ</mml:mo>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>The COBRA hypothesis assumes that the actual criterion shift Δ
<italic>c</italic>
is the weighted average of Δ
<italic>c
<sub>A</sub>
</italic>
and Δ
<italic>c
<sub>R</sub>
</italic>
. Consequently, the criterion
<italic>c</italic>
is given by:
<disp-formula id="E9">
<label>(9)</label>
<mml:math id="M10">
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>α</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>A</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>ρ</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>R</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>In this context, behavior of human subjects can be approximated by setting both coefficients to non-zero values with a sum of less than 1 (e.g.,
<italic>w</italic>
<sub>α</sub>
 = 0.2,
<italic>w</italic>
<sub>ρ</sub>
 = 0.7). In the reward likelihood condition, Δ
<italic>c
<sub>A</sub>
</italic>
is identical to Δ
<italic>c
<sub>R</sub>
</italic>
. Consequently,
<italic>w</italic>
<sub>α</sub>
and
<italic>w</italic>
<sub>ρ</sub>
add up to close to one, and lead to an almost optimal decision criterion. In the reward magnitude condition the two goals are mutually exclusive, i.e., Δ
<italic>c
<sub>R</sub>
</italic>
 > Δ
<italic>c
<sub>A</sub>
</italic>
 = 0. Hence
<italic>w</italic>
<sub>α</sub>
and
<italic>w
<sub>R</sub>
</italic>
do not add up and cause a smaller and clearly suboptimal shift (see Maddox,
<xref ref-type="bibr" rid="B13">2002</xref>
for details). The behavior of the macaques in our task can not be explained in this framework of competition between accuracy and reward maximization: the strong effects of the reward magnitude manipulation on decision criteria would suggest a large coefficient
<italic>w</italic>
<sub>ρ</sub>
for the reward criterion. However, the lack of an effect in the reward likelihood condition would suggest negligible weights for both accuracy and reward maximization.</p>
<p>It is possible to expand the COBRA hypothesis in order to accommodate the behavior of the macaques. We will refer to this extension more generally as the
<bold>criteria competition approach</bold>
. Parallel to the accuracy criterion shift we define the magnitude criterion shift Δ
<italic>c
<sub>M</sub>
</italic>
as a function of reward magnitude:
<disp-formula id="E10">
<label>(10)</label>
<mml:math id="M11">
<mml:mrow>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>M</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mo>σ</mml:mo>
</mml:msub>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Just as the accuracy criterion
<italic>c
<sub>A</sub>
</italic>
maximizes reward under the assumption of equal payoffs, the magnitude criterion
<italic>c
<sub>M</sub>
</italic>
maximizes reward under the assumption of equal base rates. Note that neither of the two criteria optimize reward if base rates
<bold>and</bold>
payoffs are unequal at the same time. In such cases, the expected reward criterion
<italic>c
<sub>R</sub>
</italic>
is the only optimal choice. In the criterion competition model, the actual criterion
<italic>c</italic>
arises as the weighted average of all three criteria:
<disp-formula id="E11">
<label>(11)</label>
<mml:math id="M12">
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>α</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>A</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>ρ</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>R</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>μ</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>M</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Behavior of the macaques in our task can be emulated by setting
<italic>w</italic>
<sub>μ</sub>
to a value larger than 1 and fixing both
<italic>w</italic>
<sub>α</sub>
and
<italic>w</italic>
<sub>ρ</sub>
to 0 (e.g.,
<italic>w</italic>
<sub>α</sub>
 = 0,
<italic>w</italic>
<sub>ρ</sub>
 = 0, w
<sub>μ</sub>
 = 2). The behavior of the human subjects can of course also be expressed in this larger framework. Due to the additional degree of freedom, there are a number of combinations of coefficients which can emulate the human behavior.</p>
</sec>
<sec>
<title>Category versus motor bias</title>
<p>Categorization tasks used in macaques typically confound perceptual category with the motor response used to signal this category (Feng et al.,
<xref ref-type="bibr" rid="B4">2009</xref>
). Hence, a bias in favor of a perceptual category, may actually be represented as a motor bias favoring a particular motor response. So far, it is not known whether the neural mechanisms and the psychophysical effects of a category bias are identical to those of a motor bias. Thus, if humans acquired a category bias as favored by their instructions, and the monkeys a motor bias, the difference between the two species may actually reflect a difference between motor and category bias.</p>
<p>We tested this hypothesis by investigating whether decision bounds are affected equally by a motor and a category bias. We used a simple experimental procedure to dissociate category membership from motor response. This enabled us to compare the effects of a motor to a category bias. Overall, our results do not support the idea that category and motor biases have distinct effects on the placement of decision bounds. The initial finding of larger shifts in the category compared to the motor bias condition (Figure
<xref ref-type="fig" rid="F2">2</xref>
) could not be replicated later in the experiment. In summary, our results indicate that differences between human and macaque subjects can not be attributed to differences between motor and category biases.</p>
</sec>
<sec>
<title>Overcompensation in the reward magnitude condition</title>
<p>In the following we will focus on the overcompensation which was consistently observed in the reward magnitude condition. First, we will review the mechanisms that have been put forward to explain the under-compensation of humans in similar tasks and explore whether these mechanisms may account for the overcompensation found in monkeys. Further, we will discuss two new approaches which are based on the decision confidence and an alternative optimization strategy which we refer to as operant matching.</p>
<sec>
<title>Utility function</title>
<p>One possible explanation of the monkeys’ overcompensation is based on the assumption that subjects maximize utility of the rewards, not reward volume
<italic>per se</italic>
. In addition, different shapes of the utility functions for the two species may explain the different behavior of human and macaque observers in such tasks. For example, if human subjects have a concave utility function, they will value big rewards relatively less, and the shift which optimizes expected utility is smaller than the one that optimizes expected reward. Indeed, humans show signs of concave utility functions in a number of situations (e.g., Kahneman and Tversky,
<xref ref-type="bibr" rid="B11">1979</xref>
). Additional support for this mechanism has been presented by Navalpakkam et al. (
<xref ref-type="bibr" rid="B15">2009</xref>
). When they encouraged human subjects to interpret rewards linearly by adding an extra cash prize for the subject with the best performance, they observed optimal criterion shifts.</p>
<p>Similarly, this mechanism may be responsible for the suboptimally large criterion shifts observed for the monkeys. However, in contrast to the concave utility function which explains the under-compensation of the humans, their overcompensation needs to be explained by convex utility functions. We rewrite Eq.
<xref ref-type="disp-formula" rid="E6">6</xref>
and replace the rewards
<italic>r
<sub>i</sub>
</italic>
with the utility of these rewards as denoted by
<italic>u</italic>
[
<italic>r
<sub>i</sub>
</italic>
]. In addition, we replace the variable decision criterion
<italic>c</italic>
with the observed decision criterion, i.e., PSE. We exploit the fact that at the PSE, the expected utility of both choices is equal:
<disp-formula id="E12">
<label>(12)</label>
<mml:math id="M13">
<mml:mtable columnalign="left">
<mml:mtr>
<mml:mtd>
<mml:mi>u</mml:mi>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:munder>
<mml:munder>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mn>3</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mtext>PSE</mml:mtext>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="true"></mml:mo>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mtext></mml:mtext>
<mml:mo>=</mml:mo>
<mml:mi>u</mml:mi>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:munder>
<mml:munder>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
<mml:mn>6</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mtext>PSE</mml:mtext>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>σ</mml:mo>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="true"></mml:mo>
</mml:munder>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:munder>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
</p>
<p>We approximate
<italic>u</italic>
[
<italic>r</italic>
] as an exponential function:
<italic>u</italic>
[
<italic>r</italic>
] = 
<italic>r
<sup>q</sup>
</italic>
. Hence we can rewrite Eq.
<xref ref-type="disp-formula" rid="E12">12</xref>
:
<disp-formula id="E13">
<label>(13)</label>
<mml:math id="M14">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>u</mml:mi>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>=</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>q</mml:mi>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Since the right hand side of Eq.
<xref ref-type="disp-formula" rid="E13">13</xref>
can be estimated from the psychophysical performance in the neutral condition (see
<xref ref-type="sec" rid="s1">Materials and Methods</xref>
), it will give us an estimate of the fraction of the utilities that would have justified the observed shift of the decision criterion. This fraction can be compared to the fraction of the actual reward values,
<italic>r</italic>
<sub>1</sub>
/
<italic>r</italic>
<sub>2</sub>
. To do this, we solve Eq.
<xref ref-type="disp-formula" rid="E13">13</xref>
for the exponent
<italic>q</italic>
which will give us an estimate of the convexity or concavity of the utility function:
<disp-formula>
<mml:math id="M15">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>For the three monkeys L, K, and C we find very similar exponents
<italic>q</italic>
of 5.15 ± 2.36, 5.23 ± 2.59, and 4.21 ± 3.54 (mean ± standard deviation). With the exception of a single data point from monkey C, all estimates of
<italic>q</italic>
are larger than 2, indicating a strong convexity in the utility function. These analyses show that in principle, the suboptimally large criterion shifts of the monkeys may be explained by convex utility functions. This interpretation is supported by other studies presenting evidence in favor of convex utility functions for fluid rewards in monkeys (McCoy and Platt,
<xref ref-type="bibr" rid="B14">2005</xref>
; Hayden et al.,
<xref ref-type="bibr" rid="B8">2008</xref>
). However, the value of the exponent
<italic>q</italic>
which was estimated to be on the order of 5 seems rather large.</p>
<p>We further performed an independent test of whether our monkeys exhibit evidence in favor of convex utility functions within the particular setting of our task. In order to do so, subjects performed the identical speed-categorization task with a slightly modified reward schedule. Correct responses for one of the categories were always rewarded with a fixed number of 3 valve openings. The other category was rewarded randomly with either 2 or 4 valve openings. If the monkeys have convex utility functions, they should prefer the variable option with a 50% chance of either 2 or 4 valve openings over the fixed one. Sensory decision criteria of one of the animals (monkey K) shifted in line with these predictions. The other animal (monkey L), however, was indifferent to the two reward schedules. In summary, the overcompensation of the monkeys may at least in part be related to a convex utility function for fluid rewards delivered in units of valve openings.</p>
</sec>
<sec>
<title>Confidence estimate</title>
<p>The optimal criterion that maximizes reward depends not only on the fraction of reward magnitudes, but also on the discriminability of the stimuli, σ (see Eq.
<xref ref-type="disp-formula" rid="E6">6</xref>
): If the discriminability of the stimuli is low, i.e., σ is large and the psychometric function is flat, the optimal shift is large (see Figure
<xref ref-type="fig" rid="F4">4</xref>
). Hence, in order to produce an optimal criterion shift, the monkeys need to have a good estimate of σ. In the following we assume that the animals representation of stimulus discriminability is given by
<inline-formula>
<mml:math id="M16">
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:math>
</inline-formula>
. We can reformulate Eq.
<xref ref-type="disp-formula" rid="E6">6</xref>
by substituting
<inline-formula>
<mml:math id="M17">
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:math>
</inline-formula>
for σ and the observed PSE for the optimal criterion shift
<italic>c</italic>
.
<disp-formula id="E14">
<label>(14)</label>
<mml:math id="M18">
<mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo><</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mn>3</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mtext>PSE</mml:mtext>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mo>></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo></mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>4</mml:mn>
</mml:mrow>
<mml:mn>6</mml:mn>
</mml:munderover>
<mml:mo>φ</mml:mo>
</mml:mstyle>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mtext>PSE</mml:mtext>
<mml:mo></mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>In most cases, Eq.
<xref ref-type="disp-formula" rid="E14">14</xref>
can be solved numerically for
<inline-formula>
<mml:math id="M19">
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:math>
</inline-formula>
. However, no solution exists if the sign of the observed shift of the decision criterion does match the sign of the optimal shift. For monkey L, averaged over all conditions, we find that a
<inline-formula>
<mml:math id="M20">
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
</mml:math>
</inline-formula>
of 4.08 ± 1.03 explains the observed criterion placement. Compared to an actual value of stimulus discriminability of σ = 1.44 ± 0.40, this would corresponds to a pronounced underestimation of the psychophysical ability. For monkey K, the estimated value is
<inline-formula>
<mml:math id="M21">
<mml:mrow>
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mn>2.19</mml:mn>
<mml:mo>±</mml:mo>
<mml:mn>0.29</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
, compared to an actual value of σ = 0.88 ± 0.28. For monkey C, Eq.
<xref ref-type="disp-formula" rid="E14">14</xref>
can not be solved in one instance. Averaged over the remaining instances we find an estimated value is
<inline-formula>
<mml:math id="M22">
<mml:mrow>
<mml:mover accent="true">
<mml:mo>σ</mml:mo>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mn>5.84</mml:mn>
<mml:mo>±</mml:mo>
<mml:mn>1.67</mml:mn>
</mml:mrow>
</mml:math>
</inline-formula>
, compared to an actual value of σ = 2.20 ± 1.20. This analysis suggests that overcompensation of the monkeys may in principle be due to a systematic underestimation of the monkeys’ psychophysical ability to discriminate the stimulus speeds.</p>
<p>The previous two sections have outlined that our results can be explained either by convex utility functions or under-confidence of the monkeys in their decision. A recent study by Kiani and Shadlen (
<xref ref-type="bibr" rid="B12">2009</xref>
) may help determine which of the two explanations is more likely to be accurate. In their study, monkeys engaged in a post-decision wagering task: After signaling their choice, monkeys were given an additional choice between sticking to their original, potentially wrong choice and a third option which featured a smaller but sure reward. Behavior in the post-decision wager is indicative of the confidence in their original choice: if they are sure about their decision, they should stick with the big prospective reward, if not, they might want to go with the small but sure reward. Similar to our task, it is possible to asses the optimality of the monkeys’ decisions strategy. Their analyses suggest that the behavior of the monkeys can be explained either by a convex utility function or overconfidence of the monkeys in their performance. Note that both, ours and their data set can be explained by convex utility functions. In contrast, erroneous confidence estimates do not provide a parsimonious explanation for both data sets: our data needs to be explained by under-confidence of the subjects, theirs by overconfidence. Taken together, the two studies seem to suggest that convex utility functions are more likely than erroneous confidence estimates to play a role in causing the observed suboptimal behavior.</p>
</sec>
<sec>
<title>Operant matching criterion</title>
<p>The COBRA hypothesis and its extension, the criteria competition approach, have already been discussed in section
<xref ref-type="sec" rid="s2">“Comparison between human and macaque observers.”</xref>
In summary, we concluded that only the criteria competition approach may accommodate the findings from the two species.</p>
<p>Here, we will consider an additional expansion of the criterion competition approach which provides a novel explanation for the overcompensation observed for the monkeys. The explanation is based on the matching law first formulated by Herrnstein (
<xref ref-type="bibr" rid="B10">1961</xref>
). Hence, we refer to the mechanism as operant matching as opposed to the local “winner-take-all” mechanism which governs the behavior of the ideal observer. We define the value of a category,
<italic>V</italic>
(
<italic>C
<sub>i</sub>
</italic>
), as the likelihood of being correct when choosing this category times the reward magnitude associated with this category:
<disp-formula id="E15">
<label>(15)</label>
<mml:math id="M23">
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>P</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>"</mml:mo>
<mml:mtext>correct"</mml:mtext>
<mml:mo>|</mml:mo>
<mml:mo>"</mml:mo>
<mml:mtext>choice</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>"</mml:mo>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>The idea of operant matching is that the decision criterion is set such that the value of each category is equal:
<disp-formula id="E16">
<label>(16)</label>
<mml:math id="M24">
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mi>V</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>Note that operant matching will produce category boundaries which can be quite distinct from the ideal observer analysis. For example, assume that the task is very easy and subjects perform virtually 100% correct. In this case, the ideal observer analysis holds that the criterion should not be shifted at all. However, if the criterion is not shifted, the values of the two categories as defined by operant matching are not identical. In particular, because all responses are correct, the value of each category is identical to the reward associated with it, i.e.,
<italic>V</italic>
(
<italic>C
<sub>i</sub>
</italic>
) = 
<italic>r
<sub>i</sub>
</italic>
. Hence, operant matching holds that the criterion needs to be shifted until the value of both categories are equal.</p>
<p>The operant matching approach has certain limitations: Equation
<xref ref-type="disp-formula" rid="E16">16</xref>
can not be solved if ∃
<italic>i</italic>
,
<italic>j</italic>
:
<italic>r
<sub>i</sub>
</italic>
/
<italic>r
<sub>j</sub>
</italic>
 < 
<italic>P</italic>
(
<italic>C
<sub>j</sub>
</italic>
). The following example illustrates the restriction: if we assume that there are only two categories which equally likely, and that
<italic>r</italic>
<sub>1</sub>
 = 1 and
<italic>r</italic>
<sub>2</sub>
 = 3, we find that
<italic>r</italic>
<sub>1</sub>
/
<italic>r</italic>
<sub>2</sub>
 = 1/3 < 
<italic>P</italic>
(
<italic>C</italic>
<sub>1</sub>
) = 0.5 and Eq.
<xref ref-type="disp-formula" rid="E16">16</xref>
has no solution: No matter how far the decision criterion is shifted, the value of category 1 will always be less or equal to 1. In contrast, the value of the second category will always be at least 1.5 (here we assume that
<italic>P</italic>
(“correct” | “choice = 
<italic>i</italic>
”) ≥ 
<italic>P</italic>
(
<italic>C
<sub>i</sub>
</italic>
) = 0.5).</p>
<p>We refer to the criterion shift predicted by operant matching as Δ
<italic>c
<sub>O</sub>
</italic>
. It can easily be included into the criterion competition approach:
<disp-formula id="E17">
<label>(17)</label>
<mml:math id="M25">
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>α</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>A</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>ρ</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>R</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mo>μ</mml:mo>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>M</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi>O</mml:mi>
</mml:msub>
<mml:mo>Δ</mml:mo>
<mml:msub>
<mml:mi>c</mml:mi>
<mml:mi>O</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</p>
<p>For all three monkeys, the global matching criterion predicted larger shifts than the criterion which optimizes rewards,
<italic>c
<sub>R</sub>
</italic>
. Hence, the differences between humans and monkeys may at least partially be caused by monkeys placing larger weight
<italic>w
<sub>O</sub>
</italic>
on the operant matching criterion.</p>
</sec>
</sec>
</sec>
<sec>
<title>Conclusion</title>
<p>In the current experiment we investigated sensory decision criteria of macaque monkeys in a biased decision making task where different options were more likely to be correct (prior reward likelihood bias) or associated with a larger reward if chosen correctly (reward magnitude bias). Our results show that decision criteria of naive monkeys over-adjust to the reward magnitude manipulation but fail to adjust at all to the reward likelihood manipulation. Importantly, the setting of decision criteria does
<bold>not</bold>
seem to be mediated by the unconditional expected value of the options as predicted by an ideal observer analysis. Rather, conditional reward magnitude alone determines the decision criteria of the monkeys in the task. This is in clear contrast to choice behavior in pure value-based decisions where the monkeys readily adjusted their behavior as a function of prior reward likelihood.</p>
</sec>
<sec>
<title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack>
<p>We want to thank Jack Grinband, Franco Pestilli, and Brian Lau for helpful discussions. Thanks to Brandon Murray for help with the data collection. We gratefully acknowledge funding by the DFG project TE819/1-1 to Tobias Teichert and NIH-MH059244 to Vincent P. Ferrera.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bernoulli</surname>
<given-names>D.</given-names>
</name>
</person-group>
(
<year>1738</year>
).
<article-title>Exposition of a new theory on the measurement of risk</article-title>
.
<source>Econometrica</source>
<volume>22</volume>
,
<fpage>22</fpage>
<lpage>36</lpage>
(originally published 1738; translated 1954).</mixed-citation>
</ref>
<ref id="B2">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Busemeyer</surname>
<given-names>J. R.</given-names>
</name>
<name>
<surname>Myung</surname>
<given-names>I. J.</given-names>
</name>
</person-group>
(
<year>1992</year>
).
<article-title>An adaptive approach to human decision-making – learning-theory, decision-theory, and human-performance</article-title>
.
<source>J. Exp. Psychol. Gen.</source>
<volume>121</volume>
,
<fpage>177</fpage>
<lpage>194</lpage>
<pub-id pub-id-type="doi">10.1037/0096-3445.121.2.177</pub-id>
</mixed-citation>
</ref>
<ref id="B3">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Erev</surname>
<given-names>I.</given-names>
</name>
</person-group>
(
<year>1998</year>
).
<article-title>Signal detection by human observers: a cutoff reinforcement learning model of categorization decisions under uncertainty</article-title>
.
<source>Psychol. Rev.</source>
<volume>105</volume>
,
<fpage>280</fpage>
<lpage>298</lpage>
<pub-id pub-id-type="doi">10.1037/0033-295X.105.2.280</pub-id>
<pub-id pub-id-type="pmid">9669925</pub-id>
</mixed-citation>
</ref>
<ref id="B4">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Holmes</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Rorie</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Newsome</surname>
<given-names>W. T.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Can monkeys choose optimally when faced with noisy stimuli and unequal rewards?</article-title>
<source>PLoS Comput. Biol.</source>
<volume>5</volume>
,
<fpage>e1000284</fpage>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000284</pub-id>
<pub-id pub-id-type="doi">10.1371/journal.pcbi.1000284</pub-id>
<pub-id pub-id-type="pmid">19214201</pub-id>
</mixed-citation>
</ref>
<ref id="B5">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fiorillo</surname>
<given-names>C. D.</given-names>
</name>
<name>
<surname>Tobler</surname>
<given-names>P. N.</given-names>
</name>
<name>
<surname>Schultz</surname>
<given-names>W.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Evidence that the delay-period activity of dopamine neurons corresponds to reward uncertainty rather than backpropagating TD errors</article-title>
.
<source>Behav. Brain Funct.</source>
<volume>1</volume>
,
<fpage>7</fpage>
<pub-id pub-id-type="doi">10.1186/1744-9081-1-7</pub-id>
<pub-id pub-id-type="pmid">15958162</pub-id>
</mixed-citation>
</ref>
<ref id="B6">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gold</surname>
<given-names>J. I.</given-names>
</name>
<name>
<surname>Shadlen</surname>
<given-names>M. N.</given-names>
</name>
</person-group>
(
<year>2007</year>
).
<article-title>The neural basis of decision making</article-title>
.
<source>Annu. Rev. Neurosci.</source>
<volume>30</volume>
,
<fpage>535</fpage>
<lpage>574</lpage>
<pub-id pub-id-type="doi">10.1146/annurev.neuro.29.051605.113038</pub-id>
<pub-id pub-id-type="pmid">17600525</pub-id>
</mixed-citation>
</ref>
<ref id="B7">
<mixed-citation publication-type="book">
<person-group person-group-type="author">
<name>
<surname>Green</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Swets</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>1966</year>
).
<source>Signal Detection Theory and Psychophysics</source>
.
<publisher-loc>John Wiley</publisher-loc>
:
<publisher-name>Oxford, England</publisher-name>
</mixed-citation>
</ref>
<ref id="B8">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hayden</surname>
<given-names>B. Y.</given-names>
</name>
<name>
<surname>Heilbronner</surname>
<given-names>S. R.</given-names>
</name>
<name>
<surname>Nair</surname>
<given-names>A. C.</given-names>
</name>
<name>
<surname>Platt</surname>
<given-names>M. L.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Cognitive influences on risk-seeking by rhesus macaques</article-title>
.
<source>Judgm. Decis. Mak.</source>
<volume>3</volume>
,
<fpage>389</fpage>
<lpage>395</lpage>
<pub-id pub-id-type="pmid">19844596</pub-id>
</mixed-citation>
</ref>
<ref id="B9">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Healy</surname>
<given-names>A. F.</given-names>
</name>
<name>
<surname>Kubovy</surname>
<given-names>M.</given-names>
</name>
</person-group>
(
<year>1981</year>
).
<article-title>Probability matching and the formation of conservative decision rules in a numerical analog of signal-detection</article-title>
.
<source>J. Exp. Psychol. Hum. Learn.</source>
<volume>7</volume>
,
<fpage>344</fpage>
<lpage>354</lpage>
<pub-id pub-id-type="doi">10.1037/0278-7393.7.5.344</pub-id>
</mixed-citation>
</ref>
<ref id="B10">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Herrnstein</surname>
<given-names>R. J.</given-names>
</name>
</person-group>
(
<year>1961</year>
).
<article-title>Relative and absolute strength of response as a function of frequency of reinforcement</article-title>
.
<source>J. Exp. Anal. Behav.</source>
<volume>4</volume>
,
<fpage>267</fpage>
<lpage>272</lpage>
<pub-id pub-id-type="doi">10.1901/jeab.1961.4-267</pub-id>
<pub-id pub-id-type="pmid">13713775</pub-id>
</mixed-citation>
</ref>
<ref id="B11">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kahneman</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Tversky</surname>
<given-names>A.</given-names>
</name>
</person-group>
(
<year>1979</year>
).
<article-title>Prospect theory – analysis of decision under risk</article-title>
.
<source>Econometrica</source>
<volume>47</volume>
,
<fpage>263</fpage>
<lpage>291</lpage>
<pub-id pub-id-type="doi">10.2307/1914185</pub-id>
</mixed-citation>
</ref>
<ref id="B12">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kiani</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Shadlen</surname>
<given-names>M. N.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Representation of confidence associated with a decision by neurons in the parietal cortex</article-title>
.
<source>Science</source>
<volume>324</volume>
,
<fpage>759</fpage>
<lpage>764</lpage>
<pub-id pub-id-type="doi">10.1126/science.1169405</pub-id>
<pub-id pub-id-type="pmid">19423820</pub-id>
</mixed-citation>
</ref>
<ref id="B13">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maddox</surname>
<given-names>W. T.</given-names>
</name>
</person-group>
(
<year>2002</year>
).
<article-title>Toward a unified theory of decision criterion learning in perceptual categorization</article-title>
.
<source>J. Exp. Anal. Behav.</source>
<volume>78</volume>
,
<fpage>567</fpage>
<lpage>595</lpage>
<pub-id pub-id-type="doi">10.1901/jeab.2002.78-567</pub-id>
<pub-id pub-id-type="pmid">12507020</pub-id>
</mixed-citation>
</ref>
<ref id="B14">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>McCoy</surname>
<given-names>A. N.</given-names>
</name>
<name>
<surname>Platt</surname>
<given-names>M. L.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Risk-sensitive neurons in macaque posterior cingulate cortex</article-title>
.
<source>Nat. Neurosci.</source>
<volume>8</volume>
,
<fpage>1220</fpage>
<lpage>1227</lpage>
<pub-id pub-id-type="doi">10.1038/nn1523</pub-id>
<pub-id pub-id-type="pmid">16116449</pub-id>
</mixed-citation>
</ref>
<ref id="B15">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Navalpakkam</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Koch</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Perona</surname>
<given-names>P.</given-names>
</name>
</person-group>
(
<year>2009</year>
).
<article-title>Homo economicus in visual search</article-title>
.
<source>J. Vision</source>
<volume>9</volume>
,
<fpage>1</fpage>
<lpage>16</lpage>
<pub-id pub-id-type="doi">10.1167/9.1.31</pub-id>
</mixed-citation>
</ref>
<ref id="B16">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rangel</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Camerer</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Montague</surname>
<given-names>P. R.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>A framework for studying the neurobiology of value-based decision making</article-title>
.
<source>Nat. Rev. Neurosci.</source>
<volume>9</volume>
,
<fpage>545</fpage>
<lpage>556</lpage>
<pub-id pub-id-type="doi">10.1038/nrn2357</pub-id>
<pub-id pub-id-type="pmid">18545266</pub-id>
</mixed-citation>
</ref>
<ref id="B17">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Romo</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Salinas</surname>
<given-names>E.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<article-title>Flutter discrimination: neural codes, perception, memory and decision making</article-title>
.
<source>Nat. Rev. Neurosci.</source>
<volume>4</volume>
,
<fpage>203</fpage>
<lpage>218</lpage>
<pub-id pub-id-type="doi">10.1038/nrn1058</pub-id>
<pub-id pub-id-type="pmid">12612633</pub-id>
</mixed-citation>
</ref>
<ref id="B18">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Schall</surname>
<given-names>J. D.</given-names>
</name>
</person-group>
(
<year>2003</year>
).
<article-title>Neural correlates of decision processes: neural and mental chronometry</article-title>
.
<source>Curr. Opin. Neurobiol.</source>
<volume>13</volume>
,
<fpage>182</fpage>
<lpage>186</lpage>
<pub-id pub-id-type="doi">10.1016/S0959-4388(03)00039-4</pub-id>
<pub-id pub-id-type="pmid">12744971</pub-id>
</mixed-citation>
</ref>
<ref id="B19">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sugrue</surname>
<given-names>L. P.</given-names>
</name>
<name>
<surname>Corrado</surname>
<given-names>G. S.</given-names>
</name>
<name>
<surname>Newsome</surname>
<given-names>W. T.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Choosing the greater of two goods: neural currencies for valuation and decision making</article-title>
.
<source>Nat. Rev. Neurosci.</source>
<volume>6</volume>
,
<fpage>363</fpage>
<lpage>375</lpage>
<pub-id pub-id-type="doi">10.1038/nrn1666</pub-id>
<pub-id pub-id-type="pmid">15832198</pub-id>
</mixed-citation>
</ref>
<ref id="B20">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tobler</surname>
<given-names>P. N.</given-names>
</name>
<name>
<surname>Fiorillo</surname>
<given-names>C. D.</given-names>
</name>
<name>
<surname>Schultz</surname>
<given-names>W.</given-names>
</name>
</person-group>
(
<year>2005</year>
).
<article-title>Adaptive coding of reward value by dopamine neurons</article-title>
.
<source>Science</source>
<volume>307</volume>
,
<fpage>1642</fpage>
<lpage>1645</lpage>
<pub-id pub-id-type="doi">10.1126/science.1105370</pub-id>
<pub-id pub-id-type="pmid">15761155</pub-id>
</mixed-citation>
</ref>
<ref id="B21">
<mixed-citation publication-type="journal">
<person-group person-group-type="author">
<name>
<surname>Voss</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Rothermund</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Brandtstadter</surname>
<given-names>J.</given-names>
</name>
</person-group>
(
<year>2008</year>
).
<article-title>Interpreting ambiguous stimuli: separating perceptual and judgmental biases</article-title>
.
<source>J. Exp. Soc. Psychol.</source>
<volume>44</volume>
,
<fpage>1048</fpage>
<lpage>1056</lpage>
<pub-id pub-id-type="doi">10.1016/j.jesp.2007.10.009</pub-id>
</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/HapticV1/Data/Pmc/Curation
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001E60 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd -nk 001E60 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    HapticV1
   |flux=    Pmc
   |étape=   Curation
   |type=    RBID
   |clé=     PMC:2996133
   |texte=   Suboptimal Integration of Reward Magnitude and Prior Reward Likelihood in Categorical Decisions by Monkeys
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Pmc/Curation/RBID.i   -Sk "pubmed:21151367" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Pmc/Curation/biblio.hfd   \
       | NlmPubMed2Wicri -a HapticV1 

Wicri

This area was generated with Dilib version V0.6.23.
Data generation: Mon Jun 13 01:09:46 2016. Site generation: Wed Mar 6 09:54:07 2024