HapticV1, Ncbi, Merge, bibRecord, 003078

On the Origins of Suboptimality in Human Probabilistic Inference

Identifieur interne : 003078 ( Ncbi/Merge ); précédent : 003077; suivant : 003079

On the Origins of Suboptimality in Human Probabilistic Inference

Auteurs : Luigi Acerbi [Royaume-Uni] ; Sethu Vijayakumar [Royaume-Uni] ; Daniel M. Wolpert [Royaume-Uni]

Source :

PLoS Computational Biology [ 1553-734X ] ; 2014.

RBID : PMC:4063671

Abstract

Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior.

Url:

http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4063671

DOI: 10.1371/journal.pcbi.1003661
PubMed: 24945142
PubMed Central: 4063671

Links toward previous steps (curation, corpus...)

to stream Pmc, to step Corpus: 000541
to stream Pmc, to step Curation: 000541
to stream Pmc, to step Checkpoint: 000A63

Links to Exploration step

PMC:4063671

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en">On the Origins of Suboptimality in Human Probabilistic Inference</title>
<author><name sortKey="Acerbi, Luigi" sort="Acerbi, Luigi" uniqKey="Acerbi L" first="Luigi" last="Acerbi">Luigi Acerbi</name>
<affiliation wicri:level="4"><nlm:aff id="aff1"><addr-line>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh</wicri:regionArea>
<placeName><settlement type="city">Édimbourg</settlement>
<region type="country">Écosse</region>
</placeName>
<orgName type="university">Université d'Édimbourg</orgName>
</affiliation>
<affiliation wicri:level="4"><nlm:aff id="aff2"><addr-line>Doctoral Training Centre in Neuroinformatics and Computational Neuroscience, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Doctoral Training Centre in Neuroinformatics and Computational Neuroscience, School of Informatics, University of Edinburgh, Edinburgh</wicri:regionArea>
<placeName><settlement type="city">Édimbourg</settlement>
<region type="country">Écosse</region>
</placeName>
<orgName type="university">Université d'Édimbourg</orgName>
</affiliation>
</author>
<author><name sortKey="Vijayakumar, Sethu" sort="Vijayakumar, Sethu" uniqKey="Vijayakumar S" first="Sethu" last="Vijayakumar">Sethu Vijayakumar</name>
<affiliation wicri:level="4"><nlm:aff id="aff1"><addr-line>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh</wicri:regionArea>
<placeName><settlement type="city">Édimbourg</settlement>
<region type="country">Écosse</region>
</placeName>
<orgName type="university">Université d'Édimbourg</orgName>
</affiliation>
</author>
<author><name sortKey="Wolpert, Daniel M" sort="Wolpert, Daniel M" uniqKey="Wolpert D" first="Daniel M." last="Wolpert">Daniel M. Wolpert</name>
<affiliation wicri:level="4"><nlm:aff id="aff3"><addr-line>Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge</wicri:regionArea>
<orgName type="university">Université de Cambridge</orgName>
<placeName><settlement type="city">Cambridge</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Angleterre de l'Est</region>
</placeName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">PMC</idno>
<idno type="pmid">24945142</idno>
<idno type="pmc">4063671</idno>
<idno type="url">http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4063671</idno>
<idno type="RBID">PMC:4063671</idno>
<idno type="doi">10.1371/journal.pcbi.1003661</idno>
<date when="2014">2014</date>
<idno type="wicri:Area/Pmc/Corpus">000541</idno>
<idno type="wicri:Area/Pmc/Curation">000541</idno>
<idno type="wicri:Area/Pmc/Checkpoint">000A63</idno>
<idno type="wicri:Area/Ncbi/Merge">003078</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a" type="main">On the Origins of Suboptimality in Human Probabilistic Inference</title>
<author><name sortKey="Acerbi, Luigi" sort="Acerbi, Luigi" uniqKey="Acerbi L" first="Luigi" last="Acerbi">Luigi Acerbi</name>
<affiliation wicri:level="4"><nlm:aff id="aff1"><addr-line>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh</wicri:regionArea>
<placeName><settlement type="city">Édimbourg</settlement>
<region type="country">Écosse</region>
</placeName>
<orgName type="university">Université d'Édimbourg</orgName>
</affiliation>
<affiliation wicri:level="4"><nlm:aff id="aff2"><addr-line>Doctoral Training Centre in Neuroinformatics and Computational Neuroscience, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Doctoral Training Centre in Neuroinformatics and Computational Neuroscience, School of Informatics, University of Edinburgh, Edinburgh</wicri:regionArea>
<placeName><settlement type="city">Édimbourg</settlement>
<region type="country">Écosse</region>
</placeName>
<orgName type="university">Université d'Édimbourg</orgName>
</affiliation>
</author>
<author><name sortKey="Vijayakumar, Sethu" sort="Vijayakumar, Sethu" uniqKey="Vijayakumar S" first="Sethu" last="Vijayakumar">Sethu Vijayakumar</name>
<affiliation wicri:level="4"><nlm:aff id="aff1"><addr-line>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh</wicri:regionArea>
<placeName><settlement type="city">Édimbourg</settlement>
<region type="country">Écosse</region>
</placeName>
<orgName type="university">Université d'Édimbourg</orgName>
</affiliation>
</author>
<author><name sortKey="Wolpert, Daniel M" sort="Wolpert, Daniel M" uniqKey="Wolpert D" first="Daniel M." last="Wolpert">Daniel M. Wolpert</name>
<affiliation wicri:level="4"><nlm:aff id="aff3"><addr-line>Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom</addr-line>
</nlm:aff>
<country xml:lang="fr">Royaume-Uni</country>
<wicri:regionArea>Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge</wicri:regionArea>
<orgName type="university">Université de Cambridge</orgName>
<placeName><settlement type="city">Cambridge</settlement>
<region type="country">Angleterre</region>
<region type="région" nuts="1">Angleterre de l'Est</region>
</placeName>
</affiliation>
</author>
</analytic>
<series><title level="j">PLoS Computational Biology</title>
<idno type="ISSN">1553-734X</idno>
<idno type="eISSN">1553-7358</idno>
<imprint><date when="2014">2014</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><textClass></textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en"><p>Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior.</p>
</div>
</front>
<back><div1 type="bibliography"><listBibl><biblStruct><analytic><author><name sortKey="Weiss, Y" uniqKey="Weiss Y">Y Weiss</name>
</author>
<author><name sortKey="Simoncelli, Ep" uniqKey="Simoncelli E">EP Simoncelli</name>
</author>
<author><name sortKey="Adelson, Eh" uniqKey="Adelson E">EH Adelson</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stocker, Aa" uniqKey="Stocker A">AA Stocker</name>
</author>
<author><name sortKey="Simoncelli, Ep" uniqKey="Simoncelli E">EP Simoncelli</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Girshick, A" uniqKey="Girshick A">A Girshick</name>
</author>
<author><name sortKey="Landy, M" uniqKey="Landy M">M Landy</name>
</author>
<author><name sortKey="Simoncelli, E" uniqKey="Simoncelli E">E Simoncelli</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Chalk, M" uniqKey="Chalk M">M Chalk</name>
</author>
<author><name sortKey="Seitz, A" uniqKey="Seitz A">A Seitz</name>
</author>
<author><name sortKey="Series, P" uniqKey="Series P">P Seriès</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Miyazaki, M" uniqKey="Miyazaki M">M Miyazaki</name>
</author>
<author><name sortKey="Nozaki, D" uniqKey="Nozaki D">D Nozaki</name>
</author>
<author><name sortKey="Nakajima, Y" uniqKey="Nakajima Y">Y Nakajima</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Jazayeri, M" uniqKey="Jazayeri M">M Jazayeri</name>
</author>
<author><name sortKey="Shadlen, Mn" uniqKey="Shadlen M">MN Shadlen</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ahrens, Mb" uniqKey="Ahrens M">MB Ahrens</name>
</author>
<author><name sortKey="Sahani, M" uniqKey="Sahani M">M Sahani</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Acerbi, L" uniqKey="Acerbi L">L Acerbi</name>
</author>
<author><name sortKey="Wolpert, Dm" uniqKey="Wolpert D">DM Wolpert</name>
</author>
<author><name sortKey="Vijayakumar, S" uniqKey="Vijayakumar S">S Vijayakumar</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kording, Kp" uniqKey="Kording K">KP Kording</name>
</author>
<author><name sortKey="Wolpert, Dm" uniqKey="Wolpert D">DM Wolpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tassinari, H" uniqKey="Tassinari H">H Tassinari</name>
</author>
<author><name sortKey="Hudson, T" uniqKey="Hudson T">T Hudson</name>
</author>
<author><name sortKey="Landy, M" uniqKey="Landy M">M Landy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Berniker, M" uniqKey="Berniker M">M Berniker</name>
</author>
<author><name sortKey="Voss, M" uniqKey="Voss M">M Voss</name>
</author>
<author><name sortKey="Kording, K" uniqKey="Kording K">K Kording</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Adams, Wj" uniqKey="Adams W">WJ Adams</name>
</author>
<author><name sortKey="Graf, Ew" uniqKey="Graf E">EW Graf</name>
</author>
<author><name sortKey="Ernst, Mo" uniqKey="Ernst M">MO Ernst</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sotiropoulos, G" uniqKey="Sotiropoulos G">G Sotiropoulos</name>
</author>
<author><name sortKey="Seitz, A" uniqKey="Seitz A">A Seitz</name>
</author>
<author><name sortKey="Series, P" uniqKey="Series P">P Seriès</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kording, K" uniqKey="Kording K">K Kording</name>
</author>
<author><name sortKey="Wolpert, D" uniqKey="Wolpert D">D Wolpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Trommersh User, J" uniqKey="Trommersh User J">J Trommershäuser</name>
</author>
<author><name sortKey="Maloney, L" uniqKey="Maloney L">L Maloney</name>
</author>
<author><name sortKey="Landy, M" uniqKey="Landy M">M Landy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Sundareswara, R" uniqKey="Sundareswara R">R Sundareswara</name>
</author>
<author><name sortKey="Schrater, Pr" uniqKey="Schrater P">PR Schrater</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Vul, E" uniqKey="Vul E">E Vul</name>
</author>
<author><name sortKey="Pashler, H" uniqKey="Pashler H">H Pashler</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Fiser, J" uniqKey="Fiser J">J Fiser</name>
</author>
<author><name sortKey="Berkes, P" uniqKey="Berkes P">P Berkes</name>
</author>
<author><name sortKey="Orban, G" uniqKey="Orban G">G Orbán</name>
</author>
<author><name sortKey="Lengyel, M" uniqKey="Lengyel M">M Lengyel</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gekas, N" uniqKey="Gekas N">N Gekas</name>
</author>
<author><name sortKey="Chalk, M" uniqKey="Chalk M">M Chalk</name>
</author>
<author><name sortKey="Seitz, Ar" uniqKey="Seitz A">AR Seitz</name>
</author>
<author><name sortKey="Series, P" uniqKey="Series P">P Seriès</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Van Den Berg, R" uniqKey="Van Den Berg R">R van den Berg</name>
</author>
<author><name sortKey="Awh, E" uniqKey="Awh E">E Awh</name>
</author>
<author><name sortKey="Ma, Wj" uniqKey="Ma W">WJ Ma</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kording, Kp" uniqKey="Kording K">KP Körding</name>
</author>
<author><name sortKey="Wolpert, Dm" uniqKey="Wolpert D">DM Wolpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Hudson, Te" uniqKey="Hudson T">TE Hudson</name>
</author>
<author><name sortKey="Maloney, Lt" uniqKey="Maloney L">LT Maloney</name>
</author>
<author><name sortKey="Landy, Ms" uniqKey="Landy M">MS Landy</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Campbell, L" uniqKey="Campbell L">L Campbell</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Wozny, Dr" uniqKey="Wozny D">DR Wozny</name>
</author>
<author><name sortKey="Beierholm, Ur" uniqKey="Beierholm U">UR Beierholm</name>
</author>
<author><name sortKey="Shams, L" uniqKey="Shams L">L Shams</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhang, H" uniqKey="Zhang H">H Zhang</name>
</author>
<author><name sortKey="Maloney, L" uniqKey="Maloney L">L Maloney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Wichmann, Fa" uniqKey="Wichmann F">FA Wichmann</name>
</author>
<author><name sortKey="Hill, Nj" uniqKey="Hill N">NJ Hill</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Neal, R" uniqKey="Neal R">R Neal</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Spiegelhalter, Dj" uniqKey="Spiegelhalter D">DJ Spiegelhalter</name>
</author>
<author><name sortKey="Best, Ng" uniqKey="Best N">NG Best</name>
</author>
<author><name sortKey="Carlin, Bp" uniqKey="Carlin B">BP Carlin</name>
</author>
<author><name sortKey="Van Der Linde, A" uniqKey="Van Der Linde A">A Van Der Linde</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Stephan, Ke" uniqKey="Stephan K">KE Stephan</name>
</author>
<author><name sortKey="Penny, Wd" uniqKey="Penny W">WD Penny</name>
</author>
<author><name sortKey="Daunizeau, J" uniqKey="Daunizeau J">J Daunizeau</name>
</author>
<author><name sortKey="Moran, Rj" uniqKey="Moran R">RJ Moran</name>
</author>
<author><name sortKey="Friston, Kj" uniqKey="Friston K">KJ Friston</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kass, Re" uniqKey="Kass R">RE Kass</name>
</author>
<author><name sortKey="Raftery, Ae" uniqKey="Raftery A">AE Raftery</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Battaglia, Pw" uniqKey="Battaglia P">PW Battaglia</name>
</author>
<author><name sortKey="Kersten, D" uniqKey="Kersten D">D Kersten</name>
</author>
<author><name sortKey="Schrater, Pr" uniqKey="Schrater P">PR Schrater</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Battaglia, Pw" uniqKey="Battaglia P">PW Battaglia</name>
</author>
<author><name sortKey="Hamrick, Jb" uniqKey="Hamrick J">JB Hamrick</name>
</author>
<author><name sortKey="Tenenbaum, Jb" uniqKey="Tenenbaum J">JB Tenenbaum</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Dakin, Sc" uniqKey="Dakin S">SC Dakin</name>
</author>
<author><name sortKey="Tibber, Ms" uniqKey="Tibber M">MS Tibber</name>
</author>
<author><name sortKey="Greenwood, Ja" uniqKey="Greenwood J">JA Greenwood</name>
</author>
<author><name sortKey="Morgan, Mj" uniqKey="Morgan M">MJ Morgan</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kuss, M" uniqKey="Kuss M">M Kuss</name>
</author>
<author><name sortKey="J Kel, F" uniqKey="J Kel F">F Jäkel</name>
</author>
<author><name sortKey="Wichmann, Fa" uniqKey="Wichmann F">FA Wichmann</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Kahneman, D" uniqKey="Kahneman D">D Kahneman</name>
</author>
<author><name sortKey="Tversky, A" uniqKey="Tversky A">A Tversky</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Tversky, A" uniqKey="Tversky A">A Tversky</name>
</author>
<author><name sortKey="Kahneman, D" uniqKey="Kahneman D">D Kahneman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Feldman, J" uniqKey="Feldman J">J Feldman</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Mamassian, P" uniqKey="Mamassian P">P Mamassian</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhang, H" uniqKey="Zhang H">H Zhang</name>
</author>
<author><name sortKey="Morvan, C" uniqKey="Morvan C">C Morvan</name>
</author>
<author><name sortKey="Maloney, Lt" uniqKey="Maloney L">LT Maloney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Zhang, H" uniqKey="Zhang H">H Zhang</name>
</author>
<author><name sortKey="Daw, Nd" uniqKey="Daw N">ND Daw</name>
</author>
<author><name sortKey="Maloney, Lt" uniqKey="Maloney L">LT Maloney</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Trommersh User, J" uniqKey="Trommersh User J">J Trommershäuser</name>
</author>
<author><name sortKey="Gepshtein, S" uniqKey="Gepshtein S">S Gepshtein</name>
</author>
<author><name sortKey="Maloney, Lt" uniqKey="Maloney L">LT Maloney</name>
</author>
<author><name sortKey="Landy, Ms" uniqKey="Landy M">MS Landy</name>
</author>
<author><name sortKey="Banks, Ms" uniqKey="Banks M">MS Banks</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Gepshtein, S" uniqKey="Gepshtein S">S Gepshtein</name>
</author>
<author><name sortKey="Seydell, A" uniqKey="Seydell A">A Seydell</name>
</author>
<author><name sortKey="Trommersh User, J" uniqKey="Trommersh User J">J Trommershäuser</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Faisal, Aa" uniqKey="Faisal A">AA Faisal</name>
</author>
<author><name sortKey="Selen, Lp" uniqKey="Selen L">LP Selen</name>
</author>
<author><name sortKey="Wolpert, Dm" uniqKey="Wolpert D">DM Wolpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Ma, Wj" uniqKey="Ma W">WJ Ma</name>
</author>
<author><name sortKey="Beck, Jm" uniqKey="Beck J">JM Beck</name>
</author>
<author><name sortKey="Latham, Pe" uniqKey="Latham P">PE Latham</name>
</author>
<author><name sortKey="Pouget, A" uniqKey="Pouget A">A Pouget</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Beck, Jm" uniqKey="Beck J">JM Beck</name>
</author>
<author><name sortKey="Ma, Wj" uniqKey="Ma W">WJ Ma</name>
</author>
<author><name sortKey="Pitkow, X" uniqKey="Pitkow X">X Pitkow</name>
</author>
<author><name sortKey="Latham, Pe" uniqKey="Latham P">PE Latham</name>
</author>
<author><name sortKey="Pouget, A" uniqKey="Pouget A">A Pouget</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Gaissmaier, W" uniqKey="Gaissmaier W">W Gaissmaier</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Green, C" uniqKey="Green C">C Green</name>
</author>
<author><name sortKey="Benson, C" uniqKey="Benson C">C Benson</name>
</author>
<author><name sortKey="Kersten, D" uniqKey="Kersten D">D Kersten</name>
</author>
<author><name sortKey="Schrater, P" uniqKey="Schrater P">P Schrater</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Oldfield, Rc" uniqKey="Oldfield R">RC Oldfield</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Howard, Is" uniqKey="Howard I">IS Howard</name>
</author>
<author><name sortKey="Ingram, Jn" uniqKey="Ingram J">JN Ingram</name>
</author>
<author><name sortKey="Wolpert, Dm" uniqKey="Wolpert D">DM Wolpert</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Teuscher, F" uniqKey="Teuscher F">F Teuscher</name>
</author>
<author><name sortKey="Guiard, V" uniqKey="Guiard V">V Guiard</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Carreira Perpinan, Ma" uniqKey="Carreira Perpinan M">MA Carreira-Perpinan</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
<biblStruct></biblStruct>
<biblStruct><analytic><author><name sortKey="Gelman, A" uniqKey="Gelman A">A Gelman</name>
</author>
<author><name sortKey="Rubin, Db" uniqKey="Rubin D">DB Rubin</name>
</author>
</analytic>
</biblStruct>
<biblStruct><analytic><author><name sortKey="Greenhouse, Sw" uniqKey="Greenhouse S">SW Greenhouse</name>
</author>
<author><name sortKey="Geisser, S" uniqKey="Geisser S">S Geisser</name>
</author>
</analytic>
</biblStruct>
<biblStruct></biblStruct>
</listBibl>
</div1>
</back>
</TEI>
<pmc article-type="research-article"><pmc-dir>properties open_access</pmc-dir>
  <front><journal-meta><journal-id journal-id-type="nlm-ta">PLoS Comput Biol</journal-id>
<journal-id journal-id-type="iso-abbrev">PLoS Comput. Biol</journal-id>
<journal-id journal-id-type="publisher-id">plos</journal-id>
<journal-id journal-id-type="pmc">ploscomp</journal-id>
<journal-title-group><journal-title>PLoS Computational Biology</journal-title>
</journal-title-group>
<issn pub-type="ppub">1553-734X</issn>
<issn pub-type="epub">1553-7358</issn>
<publisher><publisher-name>Public Library of Science</publisher-name>
<publisher-loc>San Francisco, USA</publisher-loc>
</publisher>
</journal-meta>
<article-meta><article-id pub-id-type="pmid">24945142</article-id>
<article-id pub-id-type="pmc">4063671</article-id>
<article-id pub-id-type="publisher-id">PCOMPBIOL-D-13-02082</article-id>
<article-id pub-id-type="doi">10.1371/journal.pcbi.1003661</article-id>
<article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject>
</subj-group>
<subj-group subj-group-type="Discipline-v2"><subject>Biology and Life Sciences</subject>
<subj-group><subject>Computational Biology</subject>
<subj-group><subject>Computational Neuroscience</subject>
</subj-group>
</subj-group>
<subj-group><subject>Neuroscience</subject>
<subj-group><subject>Cognitive Science</subject>
<subj-group><subject>Cognition</subject>
<subj-group><subject>Decision Making</subject>
</subj-group>
</subj-group>
</subj-group>
<subj-group><subject>Cognitive Neuroscience</subject>
</subj-group>
</subj-group>
</subj-group>
</article-categories>
<title-group><article-title>On the Origins of Suboptimality in Human Probabilistic Inference</article-title>
<alt-title alt-title-type="running-head">Suboptimality in Probabilistic Inference</alt-title>
</title-group>
<contrib-group><contrib contrib-type="author"><name><surname>Acerbi</surname>
<given-names>Luigi</given-names>
</name>
<xref ref-type="aff" rid="aff1"><sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup>
</xref>
<xref ref-type="corresp" rid="cor1"><sup>*</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Vijayakumar</surname>
<given-names>Sethu</given-names>
</name>
<xref ref-type="aff" rid="aff1"><sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author"><name><surname>Wolpert</surname>
<given-names>Daniel M.</given-names>
</name>
<xref ref-type="aff" rid="aff3"><sup>3</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1"><label>1</label>
<addr-line>Institute of Perception, Action and Behaviour, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</aff>
<aff id="aff2"><label>2</label>
<addr-line>Doctoral Training Centre in Neuroinformatics and Computational Neuroscience, School of Informatics, University of Edinburgh, Edinburgh, United Kingdom</addr-line>
</aff>
<aff id="aff3"><label>3</label>
<addr-line>Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom</addr-line>
</aff>
<contrib-group><contrib contrib-type="editor"><name><surname>Beck</surname>
<given-names>Jeff</given-names>
</name>
<role>Editor</role>
<xref ref-type="aff" rid="edit1"></xref>
</contrib>
</contrib-group>
<aff id="edit1"><addr-line>Duke University, United States of America</addr-line>
</aff>
<author-notes><corresp id="cor1">* E-mail: <email>L.Acerbi@sms.ed.ac.uk</email>
</corresp>
<fn fn-type="conflict"><p>The authors have declared that no competing interests exist.</p>
</fn>
<fn fn-type="con"><p>Conceived and designed the experiments: LA SV DMW. Performed the experiments: LA. Analyzed the data: LA. Wrote the paper: LA SV DMW.</p>
</fn>
</author-notes>
<pub-date pub-type="collection"><month>6</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="epub"><day>19</day>
<month>6</month>
<year>2014</year>
</pub-date>
<volume>10</volume>
<issue>6</issue>
<elocation-id>e1003661</elocation-id>
<history><date date-type="received"><day>25</day>
<month>11</month>
<year>2013</year>
</date>
<date date-type="accepted"><day>25</day>
<month>4</month>
<year>2014</year>
</date>
</history>
<permissions><copyright-statement>© 2014 Acerbi et al</copyright-statement>
<copyright-year>2014</copyright-year>
<copyright-holder>Acerbi et al</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><license-p>This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.</license-p>
</license>
</permissions>
<abstract><p>Humans have been shown to combine noisy sensory information with previous experience (priors), in qualitative and sometimes quantitative agreement with the statistically-optimal predictions of Bayesian integration. However, when the prior distribution becomes more complex than a simple Gaussian, such as skewed or bimodal, training takes much longer and performance appears suboptimal. It is unclear whether such suboptimality arises from an imprecise internal representation of the complex prior, or from additional constraints in performing probabilistic computations on complex distributions, even when accurately represented. Here we probe the sources of suboptimality in probabilistic inference using a novel estimation task in which subjects are exposed to an explicitly provided distribution, thereby removing the need to remember the prior. Subjects had to estimate the location of a target given a noisy cue and a visual representation of the prior probability density over locations, which changed on each trial. Different classes of priors were examined (Gaussian, unimodal, bimodal). Subjects' performance was in qualitative agreement with the predictions of Bayesian Decision Theory although generally suboptimal. The degree of suboptimality was modulated by statistical features of the priors but was largely independent of the class of the prior and level of noise in the cue, suggesting that suboptimality in dealing with complex statistical features, such as bimodality, may be due to a problem of acquiring the priors rather than computing with them. We performed a factorial model comparison across a large set of Bayesian observer models to identify additional sources of noise and suboptimality. Our analysis rejects several models of stochastic behavior, including probability matching and sample-averaging strategies. Instead we show that subjects' response variability was mainly driven by a combination of a noisy estimation of the parameters of the priors, and by variability in the decision process, which we represent as a noisy or stochastic posterior.</p>
</abstract>
<abstract abstract-type="summary"><title>Author Summary</title>
<p>The process of decision making involves combining sensory information with statistics collected from prior experience. This combination is more likely to yield ‘statistically optimal’ behavior when our prior experiences conform to a simple and regular pattern. In contrast, if prior experience has complex patterns, we might require more trial-and-error before finding the optimal solution. This partly explains why, for example, a person deciding the appropriate clothes to wear for the weather on a June day in Italy has a higher chance of success than her counterpart in Scotland. Our study uses a novel experimental setup that examines the role of complexity of prior experience on suboptimal decision making. Participants are asked to find a specific target from an array of potential targets given a cue about its location. Importantly, the ‘prior’ information is presented explicitly so that subjects do not need to recall prior events. Participants' performance, albeit suboptimal, was mostly unaffected by the complexity of the prior distributions, suggesting that remembering the patterns of past events constitutes more of a challenge to decision making than manipulating the complex probabilistic information. We introduce a mathematical description that captures the pattern of human responses in our task better than previous accounts.</p>
</abstract>
<funding-group><funding-statement>This work was supported in part by grants EP/F500385/1 and BB/F529254/1 for the University of Edinburgh School of Informatics Doctoral Training Centre in Neuroinformatics and Computational Neuroscience from the UK Engineering and Physical Sciences Research Council, UK Biotechnology and Biological Sciences Research Council, and the UK Medical Research Council (LA). This work was also supported by the Wellcome Trust (DMW), the Human Frontiers Science Program (DMW), and the Royal Society Noreen Murray Professorship in Neurobiology to DMW. SV is supported through grants from Microsoft Research, Royal Academy of Engineering and EU FP7 programs. The work has made use of resources provided by the Edinburgh Compute and Data Facility, which has support from the eDIKT initiative. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.</funding-statement>
</funding-group>
<counts><page-count count="23"></page-count>
</counts>
</article-meta>
</front>
<body><sec id="s1"><title>Introduction</title>
<p>Humans have been shown to integrate prior knowledge and sensory information in a probabilistic manner to obtain optimal (or nearly so) estimates of behaviorally relevant stimulus quantities, such as speed <xref rid="pcbi.1003661-Weiss1" ref-type="bibr">[1]</xref>
, <xref rid="pcbi.1003661-Stocker1" ref-type="bibr">[2]</xref>
, orientation <xref rid="pcbi.1003661-Girshick1" ref-type="bibr">[3]</xref>
, direction of motion <xref rid="pcbi.1003661-Chalk1" ref-type="bibr">[4]</xref>
, interval duration <xref rid="pcbi.1003661-Miyazaki1" ref-type="bibr">[5]</xref>
–<xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
 and position <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
–<xref rid="pcbi.1003661-Berniker1" ref-type="bibr">[11]</xref>
. Prior expectations about the values taken by the task-relevant variable are usually assumed to be learned either from statistics of the natural environment <xref rid="pcbi.1003661-Weiss1" ref-type="bibr">[1]</xref>
–<xref rid="pcbi.1003661-Girshick1" ref-type="bibr">[3]</xref>
 or during the course of the experiment <xref rid="pcbi.1003661-Chalk1" ref-type="bibr">[4]</xref>
–<xref rid="pcbi.1003661-Jazayeri1" ref-type="bibr">[6]</xref>
, <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
–<xref rid="pcbi.1003661-Berniker1" ref-type="bibr">[11]</xref>
; the latter include studies in which a pre-existing prior is modified in the experimental context <xref rid="pcbi.1003661-Adams1" ref-type="bibr">[12]</xref>
, <xref rid="pcbi.1003661-Sotiropoulos1" ref-type="bibr">[13]</xref>
. Behavior in these perceptual and sensorimotor tasks is qualitatively and often quantitatively well described by Bayesian Decision Theory (BDT) <xref rid="pcbi.1003661-Kording2" ref-type="bibr">[14]</xref>
, <xref rid="pcbi.1003661-Trommershuser1" ref-type="bibr">[15]</xref>
.</p>
<p>The extent to which we are capable of performing probabilistic inference on complex distributions that go beyond simple Gaussians, and the algorithms and approximations that we might use, is still unclear <xref rid="pcbi.1003661-Kording2" ref-type="bibr">[14]</xref>
. For example, it has been suggested that humans might approximate Bayesian computations by drawing random samples from the posterior distribution <xref rid="pcbi.1003661-Sundareswara1" ref-type="bibr">[16]</xref>
–<xref rid="pcbi.1003661-Fiser1" ref-type="bibr">[19]</xref>
. A major problem in testing hypotheses about human probabilistic inference is the difficulty in identifying the source of suboptimality, that is, separating any constraints and idiosyncrasies in performing Bayesian computations per se from any deficiencies in learning and recalling the correct prior. For example, previous work has examined Bayesian integration in the presence of experimentally-imposed bimodal priors <xref rid="pcbi.1003661-Chalk1" ref-type="bibr">[4]</xref>
, <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
, <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
, <xref rid="pcbi.1003661-Gekas1" ref-type="bibr">[20]</xref>
. Here the normative prescription of BDT under a wide variety of assumptions would be that responses should be biased towards one peak of the distribution or the other, depending on the current sensory information. However, for such bimodal priors, the emergence of Bayesian biases can require thousands of trials <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
 or be apparent only on pooled data <xref rid="pcbi.1003661-Chalk1" ref-type="bibr">[4]</xref>
, and often data show at best a complex pattern of biases which is only in partial agreement with the underlying distribution <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
, <xref rid="pcbi.1003661-Gekas1" ref-type="bibr">[20]</xref>
. It is unknown whether this mismatch is due to the difficulty of learning statistical features of the bimodal distribution or if the bimodal prior is actually fully learned but our ability to perform Bayesian computation with it is limited. In the current study we look systematically at how people integrate uncertain cues with trial-dependent ‘prior’ distributions that are explicitly made available to the subjects. The priors were displayed as an array of potential targets distributed according to various density classes – Gaussian, unimodal or bimodal. Our paradigm allows full control over the generative model of the task and separates the aspect of computing with a probability distribution from the problem of learning and recalling a prior. We examine subjects' performance in manipulating probabilistic information as a function of the shape of the prior. Participants' behavior in the task is in qualitative agreement with Bayesian integration, although quite variable and generally suboptimal, but the degree of suboptimality does not differ significantly across different classes of distributions or levels of reliability of the cue. In particular, performance was not greatly affected by complexity of the distribution per se – for instance, people's performance with bimodal priors is analogous to that with Gaussian priors, in contrast to previous learning experiments <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
, <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
. This finding suggests that major deviations encountered in previous studies are likely to be primarily caused by the difficulty in learning complex statistical features rather than computing with them.</p>
<p>We systematically explore the sources of suboptimality and variability in subjects' responses by employing a methodology that has been recently called <italic>factorial model comparison</italic>
<xref rid="pcbi.1003661-vandenBerg1" ref-type="bibr">[21]</xref>
. Using this approach we generate a set of models by combining different sources of suboptimality, such as different approximations in decision making with different forms of sensory noise, in a factorial manner. Our model comparison is able to reject some common models of variability in decision making, such as probability matching with the posterior distribution (posterior-matching) or a sampling-average strategy consisting of averaging a number of samples from the posterior distribution. The observer model that best describes the data is a Bayesian observer with a slightly mismatched representation of the likelihoods, with sensory noise in the estimation of the parameters of the prior, that occasionally lapses, and most importantly has a stochastic representation of the posterior that may represent additional variability in the inference process or in action selection.</p>
</sec>
<sec id="s2"><title>Results</title>
<p>Subjects were required to locate an unknown target given probabilistic information about its position along a target line (<xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1a–b</xref>
). Information consisted of a visual representation of the a priori probability distribution of targets for that trial and a noisy cue about the actual target position (<xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1b</xref>
). On each trial a hundred potential targets (dots) were displayed on a horizontal line according to a discrete representation of a trial-dependent ‘prior’ distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e001.jpg"></inline-graphic>
</inline-formula>
. The true target, unknown to the subject, was chosen at random from the potential targets with uniform probability. A noisy cue with horizontal position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e002.jpg"></inline-graphic>
</inline-formula>
, drawn from a normal distribution centered on the true target, provided partial information about target location. The cue had distance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e003.jpg"></inline-graphic>
</inline-formula>
 from the target line, which could be either a short distance, corresponding to added noise with low-variance, or a long distance, with high-variance noise. Both prior distribution and cue remained on screen for the duration of the trial. (See <xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1c–d</xref>
 for the generative model of the task.) The task for the subjects involved moving a circular cursor controlled by a manipulandum towards the target line, ending the movement at their best estimate for the position of the real target. A ‘success’ ensued if the true target was within the cursor radius.</p>
<fig id="pcbi-1003661-g001" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g001</object-id>
<label>Figure 1</label>
<caption><title>Experimental procedure.</title>
<p><bold>a: Setup.</bold>
 Subjects held the handle of a robotic manipulandum. The visual scene from a CRT monitor, including a cursor that tracked the hand position, was projected into the plane of the hand via a mirror. <bold>b: Screen setup.</bold>
 The screen showed a home position (grey circle), the cursor (red circle) here at the start of a trial, a line of potential targets (dots) and a visual cue (yellow dot). The task consisted in locating the true target among the array of potential targets, given the position of the noisy cue. The coordinate axis was not displayed on screen, and the target line is shaded here only for visualization purposes. <bold>c: Generative model of the task.</bold>
 On each trial the position of the hidden target <inline-formula><inline-graphic xlink:href="pcbi.1003661.e004.jpg"></inline-graphic>
</inline-formula>
 was drawn from a discrete representation of the trial-dependent prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e005.jpg"></inline-graphic>
</inline-formula>
, whose shape was chosen randomly from a session-dependent class of distributions. The vertical distance of the cue from the target line, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e006.jpg"></inline-graphic>
</inline-formula>
, was either ‘short’ or ‘long’, with equal probability. The horizontal position of the cue, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e007.jpg"></inline-graphic>
</inline-formula>
, depended on <inline-formula><inline-graphic xlink:href="pcbi.1003661.e008.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e009.jpg"></inline-graphic>
</inline-formula>
. The participants had to infer <inline-formula><inline-graphic xlink:href="pcbi.1003661.e010.jpg"></inline-graphic>
</inline-formula>
 given <inline-formula><inline-graphic xlink:href="pcbi.1003661.e011.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e012.jpg"></inline-graphic>
</inline-formula>
 and the current prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e013.jpg"></inline-graphic>
</inline-formula>
. <bold>d: Details of the generative model.</bold>
 The potential targets constituted a discrete representation of the trial-dependent prior distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e014.jpg"></inline-graphic>
</inline-formula>
; the discrete representation was built by taking equally spaced samples from the inverse of the cdf of the prior, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e015.jpg"></inline-graphic>
</inline-formula>
. The true target (red dot) was chosen uniformly at random from the potential targets, and the horizontal position of the cue (yellow dot) was drawn from a Gaussian distribution, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e016.jpg"></inline-graphic>
</inline-formula>
, centered on the true target <inline-formula><inline-graphic xlink:href="pcbi.1003661.e017.jpg"></inline-graphic>
</inline-formula>
 and whose SD was proportional to the distance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e018.jpg"></inline-graphic>
</inline-formula>
 from the target line (either ‘short’ or ‘long’, depending on the trial, for respectively low-noise and high-noise cues). Here we show the location of the cue for a high-noise trial. <bold>e: Components of Bayesian decision making.</bold>
 According to Bayesian Decision Theory, a Bayesian ideal observer combines the prior distribution with the likelihood function to obtain a posterior distribution. The posterior is then convolved with the loss function (in this case whether the target will be encircled by the cursor) and the observer picks the ‘optimal’ target location <inline-formula><inline-graphic xlink:href="pcbi.1003661.e019.jpg"></inline-graphic>
</inline-formula>
 (purple dot) that corresponds to the minimum of the expected loss (dashed line).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g001"></graphic>
</fig>
<p>To explain the task, subjects were told that the each dot represented a child standing in a line in a courtyard, seen from a bird's eye view. On each trial a random child was chosen and, while the subject was ‘not looking’, the child threw a yellow ball (the cue) directly ahead of them towards the opposite wall. Due to their poor throwing skills, the farther they threw the ball the more imprecise they were in terms of landing the ball straight in front of them. The subject's task was to identify the child who threw the ball, after seeing the landing point of the ball, by encircling him or her with the cursor. Subjects were told that the child throwing the ball could be any of the children, chosen randomly each trial with equal probability.</p>
<p>Twenty-four subjects performed a training session in which the ‘prior’ distributions of targets shown on the screen (the set of children) corresponded to Gaussian distributions with a standard deviation (SD) that varied between trials (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e020.jpg"></inline-graphic>
</inline-formula>
 from 0.04 to 0.18 standardized screen units; <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2a</xref>
). On each trial the location (mean) of the prior was chosen randomly from a uniform distribution. Half of the trials provided the subjects with a ‘short-distance’ cue about the position of the target (low noise: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e021.jpg"></inline-graphic>
</inline-formula>
 screen units; a short throw of the ball); the other half had a ‘long-distance’ cue (high noise: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e022.jpg"></inline-graphic>
</inline-formula>
 screen units; a long throw). The actual position of the target (the ‘child’ who threw the ball) was revealed at the end of each trial and a displayed score kept track of the number of ‘successes’ in the session (full performance feedback). The training session allowed subjects to learn the structure of the task in a setting in which humans are known to perform in qualitative and often quantitative agreement with Bayesian Decision Theory, i.e. under Gaussian priors <xref rid="pcbi.1003661-Miyazaki1" ref-type="bibr">[5]</xref>
, <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
–<xref rid="pcbi.1003661-Berniker1" ref-type="bibr">[11]</xref>
. Note however that, in contrast with the previous studies, our subjects were required to compute each trial with a different Gaussian distribution (<xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2a</xref>
). The use of Gaussian priors in the training session allowed us to assess whether our subjects could use explicit priors in our novel experimental setup in the same way in which they have been shown to learn Gaussian priors through extended implicit practice.</p>
<fig id="pcbi-1003661-g002" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g002</object-id>
<label>Figure 2</label>
<caption><title>Prior distributions.</title>
<p>Each panel shows the (unnormalized) probability density for a ‘prior’ distribution of targets, grouped by experimental session, with eight different priors per session. Within each session, priors are numbered in order of increasing differential entropy (i.e. increasing variance for Gaussian distributions). During the experiment, priors had a random location (mean drawn uniformly) and asymmetrical priors had probability 1/2 of being ‘flipped’. Target positions are shown in standardized screen units (from <inline-formula><inline-graphic xlink:href="pcbi.1003661.e023.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e024.jpg"></inline-graphic>
</inline-formula>
). <bold>a: Gaussian priors.</bold>
 These priors were used for the training session, common to all subjects, and in the Gaussian test session. Standard deviations cover the range <inline-formula><inline-graphic xlink:href="pcbi.1003661.e025.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e026.jpg"></inline-graphic>
</inline-formula>
 screen units in equal increments. <bold>b: Unimodal priors.</bold>
 All unimodal priors have fixed SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e027.jpg"></inline-graphic>
</inline-formula>
 screen units but different skewness and kurtosis (see <xref ref-type="sec" rid="s4">Methods</xref>
 for details). <bold>c: Bimodal priors.</bold>
 All priors in the bimodal session have fixed SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e028.jpg"></inline-graphic>
</inline-formula>
 screen units but different relative weights and separation between the peaks (see <xref ref-type="sec" rid="s4">Methods</xref>
).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g002"></graphic>
</fig>
<p>After the training session, subjects were randomly divided in three groups (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e029.jpg"></inline-graphic>
</inline-formula>
 each) to perform a test session. Test sessions differed with respect to the class of prior distributions displayed during the session. For the ‘Gaussian test’ group, the distributions were the same eight Gaussian distributions of varying SD used during training (<xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2a</xref>
). For the ‘unimodal test’ group, on each trial the prior was randomly chosen from eight unimodal distributions with fixed SD (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e030.jpg"></inline-graphic>
</inline-formula>
 screen units) but with varying skewness and kurtosis (see <xref ref-type="sec" rid="s4">Methods</xref>
 and <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2b</xref>
). For the ‘bimodal test’ group, priors were chosen from eight (mostly) bimodal distributions with fixed SD (again, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e031.jpg"></inline-graphic>
</inline-formula>
 screen units) but variable separation and weighting between peaks (see <xref ref-type="sec" rid="s4">Methods</xref>
 and <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2c</xref>
). As in the training session, on each trial the mean of the prior was drawn randomly from a uniform distribution. To preserve global symmetry during the session, asymmetric priors were ‘flipped’ along their center of mass with a probability of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e032.jpg"></inline-graphic>
</inline-formula>
. During the test session, at the end of each trial subjects were informed whether they ‘succeeded’ or ‘missed’ the target but the target's actual location was not displayed (partial feedback). The ‘Gaussian test’ group allowed us to verify that subjects’ behavior would not change after removal of full performance feedback. The ‘unimodal test’ and ‘bimodal test’ groups provided us with novel information on how subjects perform probabilistic inference with complex distributions. Moreover, non-Gaussian priors allowed us to evaluate several hypotheses about subjects’ behavior that are not testable with Gaussian distributions alone <xref rid="pcbi.1003661-Krding1" ref-type="bibr">[22]</xref>
.</p>
<sec id="s2a"><title>Human performance</title>
<p>We first performed a model-free analysis of subjects' performance. <xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3</xref>
 shows three representative prior distributions and the pooled subjects' responses as a function of the cue position for low (red) and high (blue) noise cues. Note that pooled data are used here only for display and all subjects' datasets were analyzed individually. The cue positions and responses in <xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3</xref>
 are reported in a coordinate system relative to the mean of the prior (set as <inline-formula><inline-graphic xlink:href="pcbi.1003661.e033.jpg"></inline-graphic>
</inline-formula>
). For all analyses we consider relative coordinates without loss of generality, having verified the assumption of translational invariance of our task (see Section 1 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
).</p>
<fig id="pcbi-1003661-g003" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g003</object-id>
<label>Figure 3</label>
<caption><title>Subjects' responses as a function of the position of the cue.</title>
<p>Each panel shows the pooled subjects' responses as a function of the position of the cue either for low-noise cues (red dots) or high-noise cues (blue dots). Each column corresponds to a representative prior distribution, shown at the top, for each different group (Gaussian, unimodal and bimodal). In the response plots, dashed lines correspond to the Bayes optimal strategy given the generative model of the task. The continuous lines are a kernel regression estimate of the mean response (see <xref ref-type="sec" rid="s4">Methods</xref>
). <bold>a</bold>
. Exemplar Gaussian prior (prior 4 in <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2a</xref>
). <bold>b</bold>
. Exemplar unimodal prior (platykurtic distribution: prior 4 in <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2b</xref>
). <bold>c</bold>
. Exemplar bimodal prior (prior 5 in <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2c</xref>
). Note that in this case the mean response is not necessarily a good description of subjects' behavior, since the marginal distribution of responses for central positions of the cue is bimodal.</p>
</caption>
<graphic xlink:href="pcbi.1003661.g003"></graphic>
</fig>
<p><xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3</xref>
 shows that subjects' performance was affected by both details of the prior distribution and the cue. Also, subjects' mean performance (continuous lines in <xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3</xref>
) show deviations from the prediction of an optimal Bayesian observer (dashed lines), suggesting that subjects behavior may have been suboptimal.</p>
<sec id="s2a1"><title>Linear integration with Gaussian priors</title>
<p>We examined how subjects performed in the task under the well-studied case of Gaussian priors <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
, <xref rid="pcbi.1003661-Tassinari1" ref-type="bibr">[10]</xref>
. Given a Gaussian prior with SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e034.jpg"></inline-graphic>
</inline-formula>
 and a noisy cue with horizontal position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e035.jpg"></inline-graphic>
</inline-formula>
 and known variability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e036.jpg"></inline-graphic>
</inline-formula>
 (assuming Gaussian noise), the most likely target location can be computed through Bayes' theorem. In the relative coordinate system (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e037.jpg"></inline-graphic>
</inline-formula>
), the optimal target location takes the simple linear form: <disp-formula id="pcbi.1003661.e038"><graphic xlink:href="pcbi.1003661.e038.jpg" position="anchor" orientation="portrait"></graphic>
<label>(1)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e039.jpg"></inline-graphic>
</inline-formula>
 is the linear weight assigned to the cue.</p>
<p>We compared subjects' behavior with the ‘optimal’ strategy predicted by Eq. 1 (see for instance <xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3a</xref>
; the dashed line corresponds to the optimal strategy). For each subject and each combination of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e040.jpg"></inline-graphic>
</inline-formula>
 and cue type (either ‘short’ or ‘long’, corresponding respectively to low-noise and high-noise cues), we fit the responses <inline-formula><inline-graphic xlink:href="pcbi.1003661.e041.jpg"></inline-graphic>
</inline-formula>
 as a function of the cue position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e042.jpg"></inline-graphic>
</inline-formula>
 with a robust linear fit. The slopes of these fits for the training session are plotted in <xref ref-type="fig" rid="pcbi-1003661-g004">Figure 4</xref>
; results were similar for the Gaussian test session. Statistical differences between different conditions were assessed using repeated-measures ANOVA (rm-ANOVA) with Greenhouse-Geisser correction (see <xref ref-type="sec" rid="s4">Methods</xref>
).</p>
<fig id="pcbi-1003661-g004" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g004</object-id>
<label>Figure 4</label>
<caption><title>Response slopes for the training session.</title>
<p>Response slope <inline-formula><inline-graphic xlink:href="pcbi.1003661.e043.jpg"></inline-graphic>
</inline-formula>
 as a function of the SD of the Gaussian prior distribution, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e044.jpg"></inline-graphic>
</inline-formula>
, plotted respectively for trials with low noise (‘short’ cues, red line) and high noise (‘long’ cues, blue line). The response slope is equivalent to the linear weight assigned to the position of the cue (<xref ref-type="disp-formula" rid="pcbi.1003661.e038">Eq. 1</xref>
). Dashed lines represent the Bayes optimal strategy given the generative model of the task in the two noise conditions. <italic>Top</italic>
: Slopes for a representative subject in the training session (slope <inline-formula><inline-graphic xlink:href="pcbi.1003661.e045.jpg"></inline-graphic>
</inline-formula>
 SE). <italic>Bottom</italic>
: Average slopes across all subjects in the training session (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e046.jpg"></inline-graphic>
</inline-formula>
, mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e047.jpg"></inline-graphic>
</inline-formula>
 SE across subjects).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g004"></graphic>
</fig>
<p>In general, subjects did not perform exactly as predicted by the optimal strategy (dashed lines), but they took into account the probabilistic nature of the task. Specifically, subjects tended to give more weight to low-noise cues than to high-noise ones (main effect: Low-noise cues, High-noise cues; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e048.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e049.jpg"></inline-graphic>
</inline-formula>
), and the weights were modulated by the width of the prior (main effect: prior width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e050.jpg"></inline-graphic>
</inline-formula>
; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e051.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e052.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e053.jpg"></inline-graphic>
</inline-formula>
), with wider priors inducing higher weighting of the cue. Interestingly, cue type and width of the prior seemed to influence the weights independently, as no significant interaction was found (interaction: prior width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e054.jpg"></inline-graphic>
</inline-formula>
 cue type; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e055.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e056.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e057.jpg"></inline-graphic>
</inline-formula>
). Analogous patterns were found in the Gaussian test session.</p>
<p>We also examined the average bias of subjects' responses (intercept of linear fits), which is expected to be zero for the optimal strategy. On average subjects exhibited a small but significant rightward bias in the training session of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e058.jpg"></inline-graphic>
</inline-formula>
 screen units or <inline-formula><inline-graphic xlink:href="pcbi.1003661.e059.jpg"></inline-graphic>
</inline-formula>
 mm (mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e060.jpg"></inline-graphic>
</inline-formula>
 SE across subjects, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e061.jpg"></inline-graphic>
</inline-formula>
). The average bias was only marginally different than zero in the test session: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e062.jpg"></inline-graphic>
</inline-formula>
 screen units (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e063.jpg"></inline-graphic>
</inline-formula>
 mm, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e064.jpg"></inline-graphic>
</inline-formula>
).</p>
</sec>
<sec id="s2a2"><title>Optimality index</title>
<p>We developed a general measure of performance that is applicable beyond the Gaussian case. An objective measure of performance in each trial is the success probability, that is, the probability that the target would be within a cursor radius' distance from the given response (final position of the cursor) under the generative model of the task (see <xref ref-type="sec" rid="s4">Methods</xref>
). We defined the <italic>optimality index</italic>
 for a trial as the success probability normalized by the maximal success probability (the success probability of an optimal response). The optimality index allows us to study variations in subjects' performance which are not trivially induced by variations in the difficulty of the task. <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 shows the optimality index averaged across subjects for different conditions, in different sessions. Data are also summarized in <xref ref-type="table" rid="pcbi-1003661-t001">Table 1</xref>
. Priors in <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 are listed in order of differential entropy (which corresponds to increasing variance for Gaussian priors), with the exception of ‘unimodal test’ priors which are in order of increasing width of the main peak in the prior, as computed through a Laplace approximation. We chose this ordering for priors in the unimodal test session as it highlights the pattern in subjects' performance (see below).</p>
<fig id="pcbi-1003661-g005" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g005</object-id>
<label>Figure 5</label>
<caption><title>Group mean optimality index.</title>
<p>Each bar represents the group-averaged optimality index for a specific session, for each prior (indexed from 1 to 8, see also <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2</xref>
) and cue type, low-noise cues (red bars) or high-noise cues (blue bars). The optimality index in each trial is computed as the probability of locating the correct target based on the subjects' responses divided by the probability of locating the target for an optimal responder. The maximal optimality index is 1, for a Bayesian observer with correct internal model of the task and no sensorimotor noise. Error bars are SE across subjects. Priors are arranged in the order of differential entropy (i.e. increasing variance for Gaussian priors), except for ‘unimodal test’ priors which are listed in order of increasing width of the main peak in the prior (see text). The dotted line and dash-dotted line represent the optimality index of a suboptimal observer that takes into account respectively either only the cue or only the prior. The shaded area is the zone of synergistic integration, in which an observer performs better than using information from either the prior or the cue alone.</p>
</caption>
<graphic xlink:href="pcbi.1003661.g005"></graphic>
</fig>
<table-wrap id="pcbi-1003661-t001" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.t001</object-id>
<label>Table 1</label>
<caption><title>Group mean optimality index.</title>
</caption>
<alternatives><graphic id="pcbi-1003661-t001-1" xlink:href="pcbi.1003661.t001"></graphic>
<table frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
</colgroup>
<thead><tr><td align="left" rowspan="1" colspan="1">Session</td>
<td align="left" rowspan="1" colspan="1">Low-noise cue</td>
<td align="left" rowspan="1" colspan="1">High-noise cue</td>
<td align="left" rowspan="1" colspan="1">All cues</td>
</tr>
</thead>
<tbody><tr><td align="left" rowspan="1" colspan="1">Gaussian training</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e065.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e066.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e067.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Gaussian test</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e068.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e069.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e070.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Unimodal test</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e071.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e072.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e073.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Bimodal test</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e074.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e075.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e076.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">All sessions</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e077.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e078.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e079.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot><fn id="nt101"><label></label>
<p>Each entry reports mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e080.jpg"></inline-graphic>
</inline-formula>
 SE of the group optimality index for a specific session and cue type, or averaged across all sessions/cues. See also <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>For a comparison, <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 also shows the optimality index of two suboptimal models that represent two extremal response strategies. Dash-dotted lines correspond to the optimality index of a Bayesian observer that maximizes the probability of locating the correct target considering only the prior distribution (see below for details). Conversely, dotted lines correspond to an observer that only uses the cue and ignores the prior: that is, the observer's response in a trial matches the current position of the cue. The shaded gray area specifies the ‘synergistic integration’ zone, in which the subject is integrating information from both prior and cue in a way that leads to better performance than by using either the prior or the cue alone. Qualitatively, the behavior in the gray area can be regarded as ‘close to optimal’, whereas performance below the gray area is suboptimal. As it is clear from <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
, in all sessions participants were sensitive to probabilistic information from both prior and cue – that is, performance is always above the minimum of the extremal models (dash-dotted and dotted lines) – in agreement with what we observed in <xref ref-type="fig" rid="pcbi-1003661-g004">Figure 4</xref>
 for Gaussian sessions, although their integration was generally suboptimal. Human subjects were analogously found to be suboptimal in a previous task that required to take into account explicit probabilistic information <xref rid="pcbi.1003661-Hudson1" ref-type="bibr">[23]</xref>
.</p>
<p>We examined how the optimality index changed across different conditions. From the analysis of the training session, it seems that subjects were able to integrate low-noise and high-noise cues for priors of any width equally well, as we found no effect of cue type on performance (main effect: Low-noise cues, High-noise cues; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e081.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e082.jpg"></inline-graphic>
</inline-formula>
) and no significant interaction between cue types and prior width (interaction: prior width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e083.jpg"></inline-graphic>
</inline-formula>
 cue type; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e084.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e085.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e086.jpg"></inline-graphic>
</inline-formula>
). However, relative performance was significantly affected by the width of the prior per se (main effect: prior width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e087.jpg"></inline-graphic>
</inline-formula>
; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e088.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e089.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e090.jpg"></inline-graphic>
</inline-formula>
); people tended to perform worse with wider priors, in a way that is not simply explained by the objective decrease in the probability of locating the correct target due to the less available information (see <xref ref-type="sec" rid="s3">Discussion</xref>
).</p>
<p>Results in the Gaussian test session (<xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 top right) replicated what we had obtained in the training session. Subjects' performance was not influenced by cue type (main effect: Low-noise cues, High-noise cues; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e091.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e092.jpg"></inline-graphic>
</inline-formula>
) nor by the interaction between cue types and prior width (interaction: prior width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e093.jpg"></inline-graphic>
</inline-formula>
 cue type; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e094.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e095.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e096.jpg"></inline-graphic>
</inline-formula>
). Conversely, as before, the width of the prior affected performance significantly (main effect: prior width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e097.jpg"></inline-graphic>
</inline-formula>
; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e098.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e099.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e100.jpg"></inline-graphic>
</inline-formula>
); again, wider priors were associated with lower relative performance.</p>
<p>A similar pattern of results was found also for the bimodal test session (<xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 bottom right). Performance was affected significantly by the shape of the prior (main effect: prior shape; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e101.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e102.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e103.jpg"></inline-graphic>
</inline-formula>
) but otherwise participants integrated cues of different type with equal skill (main effect: Low-noise cues, High-noise cues; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e104.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e105.jpg"></inline-graphic>
</inline-formula>
; interaction: prior shape <inline-formula><inline-graphic xlink:href="pcbi.1003661.e106.jpg"></inline-graphic>
</inline-formula>
 cue type; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e107.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e108.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e109.jpg"></inline-graphic>
</inline-formula>
). However, in this case performance was not clearly correlated with a simple measure of the prior or of the average posterior (e.g. differential entropy).</p>
<p>Another scenario emerged in the unimodal test session (<xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 bottom left). Here, subjects' performance was affected not only by the shape of the prior (main effect: prior shape; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e110.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e111.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e112.jpg"></inline-graphic>
</inline-formula>
) but also by the type of cue (main effect: Low-noise cues, High-noise cues; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e113.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e114.jpg"></inline-graphic>
</inline-formula>
) and the specific combination of cue and prior (interaction: prior shape <inline-formula><inline-graphic xlink:href="pcbi.1003661.e115.jpg"></inline-graphic>
</inline-formula>
 cue type; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e116.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e117.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e118.jpg"></inline-graphic>
</inline-formula>
). Moreover, in this session performance improved for priors whose main peak was broader (see <xref ref-type="sec" rid="s3">Discussion</xref>
).</p>
<p>Notwithstanding this heterogeneity of results, an overall comparison of participants' relative performance in test sessions (averaging results over priors) did not show statistically significant differences between groups (main effect: group; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e119.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e120.jpg"></inline-graphic>
</inline-formula>
) nor between the two levels of reliability of the cue (main effect: Low-noise cues, High-noise cues; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e121.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e122.jpg"></inline-graphic>
</inline-formula>
); only performance in the unimodal session for high-noise cues was at most marginally worse. In particular, relative performance in the Gaussian test and the bimodal test sessions was surprisingly similar, unlike previous learning experiments (see <xref ref-type="sec" rid="s3">Discussion</xref>
).</p>
</sec>
<sec id="s2a3"><title>Effects of uncertainty on reaction time</title>
<p>Lastly, we examined the effect of uncertainty on subjects' reaction time (time to start movement after the ‘go’ beep) in each trial. Uncertainty was quantified as the SD of the posterior distribution in the current trial, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e123.jpg"></inline-graphic>
</inline-formula>
 (an alternative measure of spread, exponential entropy <xref rid="pcbi.1003661-Campbell1" ref-type="bibr">[24]</xref>
, gave analogous results). We found that the average subjects' reaction time grew almost linearly with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e124.jpg"></inline-graphic>
</inline-formula>
 (<xref ref-type="fig" rid="pcbi-1003661-g006">Figure 6</xref>
). The average change in reaction times (from lowest to highest uncertainty in the posterior) was substantial during the training session (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e125.jpg"></inline-graphic>
</inline-formula>
 ms, about <inline-formula><inline-graphic xlink:href="pcbi.1003661.e126.jpg"></inline-graphic>
</inline-formula>
 change), although less so in subsequent test sessions.</p>
<fig id="pcbi-1003661-g006" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g006</object-id>
<label>Figure 6</label>
<caption><title>Average reaction times as a function of the SD of the posterior distribution.</title>
<p>Each panel shows the average reaction times (mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e127.jpg"></inline-graphic>
</inline-formula>
 SE across subjects) for a given session as a function of the SD of the posterior distribution, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e128.jpg"></inline-graphic>
</inline-formula>
 (individual data were smoothed with a kernel regression estimate, see <xref ref-type="sec" rid="s4">Methods</xref>
). Dashed lines are robust linear fits to the reaction times data. For all sessions the slope of the linear regression is significantly different than zero (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e129.jpg"></inline-graphic>
</inline-formula>
).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g006"></graphic>
</fig>
</sec>
</sec>
<sec id="s2b"><title>Suboptimal Bayesian observer models</title>
<p>Our model-free analysis showed that subjects' performance in the task was suboptimal. Here we examine the source of this apparent suboptimality. Subjects' performance is modelled with a family of Bayesian ideal observers which incorporate various hypotheses about the decision-making process and internal representation of the task, with the aim of teasing out the major sources of subjects' suboptimality; see <xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1e</xref>
 for a depiction of the elements of decision making in a trial. All these observers are ‘Bayesian’ because they build a posterior distribution through Bayes' rule, but the operations they perform with the posterior can differ from the normative prescriptions of Bayesian Decision Theory (BDT).</p>
<p>We construct a large model set with a factorial approach that consists in combining different independent model ‘factors’ that can take different ‘levels’ <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
, <xref rid="pcbi.1003661-vandenBerg1" ref-type="bibr">[21]</xref>
. The basic factors we consider are:</p>
<list list-type="order"><list-item><p><italic>Decision making</italic>
 (3 levels): Bayesian Decision Theory (‘BDT’), stochastic posterior (‘SPK’), posterior probability matching (‘PPM’).</p>
</list-item>
<list-item><p><italic>Cue-estimation sensory noise</italic>
 (2 levels): absent or present (‘S’).</p>
</list-item>
<list-item><p><italic>Noisy estimation of the prior</italic>
 (2 levels): absent or present (‘P’).</p>
</list-item>
<list-item><p><italic>Lapse</italic>
 (2 levels): absent or present (‘L’).</p>
</list-item>
</list>
<p>Observer models are identified by a model string, for example ‘BDT-P-L’ indicates an observer model that follows BDT with a noisy estimate of the prior and suffers from occasional lapses. Our basic model set comprises 24 observer models; we also considered several variants of these models that are described in the text. All main factors are explained in the following sections and summarized in <xref ref-type="table" rid="pcbi-1003661-t002">Table 2</xref>
. The term ‘model component' is used through the text to indicate both factors and levels.</p>
<table-wrap id="pcbi-1003661-t002" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.t002</object-id>
<label>Table 2</label>
<caption><title>Set of model factors.</title>
</caption>
<alternatives><graphic id="pcbi-1003661-t002-2" xlink:href="pcbi.1003661.t002"></graphic>
<table frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
</colgroup>
<thead><tr><td align="left" rowspan="1" colspan="1">Label</td>
<td align="left" rowspan="1" colspan="1">Model description</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e130.jpg"></inline-graphic>
</inline-formula>
 parameters</td>
<td align="left" rowspan="1" colspan="1">Free parameters (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e131.jpg"></inline-graphic>
</inline-formula>
)</td>
</tr>
</thead>
<tbody><tr><td align="left" rowspan="1" colspan="1">BDT</td>
<td align="left" rowspan="1" colspan="1">Decision making: BDT</td>
<td align="left" rowspan="1" colspan="1">4</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e132.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">PPM</td>
<td align="left" rowspan="1" colspan="1">Decision making: Posterior probability matching</td>
<td align="left" rowspan="1" colspan="1">4</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e133.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">SPK</td>
<td align="left" rowspan="1" colspan="1">Decision making: Stochastic posterior</td>
<td align="left" rowspan="1" colspan="1">6</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e134.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">PSA</td>
<td align="left" rowspan="1" colspan="1">Decision making: Posterior sampling average (<sup>*</sup>
)</td>
<td align="left" rowspan="1" colspan="1">6</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e135.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">S</td>
<td align="left" rowspan="1" colspan="1">Cue-estimation noise</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e136.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e137.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">P</td>
<td align="left" rowspan="1" colspan="1">Prior estimation noise</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e138.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e139.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">L</td>
<td align="left" rowspan="1" colspan="1">Lapse</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e140.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e141.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">MV</td>
<td align="left" rowspan="1" colspan="1">Gaussian approximation: mean/variance (<sup>*</sup>
)</td>
<td align="left" rowspan="1" colspan="1">–</td>
<td align="left" rowspan="1" colspan="1">–</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">LA</td>
<td align="left" rowspan="1" colspan="1">Gaussian approximation: Laplace approximation (<sup>*</sup>
)</td>
<td align="left" rowspan="1" colspan="1">–</td>
<td align="left" rowspan="1" colspan="1">–</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot><fn id="nt102"><label></label>
<p>Table of all major model factors, identified by a label and short description. An observer model is built by choosing a model level for decision making and then optionally adding other components. For each model component the number of free parameters is specified. A ‘<inline-formula><inline-graphic xlink:href="pcbi.1003661.e142.jpg"></inline-graphic>
</inline-formula>
’ means that a parameter is specified independently for training and test sessions; otherwise parameters are shared across sessions. See main text and <xref ref-type="sec" rid="s4">Methods</xref>
 for the meaning of the various parameters. (<sup>*</sup>
) These additional components appear in the comparison of alternative models of decision making.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<sec id="s2b1"><title>Decision making: Standard BDT observer (‘BDT’)</title>
<p>The ‘decision-making’ factor comprises model components with different assumptions about the decision process. We start describing the ‘baseline’ Bayesian observer model, BDT, that follows standard BDT. Suboptimality, in this case, emerges if the observer’s internal estimates of the parameters of the task take different values from the true ones. As all subsequent models are variations of the BDT observer we describe this model in some detail.</p>
<p>On each trial the information available to the observer is comprised of the ‘prior’ distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e143.jpg"></inline-graphic>
</inline-formula>
, the cue position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e144.jpg"></inline-graphic>
</inline-formula>
, and the distance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e145.jpg"></inline-graphic>
</inline-formula>
 of the cue from the target line, which is a proxy for cue variability, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e146.jpg"></inline-graphic>
</inline-formula>
. The posterior distribution of target location, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e147.jpg"></inline-graphic>
</inline-formula>
, is computed by multiplying together the prior with the likelihood function. For the moment we assume the observer has perfect access to the displayed cue location and prior, and knowledge that cue variability is normally distributed. However, we allow the observer's estimate of the variance of the likelihood (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e148.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e149.jpg"></inline-graphic>
</inline-formula>
) to mismatch the actual variance (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e150.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e151.jpg"></inline-graphic>
</inline-formula>
). Therefore the posterior is given by: <disp-formula id="pcbi.1003661.e152"><graphic xlink:href="pcbi.1003661.e152.jpg" position="anchor" orientation="portrait"></graphic>
<label>(2)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e153.jpg"></inline-graphic>
</inline-formula>
 denotes a normal distribution with mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e154.jpg"></inline-graphic>
</inline-formula>
 and variance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e155.jpg"></inline-graphic>
</inline-formula>
.</p>
<p>In general, for any given trial, the choice the subject makes (desired pointing location for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e156.jpg"></inline-graphic>
</inline-formula>
) can be a probabilistic one, leading to a ‘target choice’ distribution. However, for standard BDT, the choice is deterministic given the trial parameters, leading to a ‘target choice’ distribution that collapses to a delta function: <disp-formula id="pcbi.1003661.e157"><graphic xlink:href="pcbi.1003661.e157.jpg" position="anchor" orientation="portrait"></graphic>
<label>(3)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e158.jpg"></inline-graphic>
</inline-formula>
 is the ‘optimal’ target position that minimizes the observer's expected loss. The explicit task in our experiment is to place the target within the radius of the cursor, which is equivalent to a ‘square well’ loss function with a window size equal to the diameter of the cursor. For computational reasons, in our observer models we approximate the square well loss with an inverted Gaussian (see <xref ref-type="sec" rid="s4">Methods</xref>
) that best approximates the square well, with fixed SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e159.jpg"></inline-graphic>
</inline-formula>
 screen units (see Section 3 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
).</p>
<p>In our experiment all priors were mixtures of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e160.jpg"></inline-graphic>
</inline-formula>
 (mainly 1 or 2) Gaussian distributions of the form <inline-formula><inline-graphic xlink:href="pcbi.1003661.e161.jpg"></inline-graphic>
</inline-formula>
, with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e162.jpg"></inline-graphic>
</inline-formula>
. It follows that the expected loss is a mixture of Gaussians itself, and the optimal target that minimizes the expected loss takes the form (see <xref ref-type="sec" rid="s4">Methods</xref>
 for details): <disp-formula id="pcbi.1003661.e163"><graphic xlink:href="pcbi.1003661.e163.jpg" position="anchor" orientation="portrait"></graphic>
<label>(4)</label>
</disp-formula>
where we defined:</p>
<p><disp-formula id="pcbi.1003661.e164"><graphic xlink:href="pcbi.1003661.e164.jpg" position="anchor" orientation="portrait"></graphic>
<label>(5)</label>
</disp-formula>
</p>
<p>For a single-Gaussian prior (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e165.jpg"></inline-graphic>
</inline-formula>
), <inline-formula><inline-graphic xlink:href="pcbi.1003661.e166.jpg"></inline-graphic>
</inline-formula>
 and the posterior distribution is itself a Gaussian distribution with mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e167.jpg"></inline-graphic>
</inline-formula>
 and variance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e168.jpg"></inline-graphic>
</inline-formula>
, so that <inline-formula><inline-graphic xlink:href="pcbi.1003661.e169.jpg"></inline-graphic>
</inline-formula>
.</p>
<p>We assume that the subject's response is corrupted by motor noise, which we take to be normally distributed with SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e170.jpg"></inline-graphic>
</inline-formula>
. By convolving the target choice distribution (Eq. 3) with motor noise we obtain the final response distribution: <disp-formula id="pcbi.1003661.e171"><graphic xlink:href="pcbi.1003661.e171.jpg" position="anchor" orientation="portrait"></graphic>
<label>(6)</label>
</disp-formula>
</p>
<p>The calculation of the expected loss in Eq. 4 does not explicitly take into account the consequences of motor variability, but this approximation has minimal effects on the inference (see <xref ref-type="sec" rid="s3">Discussion</xref>
).</p>
<p>The behavior of observer model BDT is completely described by Eqs. 4, 5 and 6. This observer model is <italic>subjectively</italic>
 Bayes optimal; the subject applies BDT to his or her internal model of the task, which might be wrong. Specifically, the observer will be close to <italic>objective</italic>
 optimality only if his or her estimates for the likelihood parameters, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e172.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e173.jpg"></inline-graphic>
</inline-formula>
, match the true likelihood parameters of the task (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e174.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e175.jpg"></inline-graphic>
</inline-formula>
). As extreme cases, if <inline-formula><inline-graphic xlink:href="pcbi.1003661.e176.jpg"></inline-graphic>
</inline-formula>
 the BDT observer will ignore the prior and only use the noiseless cues (cue-only observer model; dashed lines in <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
), whereas for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e177.jpg"></inline-graphic>
</inline-formula>
 the observer will use only probabilistic information contained in the priors (prior-only observer model; dotted lines in <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
).</p>
</sec>
<sec id="s2b2"><title>Decision making: Noisy decision makers (‘SPK’ and ‘PPM’)</title>
<p>An alternative to BDT is a family of observer models in which the decision-making process is probabilistic, either because of noise in the inference or stochasticity in action selection. We model these various sources of variability without distinction as stochastic computations that involve the posterior distribution.</p>
<p>We start our analysis by considering a specific model, SPK (stochastic posterior, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e178.jpg"></inline-graphic>
</inline-formula>
-power), in which the observer minimizes the expected loss (Eq. 4) under a noisy, approximate representation of the posterior distribution, as opposed to the deterministic, exact posterior of BDT (<xref ref-type="fig" rid="pcbi-1003661-g007">Figure 7a and 7d</xref>
); later we will consider other variants of stochastic computations. As before, we allow the SD of the likelihoods, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e179.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e180.jpg"></inline-graphic>
</inline-formula>
, to mismatch their true values. For mathematical and computational tractability, we do not directly simulate the noisy inference during the model comparison. Instead, we showed that different ways of introducing stochasticity in the inference process – either by adding noise to an explicit representation of the observer's posterior (<xref ref-type="fig" rid="pcbi-1003661-g007">Figure 7b and 7e</xref>
), or by building a discrete approximation of the posterior via sampling (<xref ref-type="fig" rid="pcbi-1003661-g007">Figure 7c and 7f</xref>
) – induce variability in the target choice that is well approximated by a power function of the posterior distribution itself; see <xref ref-type="supplementary-material" rid="pcbi.1003661.s003">Text S2</xref>
 for details.</p>
<fig id="pcbi-1003661-g007" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g007</object-id>
<label>Figure 7</label>
<caption><title>Decision making with stochastic posterior distributions.</title>
<p><bold>a–c</bold>
: Each panel shows an example of how different models of stochasticity in the representation of the posterior distribution, and therefore in the computation of the expected loss, may affect decision making in a trial. In all cases, the observer chooses the subjectively optimal target <inline-formula><inline-graphic xlink:href="pcbi.1003661.e181.jpg"></inline-graphic>
</inline-formula>
 (blue arrow) that minimizes the expected loss (purple line; see Eq. 4) given his or her current representation of the posterior (black lines or bars). The original posterior distribution is showed in panels b–f for comparison (shaded line). <bold>a</bold>
: Original posterior distribution. <bold>b</bold>
: Noisy posterior: the original posterior is corrupted by random multiplicative or Poisson-like noise (in this example, the noise has caused the observer to aim for the wrong peak). <bold>c</bold>
: Sample-based posterior: a discrete approximation of the posterior is built by drawing samples from the original posterior (grey bars; samples are binned for visualization purposes). <bold>d–f</bold>
: Each panel shows how stochasticity in the posterior affects the distribution of target choices <inline-formula><inline-graphic xlink:href="pcbi.1003661.e182.jpg"></inline-graphic>
</inline-formula>
 (blue line). <bold>d</bold>
: Without noise, the target choice distribution is a delta function peaked on the minimum of the expected loss, as per standard BDT. <bold>e</bold>
: On each trial, the posterior is corrupted by different instances of noise, inducing a distribution of possible target choices <inline-formula><inline-graphic xlink:href="pcbi.1003661.e183.jpg"></inline-graphic>
</inline-formula>
 (blue line). In our task, this distribution of target choices is very well approximated by a power function of the posterior distribution, <xref ref-type="disp-formula" rid="pcbi.1003661.e185">Eq. 7</xref>
 (red dashed line); see <xref ref-type="supplementary-material" rid="pcbi.1003661.s003">Text S2</xref>
 for details. <bold>f</bold>
: Similarly, the target choice distribution induced by sampling (blue line) is fit very well by a power function of the posterior (red dashed line). Note the extremely close resemblance of panels e and f (the exponent of the power function is the same).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g007"></graphic>
</fig>
<p>We, therefore, use the power function approximation with power <inline-formula><inline-graphic xlink:href="pcbi.1003661.e184.jpg"></inline-graphic>
</inline-formula>
 – hence the name of the model – to simulate the effects of a stochastic posterior on decision making, without committing to a specific interpretation. The target choice distribution in model SPK takes the form: <disp-formula id="pcbi.1003661.e185"><graphic xlink:href="pcbi.1003661.e185.jpg" position="anchor" orientation="portrait"></graphic>
<label>(7)</label>
</disp-formula>
where the power exponent <inline-formula><inline-graphic xlink:href="pcbi.1003661.e186.jpg"></inline-graphic>
</inline-formula>
 is a free parameter inversely related to the amount of variability. Eq. 7 is convolved with motor noise to give the response distribution. The power function conveniently interpolates between a posterior-matching strategy (for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e187.jpg"></inline-graphic>
</inline-formula>
) and a maximum a posteriori (MAP) solution (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e188.jpg"></inline-graphic>
</inline-formula>
).</p>
<p>We consider as a separate factor the specific case in which the power exponent <inline-formula><inline-graphic xlink:href="pcbi.1003661.e189.jpg"></inline-graphic>
</inline-formula>
 is fixed to 1, yielding a posterior probability matching observer, PPM, that takes action according to a single draw from the posterior distribution <xref rid="pcbi.1003661-Mamassian1" ref-type="bibr">[25]</xref>
, <xref rid="pcbi.1003661-Wozny1" ref-type="bibr">[26]</xref>
.</p>
</sec>
<sec id="s2b3"><title>Observer models with cue-estimation sensory noise (‘S’)</title>
<p>We consider a family of observer models, S, in which we drop the assumption that the observer perfectly knows the horizontal position of the cue. We model sensory variability by adding Gaussian noise to the internal measurement of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e190.jpg"></inline-graphic>
</inline-formula>
, which we label <inline-formula><inline-graphic xlink:href="pcbi.1003661.e191.jpg"></inline-graphic>
</inline-formula>
: <disp-formula id="pcbi.1003661.e192"><graphic xlink:href="pcbi.1003661.e192.jpg" position="anchor" orientation="portrait"></graphic>
<label>(8)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e193.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e194.jpg"></inline-graphic>
</inline-formula>
 represent the variances of the estimates of the position of the cue, respectively for low-noise (short-distance) and high-noise (long-distance) cues. According to Weber's law, we assume that the measurement error is proportional to the distance from the target line <inline-formula><inline-graphic xlink:href="pcbi.1003661.e195.jpg"></inline-graphic>
</inline-formula>
, so that the ratio of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e196.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e197.jpg"></inline-graphic>
</inline-formula>
 is equal to the ratio of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e198.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e199.jpg"></inline-graphic>
</inline-formula>
, and we need to specify only one of the two parameters (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e200.jpg"></inline-graphic>
</inline-formula>
). Given that both the cue variability and the observer's measurement variability are normally distributed, their combined variability will still appear to the observer as a Gaussian distribution with variance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e201.jpg"></inline-graphic>
</inline-formula>
, assuming independence. Therefore, the observer's internal model of the task is formally identical to the description we gave before by replacing <inline-formula><inline-graphic xlink:href="pcbi.1003661.e202.jpg"></inline-graphic>
</inline-formula>
 with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e203.jpg"></inline-graphic>
</inline-formula>
 in Eq. 2 (see <xref ref-type="sec" rid="s4">Methods</xref>
). Since the subject's internal measurement is not accessible during the experiment, the observed response probability is integrated over the hidden variable <inline-formula><inline-graphic xlink:href="pcbi.1003661.e204.jpg"></inline-graphic>
</inline-formula>
 (Eq. 18 in Methods). A model with cue-estimation sensory noise (‘S’) tends to the equivalent observer model without noise for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e205.jpg"></inline-graphic>
</inline-formula>
.</p>
</sec>
<sec id="s2b4"><title>Observer models with noisy estimation of the prior (‘P’)</title>
<p>We introduce a family of observer models, P, in which subjects have access only to noisy estimates of the parameters of the prior, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e206.jpg"></inline-graphic>
</inline-formula>
. For this class of models we assume that estimation noise is structured along a task-relevant dimension.</p>
<p>Specifically, for Gaussian priors we assume that the observers take a noisy internal measurement of the SD of the prior, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e207.jpg"></inline-graphic>
</inline-formula>
, which according to Weber's law follows a log-normal distribution: <disp-formula id="pcbi.1003661.e208"><graphic xlink:href="pcbi.1003661.e208.jpg" position="anchor" orientation="portrait"></graphic>
<label>(9)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e209.jpg"></inline-graphic>
</inline-formula>
, the true SD, is the log-scale parameter and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e210.jpg"></inline-graphic>
</inline-formula>
 is the shape parameter of the log-normally distributed measurement (respectively mean and SD in log space). We assume an analogous form of noise on the width of the platykurtic prior in the unimodal session. Conversely, we assume that for priors that are mixtures of two Gaussians the main source of error stems from assessing the relative importance of the two components. In this case we add log-normal noise to the weights of each component, which we assume to be estimated independently:<disp-formula id="pcbi.1003661.e211"><graphic xlink:href="pcbi.1003661.e211.jpg" position="anchor" orientation="portrait"></graphic>
<label>(10)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e212.jpg"></inline-graphic>
</inline-formula>
 are the true mixing weights and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e213.jpg"></inline-graphic>
</inline-formula>
 is the noise parameter previously defined. Note that Eq. 10 is equivalent to adding normal noise with SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e214.jpg"></inline-graphic>
</inline-formula>
 to the log weights ratio in the ‘natural’ log odds space <xref rid="pcbi.1003661-Zhang1" ref-type="bibr">[27]</xref>
.</p>
<p>The internal measurements of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e215.jpg"></inline-graphic>
</inline-formula>
 (or <inline-formula><inline-graphic xlink:href="pcbi.1003661.e216.jpg"></inline-graphic>
</inline-formula>
 are used by the observer in place of the true parameters of the priors in the inference process (e.g. Eq. 5). Since we cannot measure the internal measurements of the subjects, the actual response probabilities are computed by integrating over the unobserved values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e217.jpg"></inline-graphic>
</inline-formula>
 or <inline-formula><inline-graphic xlink:href="pcbi.1003661.e218.jpg"></inline-graphic>
</inline-formula>
 (see <xref ref-type="sec" rid="s4">Methods</xref>
). Note that for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e219.jpg"></inline-graphic>
</inline-formula>
 an observer model with prior noise (‘P’) tends to its corresponding version with no noise.</p>
<p>A different type of measurement noise on the the prior density is represented by ‘unstructured’, pointwise noise which can be shown to be indistinguishable from noise in the posterior under certain assumptions (see <xref ref-type="supplementary-material" rid="pcbi.1003661.s003">Text S2</xref>
).</p>
</sec>
<sec id="s2b5"><title>Observer models with lapse (‘L’)</title>
<p>It is possible that the response variability exhibited by the subjects could be simply explained by occasional lapses. Observer models with a lapse term are common in psychophysics to account for missed stimuli and additional variability in the data <xref rid="pcbi.1003661-Wichmann1" ref-type="bibr">[28]</xref>
. According to these models, in each trial the observer has a typically small, fixed probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e220.jpg"></inline-graphic>
</inline-formula>
 (the <italic>lapse rate</italic>
) of making a choice from a lapse probability distribution instead of the optimal target <inline-formula><inline-graphic xlink:href="pcbi.1003661.e221.jpg"></inline-graphic>
</inline-formula>
. As a representative lapse distribution we choose the prior distribution (prior-matching lapse). The target choice for an observer with lapse has distribution: <disp-formula id="pcbi.1003661.e222"><graphic xlink:href="pcbi.1003661.e222.jpg" position="anchor" orientation="portrait"></graphic>
<label>(11)</label>
</disp-formula>
where the first term in the right hand side of the equation is the target choice distribution (either Eq. 3 or Eq. 7, depending on the decision-making factor), weighted by the probability of <italic>not</italic>
 making a lapse, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e223.jpg"></inline-graphic>
</inline-formula>
. The second term is the lapse term, with probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e224.jpg"></inline-graphic>
</inline-formula>
, and it is clear that the observer model with lapse (‘L’) reduces to an observer with no lapse in the limit <inline-formula><inline-graphic xlink:href="pcbi.1003661.e225.jpg"></inline-graphic>
</inline-formula>
. Eq. 11 is then convolved with motor noise to provide the response distribution. We also tested a lapse model in which the lapse distribution was uniform over the range of the displayed prior distribution. Observer models with uniform lapse performed consistently worse than the prior-matching lapse model, so we only report the results of the latter.</p>
</sec>
</sec>
<sec id="s2c"><title>Model comparison</title>
<p>For each observer model <inline-formula><inline-graphic xlink:href="pcbi.1003661.e226.jpg"></inline-graphic>
</inline-formula>
 and each subject’s dataset we evaluated the posterior distribution of parameters <inline-formula><inline-graphic xlink:href="pcbi.1003661.e227.jpg"></inline-graphic>
</inline-formula>
, where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e228.jpg"></inline-graphic>
</inline-formula>
 is in general a vector of model-dependent parameters (see <xref ref-type="table" rid="pcbi-1003661-t002">Table 2</xref>
). Each subject's dataset comprised of two sessions (training and test), for a total of about 1200 trials divided in 32 distinct conditions (8 priors <inline-formula><inline-graphic xlink:href="pcbi.1003661.e229.jpg"></inline-graphic>
</inline-formula>
 2 noise levels <inline-formula><inline-graphic xlink:href="pcbi.1003661.e230.jpg"></inline-graphic>
</inline-formula>
 2 sessions). In general, we assumed subjects shared the motor parameter <inline-formula><inline-graphic xlink:href="pcbi.1003661.e231.jpg"></inline-graphic>
</inline-formula>
 across sessions. We also assumed that from training to test sessions people would use the same high-noise to low-noise ratio between cue variability (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e232.jpg"></inline-graphic>
</inline-formula>
); so only one cue-noise parameter (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e233.jpg"></inline-graphic>
</inline-formula>
) needed to be specified for the test session. Conversely, we assumed that the other noise-related parameters, if present (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e234.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e235.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e236.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e237.jpg"></inline-graphic>
</inline-formula>
), could change freely between sessions, reasoning that additional response variability can be affected by the presence or absence of feedback, or as a result of the difference between training and test distributions. These assumptions were validated via a preliminary model comparison (see Section 5 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
). <xref ref-type="table" rid="pcbi-1003661-t002">Table 2</xref>
 lists a summary of observer models and their free parameters.</p>
<p>The posterior distributions of the parameters were obtained through a slice sampling Monte Carlo method <xref rid="pcbi.1003661-Neal1" ref-type="bibr">[29]</xref>
. In general, we assumed noninformative priors over the parameters except for motor noise parameter <inline-formula><inline-graphic xlink:href="pcbi.1003661.e238.jpg"></inline-graphic>
</inline-formula>
 and cue-estimation sensory noise parameter <inline-formula><inline-graphic xlink:href="pcbi.1003661.e239.jpg"></inline-graphic>
</inline-formula>
 (when present), for which we determined a reasonable range of values through an independent experiment (see <xref ref-type="sec" rid="s4">Methods</xref>
 and <xref ref-type="supplementary-material" rid="pcbi.1003661.s004">Text S3</xref>
). Via sampling we also computed for each dataset a measure of complexity and goodness of fit of each observer model, the Deviance Information Criterion (DIC) <xref rid="pcbi.1003661-Spiegelhalter1" ref-type="bibr">[30]</xref>
, which we used as an approximation of the marginal likelihood to perform model comparison (see <xref ref-type="sec" rid="s4">Methods</xref>
).</p>
<p>We compared observer models according to a hierarchical Bayesian model selection (BMS) method that treats subjects and models as random effects <xref rid="pcbi.1003661-Stephan1" ref-type="bibr">[31]</xref>
. That is, we assumed that multiple observer models could be present in the population, and we computed how likely it is that a specific model (or model level within a factor) generated the data of a randomly chosen subject, given the model evidence represented by the subjects' DIC scores (see <xref ref-type="sec" rid="s4">Methods</xref>
 for details). As a Bayesian metric of significance we used the exceedance probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e240.jpg"></inline-graphic>
</inline-formula>
 of one model (or model level) being more likely than any other model (or model levels within a factor). In <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 we report instead a classical (frequentist) analysis of the group difference in DIC between models (GDIC), which assumes that all datasets have been generated by the same unknown observer model. In spite of different assumptions, BMS and GDIC agree on the most likely observer model, validating the robustness of our main findings. The two approaches exhibit differences with respect to model ranking, due to the fact that, as a ‘fixed effect’ method, GDIC does not account for group heterogeneity and outliers <xref rid="pcbi.1003661-Stephan1" ref-type="bibr">[31]</xref>
 (see Section 4 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 for details). Finally, we assessed the impact of each factor on model performance by computing the average change in DIC associated with a given component.</p>
<sec id="s2c1"><title>Results of model comparison</title>
<p><xref ref-type="fig" rid="pcbi-1003661-g008">Figure 8</xref>
 shows the results of the BMS method applied to our model set. <xref ref-type="fig" rid="pcbi-1003661-g008">Figure 8a</xref>
 shows the model evidence for each individual model and subject. For each subject we computed the posterior probability of each observer model using DIC as an approximation of the marginal likelihood (see <xref ref-type="sec" rid="s4">Methods</xref>
). We calculated model evidence as the Bayes factor (posterior probability ratio) between the subject's best model and a given model. In the graph we report model evidence in the same scale as DIC, that is as twice the log Bayes factor. A difference of more than 10 in this scale is considered very strong evidence <xref rid="pcbi.1003661-Kass1" ref-type="bibr">[32]</xref>
. Results for individual subjects show that model SPK-P-L (stochastic posterior with estimation noise on the prior and lapse) performed consistently better than other models for all conditions. A minority of subjects were also well represented by model SPK-P (same as above, but without the lapse component). All other models performed significantly worse. In particular, note that the richer SPK-S-P-L model was not supported, suggesting that that sensory noise on estimation of cue location was not needed to explain the data. <xref ref-type="fig" rid="pcbi-1003661-g008">Figure 8b</xref>
 confirms these results by showing the estimated probability of finding a given observer model in the population (assuming that multiple observer models could be present). Model SPK-P-L is significantly more represented (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e241.jpg"></inline-graphic>
</inline-formula>
; exceedance probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e242.jpg"></inline-graphic>
</inline-formula>
), followed by model SPK-P (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e243.jpg"></inline-graphic>
</inline-formula>
). For all other models the probability is essentially the same at <inline-formula><inline-graphic xlink:href="pcbi.1003661.e244.jpg"></inline-graphic>
</inline-formula>
. The probability of single model factors reproduced an analogous pattern (<xref ref-type="fig" rid="pcbi-1003661-g008">Figure 8c</xref>
). The majority of subjects (more than <inline-formula><inline-graphic xlink:href="pcbi.1003661.e245.jpg"></inline-graphic>
</inline-formula>
 in each case) are likely to use a stochastic decision making (SPK), to have noise in the estimation of the priors (P), and lapse (L). Only a minority (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e246.jpg"></inline-graphic>
</inline-formula>
) would be described by an observer model with sensory noise in estimation of the cue. The model comparison yielded similar results, although with a more graded difference between models, when looking directly at DIC scores (see Section 4 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
; lower is better).</p>
<fig id="pcbi-1003661-g008" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g008</object-id>
<label>Figure 8</label>
<caption><title>Model comparison between individual models.</title>
<p><bold>a</bold>
: Each column represents a subject, divided by test group (all datasets include a Gaussian training session), each row an observer model identified by a model string (see <xref ref-type="table" rid="pcbi-1003661-t002">Table 2</xref>
). Cell color indicates model's evidence, here displayed as the Bayes factor against the best model for that subject (a higher value means a worse performance of a given model with respect to the best model). Models are sorted by their posterior likelihood for a randomly selected subject (see panel b). Numbers above cells specify ranking for most supported models with comparable evidence (difference less than 10 in 2 log Bayes factor <xref rid="pcbi.1003661-Kass1" ref-type="bibr">[32]</xref>
). <bold>b</bold>
: Probability that a given model generated the data of a randomly chosen subject. Here and in panel c, brown bars represent the most supported models (or model levels within a factor). Asterisks indicate a significant exceedance probability, that is the posterior probability that a given model (or model component) is more likely than any other model (or model component): <inline-formula><inline-graphic xlink:href="pcbi.1003661.e247.jpg"></inline-graphic>
</inline-formula>
. <bold>c</bold>
: Probability that a given model level within a factor generated the data of a randomly chosen subject.</p>
</caption>
<graphic xlink:href="pcbi.1003661.g008"></graphic>
</fig>
<p>To assess in another way the relative importance of each model component in determining the performance of a model, we measured the average contribution to DIC of each model level within a factor across all tested models (<xref ref-type="fig" rid="pcbi-1003661-g004">Figure 4</xref>
 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
). In agreement with our previous findings, the lowest DIC (better score) in decision making is obtained by observer models containing the SPK factor. BDT increases (i.e. worsens) average DIC scores substantially (difference in DIC, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e248.jpg"></inline-graphic>
</inline-formula>
DIC = <inline-formula><inline-graphic xlink:href="pcbi.1003661.e249.jpg"></inline-graphic>
</inline-formula>
; mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e250.jpg"></inline-graphic>
</inline-formula>
 SE across subjects) and PPM has devastating effects on model performance (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e251.jpg"></inline-graphic>
</inline-formula>
DIC = <inline-formula><inline-graphic xlink:href="pcbi.1003661.e252.jpg"></inline-graphic>
</inline-formula>
), where 10 points of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e253.jpg"></inline-graphic>
</inline-formula>
DIC may already be considered a strong evidence towards the model with lower DIC <xref rid="pcbi.1003661-Spiegelhalter1" ref-type="bibr">[30]</xref>
. Regarding the other factors (S, P, L) we found that in general lacking a factor increases DIC (worse model performance; see Section 4 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 for discussion about factor S). Overall, this analysis confirms the strong impact that an appropriate modelling of variability has on model performance (see Section 4 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 for details).</p>
<p>We performed a number of analyses on an additional set of observer models to validate the finding that model SPK-P-L provides the best explanation for the data in our model set.</p>
<p>Firstly, in all the observer models described so far the subjects' parameters of the likelihood, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e254.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e255.jpg"></inline-graphic>
</inline-formula>
, were allowed to vary. Preliminary analysis had suggested that observer models with mismatching likelihoods always outperformed models with true likelihood parameters, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e256.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e257.jpg"></inline-graphic>
</inline-formula>
. We tested whether this was the case also with our current best model, or if we could assume instead that at least some subjects were using the true parameters. Model SPK-P-L-true performed considerably worse than its counterpart with mismatching likelihood parameters (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e258.jpg"></inline-graphic>
</inline-formula>
 with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e259.jpg"></inline-graphic>
</inline-formula>
 for the other model; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e260.jpg"></inline-graphic>
</inline-formula>
DIC = <inline-formula><inline-graphic xlink:href="pcbi.1003661.e261.jpg"></inline-graphic>
</inline-formula>
), suggesting that mismatching likelihoods are invariably necessary to explain our subjects' data.</p>
<p>We then checked whether the variability of subjects' estimates of the priors may have arisen instead due to the discrete representation of the prior distribution in the experiment (see <xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1d</xref>
). We therefore considered a model SPK-D-L in which priors were not noisy, but the model component ‘D’ replaces the continuous representations of the priors with their true discrete representation (a mixture of a hundred narrow Gaussians corresponding to the dots shown on screen). Model SPK-D-L performed worse than model SPK-P-L (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e262.jpg"></inline-graphic>
</inline-formula>
 with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e263.jpg"></inline-graphic>
</inline-formula>
 for the other model; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e264.jpg"></inline-graphic>
</inline-formula>
DIC = <inline-formula><inline-graphic xlink:href="pcbi.1003661.e265.jpg"></inline-graphic>
</inline-formula>
) and, more interestingly, also worse than model SPK-L (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e266.jpg"></inline-graphic>
</inline-formula>
 with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e267.jpg"></inline-graphic>
</inline-formula>
 for the other model; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e268.jpg"></inline-graphic>
</inline-formula>
DIC = <inline-formula><inline-graphic xlink:href="pcbi.1003661.e269.jpg"></inline-graphic>
</inline-formula>
). The discrete representation of the prior, therefore, does not provide a better explanation for subjects' behavior.</p>
<p>Lastly, we verified whether our subjects' behavior and apparent variability could be explained by a non-Bayesian iterative model applied to the training datasets. A basic iterative model failed to explain subjects' data (see Section 6 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 and <xref ref-type="sec" rid="s3">Discussion</xref>
).</p>
<p>In conclusion, all analyses identify as the main sources of subjects' suboptimal behavior the combined effect of both noise in estimating the shape of the ‘prior’ distributions and variability in the subsequent decision, plus some occasional lapses.</p>
</sec>
<sec id="s2c2"><title>Comparison of alternative models of decision making</title>
<p>Our previous analyses suggest that subjects exhibit variability in decision making that conforms to some nontrivial transformation of the posterior distribution (such as a power function of the posterior, as expressed by model component SPK). We perform a second factorial model comparison that focusses on details of the decision-making process, by including additional model components that describe different transformations of the posterior. We consider in this analysis the following factors (in italic the additions):</p>
<list list-type="order"><list-item><p><bold>Decision making</bold>
 (4 levels): Bayesian Decision Theory (‘BDT’), stochastic posterior (‘SPK’), posterior probability matching (‘PPM’), <italic>posterior sampling-average</italic>
 (‘PSA’).</p>
</list-item>
<list-item><p><bold>Gaussian approximation of the posterior</bold>
 (3 levels): no approximation, <italic>mean/variance approximation</italic>
 (‘MV’) or <italic>Laplace approximation</italic>
 (‘LA’).</p>
</list-item>
<list-item><p><bold>Lapse</bold>
 (2 levels): absent or present (‘L’).</p>
</list-item>
</list>
<p>Our extended model set comprises 18 observer models since some combinations of model factors lead to equivalent observer models. In order to limit the combinatorial explosion of models, in this factorial analysis we do not include model factors S and P that were previously considered, since our main focus here is on decision making (but see below). All new model components are explained in this section and summarized in <xref ref-type="table" rid="pcbi-1003661-t002">Table 2</xref>
.</p>
<p>Firstly, we illustrate an additional level for the decision-making factor. According to model PSA (posterior sampling-average), we assume that the observer chooses a target by taking the average of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e270.jpg"></inline-graphic>
</inline-formula>
 samples drawn from the posterior distribution <xref rid="pcbi.1003661-Battaglia1" ref-type="bibr">[33]</xref>
. This corresponds to an observer with a sample-based posterior that applies a quadratic loss function when choosing the optimal target. For generality, with an interpolation method we allow <inline-formula><inline-graphic xlink:href="pcbi.1003661.e271.jpg"></inline-graphic>
</inline-formula>
 to be a real number (see <xref ref-type="sec" rid="s4">Methods</xref>
).</p>
<p>We also introduce a new model factor according to which subjects may use a single Gaussian to approximate the full posterior. The mean/variance model (MV) assumes that subjects approximate the posterior with a Gaussian with matching low-order moments (mean and variance). For observer models that act according to BDT, model MV is equivalent to the assumption of a quadratic loss function during target selection, whose optimal target choice equals the mean of the posterior. Alternatively, a commonly used Gaussian approximation in Bayesian inference is the Laplace approximation (LA) <xref rid="pcbi.1003661-MacKay1" ref-type="bibr">[34]</xref>
. In this case, the observer approximates the posterior with a single Gaussian centered on the mode of the posterior and whose variance depends on the local curvature at the mode (see <xref ref-type="sec" rid="s4">Methods</xref>
). The main difference of the Laplace approximation from other models is that the posterior is usually narrower, since it takes into account only the main peak.</p>
<p>Crucially, the predictions of these additional model components differ only if the posterior distribution is non-Gaussian; these observer models represent different generalizations of how a noisy decision process could affect behavior beyond the Gaussian case. Therefore we include in this analysis only trials in which the theoretical posterior distribution is considerably non-Gaussian (see <xref ref-type="sec" rid="s4">Methods</xref>
); this restriction immediately excludes from the analysis the training sessions and the Gaussian group, in which all priors and posteriors are strictly Gaussian.</p>
<p><xref ref-type="fig" rid="pcbi-1003661-g009">Figure 9</xref>
 shows the results of the BMS method applied to this model set. As before, we consider first the model evidence for each individual model and subject (<xref ref-type="fig" rid="pcbi-1003661-g009">Figure 9a</xref>
). Results are slighly different depending on the session (unimodal or bimodal) but in both cases model SPK-L (stochastic posterior with lapse) performs consistently better than other tested models for all conditions. Only a couple of subjects are better described by a different approximation of the posterior (either PSA or SPK-MV-L). These results are summarized in <xref ref-type="fig" rid="pcbi-1003661-g009">Figure 9b</xref>
, which shows the estimated probability that a given model would be responsible of generating the data of a randomly chosen subject. We show here results for both groups; a separate analysis of each group did not show qualitative differences. Model SPK-L is significantly more represented (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e272.jpg"></inline-graphic>
</inline-formula>
; exceedance probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e273.jpg"></inline-graphic>
</inline-formula>
), followed by model PSA (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e274.jpg"></inline-graphic>
</inline-formula>
) and SPK-MV-L (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e275.jpg"></inline-graphic>
</inline-formula>
). For all other models the probability is essentially the same at <inline-formula><inline-graphic xlink:href="pcbi.1003661.e276.jpg"></inline-graphic>
</inline-formula>
. The probability of single model factors reproduces the pattern seen before (<xref ref-type="fig" rid="pcbi-1003661-g009">Figure 9c</xref>
). The majority of subjects (more than <inline-formula><inline-graphic xlink:href="pcbi.1003661.e277.jpg"></inline-graphic>
</inline-formula>
 in each case) are likely to use a stochastic decision making (SPK), to use the full posterior (no Gaussian approximations) and lapse (L).</p>
<fig id="pcbi-1003661-g009" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g009</object-id>
<label>Figure 9</label>
<caption><title>Comparison between alternative models of decision making.</title>
<p>We tested a class of alternative models of decision making which differ with respect to predictions for non-Gaussian trials only. <bold>a</bold>
: Each column represents a subject, divided by group (either unimodal or bimodal test session), each row an observer model identified by a model string (see <xref ref-type="table" rid="pcbi-1003661-t002">Table 2</xref>
). Cell color indicates model's evidence, here displayed as the Bayes factor against the best model for that subject (a higher value means a worse performance of a given model with respect to the best model). Models are sorted by their posterior likelihood for a randomly selected subject (see panel b). Numbers above cells specify ranking for most supported models with comparable evidence (difference less than 10 in 2 log Bayes factor <xref rid="pcbi.1003661-Kass1" ref-type="bibr">[32]</xref>
). <bold>b</bold>
: Probability that a given model generated the data of a randomly chosen subject. Here and in panel c, brown bars represent the most supported models (or model levels within a factor). Asterisks indicate a significant exceedance probability, that is the posterior probability that a given model (or model component) is more likely than any other model (or model component): <inline-formula><inline-graphic xlink:href="pcbi.1003661.e278.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e279.jpg"></inline-graphic>
</inline-formula>
. <bold>c</bold>
: Probability that a given model level within a factor generated the data of a randomly chosen subject. Label ‘<inline-formula><inline-graphic xlink:href="pcbi.1003661.e280.jpg"></inline-graphic>
</inline-formula>
GA’ stands for no Gaussian approximation (full posterior).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g009"></graphic>
</fig>
<p>The model comparison performed on group DIC scores (GDIC) obtained mostly similar results although with a more substantial difference between the unimodal group and the bimodal group (<xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3</xref>
 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
). In particular, group DIC scores fail to find significant differences between distinct types of approximation of the posterior in the unimodal case. The reason is that for several subjects in the unimodal group differences between models are marginal, and GDIC does not have enough information to disambiguate between these models. Nonetheless, results in the bimodal case are non-ambigous, and overall the SPK-L model emerges again as the best description of subjects' behavior (see Section 4 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 for details).</p>
<p>As mentioned before, in order to limit model complexity we did not include model factors S and P in the current analysis. We can arguably ignore sensory noise in cue estimation, S, since it was already proven to have marginal effect on subjects' behavior, but this is not the case for noisy estimation of the prior, P. We need, therefore, to verify that our main results about decision making in the case of non-Gaussian posteriors were not affected by the lack of this factor. We compared the four most represented models of the current analysis (<xref ref-type="fig" rid="pcbi-1003661-g009">Figure 9b</xref>
) augmented with the P factor: SPK-P-L, PSA-P, SPK-MV-P-L and PSA-P-L. Model SPK-P-L was still the most representative model (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e281.jpg"></inline-graphic>
</inline-formula>
, exceedance probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e282.jpg"></inline-graphic>
</inline-formula>
), showing that model factor P does not affect our conclusions on alternative models of decision making. We also found that model SPK-P-L obtained more evidence than any other model tested in this section (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e283.jpg"></inline-graphic>
</inline-formula>
, exceedance probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e284.jpg"></inline-graphic>
</inline-formula>
), in agreement with the finding of our first factorial model comparison.</p>
<p>Finally, even though the majority of subjects' datasets is better described by the narrow loss function of the task, a few of them support also observer models that subtend a quadratic loss. To explore this diversity, we examined an extended BDT model in which the loss width <inline-formula><inline-graphic xlink:href="pcbi.1003661.e285.jpg"></inline-graphic>
</inline-formula>
 is a free parameter (see Section 3 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
). This model performed slightly better than a BDT model with fixed <inline-formula><inline-graphic xlink:href="pcbi.1003661.e286.jpg"></inline-graphic>
</inline-formula>
, but no better than the equivalent SPK model, so our findings are not affected.</p>
<p>In summary, subjects' variability in our task is compatible with them manipulating the full shape of the posterior corrupted by noise (SPK), and applying a close approximation of the loss function of the task. Our analysis marks as unlikely alternative models of decision making that use instead a quadratic loss or different low-order approximations of the posterior.</p>
</sec>
</sec>
<sec id="s2d"><title>Analysis of best observer model</title>
<p>After establishing model SPK-P-L as the ‘best’ description of the data among the considered observer models, we examined its properties. First of all, we inspected the posterior distribution of the model parameters given the data for each subject. In almost all cases the marginalized posterior distributions were unimodal with a well-defined peak. We therefore summarized each posterior distribution with a point estimate (a robust mean) with minor loss of generality; group averages are listed in <xref ref-type="table" rid="pcbi-1003661-t003">Table 3</xref>
. For the analyses in this section we ignored outlier parameter values that fell more than 3 SDs away from the group mean (this rule excluded at most one value per parameter). In general, we found a reasonable statistical agreement between parameters of different sessions, with some discrepancies in the unimodal test session only. In this section, inferred values are reported as mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e287.jpg"></inline-graphic>
</inline-formula>
 SD across subjects.</p>
<table-wrap id="pcbi-1003661-t003" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.t003</object-id>
<label>Table 3</label>
<caption><title>Best observer model's estimated parameters.</title>
</caption>
<alternatives><graphic id="pcbi-1003661-t003-3" xlink:href="pcbi.1003661.t003"></graphic>
<table frame="hsides" rules="groups"><colgroup span="1"><col align="left" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
<col align="center" span="1"></col>
</colgroup>
<thead><tr><td align="left" rowspan="1" colspan="1">Session</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e288.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e289.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e290.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e291.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e292.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e293.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
</thead>
<tbody><tr><td align="left" rowspan="1" colspan="1">Gaussian training</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e294.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e295.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e296.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e297.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e298.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e299.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Gaussian test</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e300.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e301.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e302.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e303.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e304.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e305.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Unimodal test</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e306.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e307.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e308.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e309.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e310.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e311.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">Bimodal test</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e312.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e313.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e314.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e315.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e316.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e317.jpg"></inline-graphic>
</inline-formula>
</td>
</tr>
<tr><td align="left" rowspan="1" colspan="1">True values</td>
<td align="left" rowspan="1" colspan="1">–</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e318.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1"><inline-formula><inline-graphic xlink:href="pcbi.1003661.e319.jpg"></inline-graphic>
</inline-formula>
</td>
<td align="left" rowspan="1" colspan="1">–</td>
<td align="left" rowspan="1" colspan="1">–</td>
<td align="left" rowspan="1" colspan="1">–</td>
</tr>
</tbody>
</table>
</alternatives>
<table-wrap-foot><fn id="nt103"><label></label>
<p>Group-average estimated parameters for the ‘best’ observer model (SPK-P-L), grouped by session (mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e320.jpg"></inline-graphic>
</inline-formula>
 SD across subjects). For each subject, the point estimates of the parameters were computed through a robust mean of the posterior distribution of the parameter given the data. For reference, we also report the true noise values of the cues, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e321.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e322.jpg"></inline-graphic>
</inline-formula>
. (<sup>*</sup>
) We ignored values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e323.jpg"></inline-graphic>
</inline-formula>
.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The motor noise parameter <inline-formula><inline-graphic xlink:href="pcbi.1003661.e324.jpg"></inline-graphic>
</inline-formula>
 took typical values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e325.jpg"></inline-graphic>
</inline-formula>
 screen units (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e326.jpg"></inline-graphic>
</inline-formula>
 mm), somewhat larger on average than the values found in the sensorimotor estimation experiment, although still in a reasonable range (see <xref ref-type="supplementary-material" rid="pcbi.1003661.s004">Text S3</xref>
). The inferred amount of motor noise is lower than estimates from previous studies in reaching and pointing (e.g. <xref rid="pcbi.1003661-Tassinari1" ref-type="bibr">[10]</xref>
), but in our task subjects had the possibility to adjust their end-point position.</p>
<p>The internal estimates of cue variability for low-noise and high-noise cues (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e327.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e328.jpg"></inline-graphic>
</inline-formula>
) were broadly scattered around the true values (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e329.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e330.jpg"></inline-graphic>
</inline-formula>
 screen units). In general, individual values were in qualitative agreement with the true parameters but showed quantitative discrepancies. Differences were manifest also at the group level, as we found statistically significant disagreement for both low and high-noise cues in the unimodal test session (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e331.jpg"></inline-graphic>
</inline-formula>
-test, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e332.jpg"></inline-graphic>
</inline-formula>
) and high-noise cues in the bimodal test session (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e333.jpg"></inline-graphic>
</inline-formula>
). The ratio between the two likelihood parameters, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e334.jpg"></inline-graphic>
</inline-formula>
, differed significantly from the true ratio, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e335.jpg"></inline-graphic>
</inline-formula>
 (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e336.jpg"></inline-graphic>
</inline-formula>
).</p>
<p>A few subjects (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e337.jpg"></inline-graphic>
</inline-formula>
) were very precise in their decision-making process, with a power function exponent <inline-formula><inline-graphic xlink:href="pcbi.1003661.e338.jpg"></inline-graphic>
</inline-formula>
. For the majority of subjects, however, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e339.jpg"></inline-graphic>
</inline-formula>
 took values between <inline-formula><inline-graphic xlink:href="pcbi.1003661.e340.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e341.jpg"></inline-graphic>
</inline-formula>
 (median <inline-formula><inline-graphic xlink:href="pcbi.1003661.e342.jpg"></inline-graphic>
</inline-formula>
), corresponding approximately to an amount of decision noise of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e343.jpg"></inline-graphic>
</inline-formula>
 of the variance of the posterior distribution (median <inline-formula><inline-graphic xlink:href="pcbi.1003661.e344.jpg"></inline-graphic>
</inline-formula>
). The range of exponents is compatible with values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e345.jpg"></inline-graphic>
</inline-formula>
 (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e346.jpg"></inline-graphic>
</inline-formula>
 number of samples) previously reported in other experiments, such as a distance-estimation task <xref rid="pcbi.1003661-Battaglia1" ref-type="bibr">[33]</xref>
 or ‘intuitive physics’ judgments <xref rid="pcbi.1003661-Battaglia2" ref-type="bibr">[35]</xref>
. In agreement with the results of our previous model comparison, the inferred exponents suggest that subjects' stochastic decision making followed the shape of a considerably narrower version of the posterior distribution (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e347.jpg"></inline-graphic>
</inline-formula>
) which is not simply a form of posterior-matching (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e348.jpg"></inline-graphic>
</inline-formula>
).</p>
<p>The Weber's fraction of estimation of the parameters of the priors' density took typical values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e349.jpg"></inline-graphic>
</inline-formula>
, with similar means across conditions. These values denote quite a large amount of noise in estimating (or manipulating) properties of the priors. Nonetheless, such values are in qualitative agreeement with a density/numerosity estimation experiment in which a change of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e350.jpg"></inline-graphic>
</inline-formula>
 in density or numerosity of a field of random dots was necessary for subjects to note a difference in either property <xref rid="pcbi.1003661-Dakin1" ref-type="bibr">[36]</xref>
. Although the two tasks are too different to allow a direct quantitative comparison, the thresholds measured in <xref rid="pcbi.1003661-Dakin1" ref-type="bibr">[36]</xref>
 suggest that density/numerosity estimation can indeed be as noisy as we found.</p>
<p>Finally, even though we did not set an informative prior over the parameter, the lapse rate took reasonably low values as expected from a probability of occasional mistakes <xref rid="pcbi.1003661-Wichmann1" ref-type="bibr">[28]</xref>
, <xref rid="pcbi.1003661-Kuss1" ref-type="bibr">[37]</xref>
. We found <inline-formula><inline-graphic xlink:href="pcbi.1003661.e351.jpg"></inline-graphic>
</inline-formula>
, and the inferred lapse rate averaged over training and test session was less than <inline-formula><inline-graphic xlink:href="pcbi.1003661.e352.jpg"></inline-graphic>
</inline-formula>
 for all but one subject.</p>
<p>We examined the best observer model's capability to reproduce our subjects' performance. For each subject and group, we generated <inline-formula><inline-graphic xlink:href="pcbi.1003661.e353.jpg"></inline-graphic>
</inline-formula>
 datasets simulating the responses of the SPK-P-L observer model to the experimental trials experienced by the subject. For each simulated dataset, model parameters were sampled from the posterior distribution of the parameters given the data. For each condition (shape of prior and cue type) we then computed the optimality index and averaged it across simulated datasets. The model's ‘postdictions’ are plotted in <xref ref-type="fig" rid="pcbi-1003661-g010">Figure 10</xref>
 as continuous lines (SE are omitted for clarity) and appear to be in good agreement with the data. Note that the postdiction is not exactly a fit since (a) the parameters are not optimized specifically to minimize performance error, and (b) the whole posterior distribution of the parameters is used and not just a ‘best’ point estimate. As a comparison, we also plotted in <xref ref-type="fig" rid="pcbi-1003661-g010">Figure 10</xref>
 the postdiction for the best BDT observer model, BDT-P-L (dashed line). As the model comparison suggested, standard Bayesian Decision Theory fails to capture subjects' performance.</p>
<fig id="pcbi-1003661-g010" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g010</object-id>
<label>Figure 10</label>
<caption><title>Model ‘postdiction’ of the optimality index.</title>
<p>Each bar represents the group-averaged optimality index for a specific session, for each prior (indexed from 1 to 8, see also <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2</xref>
) and cue type, either low-noise cues (red bars) or high-noise cues (blue bars); see also <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
. Error bars are SE across subjects. The continuous line represents the ‘postdiction’ of the best suboptimal Bayesian observer model, model SPK-P-L; see ‘Analysis of best observer model’ in the text). For comparison, the dashed line is the ‘postdiction’ of the best suboptimal observer model that follows Bayesian Decision Theory, BDT-P-L.</p>
</caption>
<graphic xlink:href="pcbi.1003661.g010"></graphic>
</fig>
<p>For each subject and group (training and test) we also plot the mean optimality index of the simulated sessions against the optimality index computed from the data, finding a good correlation (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e354.jpg"></inline-graphic>
</inline-formula>
; see <xref ref-type="fig" rid="pcbi-1003661-g011">Figure 11</xref>
).</p>
<fig id="pcbi-1003661-g011" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g011</object-id>
<label>Figure 11</label>
<caption><title>Comparison of measured and simulated performance.</title>
<p>Comparison of the mean optimality index computed from the data and the simulated optimality index, according to the ‘postdiction’ of the best observer model (SPK-P-L). Each dot represents a single session for each subject (either training or test). The dashed line corresponds to equality between observed and simulated performance. Model-simulated performance is in good agreement with subjects' performance (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e355.jpg"></inline-graphic>
</inline-formula>
).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g011"></graphic>
</fig>
<p>Lastly, to gain an insight on subjects' systematic response biases, we used our framework in order to nonparametrically reconstruct what the subjects' priors in the various conditions would look like <xref rid="pcbi.1003661-Stocker1" ref-type="bibr">[2]</xref>
, <xref rid="pcbi.1003661-Girshick1" ref-type="bibr">[3]</xref>
, <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
, <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
 (see <xref ref-type="sec" rid="s4">Methods</xref>
). Due to limited data per condition and computational constraints, we recovered the subjects' priors at the group level and for model SPK-L, without additional noise on the priors (P). The reconstructed average priors for distinct test sessions are shown in <xref ref-type="fig" rid="pcbi-1003661-g012">Figure 12</xref>
. Reconstructed priors display a very good match with the true priors for the Gaussian session and show minor deviations in the other sessions. The ability of the model to reconstruct the priors – modulo residual idiosyncrasies – is indicative of the goodness of the observer model in capturing subjects' sources of suboptimality.</p>
<fig id="pcbi-1003661-g012" orientation="portrait" position="float"><object-id pub-id-type="doi">10.1371/journal.pcbi.1003661.g012</object-id>
<label>Figure 12</label>
<caption><title>Reconstructed prior distributions.</title>
<p>Each panel shows the (unnormalized) probability density for a ‘prior’ distribution of targets, grouped by test session, as per <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2</xref>
. Purple lines are mean reconstructed priors (mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e356.jpg"></inline-graphic>
</inline-formula>
 1 s.d.) according to observer model SPK-L. <bold>a: Gaussian session.</bold>
 Recovered priors in the Gaussian test session are very good approximations of the true priors (comparison between SD of the reconstructed priors and true SD: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e357.jpg"></inline-graphic>
</inline-formula>
). <bold>b: Unimodal session.</bold>
 Recovered priors in the unimodal test session approximate the true priors (recovered SD: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e358.jpg"></inline-graphic>
</inline-formula>
, true SD: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e359.jpg"></inline-graphic>
</inline-formula>
 screen units) although with systematic deviations in higher-order moments (comparison between moments of the reconstructed priors and true moments: skewness <inline-formula><inline-graphic xlink:href="pcbi.1003661.e360.jpg"></inline-graphic>
</inline-formula>
; kurtosis <inline-formula><inline-graphic xlink:href="pcbi.1003661.e361.jpg"></inline-graphic>
</inline-formula>
). Reconstructed priors are systematically less kurtotic (less peaked, lighter-tailed) than the true priors. <bold>c: Bimodal session.</bold>
 Recovered priors in the bimodal test session approximate the true priors with only minor systematic deviations (recovered SD: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e362.jpg"></inline-graphic>
</inline-formula>
, true SD: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e363.jpg"></inline-graphic>
</inline-formula>
 screen units; coefficient of determination between moments of the reconstructed priors and true moments: skewness <inline-formula><inline-graphic xlink:href="pcbi.1003661.e364.jpg"></inline-graphic>
</inline-formula>
; kurtosis <inline-formula><inline-graphic xlink:href="pcbi.1003661.e365.jpg"></inline-graphic>
</inline-formula>
).</p>
</caption>
<graphic xlink:href="pcbi.1003661.g012"></graphic>
</fig>
</sec>
</sec>
<sec id="s3"><title>Discussion</title>
<p>We have explored human performance in probabilistic inference (a target estimation task) for different classes of prior distributions and different levels of reliability of the cues. Crucially, in our setup subjects were required to perform Bayesian computations with explicitly provided probabilistic information, thereby removing the need either for statistical learning or for memory and recall of a prior distribution. We found that subjects performed suboptimally in our paradigm but that their relative degree of suboptimality was similar across different priors and different cue noise. Based on a generative model of the task we built a set of suboptimal Bayesian observer models. Different methods of model comparison among this large class of models converged in identifying a most likely observer model that deviates from the optimal Bayesian observer in the following points: (a) a mismatching representation of the likelihood parameters, (b) a noisy estimation of the parameters of the prior, (c) a few occasional lapses, and (d) a stochastic representation of the posterior (such that the target choice distribution is approximated by a power function of the posterior).</p>
<sec id="s3a"><title>Human performance in probabilistic inference</title>
<p>Subjects integrated probabilistic information from both prior and cue in our task, but rarely exhibited the signature of full ‘synergistic integration’, i.e. a performance above that which could be obtained by using either the prior or the cue alone (see <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
). However, unlike most studies of Bayesian learning, on each trial in our study subjects were presented with a new prior. A previous study on movement planning with probabilistic information (and fewer conditions) similarly found that subjects violated conditions of optimality <xref rid="pcbi.1003661-Hudson1" ref-type="bibr">[23]</xref>
.</p>
<p>More interestingly, in our data the relative degree of suboptimality did not show substantial differences across distinct classes of priors and noise levels of the cue (low-noise and high-noise). This finding suggests that human efficacy at probabilistic inference is only mildly affected by complexity of the prior per se, at least for the distributions we have used. Conversely, the process of learning priors is considerably affected by the class of the distribution: for instance, learning a bimodal prior (when it is learnt at all) can require thousands of trials <xref rid="pcbi.1003661-Kording1" ref-type="bibr">[9]</xref>
, whereas mean and variance of a single Gaussian can be acquired reliably within a few hundred trials <xref rid="pcbi.1003661-Berniker1" ref-type="bibr">[11]</xref>
.</p>
<p>Within the same session, subjects' relative performance was influenced by the specific shape of the prior. In particular, for Gaussian priors we found a systematic effect of the variance – subjects performed worse with wider priors, more than what would be expected by taking into account the objective decrease in available information. Interestingly, neither noise in estimation of the prior width (factor P) nor occasional lapses that follow the shape of the prior itself (factor L) are sufficient to explain this effect. Model postdictions of model BDT-P-L show large systematic deviations from subjects' performance in the Gaussian sessions, whereas the best model with decision noise, SPK-P-L, is able to capture subjects' behavior; see top left and top right panels in <xref ref-type="fig" rid="pcbi-1003661-g010">Figure 10</xref>
. Moreover, the Gaussian priors recovered under model SPK-L match extremely well the true priors, furthering the role of the stochastic posterior in fully explaining subjects' performance with Gaussians. The crucial aspect of model SPK may be that decision noise is proportional to the width of the posterior, and not merely of the prior.</p>
<p>In the unimodal test session, subjects' performance was positively correlated with the width of the main peak of the distribution. That is, non-Gaussian, narrow-peaked priors (such as priors 1 and 6 in <xref ref-type="fig" rid="pcbi-1003661-g012">Figure 12b</xref>
) induced worse performance than broad and smooth distributions (e.g. priors 4 and 8). Subjects tended to ‘mistrust’ the prior, especially in the high-noise condition, giving excess weight to the cue (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e366.jpg"></inline-graphic>
</inline-formula>
 is significantly lower than it should be; see <xref ref-type="table" rid="pcbi-1003661-t003">Table 3</xref>
), which can be also interpreted as an overestimation of the width of the prior. In agreement with this description, the reconstructed priors in <xref ref-type="fig" rid="pcbi-1003661-g012">Figure 12b</xref>
 show a general tendency to overestimate the width of the narrower peaks, as we found in a previous study of interval timing <xref rid="pcbi.1003661-Acerbi1" ref-type="bibr">[8]</xref>
. This behavior is compatible with a well-known human tendency of underestimating (or, alternatively, underweighting) the probability of occurrence of highly probable results and overestimating (overweighting) the frequency of rare events (see <xref rid="pcbi.1003661-Zhang1" ref-type="bibr">[27]</xref>
, <xref rid="pcbi.1003661-Kahneman1" ref-type="bibr">[38]</xref>
, <xref rid="pcbi.1003661-Tversky1" ref-type="bibr">[39]</xref>
). Similar biases in estimating and manipulating prior distributions may be explained with an hyperprior that favors more entropic and, therefore, smoother priors in order to avoid ‘overfitting’ to the environment <xref rid="pcbi.1003661-Feldman1" ref-type="bibr">[40]</xref>
.</p>
</sec>
<sec id="s3b"><title>Modelling suboptimality</title>
<p>In building our observer models we made several assumptions. For all models we assumed that the prior adopted by observers in Eq. 2 corresponded to a continuous approximation of the probability density function displayed on screen, or a noisy estimate thereof. We verified that using the original discrete representation does not improve model performance. Clearly, subjects may have been affected by the discretization of the prior in other ways, but we assumed that such errors could be absorbed by other model components. We also assumed subjects quickly acquired a correct internal model of the probabilistic structure of the task, through practice and feedback, although quantitative details (i.e. model parameters) could be mismatched with respect to the true parameters. Formally, our observer models were not ‘actor’ models in the sense that they did not take into account the motor error in the computation of the expected loss. However, this was with negligible loss of generality since the motor term has no influence on the inference of the optimal target for single Gaussians priors, and yields empirically negligible impact for other priors for small values of the motor error <inline-formula><inline-graphic xlink:href="pcbi.1003661.e367.jpg"></inline-graphic>
</inline-formula>
 (as those measured in our task; see <xref ref-type="supplementary-material" rid="pcbi.1003661.s004">Text S3</xref>
).</p>
<p>Suboptimality was introduced into our observer models in three main ways: (a) miscalibration of the parameters of the likelihood; (b) models of approximate inference; and (c) additional stochasticity, either on the sensory inputs or in the decision-making process itself. Motor noise was another source of suboptimality, but its contribution was comparably low.</p>
<p>Miscalibration of the parameters of the likelihood means that the subjective estimates of the reliability of the cues (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e368.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e369.jpg"></inline-graphic>
</inline-formula>
) could differ from the true values (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e370.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e371.jpg"></inline-graphic>
</inline-formula>
). In fact, we found slight to moderate discrepancies, which became substantial in some conditions. Previous studies have investigated whether subjects have (or develop) a correct internal estimate of relevant noise parameters (i.e. the likelihood) which may correspond to their own sensory or motor variability plus some externally injected noise. In several cases subjects were found to have a miscalibrated model of their own variability which led to suboptimal behavior <xref rid="pcbi.1003661-Battaglia1" ref-type="bibr">[33]</xref>
, <xref rid="pcbi.1003661-Mamassian2" ref-type="bibr">[41]</xref>
–<xref rid="pcbi.1003661-Zhang3" ref-type="bibr">[43]</xref>
, although there are cases in which subjects were able to develop correct estimates of such parameters <xref rid="pcbi.1003661-Tassinari1" ref-type="bibr">[10]</xref>
, <xref rid="pcbi.1003661-Trommershuser2" ref-type="bibr">[44]</xref>
, <xref rid="pcbi.1003661-Gepshtein1" ref-type="bibr">[45]</xref>
.</p>
<p>More generally, it could be that subjects were not only using incorrect parameters for the task, but built a wrong internal model or were employing approximations in the inference process. For our task, which has a relatively simple one-dimensional structure, we did not find evidence that subjects were using low-order approximations of the posterior distribution. Also, the capability of our models to recover the subjects' priors in good agreement with the true priors suggest that subjects' internal model of the task was not too discrepant from the true one.</p>
<p>Crucial element in all our models was the inclusion of extra sources of variability, in particular in decision making. Whereas most forms of added noise have a clear interpretation, such as sensory noise in the estimation of the cue location, or in estimating the parameters of the prior, the so-called ‘stochastic posterior’ deserves an extended explanation.</p>
</sec>
<sec id="s3c"><title>Understanding the stochastic posterior</title>
<p>We introduced the stochastic posterior model of decision making, SPK, with two intuitive interpretations, that is a noisy posterior or a sample-based approximation (see <xref ref-type="fig" rid="pcbi-1003661-g007">Figure 7</xref>
 and <xref ref-type="supplementary-material" rid="pcbi.1003661.s003">Text S2</xref>
), but clearly any process that produces a variability in the target choice distribution that approximates a power function of the posterior is a candidate explanation. The stochastic posterior captures the main trait of decision noise, that is a variability that depends on the shape of the posterior <xref rid="pcbi.1003661-Battaglia1" ref-type="bibr">[33]</xref>
, as opposed to other forms of noise that do not depend on the decision process. Outstanding open questions are therefore which kind of process could be behind the observed noise in decision making, and during which stage it arises, e.g. whether it is due to inference or to action selection <xref rid="pcbi.1003661-Drugowitsch1" ref-type="bibr">[46]</xref>
.</p>
<p>A seemingly promising candidate for the source of noise in the inference is neuronal variability in the nervous system <xref rid="pcbi.1003661-Faisal1" ref-type="bibr">[47]</xref>
. Although the noisy representation of the posterior distribution in <xref ref-type="fig" rid="pcbi-1003661-g007">Figure 7b through a</xref>
 population of units may be a simplistic cartoon, the posterior could be encoded in subtler ways (see for instance <xref rid="pcbi.1003661-Ma1" ref-type="bibr">[48]</xref>
). However, neuronal noise itself may not be enough to explain the amount of observed variability (see <xref ref-type="supplementary-material" rid="pcbi.1003661.s003">Text S2</xref>
). An extension of this hypothesis is that the noise may emerge since suboptimal computations magnify the underlying variability <xref rid="pcbi.1003661-Beck1" ref-type="bibr">[49]</xref>
.</p>
<p>Conversely, another scenario is represented by the sampling hypothesis, an approximate algorithm for probabilistic inference which could be implemented at the neural level <xref rid="pcbi.1003661-Fiser1" ref-type="bibr">[19]</xref>
. Our analysis ruled out an observer whose decision-making process consists in taking the average of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e372.jpg"></inline-graphic>
</inline-formula>
 samples from the posterior – operation that implicitly assumes a quadratic loss function – showing that averaging samples from the posterior is not a generally valid approach, although differences can be small for unimodal distributions. More generally, the sampling method should always take into account the loss function of the task, which in our case is closer to a delta function (a MAP solution) rather than to a quadratic loss. Our results are compatible with a proper sampling approach, in which an empirical distribution is built out of a small number of samples from the posterior, and then the expected loss is computed from the sampled distribution <xref rid="pcbi.1003661-Fiser1" ref-type="bibr">[19]</xref>
.</p>
<p>As a more cognitive explanation, decision variability may have arisen because subjects adopted a probabilistic instead of deterministic strategy in action selection as a form of exploratory behavior. In reinforcement learning this is analogous to the implementation of a probabilistic policy as opposed to a deterministic policy, with a ‘temperature’ parameter that governs the amount of variability <xref rid="pcbi.1003661-Sutton1" ref-type="bibr">[50]</xref>
. Search strategies have been hypothesized to lie behind suboptimal behaviors that appear random, such as probability matching <xref rid="pcbi.1003661-Gaissmaier1" ref-type="bibr">[51]</xref>
. While generic exploratory behavior is compatible with our findings, our analysis rejected a simple posterior-matching strategy <xref rid="pcbi.1003661-Mamassian1" ref-type="bibr">[25]</xref>
, <xref rid="pcbi.1003661-Wozny1" ref-type="bibr">[26]</xref>
.</p>
<p>All of these interpretations assume that there is some noise in the decision process itself. However, the noise could emerge from other sources, without the necessity of introducing deviations from standard BDT. For instance, variability in the experiment could arise from lack of stationarity: dependencies between trials, fluctuations of subjects' parameters or time-varying strategies would appear as additional noise in a stationary model <xref rid="pcbi.1003661-Green1" ref-type="bibr">[52]</xref>
. We explored the possibility of nonstationary behavior without finding evidence for strong effects of nonstationarity (see Section 6 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
). In particular, an iterative (trial-dependent) non-Bayesian model failed to model the data in the training dataset better than the stochastic posterior model. Clearly, this does not exclude that different, possibly Bayesian, iterative models could explain the data better, but our task design with multiple alternating conditions and partial feedback should mitigate the effect of dependencies between trials, since each trial typically displays a different condition from the immediately preceding ones.</p>
<p>In summary, we show that a decision strategy that implements a ‘stochastic posterior’ that introduces variability in the computation of the expected loss has several theoretical and empirical advantages when modelling subjects' performance, demonstrating improvement over previous models that implemented variability only through a ‘posterior-matching’ approach or that implicitly assume a quadratic loss function (sampling-average methods).</p>
</sec>
</sec>
<sec sec-type="methods" id="s4"><title>Methods</title>
<sec id="s4a"><title>Ethics statement</title>
<p>The Cambridge Psychology Research Ethics Committee approved the experimental procedures and all subjects gave informed consent.</p>
</sec>
<sec id="s4b"><title>Participants</title>
<p>Twenty-four subjects (10 male and 14 female; age range 18–33 years) participated in the study. All participants were naïve to the purpose of the study. All participants were right-handed according to the Edinburgh handedness inventory <xref rid="pcbi.1003661-Oldfield1" ref-type="bibr">[53]</xref>
, with normal or corrected-to-normal vision and reported no neurological disorder. Participants were compensated for their time.</p>
</sec>
<sec id="s4c"><title>Behavioral task</title>
<p>Subjects were required to reach to an unknown target given probabilistic information about its position. Information consisted of a visual representation of the a priori probability distribution of targets for that trial and a noisy cue about the actual target position.</p>
<p>Subjects held the handle of a robotic manipulandum (vBOT, <xref rid="pcbi.1003661-Howard1" ref-type="bibr">[54]</xref>
). The visual scene from a CRT monitor (Dell UltraScan P1110, 21-inch, 100 Hz refresh rate) was projected into the plane of the hand via a mirror (<xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1a</xref>
) that prevented the subjects from seeing their hand. The workspace origin, coordinates <inline-formula><inline-graphic xlink:href="pcbi.1003661.e373.jpg"></inline-graphic>
</inline-formula>
, was <inline-formula><inline-graphic xlink:href="pcbi.1003661.e374.jpg"></inline-graphic>
</inline-formula>
 cm from the torso of the subjects, with positive axes towards the right (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e375.jpg"></inline-graphic>
</inline-formula>
 axis) and away from the subject (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e376.jpg"></inline-graphic>
</inline-formula>
 axis). The workspace showed a home position (1.5 cm radius circle) at <inline-formula><inline-graphic xlink:href="pcbi.1003661.e377.jpg"></inline-graphic>
</inline-formula>
 cm and a cursor (1.25 cm radius circle) that tracked the hand position.</p>
<p>On each trial 100 potential targets (0.1 cm radius dots) were shown around the target line at positions <inline-formula><inline-graphic xlink:href="pcbi.1003661.e378.jpg"></inline-graphic>
</inline-formula>
, for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e379.jpg"></inline-graphic>
</inline-formula>
, where the <inline-formula><inline-graphic xlink:href="pcbi.1003661.e380.jpg"></inline-graphic>
</inline-formula>
 formed a fixed discrete representation of the trial-dependent ‘prior’ distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e381.jpg"></inline-graphic>
</inline-formula>
, obtained through a regular sample of the cdf (see <xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1d</xref>
), and the <inline-formula><inline-graphic xlink:href="pcbi.1003661.e382.jpg"></inline-graphic>
</inline-formula>
 were small random offsets used to facilitate visualization (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e383.jpg"></inline-graphic>
</inline-formula>
 Uniform(−0.3, 0.3) cm). The true target was chosen by picking one of the potential targets at random with uniform probability. A cue (0.25 cm radius circle) was shown at position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e384.jpg"></inline-graphic>
</inline-formula>
. The horizontal position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e385.jpg"></inline-graphic>
</inline-formula>
 provided a noisy estimate of the target position, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e386.jpg"></inline-graphic>
</inline-formula>
, with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e387.jpg"></inline-graphic>
</inline-formula>
 the true (horizontal) position of the target, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e388.jpg"></inline-graphic>
</inline-formula>
 the cue variability and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e389.jpg"></inline-graphic>
</inline-formula>
 a normal random variable with zero mean and unit variance. The distance of the cue from the target line, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e390.jpg"></inline-graphic>
</inline-formula>
, was linearly related to the cue variability: cues distant from the target line were noisier than cues close to it. In our setup, the noise level <inline-formula><inline-graphic xlink:href="pcbi.1003661.e391.jpg"></inline-graphic>
</inline-formula>
 could only either be low for ‘short-distance’ cues, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e392.jpg"></inline-graphic>
</inline-formula>
 cm (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e393.jpg"></inline-graphic>
</inline-formula>
 cm), or high for ‘long-distance’ cues, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e394.jpg"></inline-graphic>
</inline-formula>
 cm (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e395.jpg"></inline-graphic>
</inline-formula>
 cm). Both the prior distribution and cue remained on the screen for the duration of a trial.</p>
<p>After a ‘go’ beep, subjects were required to move the handle towards the target line, choosing an endpoint position such that the true target would be within the cursor radius. The manipulandum generated a spring force along the depth axis (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e396.jpg"></inline-graphic>
</inline-formula>
 N/cm) for cursor positions past the target line, preventing subjects from overshooting. The horizontal endpoint position of the movement (velocity of the cursor less than 0.5 cm/s), after contact with the target line, was recorded as the subject’s response <inline-formula><inline-graphic xlink:href="pcbi.1003661.e397.jpg"></inline-graphic>
</inline-formula>
 for that trial.</p>
<p>At the end of each trial, subjects received visual feedback on whether their cursor encircled (a ‘success’) or missed the true target (partial feedback). On full feedback trials, the position of the true target was also shown (0.25 cm radius yellow circle). Feedback remained on screen for 1 s. Potential targets, cues and feedback then disappeared. A new trial started 500 ms after the subject had returned to the home position.</p>
<p>For simplicity, all distances in the experiment are reported in terms of standardized screen units (window width of 1.0), with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e398.jpg"></inline-graphic>
</inline-formula>
 and 0.01 screen units corresponding to 3 mm. In screen units, the cursor radius is <inline-formula><inline-graphic xlink:href="pcbi.1003661.e399.jpg"></inline-graphic>
</inline-formula>
 and the SD of noise for short and long distance cues is respectively <inline-formula><inline-graphic xlink:href="pcbi.1003661.e400.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e401.jpg"></inline-graphic>
</inline-formula>
.</p>
</sec>
<sec id="s4d"><title>Experimental sessions</title>
<p>Subjects performed one practice block in which they were familiarized with the task (64 trials). The main experiment consisted of a training session with Gaussian priors (576 trials) followed by a test session with group-dependent priors (576–640 trials). Sessions were divided in four runs. Subjects could take short breaks between runs and there was a mandatory 15 minutes break between the training and test sessions.</p>
<p>Each session presented eight different types of priors and two cue noise levels (corresponding to either ‘short’ or ‘long’ cues), for a total of 16 different conditions (36–40 trials per condition). Trials from different conditions were presented in random order. Depending on the session and group, priors belonged to one of the following classes (see <xref ref-type="fig" rid="pcbi-1003661-g002">Figure 2</xref>
):</p>
<sec id="s4d1"><title>Gaussian priors</title>
<p>Eight Gaussian distributions with evenly spread SDs between 0.04 and 0.18 i.e. <inline-formula><inline-graphic xlink:href="pcbi.1003661.e402.jpg"></inline-graphic>
</inline-formula>
 screen units.</p>
</sec>
<sec id="s4d2"><title>Unimodal priors</title>
<p>Eight unimodal priors with fixed SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e403.jpg"></inline-graphic>
</inline-formula>
 and variable skewness and kurtosis. With the exception of platykurtic prior 4, which is a mixture of 11 Gaussians, and prior 8, which is a single Gaussian, all other priors were realized as mixtures of two Gaussians that locally maximize differential entropy for given values of the first four central moments. In the maximization we included a constraint on the SDs of the individual components so to prevent degenerate solutions (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e404.jpg"></inline-graphic>
</inline-formula>
 screen units, for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e405.jpg"></inline-graphic>
</inline-formula>
). Skewness and excess kurtosis were chosen to represent various shapes of unimodal distributions, within the strict bounds that exist between skewness and kurtosis of a unimodal distribution <xref rid="pcbi.1003661-Teuscher1" ref-type="bibr">[55]</xref>
. The values of (skewness, kurtosis) for the eight distributions, in order of increasing differential entropy: 1: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e406.jpg"></inline-graphic>
</inline-formula>
; 2: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e407.jpg"></inline-graphic>
</inline-formula>
; 3: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e408.jpg"></inline-graphic>
</inline-formula>
; 4: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e409.jpg"></inline-graphic>
</inline-formula>
; 5: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e410.jpg"></inline-graphic>
</inline-formula>
; 6: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e411.jpg"></inline-graphic>
</inline-formula>
; 7: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e412.jpg"></inline-graphic>
</inline-formula>
; 8: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e413.jpg"></inline-graphic>
</inline-formula>
.</p>
</sec>
<sec id="s4d3"><title>Bimodal priors</title>
<p>Eight (mostly) bimodal priors with fixed SD <inline-formula><inline-graphic xlink:href="pcbi.1003661.e414.jpg"></inline-graphic>
</inline-formula>
 and variable separation and relative weight. The priors were realized as mixtures of two Gaussians with equal variance: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e415.jpg"></inline-graphic>
</inline-formula>
. Separation was computed as <inline-formula><inline-graphic xlink:href="pcbi.1003661.e416.jpg"></inline-graphic>
</inline-formula>
, and relative weight was defined as <inline-formula><inline-graphic xlink:href="pcbi.1003661.e417.jpg"></inline-graphic>
</inline-formula>
. The values of (separation, relative weight) for the eight distributions, in order of increasing differential entropy: 1: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e418.jpg"></inline-graphic>
</inline-formula>
; 2: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e419.jpg"></inline-graphic>
</inline-formula>
; 3: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e420.jpg"></inline-graphic>
</inline-formula>
; 4: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e421.jpg"></inline-graphic>
</inline-formula>
; 5: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e422.jpg"></inline-graphic>
</inline-formula>
; 6: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e423.jpg"></inline-graphic>
</inline-formula>
; 7: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e424.jpg"></inline-graphic>
</inline-formula>
; 8: <inline-formula><inline-graphic xlink:href="pcbi.1003661.e425.jpg"></inline-graphic>
</inline-formula>
 (the last distribution is a single Gaussian).</p>
<p>For all priors, the mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e426.jpg"></inline-graphic>
</inline-formula>
 was drawn from a uniform distribution whose bounds were chosen such that the extremes of the discrete representation would fall within the active screen window (the actual screen size was larger than the active window). Also, asymmetric priors had <inline-formula><inline-graphic xlink:href="pcbi.1003661.e427.jpg"></inline-graphic>
</inline-formula>
 probability of being flipped horizontally about the mean.</p>
</sec>
</sec>
<sec id="s4e"><title>Data analysis</title>
<sec id="s4e1"><title>Analysis of behavioral data</title>
<p>Data analysis was conducted in MATLAB 2010b (Mathworks, U.S.A.). To avoid edge artifacts in subjects' response, we discarded trials in which the cue position, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e428.jpg"></inline-graphic>
</inline-formula>
, was outside the range of the discretized prior distribution (2691 out of 28672 trials: 9.4%). We included these trials in the experimental session in order to preserve the probabilistic relationships between variables of the task.</p>
<p>For each trial, we recorded the response location <inline-formula><inline-graphic xlink:href="pcbi.1003661.e429.jpg"></inline-graphic>
</inline-formula>
 and the reaction time (RT) was defined as the interval between the ‘go’ beep and the start of the subject's movement. For each subject and session we computed a nonlinear kernel regression estimate of the average RT as a function of the SD of the posterior distribution, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e430.jpg"></inline-graphic>
</inline-formula>
. We only considered a range of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e431.jpg"></inline-graphic>
</inline-formula>
 for which all subjects had a significant density of data points. Results did not change qualitatively for other measures of spread of the posterior, such as the exponential entropy <xref rid="pcbi.1003661-Campbell1" ref-type="bibr">[24]</xref>
.</p>
<p>All subjects' datasets are available online in <xref ref-type="supplementary-material" rid="pcbi.1003661.s001">Dataset S1</xref>
.</p>
</sec>
<sec id="s4e2"><title>Optimality index and success probability</title>
<p>We calculated the optimality index for each trial as the success probability for response <inline-formula><inline-graphic xlink:href="pcbi.1003661.e432.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e433.jpg"></inline-graphic>
</inline-formula>
, divided by the maximal success probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e434.jpg"></inline-graphic>
</inline-formula>
, which we used to quantify performance of a subject (or an observer model). The optimality index of our subjects in the task is plotted in <xref ref-type="fig" rid="pcbi-1003661-g005">Figure 5</xref>
 and success probabilities are shown in <xref ref-type="fig" rid="pcbi-1003661-g001">Figure 1</xref>
 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
.</p>
<p>The success probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e435.jpg"></inline-graphic>
</inline-formula>
 in a given trial represents the probability of locating the correct target according to the generative model of the task (independent of the actual position of the target). For a trial with cue position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e436.jpg"></inline-graphic>
</inline-formula>
, cue noise variance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e437.jpg"></inline-graphic>
</inline-formula>
, and prior distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e438.jpg"></inline-graphic>
</inline-formula>
, the success probability is defined as: <disp-formula id="pcbi.1003661.e439"><graphic xlink:href="pcbi.1003661.e439.jpg" position="anchor" orientation="portrait"></graphic>
<label>(12)</label>
</disp-formula>
where the integrand is the posterior distribution according to the continuous generative model of the task and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e440.jpg"></inline-graphic>
</inline-formula>
 is the diameter of the cursor. Solving the integral in Eq. 12 for a generic mixture-of-Gaussians prior, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e441.jpg"></inline-graphic>
</inline-formula>
, we obtain:<disp-formula id="pcbi.1003661.e442"><graphic xlink:href="pcbi.1003661.e442.jpg" position="anchor" orientation="portrait"></graphic>
<label>(13)</label>
</disp-formula>
where the symbols <inline-formula><inline-graphic xlink:href="pcbi.1003661.e443.jpg"></inline-graphic>
</inline-formula>
, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e444.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e445.jpg"></inline-graphic>
</inline-formula>
 have been defined in Eq. 5. The maximal success probability is simply computed as <inline-formula><inline-graphic xlink:href="pcbi.1003661.e446.jpg"></inline-graphic>
</inline-formula>
.</p>
<p>Note that a metric based on the theoretical success probability is more appropriate than the observed fraction of successes for a given sample of trials, as the latter introduces additional error due to mere chance (the observed fraction of successes fluctuates around the true success probability with binomial statistics, and the error can be substantial for small sample size).</p>
<p>The priors for the Gaussian, unimodal and bimodal sessions were chosen such that the average maximal success probability of each class was about the same (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e447.jpg"></inline-graphic>
</inline-formula>
) making the task challenging and of equal difficulty across the task.</p>
</sec>
<sec id="s4e3"><title>Computing the optimal target</title>
<p>According to Bayesian Decision Theory (BDT), the key quantity an observer needs to compute in order to make a decision is the (subjectively) expected loss for a given action. In our task, the action corresponds to a choice of a cursor position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e448.jpg"></inline-graphic>
</inline-formula>
, and the expected loss takes the form: <disp-formula id="pcbi.1003661.e449"><graphic xlink:href="pcbi.1003661.e449.jpg" position="anchor" orientation="portrait"></graphic>
<label>(14)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e450.jpg"></inline-graphic>
</inline-formula>
 is the subject's posterior distribution of target position, described by Eq. 2, and the loss associated with choosing position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e451.jpg"></inline-graphic>
</inline-formula>
 when the target location is <inline-formula><inline-graphic xlink:href="pcbi.1003661.e452.jpg"></inline-graphic>
</inline-formula>
 is represented by loss function <inline-formula><inline-graphic xlink:href="pcbi.1003661.e453.jpg"></inline-graphic>
</inline-formula>
.</p>
<p>Our task has a clear ‘hit or miss’ structure that is represented by the square well function: <disp-formula id="pcbi.1003661.e454"><graphic xlink:href="pcbi.1003661.e454.jpg" position="anchor" orientation="portrait"></graphic>
<label>(15)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e455.jpg"></inline-graphic>
</inline-formula>
 is the distance of the chosen response from the target, and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e456.jpg"></inline-graphic>
</inline-formula>
 is the size of the allowed window for locating the target (in the experiment, the cursor diameter). The square well loss allows for an analytical expression of the expected loss, but the optimal target still needs to be computed numerically. Therefore we make a smooth approximation to the square well loss represented by the inverted Gaussian loss:<disp-formula id="pcbi.1003661.e457"><graphic xlink:href="pcbi.1003661.e457.jpg" position="anchor" orientation="portrait"></graphic>
<label>(16)</label>
</disp-formula>
where the parameter <inline-formula><inline-graphic xlink:href="pcbi.1003661.e458.jpg"></inline-graphic>
</inline-formula>
 governs the scale of smoothed detection window. The Gaussian loss approximates extremely well the predictions of the square well loss in our task, to the point that performance under the two forms of loss is empirically indistinguishable (see Section 3 in <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
). However, computationally the Gaussian loss is preferrable as it allows much faster calculations of optimal behavior.</p>
<p>For the decision process, BDT assumes that observers choose the ‘optimal’ target position <inline-formula><inline-graphic xlink:href="pcbi.1003661.e459.jpg"></inline-graphic>
</inline-formula>
 that minimizes the expected loss: <disp-formula id="pcbi.1003661.e460"><graphic xlink:href="pcbi.1003661.e460.jpg" position="anchor" orientation="portrait"></graphic>
<label>(17)</label>
</disp-formula>
where we have used Eqs. 2, 14 and 16. With some algebraic manipulations, Eq. 17 can be reformulated as Eq. 4. Given the form of the expected loss, the solution of Eq. 4 is equivalent to finding the maximum (mode) of a Gaussian mixture model. In general no analytical solution is known for more than one model component (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e461.jpg"></inline-graphic>
</inline-formula>
), so we implemented a fast and accurate numerical solution adapting the algorithm in <xref rid="pcbi.1003661-CarreiraPerpinan1" ref-type="bibr">[56]</xref>
.</p>
</sec>
<sec id="s4e4"><title>Computing the response probability</title>
<p>The probability of observing response <inline-formula><inline-graphic xlink:href="pcbi.1003661.e462.jpg"></inline-graphic>
</inline-formula>
 in a trial, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e463.jpg"></inline-graphic>
</inline-formula>
 (e.g., Eq. 6) is the key quantity for our probabilistic modelling of the task. For basic observer models, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e464.jpg"></inline-graphic>
</inline-formula>
 is obtained as the convolution between a Gaussian distribution (motor noise) and a target choice distribution in closed form (e.g. a power function of a mixture of Gaussians), such as in Eqs. 3, 7 and 11. Response probabilities are integrated over latent variables of model factor S (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e465.jpg"></inline-graphic>
</inline-formula>
; see Eq. 8) and of model factor P (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e466.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e467.jpg"></inline-graphic>
</inline-formula>
; see Eqs. 9 and 10). Integrations were performed analytically when possible or otherwise numerically (trapz in MATLAB or Gauss-Hermite quadrature method for non-analytical Gaussian integrals <xref rid="pcbi.1003661-Press1" ref-type="bibr">[57]</xref>
). For instance, the observed response probability for model factor S takes the shape: <disp-formula id="pcbi.1003661.e468"><graphic xlink:href="pcbi.1003661.e468.jpg" position="anchor" orientation="portrait"></graphic>
<label>(18)</label>
</disp-formula>
where we are integrating over the hidden variables <inline-formula><inline-graphic xlink:href="pcbi.1003661.e469.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e470.jpg"></inline-graphic>
</inline-formula>
. The target choice distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e471.jpg"></inline-graphic>
</inline-formula>
 depends on the decision-making model component (see e.g. Eqs. 3 and 7). Without loss of generality, we assumed that the observers are not aware of their internal variability. Predictions of model S do not change whether we assume that the observer is aware of his or her measurement error <inline-formula><inline-graphic xlink:href="pcbi.1003661.e472.jpg"></inline-graphic>
</inline-formula>
 or not; differences amount just to redefinitions of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e473.jpg"></inline-graphic>
</inline-formula>
.</p>
<p>For a Gaussian prior with mean <inline-formula><inline-graphic xlink:href="pcbi.1003661.e474.jpg"></inline-graphic>
</inline-formula>
 and variance <inline-formula><inline-graphic xlink:href="pcbi.1003661.e475.jpg"></inline-graphic>
</inline-formula>
, the response probability has the following closed form solution: <disp-formula id="pcbi.1003661.e476"><graphic xlink:href="pcbi.1003661.e476.jpg" position="anchor" orientation="portrait"></graphic>
<label>(19)</label>
</disp-formula>
with<disp-formula id="pcbi.1003661.e477"><graphic xlink:href="pcbi.1003661.e477.jpg" position="anchor" orientation="portrait"></graphic>
<label>(20)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e478.jpg"></inline-graphic>
</inline-formula>
 is the noise parameter of the stochastic posterior in model component SPK (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e479.jpg"></inline-graphic>
</inline-formula>
 for PPM; <inline-formula><inline-graphic xlink:href="pcbi.1003661.e480.jpg"></inline-graphic>
</inline-formula>
 for BDT) and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e481.jpg"></inline-graphic>
</inline-formula>
 is the sensory noise in estimation of the cue position in model S (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e482.jpg"></inline-graphic>
</inline-formula>
 for observer models without cue-estimation noise). For observer models P with noise on the prior, Eq. 19 was numerically integrated over different values of the internal measurement (here corresponding to log <inline-formula><inline-graphic xlink:href="pcbi.1003661.e483.jpg"></inline-graphic>
</inline-formula>
) with a Gauss-Hermite quadrature method <xref rid="pcbi.1003661-Press1" ref-type="bibr">[57]</xref>
.</p>
<p>For non-Gaussian priors there is no closed form solution similar to Eq. 19 and the calculation of the response probability, depending on active model components, may require up to three nested numerical integrations. Therefore, for computational tractability, we occasionally restricted our analysis to a subset of observer models, as indicated in the main text.</p>
<p>For model class PSA (posterior sampling average), the target choice distribution is the probability distribution of the average of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e484.jpg"></inline-graphic>
</inline-formula>
 samples drawn from the posterior distribution. For a posterior that is a mixture of Gaussians and integer <inline-formula><inline-graphic xlink:href="pcbi.1003661.e485.jpg"></inline-graphic>
</inline-formula>
, it is possible to obtain an explicit expression whose number of terms grows exponentially in <inline-formula><inline-graphic xlink:href="pcbi.1003661.e486.jpg"></inline-graphic>
</inline-formula>
. Fortunately, this did not constitute a problem as observer models favored small values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e487.jpg"></inline-graphic>
</inline-formula>
 (also, a Gaussian approximation applies for large values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e488.jpg"></inline-graphic>
</inline-formula>
 due to the central limit theorem). Values of the distribution for non-integer <inline-formula><inline-graphic xlink:href="pcbi.1003661.e489.jpg"></inline-graphic>
</inline-formula>
 were found by linear interpolation between adjacent integer values. For model class LA (Laplace approximation) we found the mode of the posterior numerically <xref rid="pcbi.1003661-CarreiraPerpinan1" ref-type="bibr">[56]</xref>
 and analytically evaluated the second derivative of the log posterior at the mode. The mean of the approximate Gaussian posterior is set to the mode and the variance to minus the inverse of the second derivative <xref rid="pcbi.1003661-MacKay1" ref-type="bibr">[34]</xref>
.</p>
<p>For all models, when using the model-dependent response probability, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e490.jpg"></inline-graphic>
</inline-formula>
trial<inline-formula><inline-graphic xlink:href="pcbi.1003661.e491.jpg"></inline-graphic>
</inline-formula>
, in the model comparison, we added a small regularization term: <disp-formula id="pcbi.1003661.e492"><graphic xlink:href="pcbi.1003661.e492.jpg" position="anchor" orientation="portrait"></graphic>
<label>(21)</label>
</disp-formula>
with <inline-formula><inline-graphic xlink:href="pcbi.1003661.e493.jpg"></inline-graphic>
</inline-formula>
 (the value of the pdf of a normal distribution at 5 SDs from the mean). This change in probability is empirically negligible, but from the point of view of model comparison the regularization term introduces a lower bound <inline-formula><inline-graphic xlink:href="pcbi.1003661.e494.jpg"></inline-graphic>
</inline-formula>
 on the log probability of a single trial, preventing single outliers from having unlimited weight on the log likelihood of a model, increasing therefore the robustness of the inference.</p>
</sec>
<sec id="s4e5"><title>Sampling and model comparison</title>
<p>For each observer model and each subject's dataset (comprised of training and test session) we calculated the posterior distribution of the model parameters given the data, Pr(<inline-formula><inline-graphic xlink:href="pcbi.1003661.e495.jpg"></inline-graphic>
</inline-formula>
 | data, model) <inline-formula><inline-graphic xlink:href="pcbi.1003661.e496.jpg"></inline-graphic>
</inline-formula>
 Pr(data| <inline-formula><inline-graphic xlink:href="pcbi.1003661.e497.jpg"></inline-graphic>
</inline-formula>
, model) Pr(<inline-formula><inline-graphic xlink:href="pcbi.1003661.e498.jpg"></inline-graphic>
</inline-formula>
 | model), where we assumed a factorized prior over parameters, Pr(<inline-formula><inline-graphic xlink:href="pcbi.1003661.e499.jpg"></inline-graphic>
</inline-formula>
 | model)  = <inline-formula><inline-graphic xlink:href="pcbi.1003661.e500.jpg"></inline-graphic>
</inline-formula>
 Pr(<inline-formula><inline-graphic xlink:href="pcbi.1003661.e501.jpg"></inline-graphic>
</inline-formula>
 | model). Having obtained independent measures of typical sensorimotor noise parameters of the subjects in a sensorimotor estimation experiment, we took informative log-normal priors on parameters <inline-formula><inline-graphic xlink:href="pcbi.1003661.e502.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e503.jpg"></inline-graphic>
</inline-formula>
 (when present), with log-scale respectively <inline-formula><inline-graphic xlink:href="pcbi.1003661.e504.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e505.jpg"></inline-graphic>
</inline-formula>
 screen units and shape parameters <inline-formula><inline-graphic xlink:href="pcbi.1003661.e506.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e507.jpg"></inline-graphic>
</inline-formula>
 (see <xref ref-type="supplementary-material" rid="pcbi.1003661.s004">Text S3</xref>
; results did not depend crucially on the shape of the priors). For the other parameters we took a noninformative uniform prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e508.jpg"></inline-graphic>
</inline-formula>
 Uniform[0, 1] (dimensionful parameters were measured in normalized screen units), with the exception of the <inline-formula><inline-graphic xlink:href="pcbi.1003661.e509.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e510.jpg"></inline-graphic>
</inline-formula>
 parameters. The <inline-formula><inline-graphic xlink:href="pcbi.1003661.e511.jpg"></inline-graphic>
</inline-formula>
 parameter that regulates the noise in the prior could occasionally be quite large (see main text) so we adopted a broader range <inline-formula><inline-graphic xlink:href="pcbi.1003661.e512.jpg"></inline-graphic>
</inline-formula>
 Uniform[0, 4] to avoid edge effects. A priori, the <inline-formula><inline-graphic xlink:href="pcbi.1003661.e513.jpg"></inline-graphic>
</inline-formula>
 parameter that governs noise in decision making could take any positive nonzero value (with higher probability mass on lower values), so we assumed a prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e514.jpg"></inline-graphic>
</inline-formula>
 Uniform[0, 1] on <inline-formula><inline-graphic xlink:href="pcbi.1003661.e515.jpg"></inline-graphic>
</inline-formula>
, which is equivalent to a prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e516.jpg"></inline-graphic>
</inline-formula>
, for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e517.jpg"></inline-graphic>
</inline-formula>
. Formally, a value of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e518.jpg"></inline-graphic>
</inline-formula>
 less than one represents a performance more variable than posterior-matching (for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e519.jpg"></inline-graphic>
</inline-formula>
 the posterior distribution tends to a uniform distribution). Results of the model comparison were essentially identical whether we allowed <inline-formula><inline-graphic xlink:href="pcbi.1003661.e520.jpg"></inline-graphic>
</inline-formula>
 to be less than one or not. We took a prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e521.jpg"></inline-graphic>
</inline-formula>
 on the positive real line since it is integrable; an improper prior such as a noninformative prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e522.jpg"></inline-graphic>
</inline-formula>
 is not recommendable in a model comparison between models with non-common parameters due to the ‘marginalization paradox’ <xref rid="pcbi.1003661-Dawid1" ref-type="bibr">[58]</xref>
.</p>
<p>The posterior distribution of the parameters is proportional to the data likelihood, which was computed in logarithmic form as: <disp-formula id="pcbi.1003661.e523"><graphic xlink:href="pcbi.1003661.e523.jpg" position="anchor" orientation="portrait"></graphic>
<label>(22)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e524.jpg"></inline-graphic>
</inline-formula>
 is the regularized probability of response given by Eq. 21, and trial<inline-formula><inline-graphic xlink:href="pcbi.1003661.e525.jpg"></inline-graphic>
</inline-formula>
 represents all the relevant variables of the <inline-formula><inline-graphic xlink:href="pcbi.1003661.e526.jpg"></inline-graphic>
</inline-formula>
-th trial. Eq. 22 assumes that the trials are independent and that subjects' parameters are fixed throughout each session (stationarity). The possibility of dependencies between trials and nonstationarity in the data is explored in Section 6 of <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
.</p>
<p>A convenient way to compute a probability distribution whose unnormalized pdf is known (Eq. 22) is by using a Markov Chain Monte Carlo method (e.g. slice sampling <xref rid="pcbi.1003661-Neal1" ref-type="bibr">[29]</xref>
). For each dataset and model, we ran three parallel chains with different starting points (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e527.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e528.jpg"></inline-graphic>
</inline-formula>
 burn-in samples, <inline-formula><inline-graphic xlink:href="pcbi.1003661.e529.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e530.jpg"></inline-graphic>
</inline-formula>
 saved samples per chain, depending on model complexity) obtaining a total of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e531.jpg"></inline-graphic>
</inline-formula>
 to <inline-formula><inline-graphic xlink:href="pcbi.1003661.e532.jpg"></inline-graphic>
</inline-formula>
 sampled parameter vectors. Marginal pdfs of sampled chains were visually checked for convergence. We also searched for the global minimum of the (minus log) marginal likelihood by running a minimization algorithm (fminsearch in MATLAB) from several starting points (30 to 100 random locations). With this information we verified that, as far as we could tell, the chains were not stuck in a local minimum. Finally, we computed Gelman and Rubin's potential scale reduction statistic <inline-formula><inline-graphic xlink:href="pcbi.1003661.e533.jpg"></inline-graphic>
</inline-formula>
 for all parameters <xref rid="pcbi.1003661-Gelman1" ref-type="bibr">[59]</xref>
. Large values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e534.jpg"></inline-graphic>
</inline-formula>
 indicate convergence problems whereas values close to 1 suggest convergence. Longer chains were run when suspicion of a convergence problem arose from any of these methods. In the end average <inline-formula><inline-graphic xlink:href="pcbi.1003661.e535.jpg"></inline-graphic>
</inline-formula>
 (across parameters, participants and models) was 1.003 and almost all values were <inline-formula><inline-graphic xlink:href="pcbi.1003661.e536.jpg"></inline-graphic>
</inline-formula>
 suggesting good convergence.</p>
<p>Given the parameter samples, we computed the DIC score (deviance information criterion) <xref rid="pcbi.1003661-Spiegelhalter1" ref-type="bibr">[30]</xref>
 for each dataset and model. The DIC score is a metric that combines a goodness of fit term and a penality for model complexity, similarly to other metrics adopted in model comparison, such as Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), with the advantage that DIC takes into account an estimate of the effective complexity of the model and it is particularly easy to compute given a MCMC output. DIC scores are computed as: <disp-formula id="pcbi.1003661.e537"><graphic xlink:href="pcbi.1003661.e537.jpg" position="anchor" orientation="portrait"></graphic>
<label>(23)</label>
</disp-formula>
where <inline-formula><inline-graphic xlink:href="pcbi.1003661.e538.jpg"></inline-graphic>
</inline-formula>
 is the <italic>deviance</italic>
 given parameter vector <inline-formula><inline-graphic xlink:href="pcbi.1003661.e539.jpg"></inline-graphic>
</inline-formula>
, the <inline-formula><inline-graphic xlink:href="pcbi.1003661.e540.jpg"></inline-graphic>
</inline-formula>
 are MCMC parameter samples and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e541.jpg"></inline-graphic>
</inline-formula>
 is a ‘good’ parameter estimate for the model (e.g. the mean, median or another measure of central tendency of the sampled parameters). As a robust estimate of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e542.jpg"></inline-graphic>
</inline-formula>
 we computed a trimmed mean (discarding <inline-formula><inline-graphic xlink:href="pcbi.1003661.e543.jpg"></inline-graphic>
</inline-formula>
 from each side, which eliminated outlier parameter values). DIC scores are meaningful only in a comparison, so we only report DIC scores differences between models (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e544.jpg"></inline-graphic>
</inline-formula>
DIC). Although a difference of 3-7 points is already suggested to be significant <xref rid="pcbi.1003661-Spiegelhalter1" ref-type="bibr">[30]</xref>
, we follow a conservative stance, for which the difference in DIC scores needs to be 10 or more to be considered significant <xref rid="pcbi.1003661-Battaglia1" ref-type="bibr">[33]</xref>
. In Section 4 of <xref ref-type="supplementary-material" rid="pcbi.1003661.s002">Text S1</xref>
 we report a set of model comparisons evaluated in terms of group DIC (GDIC). The assumption of GDIC is that all participants' datasets have been generated by the same observer model, and all subjects contribute equally to the evidence of each model.</p>
<p>In the main text, instead, we compared models according to a hierarchical Bayesian model selection method (BMS) <xref rid="pcbi.1003661-Stephan1" ref-type="bibr">[31]</xref>
 that treats both subjects and models as random factors, that is, multiple observer models may be present in the population. BMS uses an iterative algorithm based on variational inference to compute model evidence from individual subjects' marginal likelihoods (or approximations thereof, such as DIC, with the marginal likelihood being <inline-formula><inline-graphic xlink:href="pcbi.1003661.e545.jpg"></inline-graphic>
</inline-formula>
 DIC). BMS is particularly appealing because it naturally deals with group heterogeneity and outliers. Moreover, the output of the algorithm has an immediate interpretation as the probability that a given model is responsible for generating the data of a randomly chosen subject. BMS also allows to easily compute the cumulative evidence for groups of models and we used this feature to compare distinct levels within factors <xref rid="pcbi.1003661-Stephan1" ref-type="bibr">[31]</xref>
. As a Bayesian metric of significance we report the exceedance probability <inline-formula><inline-graphic xlink:href="pcbi.1003661.e546.jpg"></inline-graphic>
</inline-formula>
 of a model (or model level within a factor) being more likely than any other model (or level). We consider values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e547.jpg"></inline-graphic>
</inline-formula>
 to be significant. The BMS algorithm is typically initialized with a symmetric Dirichlet distribution that represents a prior over model probabilities with no preference for any specific model <xref rid="pcbi.1003661-Stephan1" ref-type="bibr">[31]</xref>
. Since we are comparing a large number of models generated by the factorial method, we chose for the concentration parameter of the Dirichlet distribution a value <inline-formula><inline-graphic xlink:href="pcbi.1003661.e548.jpg"></inline-graphic>
</inline-formula>
 that corresponds to a weak prior belief that only a few observer models are actually present in the population (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e549.jpg"></inline-graphic>
</inline-formula>
 would correspond to the prior belief that only one model is true, similarly to GDIC, and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e550.jpg"></inline-graphic>
</inline-formula>
 that any number of models are true). Results are qualitatively independent of the specific choice of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e551.jpg"></inline-graphic>
</inline-formula>
 for a large range of values.</p>
<p>When looking at alternative models of decision making in our second factorial model comparison, we excluded from the analysis ‘uninteresting’ trials in which the theoretical posterior distribution (Eq. 2 with the true values of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e552.jpg"></inline-graphic>
</inline-formula>
 and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e553.jpg"></inline-graphic>
</inline-formula>
) was too close in shape to a Gaussian; since predictions of these models are identical for Gaussian posteriors, Gaussian trials constitute only a confound for the model comparison. A posterior distribution was considered ‘too close’ to a Gaussian if the Kullback-Leibler divergence between a Gaussian approximation with matching low-order moments and the full posterior was less than a threshold value of 0.02 nats (results were qualitatively independent of the chosen threshold). In general, this preprocessing step removed about 45–60% of trials from unimodal and bimodal sessions (clearly, Gaussian sessions were automatically excluded).</p>
</sec>
<sec id="s4e6"><title>Nonparametric reconstruction of the priors</title>
<p>We reconstructed the group priors as a means to visualize the subjects’ common systematic biases under a specific observer model (SPK-L). Each group prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e554.jpg"></inline-graphic>
</inline-formula>
 was ‘nonparametrically’ represented by a mixture of Gaussians with a large number of components (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e555.jpg"></inline-graphic>
</inline-formula>
). The components' means were equally spaced on a grid that spanned the range of the discrete representation of the prior; SDs were equal to the grid spacing. The mixing weights <inline-formula><inline-graphic xlink:href="pcbi.1003661.e556.jpg"></inline-graphic>
</inline-formula>
 were free to vary to define the shape of the prior (we enforced symmetric values on symmetric distributions, and the sum of the weigths to be one). The representation of the prior as a mixture of Gaussians allowed us to cover a large class of smooth distributions using the same framework as the rest of our study.</p>
<p>For this analysis we fixed subjects' parameters to the values inferred in our main model comparison for model SPK-L (i.e. to the robust means of the posterior of the parameters). For each prior in each group (Gaussian, unimodal and bimodal test sessions), we simultaneously inferred the shape of the nonparametric prior that explained each subject's dataset, assuming the same distribution <inline-formula><inline-graphic xlink:href="pcbi.1003661.e557.jpg"></inline-graphic>
</inline-formula>
 for all subjects. Specifically, we sampled from the posterior distribution of the parameters of the group priors, Pr(<inline-formula><inline-graphic xlink:href="pcbi.1003661.e558.jpg"></inline-graphic>
</inline-formula>
 data), with a flat prior over log values of the mixing weights <inline-formula><inline-graphic xlink:href="pcbi.1003661.e559.jpg"></inline-graphic>
</inline-formula>
. We ran 5 parallel chains with a burn-in of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e560.jpg"></inline-graphic>
</inline-formula>
 samples and <inline-formula><inline-graphic xlink:href="pcbi.1003661.e561.jpg"></inline-graphic>
</inline-formula>
 samples per chain, for a total of <inline-formula><inline-graphic xlink:href="pcbi.1003661.e562.jpg"></inline-graphic>
</inline-formula>
 sampled vectors of mixing weights (see previous section for details on sampling). Each sampled vector of mixing weights corresponds to a prior <inline-formula><inline-graphic xlink:href="pcbi.1003661.e563.jpg"></inline-graphic>
</inline-formula>
, for <inline-formula><inline-graphic xlink:href="pcbi.1003661.e564.jpg"></inline-graphic>
</inline-formula>
. Purple lines in <xref ref-type="fig" rid="pcbi-1003661-g012">Figure 12</xref>
 show the mean (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e565.jpg"></inline-graphic>
</inline-formula>
 1 SD) of the sampled priors, that is the average reconstructed priors (smoothed with a small Gaussian kernel for visualization purposes). For each sampled prior we also computed the first four central moments (mean, variance, skewness and kurtosis) and calculated the posterior average of the moments (see <xref ref-type="fig" rid="pcbi-1003661-g012">Figure 12</xref>
).</p>
</sec>
<sec id="s4e7"><title>Statistical analyses</title>
<p>All regressions in our analyses used a robust procedure, computed using Tukey's ‘bisquare’ weighting function (robustfit in MATLAB). Robust means were computed as trimmed means, discarding <inline-formula><inline-graphic xlink:href="pcbi.1003661.e566.jpg"></inline-graphic>
</inline-formula>
 of values from each side of the sample. Statistical differences were assessed using repeated-measures ANOVA (rm-ANOVA) with Greenhouse-Geisser correction of the degrees of freedom in order to account for deviations from sphericity <xref rid="pcbi.1003661-Greenhouse1" ref-type="bibr">[60]</xref>
. A logit transform was applied to the optimality index measure before performing rm-ANOVA, in order to improve normality of the data (results were qualitatively similar for non-transformed data). Nonlinear kernel regression estimates to visualize mean data (<xref ref-type="fig" rid="pcbi-1003661-g003">Figure 3</xref>
 and <xref ref-type="fig" rid="pcbi-1003661-g006">6</xref>
) were computed with a Nadaraya-Watson estimator with rule-of-thumb bandwidth <xref rid="pcbi.1003661-Hrdle1" ref-type="bibr">[61]</xref>
. For all analyses the criterion for statistical significance was <inline-formula><inline-graphic xlink:href="pcbi.1003661.e567.jpg"></inline-graphic>
</inline-formula>
.</p>
</sec>
</sec>
</sec>
<sec sec-type="supplementary-material" id="s5"><title>Supporting Information</title>
<supplementary-material content-type="local-data" id="pcbi.1003661.s001"><label>Dataset S1</label>
<caption><p><bold>Subject's datasets.</bold>
 Subjects' datasets for the main experiment (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e568.jpg"></inline-graphic>
</inline-formula>
, training and test sessions) and for the sensorimotor estimation experiment (<inline-formula><inline-graphic xlink:href="pcbi.1003661.e569.jpg"></inline-graphic>
</inline-formula>
), with relevant metadata, in a single MATLAB data file.</p>
<p>(ZIP)</p>
</caption>
<media xlink:href="pcbi.1003661.s001.zip"><caption><p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pcbi.1003661.s002"><label>Text S1</label>
<caption><p><bold>Additional analyses and observer models.</bold>
 This supporting text includes sections on: Translational invariance of subjects' behavior; Success probability; Inverted Gaussian loss function; Model comparison with DIC; Model comparison for different shared parameters between sessions; Nonstationary analysis.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pcbi.1003661.s002.pdf"><caption><p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pcbi.1003661.s003"><label>Text S2</label>
<caption><p><bold>Noisy probabilistic inference.</bold>
 Description of the models of stochastic probabilistic inference (‘noisy posterior’ and ‘sample-based posterior’) and discussion about unstructured noise in the prior.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pcbi.1003661.s003.pdf"><caption><p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
<supplementary-material content-type="local-data" id="pcbi.1003661.s004"><label>Text S3</label>
<caption><p><bold>Sensorimotor estimation experiment.</bold>
<xref ref-type="sec" rid="s4">Methods</xref>
 and results of the additional experiment to estimate the range of subjects' sensorimotor parameters.</p>
<p>(PDF)</p>
</caption>
<media xlink:href="pcbi.1003661.s004.pdf"><caption><p>Click here for additional data file.</p>
</caption>
</media>
</supplementary-material>
</sec>
</body>
<back><ack><p>We thank Sophie Denève, Jan Drugowitsch, Megan A. K. Peters, Paul R. Schrater and Angela J. Yu for useful discussions, Sae Franklin for assistance with the experiments and James Ingram for technical assistance. We also thank the editor and two anonymous reviewers for helpful feedback.</p>
</ack>
<ref-list><title>References</title>
<ref id="pcbi.1003661-Weiss1"><label>1</label>
<mixed-citation publication-type="journal"><name><surname>Weiss</surname>
<given-names>Y</given-names>
</name>
, <name><surname>Simoncelli</surname>
<given-names>EP</given-names>
</name>
, <name><surname>Adelson</surname>
<given-names>EH</given-names>
</name>
 (<year>2002</year>
) <article-title>Motion illusions as optimal percepts</article-title>
. <source>Nat Neurosci</source>
<volume>5</volume>
: <fpage>598</fpage>
–<lpage>604</lpage>
.<pub-id pub-id-type="pmid">12021763</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Stocker1"><label>2</label>
<mixed-citation publication-type="journal"><name><surname>Stocker</surname>
<given-names>AA</given-names>
</name>
, <name><surname>Simoncelli</surname>
<given-names>EP</given-names>
</name>
 (<year>2006</year>
) <article-title>Noise characteristics and prior expectations in human visual speed perception</article-title>
. <source>Nat Neurosci</source>
<volume>9</volume>
: <fpage>578</fpage>
–<lpage>585</lpage>
.<pub-id pub-id-type="pmid">16547513</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Girshick1"><label>3</label>
<mixed-citation publication-type="journal"><name><surname>Girshick</surname>
<given-names>A</given-names>
</name>
, <name><surname>Landy</surname>
<given-names>M</given-names>
</name>
, <name><surname>Simoncelli</surname>
<given-names>E</given-names>
</name>
 (<year>2011</year>
) <article-title>Cardinal rules: visual orientation perception reflects knowledge of environmental statistics</article-title>
. <source>Nat Neurosci</source>
<volume>14</volume>
: <fpage>926</fpage>
–<lpage>932</lpage>
.<pub-id pub-id-type="pmid">21642976</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Chalk1"><label>4</label>
<mixed-citation publication-type="journal"><name><surname>Chalk</surname>
<given-names>M</given-names>
</name>
, <name><surname>Seitz</surname>
<given-names>A</given-names>
</name>
, <name><surname>Seriès</surname>
<given-names>P</given-names>
</name>
 (<year>2010</year>
) <article-title>Rapidly learned stimulus expectations alter perception of motion</article-title>
. <source>J Vis</source>
<volume>10</volume>
: <fpage>1</fpage>
–<lpage>18</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Miyazaki1"><label>5</label>
<mixed-citation publication-type="journal"><name><surname>Miyazaki</surname>
<given-names>M</given-names>
</name>
, <name><surname>Nozaki</surname>
<given-names>D</given-names>
</name>
, <name><surname>Nakajima</surname>
<given-names>Y</given-names>
</name>
 (<year>2005</year>
) <article-title>Testing bayesian models of human coincidence timing</article-title>
. <source>J Neurophysiol</source>
<volume>94</volume>
: <fpage>395</fpage>
–<lpage>399</lpage>
.<pub-id pub-id-type="pmid">15716368</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Jazayeri1"><label>6</label>
<mixed-citation publication-type="journal"><name><surname>Jazayeri</surname>
<given-names>M</given-names>
</name>
, <name><surname>Shadlen</surname>
<given-names>MN</given-names>
</name>
 (<year>2010</year>
) <article-title>Temporal context calibrates interval timing</article-title>
. <source>Nat Neurosci</source>
<volume>13</volume>
: <fpage>1020</fpage>
–<lpage>1026</lpage>
.<pub-id pub-id-type="pmid">20581842</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Ahrens1"><label>7</label>
<mixed-citation publication-type="journal"><name><surname>Ahrens</surname>
<given-names>MB</given-names>
</name>
, <name><surname>Sahani</surname>
<given-names>M</given-names>
</name>
 (<year>2011</year>
) <article-title>Observers exploit stochastic models of sensory change to help judge the passage of time</article-title>
. <source>Curr Biol</source>
<volume>21</volume>
: <fpage>200</fpage>
–<lpage>206</lpage>
.<pub-id pub-id-type="pmid">21256018</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Acerbi1"><label>8</label>
<mixed-citation publication-type="journal"><name><surname>Acerbi</surname>
<given-names>L</given-names>
</name>
, <name><surname>Wolpert</surname>
<given-names>DM</given-names>
</name>
, <name><surname>Vijayakumar</surname>
<given-names>S</given-names>
</name>
 (<year>2012</year>
) <article-title>Internal representations of temporal statistics and feedback calibrate motor-sensory interval timing</article-title>
. <source>PLoS Comput Biol</source>
<volume>8</volume>
: <fpage>e1002771</fpage>
.<pub-id pub-id-type="pmid">23209386</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Kording1"><label>9</label>
<mixed-citation publication-type="journal"><name><surname>Kording</surname>
<given-names>KP</given-names>
</name>
, <name><surname>Wolpert</surname>
<given-names>DM</given-names>
</name>
 (<year>2004</year>
) <article-title>Bayesian integration in sensorimotor learning</article-title>
. <source>Nature</source>
<volume>427</volume>
: <fpage>244</fpage>
–<lpage>247</lpage>
.<pub-id pub-id-type="pmid">14724638</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Tassinari1"><label>10</label>
<mixed-citation publication-type="journal"><name><surname>Tassinari</surname>
<given-names>H</given-names>
</name>
, <name><surname>Hudson</surname>
<given-names>T</given-names>
</name>
, <name><surname>Landy</surname>
<given-names>M</given-names>
</name>
 (<year>2006</year>
) <article-title>Combining priors and noisy visual cues in a rapid pointing task</article-title>
. <source>J Neurosci</source>
<volume>26</volume>
: <fpage>10154</fpage>
–<lpage>10163</lpage>
.<pub-id pub-id-type="pmid">17021171</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Berniker1"><label>11</label>
<mixed-citation publication-type="journal"><name><surname>Berniker</surname>
<given-names>M</given-names>
</name>
, <name><surname>Voss</surname>
<given-names>M</given-names>
</name>
, <name><surname>Kording</surname>
<given-names>K</given-names>
</name>
 (<year>2010</year>
) <article-title>Learning priors for bayesian computations in the nervous system</article-title>
. <source>PLoS One</source>
<volume>5</volume>
: <fpage>e12686</fpage>
.<pub-id pub-id-type="pmid">20844766</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Adams1"><label>12</label>
<mixed-citation publication-type="journal"><name><surname>Adams</surname>
<given-names>WJ</given-names>
</name>
, <name><surname>Graf</surname>
<given-names>EW</given-names>
</name>
, <name><surname>Ernst</surname>
<given-names>MO</given-names>
</name>
 (<year>2004</year>
) <article-title>Experience can change the ‘light-from-above’ prior</article-title>
. <source>Nature</source>
<volume>7</volume>
: <fpage>1057</fpage>
–<lpage>1058</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Sotiropoulos1"><label>13</label>
<mixed-citation publication-type="journal"><name><surname>Sotiropoulos</surname>
<given-names>G</given-names>
</name>
, <name><surname>Seitz</surname>
<given-names>A</given-names>
</name>
, <name><surname>Seriès</surname>
<given-names>P</given-names>
</name>
 (<year>2011</year>
) <article-title>Changing expectations about speed alters perceived motion direction</article-title>
. <source>Curr Biol</source>
<volume>21</volume>
: <fpage>R883</fpage>
–<lpage>R884</lpage>
.<pub-id pub-id-type="pmid">22075425</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Kording2"><label>14</label>
<mixed-citation publication-type="journal"><name><surname>Kording</surname>
<given-names>K</given-names>
</name>
, <name><surname>Wolpert</surname>
<given-names>D</given-names>
</name>
 (<year>2006</year>
) <article-title>Bayesian decision theory in sensorimotor control</article-title>
. <source>Trends Cogn Sci</source>
<volume>10</volume>
: <fpage>319</fpage>
–<lpage>326</lpage>
.<pub-id pub-id-type="pmid">16807063</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Trommershuser1"><label>15</label>
<mixed-citation publication-type="journal"><name><surname>Trommershäuser</surname>
<given-names>J</given-names>
</name>
, <name><surname>Maloney</surname>
<given-names>L</given-names>
</name>
, <name><surname>Landy</surname>
<given-names>M</given-names>
</name>
 (<year>2008</year>
) <article-title>Decision making, movement planning and statistical decision theory</article-title>
. <source>Trends Cogn Sci</source>
<volume>12</volume>
: <fpage>291</fpage>
–<lpage>297</lpage>
.<pub-id pub-id-type="pmid">18614390</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Sundareswara1"><label>16</label>
<mixed-citation publication-type="journal"><name><surname>Sundareswara</surname>
<given-names>R</given-names>
</name>
, <name><surname>Schrater</surname>
<given-names>PR</given-names>
</name>
 (<year>2008</year>
) <article-title>Perceptual multistability predicted by search model for bayesian decisions</article-title>
. <source>J Vis</source>
<volume>8</volume>
: <fpage>1</fpage>
–<lpage>19</lpage>
.<pub-id pub-id-type="pmid">18842083</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Vul1"><label>17</label>
<mixed-citation publication-type="journal"><name><surname>Vul</surname>
<given-names>E</given-names>
</name>
, <name><surname>Pashler</surname>
<given-names>H</given-names>
</name>
 (<year>2008</year>
) <article-title>Measuring the crowd within: Probabilistic representations within individuals</article-title>
. <source>Psychol Sci</source>
<volume>19</volume>
: <fpage>645</fpage>
–<lpage>647</lpage>
.<pub-id pub-id-type="pmid">18727777</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Vul2"><label>18</label>
<mixed-citation publication-type="journal">Vul E, Goodman ND, Griffiths TL, Tenenbaum JB (2009) One and done? optimal decisions from very few samples. In: Proceedings of the 31st annual conference of the cognitive science society. <volume>volume 1</volume>
 , pp. 66–72.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Fiser1"><label>19</label>
<mixed-citation publication-type="journal"><name><surname>Fiser</surname>
<given-names>J</given-names>
</name>
, <name><surname>Berkes</surname>
<given-names>P</given-names>
</name>
, <name><surname>Orbán</surname>
<given-names>G</given-names>
</name>
, <name><surname>Lengyel</surname>
<given-names>M</given-names>
</name>
 (<year>2010</year>
) <article-title>Statistically optimal perception and learning: from behavior to neural representations</article-title>
. <source>Trends Cogn Sci</source>
<volume>14</volume>
: <fpage>119</fpage>
–<lpage>130</lpage>
.<pub-id pub-id-type="pmid">20153683</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Gekas1"><label>20</label>
<mixed-citation publication-type="journal"><name><surname>Gekas</surname>
<given-names>N</given-names>
</name>
, <name><surname>Chalk</surname>
<given-names>M</given-names>
</name>
, <name><surname>Seitz</surname>
<given-names>AR</given-names>
</name>
, <name><surname>Seriès</surname>
<given-names>P</given-names>
</name>
 (<year>2013</year>
) <article-title>Complexity and specificity of experimentally-induced expectations in motion perception</article-title>
. <source>J Vis</source>
<volume>13</volume>
: <fpage>1</fpage>
–<lpage>18</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-vandenBerg1"><label>21</label>
<mixed-citation publication-type="journal"><name><surname>van den Berg</surname>
<given-names>R</given-names>
</name>
, <name><surname>Awh</surname>
<given-names>E</given-names>
</name>
, <name><surname>Ma</surname>
<given-names>WJ</given-names>
</name>
 (<year>2014</year>
) <article-title>Factorial comparison of working memory models</article-title>
. <source>Psychol Rev</source>
<volume>121</volume>
: <fpage>124</fpage>
–<lpage>149</lpage>
.<pub-id pub-id-type="pmid">24490791</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Krding1"><label>22</label>
<mixed-citation publication-type="journal"><name><surname>Körding</surname>
<given-names>KP</given-names>
</name>
, <name><surname>Wolpert</surname>
<given-names>DM</given-names>
</name>
 (<year>2004</year>
) <article-title>The loss function of sensorimotor learning</article-title>
. <source>Proc Natl Acad Sci U S A</source>
<volume>101</volume>
: <fpage>9839</fpage>
–<lpage>9842</lpage>
.<pub-id pub-id-type="pmid">15210973</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Hudson1"><label>23</label>
<mixed-citation publication-type="journal"><name><surname>Hudson</surname>
<given-names>TE</given-names>
</name>
, <name><surname>Maloney</surname>
<given-names>LT</given-names>
</name>
, <name><surname>Landy</surname>
<given-names>MS</given-names>
</name>
 (<year>2007</year>
) <article-title>Movement planning with probabilistic target information</article-title>
. <source>J Neurophysiol</source>
<volume>98</volume>
: <fpage>3034</fpage>
–<lpage>3046</lpage>
.<pub-id pub-id-type="pmid">17898140</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Campbell1"><label>24</label>
<mixed-citation publication-type="journal"><name><surname>Campbell</surname>
<given-names>L</given-names>
</name>
 (<year>1966</year>
) <article-title>Exponential entropy as a measure of extent of a distribution</article-title>
. <source>Probab Theory Rel</source>
<volume>5</volume>
: <fpage>217</fpage>
–<lpage>225</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Mamassian1"><label>25</label>
<mixed-citation publication-type="other">Mamassian P, Landy MS, Maloney LT (2002) Bayesian modelling of visual perception. In: Rao R, Olshausen B, Lewicki M, editors, Probabilistic models of the brain: Perception and neural function, MIT Press. pp. 13–36.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Wozny1"><label>26</label>
<mixed-citation publication-type="journal"><name><surname>Wozny</surname>
<given-names>DR</given-names>
</name>
, <name><surname>Beierholm</surname>
<given-names>UR</given-names>
</name>
, <name><surname>Shams</surname>
<given-names>L</given-names>
</name>
 (<year>2010</year>
) <article-title>Probability matching as a computational strategy used in perception</article-title>
. <source>PLoS Comput Biol</source>
<volume>6</volume>
: <fpage>e1000871</fpage>
.<pub-id pub-id-type="pmid">20700493</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Zhang1"><label>27</label>
<mixed-citation publication-type="journal"><name><surname>Zhang</surname>
<given-names>H</given-names>
</name>
, <name><surname>Maloney</surname>
<given-names>L</given-names>
</name>
 (<year>2012</year>
) <article-title>Ubiquitous log odds: a common representation of probability and frequency distortion in perception, action, and cognition</article-title>
. <source>Front Neurosci</source>
<volume>6</volume>
: <fpage>1</fpage>
–<lpage>14</lpage>
.<pub-id pub-id-type="pmid">22294978</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Wichmann1"><label>28</label>
<mixed-citation publication-type="journal"><name><surname>Wichmann</surname>
<given-names>FA</given-names>
</name>
, <name><surname>Hill</surname>
<given-names>NJ</given-names>
</name>
 (<year>2001</year>
) <article-title>The psychometric function: I. fitting, sampling, and goodness of fit</article-title>
. <source>Percept Psychophys</source>
<volume>63</volume>
: <fpage>1293</fpage>
–<lpage>1313</lpage>
.<pub-id pub-id-type="pmid">11800458</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Neal1"><label>29</label>
<mixed-citation publication-type="journal"><name><surname>Neal</surname>
<given-names>R</given-names>
</name>
 (<year>2003</year>
) <article-title>Slice sampling</article-title>
. <source>Ann Stat</source>
<volume>31</volume>
: <fpage>705</fpage>
–<lpage>741</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Spiegelhalter1"><label>30</label>
<mixed-citation publication-type="journal"><name><surname>Spiegelhalter</surname>
<given-names>DJ</given-names>
</name>
, <name><surname>Best</surname>
<given-names>NG</given-names>
</name>
, <name><surname>Carlin</surname>
<given-names>BP</given-names>
</name>
, <name><surname>Van Der Linde</surname>
<given-names>A</given-names>
</name>
 (<year>2002</year>
) <article-title>Bayesian measures of model complexity and fit</article-title>
. <source>J R Stat Soc B</source>
<volume>64</volume>
: <fpage>583</fpage>
–<lpage>639</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Stephan1"><label>31</label>
<mixed-citation publication-type="journal"><name><surname>Stephan</surname>
<given-names>KE</given-names>
</name>
, <name><surname>Penny</surname>
<given-names>WD</given-names>
</name>
, <name><surname>Daunizeau</surname>
<given-names>J</given-names>
</name>
, <name><surname>Moran</surname>
<given-names>RJ</given-names>
</name>
, <name><surname>Friston</surname>
<given-names>KJ</given-names>
</name>
 (<year>2009</year>
) <article-title>Bayesian model selection for group studies</article-title>
. <source>Neuroimage</source>
<volume>46</volume>
: <fpage>1004</fpage>
–<lpage>1017</lpage>
.<pub-id pub-id-type="pmid">19306932</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Kass1"><label>32</label>
<mixed-citation publication-type="journal"><name><surname>Kass</surname>
<given-names>RE</given-names>
</name>
, <name><surname>Raftery</surname>
<given-names>AE</given-names>
</name>
 (<year>1995</year>
) <article-title>Bayes factors</article-title>
. <source>J Am Stat Assoc</source>
<volume>90</volume>
: <fpage>773</fpage>
–<lpage>795</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Battaglia1"><label>33</label>
<mixed-citation publication-type="journal"><name><surname>Battaglia</surname>
<given-names>PW</given-names>
</name>
, <name><surname>Kersten</surname>
<given-names>D</given-names>
</name>
, <name><surname>Schrater</surname>
<given-names>PR</given-names>
</name>
 (<year>2011</year>
) <article-title>How haptic size sensations improve distance perception</article-title>
. <source>PLoS Comput Biol</source>
<volume>7</volume>
: <fpage>e1002080</fpage>
.<pub-id pub-id-type="pmid">21738457</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-MacKay1"><label>34</label>
<mixed-citation publication-type="other">MacKay DJ (2003) Information theory, inference and learning algorithms. Cambridge University Press.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Battaglia2"><label>35</label>
<mixed-citation publication-type="journal"><name><surname>Battaglia</surname>
<given-names>PW</given-names>
</name>
, <name><surname>Hamrick</surname>
<given-names>JB</given-names>
</name>
, <name><surname>Tenenbaum</surname>
<given-names>JB</given-names>
</name>
 (<year>2013</year>
) <article-title>Simulation as an engine of physical scene understanding</article-title>
. <source>Proc Natl Acad Sci U S A</source>
<volume>110</volume>
: <fpage>18327</fpage>
–<lpage>18332</lpage>
.<pub-id pub-id-type="pmid">24145417</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Dakin1"><label>36</label>
<mixed-citation publication-type="journal"><name><surname>Dakin</surname>
<given-names>SC</given-names>
</name>
, <name><surname>Tibber</surname>
<given-names>MS</given-names>
</name>
, <name><surname>Greenwood</surname>
<given-names>JA</given-names>
</name>
, <name><surname>Morgan</surname>
<given-names>MJ</given-names>
</name>
, <etal>et al</etal>
 (<year>2011</year>
) <article-title>A common visual metric for approximate number and density</article-title>
. <source>Proc Natl Acad Sci USA</source>
<volume>108</volume>
: <fpage>19552</fpage>
–<lpage>19557</lpage>
.<pub-id pub-id-type="pmid">22106276</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Kuss1"><label>37</label>
<mixed-citation publication-type="journal"><name><surname>Kuss</surname>
<given-names>M</given-names>
</name>
, <name><surname>Jäkel</surname>
<given-names>F</given-names>
</name>
, <name><surname>Wichmann</surname>
<given-names>FA</given-names>
</name>
 (<year>2005</year>
) <article-title>Bayesian inference for psychometric functions</article-title>
. <source>J Vis</source>
<volume>5</volume>
: <fpage>478</fpage>
–<lpage>492</lpage>
.<pub-id pub-id-type="pmid">16097878</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Kahneman1"><label>38</label>
<mixed-citation publication-type="journal"><name><surname>Kahneman</surname>
<given-names>D</given-names>
</name>
, <name><surname>Tversky</surname>
<given-names>A</given-names>
</name>
 (<year>1979</year>
) <article-title>Prospect theory: An analysis of decision under risk</article-title>
. <source>Econometrica</source>
<volume>47</volume>
: <fpage>263</fpage>
–<lpage>291</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Tversky1"><label>39</label>
<mixed-citation publication-type="journal"><name><surname>Tversky</surname>
<given-names>A</given-names>
</name>
, <name><surname>Kahneman</surname>
<given-names>D</given-names>
</name>
 (<year>1992</year>
) <article-title>Advances in prospect theory: Cumulative representation of uncertainty</article-title>
. <source>J Risk Uncertainty</source>
<volume>5</volume>
: <fpage>297</fpage>
–<lpage>323</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Feldman1"><label>40</label>
<mixed-citation publication-type="journal"><name><surname>Feldman</surname>
<given-names>J</given-names>
</name>
 (<year>2013</year>
) <article-title>Tuning your priors to the world</article-title>
. <source>Top Cogn Sci</source>
<volume>5</volume>
: <fpage>13</fpage>
–<lpage>34</lpage>
.<pub-id pub-id-type="pmid">23335572</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Mamassian2"><label>41</label>
<mixed-citation publication-type="journal"><name><surname>Mamassian</surname>
<given-names>P</given-names>
</name>
 (<year>2008</year>
) <article-title>Overconfidence in an objective anticipatory motor task</article-title>
. <source>Psychol Sci</source>
<volume>19</volume>
: <fpage>601</fpage>
–<lpage>606</lpage>
.<pub-id pub-id-type="pmid">18578851</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Zhang2"><label>42</label>
<mixed-citation publication-type="journal"><name><surname>Zhang</surname>
<given-names>H</given-names>
</name>
, <name><surname>Morvan</surname>
<given-names>C</given-names>
</name>
, <name><surname>Maloney</surname>
<given-names>LT</given-names>
</name>
 (<year>2010</year>
) <article-title>Gambling in the visual periphery: a conjoint-measurement analysis of human ability to judge visual uncertainty</article-title>
. <source>PLoS Comput Biol</source>
<volume>6</volume>
: <fpage>e1001023</fpage>
.<pub-id pub-id-type="pmid">21152007</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Zhang3"><label>43</label>
<mixed-citation publication-type="journal"><name><surname>Zhang</surname>
<given-names>H</given-names>
</name>
, <name><surname>Daw</surname>
<given-names>ND</given-names>
</name>
, <name><surname>Maloney</surname>
<given-names>LT</given-names>
</name>
 (<year>2013</year>
) <article-title>Testing whether humans have an accurate model of their own motor uncertainty in a speeded reaching task</article-title>
. <source>PLoS Comput Biol</source>
<volume>9</volume>
: <fpage>e1003080</fpage>
.<pub-id pub-id-type="pmid">23717198</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Trommershuser2"><label>44</label>
<mixed-citation publication-type="journal"><name><surname>Trommershäuser</surname>
<given-names>J</given-names>
</name>
, <name><surname>Gepshtein</surname>
<given-names>S</given-names>
</name>
, <name><surname>Maloney</surname>
<given-names>LT</given-names>
</name>
, <name><surname>Landy</surname>
<given-names>MS</given-names>
</name>
, <name><surname>Banks</surname>
<given-names>MS</given-names>
</name>
 (<year>2005</year>
) <article-title>Optimal compensation for changes in task-relevant movement variability</article-title>
. <source>J Neurosci</source>
<volume>25</volume>
: <fpage>7169</fpage>
–<lpage>7178</lpage>
.<pub-id pub-id-type="pmid">16079399</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Gepshtein1"><label>45</label>
<mixed-citation publication-type="journal"><name><surname>Gepshtein</surname>
<given-names>S</given-names>
</name>
, <name><surname>Seydell</surname>
<given-names>A</given-names>
</name>
, <name><surname>Trommershäuser</surname>
<given-names>J</given-names>
</name>
 (<year>2007</year>
) <article-title>Optimality of human movement under natural variations of visual-motor uncertainty</article-title>
. <source>J Vis</source>
<volume>7</volume>
: <fpage>1</fpage>
–<lpage>18</lpage>
.<pub-id pub-id-type="pmid">18217853</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Drugowitsch1"><label>46</label>
<mixed-citation publication-type="other">Drugowitsch J, Wyarta V, Koechlin E (2014). The origin and structure of behavioral variability in perceptual decision-making. Cosyne Abstracts 2014, Salt Lake City USA.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Faisal1"><label>47</label>
<mixed-citation publication-type="journal"><name><surname>Faisal</surname>
<given-names>AA</given-names>
</name>
, <name><surname>Selen</surname>
<given-names>LP</given-names>
</name>
, <name><surname>Wolpert</surname>
<given-names>DM</given-names>
</name>
 (<year>2008</year>
) <article-title>Noise in the nervous system</article-title>
. <source>Nat Rev Neurosci</source>
<volume>9</volume>
: <fpage>292</fpage>
–<lpage>303</lpage>
.<pub-id pub-id-type="pmid">18319728</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Ma1"><label>48</label>
<mixed-citation publication-type="journal"><name><surname>Ma</surname>
<given-names>WJ</given-names>
</name>
, <name><surname>Beck</surname>
<given-names>JM</given-names>
</name>
, <name><surname>Latham</surname>
<given-names>PE</given-names>
</name>
, <name><surname>Pouget</surname>
<given-names>A</given-names>
</name>
 (<year>2006</year>
) <article-title>Bayesian inference with probabilistic population codes</article-title>
. <source>Nat Neurosci</source>
<volume>9</volume>
: <fpage>1432</fpage>
–<lpage>1438</lpage>
.<pub-id pub-id-type="pmid">17057707</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Beck1"><label>49</label>
<mixed-citation publication-type="journal"><name><surname>Beck</surname>
<given-names>JM</given-names>
</name>
, <name><surname>Ma</surname>
<given-names>WJ</given-names>
</name>
, <name><surname>Pitkow</surname>
<given-names>X</given-names>
</name>
, <name><surname>Latham</surname>
<given-names>PE</given-names>
</name>
, <name><surname>Pouget</surname>
<given-names>A</given-names>
</name>
 (<year>2012</year>
) <article-title>Not noisy, just wrong: the role of suboptimal inference in behavioral variability</article-title>
. <source>Neuron</source>
<volume>74</volume>
: <fpage>30</fpage>
–<lpage>39</lpage>
.<pub-id pub-id-type="pmid">22500627</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Sutton1"><label>50</label>
<mixed-citation publication-type="other">Sutton RS, Barto AG (1998) Reinforcement learning: An introduction. MIT press.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Gaissmaier1"><label>51</label>
<mixed-citation publication-type="journal"><name><surname>Gaissmaier</surname>
<given-names>W</given-names>
</name>
 (<year>2008</year>
) <collab>Schooler LJ</collab>
 (<year>2008</year>
) <article-title>The smart potential behind probability matching</article-title>
. <source>Cognition</source>
<volume>109</volume>
: <fpage>416</fpage>
–<lpage>422</lpage>
.<pub-id pub-id-type="pmid">19019351</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Green1"><label>52</label>
<mixed-citation publication-type="journal"><name><surname>Green</surname>
<given-names>C</given-names>
</name>
, <name><surname>Benson</surname>
<given-names>C</given-names>
</name>
, <name><surname>Kersten</surname>
<given-names>D</given-names>
</name>
, <name><surname>Schrater</surname>
<given-names>P</given-names>
</name>
 (<year>2010</year>
) <article-title>Alterations in choice behavior by manipulations of world model</article-title>
. <source>Proc Natl Acad Sci U S A</source>
<volume>107</volume>
: <fpage>16401</fpage>
–<lpage>16406</lpage>
.<pub-id pub-id-type="pmid">20805507</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Oldfield1"><label>53</label>
<mixed-citation publication-type="journal"><name><surname>Oldfield</surname>
<given-names>RC</given-names>
</name>
 (<year>1971</year>
) <article-title>The assessment and analysis of handedness: the edinburgh inventory</article-title>
. <source>Neuropsychologia</source>
<volume>9</volume>
: <fpage>97</fpage>
–<lpage>113</lpage>
.<pub-id pub-id-type="pmid">5146491</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Howard1"><label>54</label>
<mixed-citation publication-type="journal"><name><surname>Howard</surname>
<given-names>IS</given-names>
</name>
, <name><surname>Ingram</surname>
<given-names>JN</given-names>
</name>
, <name><surname>Wolpert</surname>
<given-names>DM</given-names>
</name>
 (<year>2009</year>
) <article-title>A modular planar robotic manipulandum with endpoint torque control</article-title>
. <source>J Neurosci Methods</source>
<volume>181</volume>
: <fpage>199</fpage>
–<lpage>211</lpage>
.<pub-id pub-id-type="pmid">19450621</pub-id>
</mixed-citation>
</ref>
<ref id="pcbi.1003661-Teuscher1"><label>55</label>
<mixed-citation publication-type="journal"><name><surname>Teuscher</surname>
<given-names>F</given-names>
</name>
, <name><surname>Guiard</surname>
<given-names>V</given-names>
</name>
 (<year>1995</year>
) <article-title>Sharp inequalities between skewness and kurtosis for unimodal distributions</article-title>
. <source>Stat Probabil Lett</source>
<volume>22</volume>
: <fpage>257</fpage>
–<lpage>260</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-CarreiraPerpinan1"><label>56</label>
<mixed-citation publication-type="journal"><name><surname>Carreira-Perpinan</surname>
<given-names>MA</given-names>
</name>
 (<year>2000</year>
) <article-title>Mode-finding for mixtures of gaussian distributions</article-title>
. <source>IEEE T Pattern Anal</source>
<volume>22</volume>
: <fpage>1318</fpage>
–<lpage>1323</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Press1"><label>57</label>
<mixed-citation publication-type="other">Press WH, Flannery BP, Teukolsky SA, Vetterling WT (2007) Numerical recipes 3rd edition: The art of scientific computing. Cambridge University Press.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Dawid1"><label>58</label>
<mixed-citation publication-type="other">Dawid A, Stone M, Zidek JV (1973) Marginalization paradoxes in bayesian and structural inference. J R Stat Soc B: 189–233.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Gelman1"><label>59</label>
<mixed-citation publication-type="journal"><name><surname>Gelman</surname>
<given-names>A</given-names>
</name>
, <name><surname>Rubin</surname>
<given-names>DB</given-names>
</name>
 (<year>1992</year>
) <article-title>Inference from iterative simulation using multiple sequences</article-title>
. <source>Stat Sci</source>
<volume>7</volume>
: <fpage>457</fpage>
–<lpage>472</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Greenhouse1"><label>60</label>
<mixed-citation publication-type="journal"><name><surname>Greenhouse</surname>
<given-names>SW</given-names>
</name>
, <name><surname>Geisser</surname>
<given-names>S</given-names>
</name>
 (<year>1959</year>
) <article-title>On methods in the analysis of profile data</article-title>
. <source>Psychometrika</source>
<volume>24</volume>
: <fpage>95</fpage>
–<lpage>112</lpage>
.</mixed-citation>
</ref>
<ref id="pcbi.1003661-Hrdle1"><label>61</label>
<mixed-citation publication-type="other">Härdle W, Müller M, Sperlich S, Werwatz A (2004) Nonparametric and semiparametric models, An introduction. Springer.</mixed-citation>
</ref>
</ref-list>
</back>
</pmc>
<affiliations><list><country><li>Royaume-Uni</li>
</country>
<region><li>Angleterre</li>
<li>Angleterre de l'Est</li>
<li>Écosse</li>
</region>
<settlement><li>Cambridge</li>
<li>Édimbourg</li>
</settlement>
<orgName><li>Université d'Édimbourg</li>
<li>Université de Cambridge</li>
</orgName>
</list>
<tree><country name="Royaume-Uni"><region name="Écosse"><name sortKey="Acerbi, Luigi" sort="Acerbi, Luigi" uniqKey="Acerbi L" first="Luigi" last="Acerbi">Luigi Acerbi</name>
</region>
<name sortKey="Acerbi, Luigi" sort="Acerbi, Luigi" uniqKey="Acerbi L" first="Luigi" last="Acerbi">Luigi Acerbi</name>
<name sortKey="Vijayakumar, Sethu" sort="Vijayakumar, Sethu" uniqKey="Vijayakumar S" first="Sethu" last="Vijayakumar">Sethu Vijayakumar</name>
<name sortKey="Wolpert, Daniel M" sort="Wolpert, Daniel M" uniqKey="Wolpert D" first="Daniel M." last="Wolpert">Daniel M. Wolpert</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/HapticV1/Data/Ncbi/Merge

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003078 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd -nk 003078 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    HapticV1
   |flux=    Ncbi
   |étape=   Merge
   |type=    RBID
   |clé=     PMC:4063671
   |texte=   On the Origins of Suboptimality in Human Probabilistic Inference
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Merge/RBID.i   -Sk "pubmed:24945142" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Merge/biblio.hfd   \
       | NlmPubMed2Wicri -a HapticV1

This area was generated with Dilib version V0.6.23.
Data generation: Mon Jun 13 01:09:46 2016. Site generation: Wed Mar 6 09:54:07 2024

	Serveur d'exploration sur les dispositifs haptiques
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur les dispositifs haptiques

On the Origins of Suboptimality in Human Probabilistic Inference

On the Origins of Suboptimality in Human Probabilistic Inference

Source :

Abstract

Links toward previous steps (curation, corpus...)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri

Pour générer des pages wiki