Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

A reinforcement learning approach to instrumental contingency degradation in rats.

Identifieur interne : 000120 ( Ncbi/Checkpoint ); précédent : 000119; suivant : 000121

A reinforcement learning approach to instrumental contingency degradation in rats.

Auteurs : Alain Dutech [France] ; Etienne Coutureau ; Alain R. Marchand

Source :

RBID : pubmed:21907801

Descripteurs français

English descriptors

Abstract

Goal-directed action involves a representation of action consequences. Adapting to changes in action-outcome contingency requires the prefrontal region. Indeed, rats with lesions of the medial prefrontal cortex do not adapt their free operant response when food delivery becomes unrelated to lever-pressing. The present study explores the bases of this deficit through a combined behavioural and computational approach. We show that lesioned rats retain some behavioural flexibility and stop pressing if this action prevents food delivery. We attempt to model this phenomenon in a reinforcement learning framework. The model assumes that distinct action values are learned in an incremental manner in distinct states. The model represents states as n-uplets of events, emphasizing sequences rather than the continuous passage of time. Probabilities of lever-pressing and visits to the food magazine observed in the behavioural experiments are first analyzed as a function of these states, to identify sequences of events that influence action choice. Observed action probabilities appear to be essentially function of the last event that occurred, with reward delivery and waiting significantly facilitating magazine visits and lever-pressing respectively. Behavioural sequences of normal and lesioned rats are then fed into the model, action values are updated at each event transition according to the SARSA algorithm, and predicted action probabilities are derived through a softmax policy. The model captures the time course of learning, as well as the differential adaptation of normal and prefrontal lesioned rats to contingency degradation with the same parameters for both groups. The results suggest that simple temporal difference algorithms with low learning rates can largely account for instrumental learning and performance. Prefrontal lesioned rats appear to mainly differ from control rats in their low rates of visits to the magazine after a lever press, and their inability to initially detect weak contingency changes.

DOI: 10.1016/j.jphysparis.2011.07.017
PubMed: 21907801


Affiliations:


Links toward previous steps (curation, corpus...)


Links to Exploration step

pubmed:21907801

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en">A reinforcement learning approach to instrumental contingency degradation in rats.</title>
<author>
<name sortKey="Dutech, Alain" sort="Dutech, Alain" uniqKey="Dutech A" first="Alain" last="Dutech">Alain Dutech</name>
<affiliation wicri:level="3">
<nlm:affiliation>LORIA/INRIA, Campus Scientifique, BP 239, 54506 Vandoeuvre les Nancy, France. alain.dutech@loria.fr</nlm:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA/INRIA, Campus Scientifique, BP 239, 54506 Vandoeuvre les Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Coutureau, Etienne" sort="Coutureau, Etienne" uniqKey="Coutureau E" first="Etienne" last="Coutureau">Etienne Coutureau</name>
</author>
<author>
<name sortKey="Marchand, Alain R" sort="Marchand, Alain R" uniqKey="Marchand A" first="Alain R" last="Marchand">Alain R. Marchand</name>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">PubMed</idno>
<date when="????">
<PubDate>
<MedlineDate>2011 Jan-Jun</MedlineDate>
</PubDate>
</date>
<idno type="doi">10.1016/j.jphysparis.2011.07.017</idno>
<idno type="RBID">pubmed:21907801</idno>
<idno type="pmid">21907801</idno>
<idno type="wicri:Area/PubMed/Corpus">000093</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Corpus" wicri:corpus="PubMed">000093</idno>
<idno type="wicri:Area/PubMed/Curation">000093</idno>
<idno type="wicri:explorRef" wicri:stream="PubMed" wicri:step="Curation">000093</idno>
<idno type="wicri:Area/PubMed/Checkpoint">000181</idno>
<idno type="wicri:explorRef" wicri:stream="Checkpoint" wicri:step="PubMed">000181</idno>
<idno type="wicri:Area/Ncbi/Merge">000122</idno>
<idno type="wicri:Area/Ncbi/Curation">000120</idno>
<idno type="wicri:Area/Ncbi/Checkpoint">000120</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">A reinforcement learning approach to instrumental contingency degradation in rats.</title>
<author>
<name sortKey="Dutech, Alain" sort="Dutech, Alain" uniqKey="Dutech A" first="Alain" last="Dutech">Alain Dutech</name>
<affiliation wicri:level="3">
<nlm:affiliation>LORIA/INRIA, Campus Scientifique, BP 239, 54506 Vandoeuvre les Nancy, France. alain.dutech@loria.fr</nlm:affiliation>
<country xml:lang="fr">France</country>
<wicri:regionArea>LORIA/INRIA, Campus Scientifique, BP 239, 54506 Vandoeuvre les Nancy</wicri:regionArea>
<placeName>
<region type="region" nuts="2">Grand Est</region>
<region type="old region" nuts="2">Lorraine (région)</region>
<settlement type="city">Vandœuvre-lès-Nancy</settlement>
</placeName>
</affiliation>
</author>
<author>
<name sortKey="Coutureau, Etienne" sort="Coutureau, Etienne" uniqKey="Coutureau E" first="Etienne" last="Coutureau">Etienne Coutureau</name>
</author>
<author>
<name sortKey="Marchand, Alain R" sort="Marchand, Alain R" uniqKey="Marchand A" first="Alain R" last="Marchand">Alain R. Marchand</name>
</author>
</analytic>
<series>
<title level="j">Journal of physiology, Paris</title>
<idno type="eISSN">1769-7115</idno>
</series>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Animals</term>
<term>Behavior, Animal (physiology)</term>
<term>Conditioning, Operant (physiology)</term>
<term>Extinction, Psychological (physiology)</term>
<term>Models, Neurological</term>
<term>Neurons (physiology)</term>
<term>Rats</term>
<term>Reinforcement (Psychology)</term>
<term>Touch (physiology)</term>
<term>Touch Perception (physiology)</term>
</keywords>
<keywords scheme="KwdFr" xml:lang="fr">
<term>Animaux</term>
<term>Comportement animal (physiologie)</term>
<term>Conditionnement opérant (physiologie)</term>
<term>Extinction (psychologie) (physiologie)</term>
<term>Modèles neurologiques</term>
<term>Neurones (physiologie)</term>
<term>Perception du toucher (physiologie)</term>
<term>Rats</term>
<term>Renforcement (psychologie)</term>
<term>Toucher (physiologie)</term>
</keywords>
<keywords scheme="MESH" qualifier="physiologie" xml:lang="fr">
<term>Comportement animal</term>
<term>Conditionnement opérant</term>
<term>Extinction (psychologie)</term>
<term>Neurones</term>
<term>Perception du toucher</term>
<term>Toucher</term>
</keywords>
<keywords scheme="MESH" qualifier="physiology" xml:lang="en">
<term>Behavior, Animal</term>
<term>Conditioning, Operant</term>
<term>Extinction, Psychological</term>
<term>Neurons</term>
<term>Touch</term>
<term>Touch Perception</term>
</keywords>
<keywords scheme="MESH" xml:lang="en">
<term>Animals</term>
<term>Models, Neurological</term>
<term>Rats</term>
<term>Reinforcement (Psychology)</term>
</keywords>
<keywords scheme="MESH" xml:lang="fr">
<term>Animaux</term>
<term>Modèles neurologiques</term>
<term>Rats</term>
<term>Renforcement (psychologie)</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Goal-directed action involves a representation of action consequences. Adapting to changes in action-outcome contingency requires the prefrontal region. Indeed, rats with lesions of the medial prefrontal cortex do not adapt their free operant response when food delivery becomes unrelated to lever-pressing. The present study explores the bases of this deficit through a combined behavioural and computational approach. We show that lesioned rats retain some behavioural flexibility and stop pressing if this action prevents food delivery. We attempt to model this phenomenon in a reinforcement learning framework. The model assumes that distinct action values are learned in an incremental manner in distinct states. The model represents states as n-uplets of events, emphasizing sequences rather than the continuous passage of time. Probabilities of lever-pressing and visits to the food magazine observed in the behavioural experiments are first analyzed as a function of these states, to identify sequences of events that influence action choice. Observed action probabilities appear to be essentially function of the last event that occurred, with reward delivery and waiting significantly facilitating magazine visits and lever-pressing respectively. Behavioural sequences of normal and lesioned rats are then fed into the model, action values are updated at each event transition according to the SARSA algorithm, and predicted action probabilities are derived through a softmax policy. The model captures the time course of learning, as well as the differential adaptation of normal and prefrontal lesioned rats to contingency degradation with the same parameters for both groups. The results suggest that simple temporal difference algorithms with low learning rates can largely account for instrumental learning and performance. Prefrontal lesioned rats appear to mainly differ from control rats in their low rates of visits to the magazine after a lever press, and their inability to initially detect weak contingency changes.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>France</li>
</country>
<region>
<li>Grand Est</li>
<li>Lorraine (région)</li>
</region>
<settlement>
<li>Vandœuvre-lès-Nancy</li>
</settlement>
</list>
<tree>
<noCountry>
<name sortKey="Coutureau, Etienne" sort="Coutureau, Etienne" uniqKey="Coutureau E" first="Etienne" last="Coutureau">Etienne Coutureau</name>
<name sortKey="Marchand, Alain R" sort="Marchand, Alain R" uniqKey="Marchand A" first="Alain R" last="Marchand">Alain R. Marchand</name>
</noCountry>
<country name="France">
<region name="Grand Est">
<name sortKey="Dutech, Alain" sort="Dutech, Alain" uniqKey="Dutech A" first="Alain" last="Dutech">Alain Dutech</name>
</region>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Ncbi/Checkpoint
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000120 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Ncbi/Checkpoint/biblio.hfd -nk 000120 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Ncbi
   |étape=   Checkpoint
   |type=    RBID
   |clé=     pubmed:21907801
   |texte=   A reinforcement learning approach to instrumental contingency degradation in rats.
}}

Pour générer des pages wiki

HfdIndexSelect -h $EXPLOR_AREA/Data/Ncbi/Checkpoint/RBID.i   -Sk "pubmed:21907801" \
       | HfdSelect -Kh $EXPLOR_AREA/Data/Ncbi/Checkpoint/biblio.hfd   \
       | NlmPubMed2Wicri -a InforLorV4 

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022