InforLorV4, PascalFrancis, Corpus, bibRecord, 000714

Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems

Identifieur interne : 000714 ( PascalFrancis/Corpus ); précédent : 000713; suivant : 000715

Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems

Auteurs : Bruno Scherrer ; Francois Charpillet

Source :

Proceedings - International Conference on Tools with Artificial Intelligence, TAI [ 1082-3409 ] ; 2002.

RBID : Pascal:04-0171371

Descripteurs français

Pascal (Inist)
- Apprentissage renforcé, Apprentissage(intelligence artificielle), Système multiagent, Processus Markov, Décision Markov, Apprentissage collectif, Coapprentissage.

English descriptors

KwdEn :
- Collective learning, Learning (artificial intelligence), Markov decision, Markov process, Multiagent system, Reinforcement learning.

Abstract

Solving Multi-Agent Reinforcement Learning Problems is a key issue. Indeed, the complexity of deriving multi-agent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov Decision Processes, Partially Observable Markov Decision Processes and Decentralized Partially Observable Markov Decision Processes.

Notice en format standard (ISO 2709)

Pour connaître la documentation sur le format Inist Standard.

A01	`01`	`1`		`@0 1082-3409`
A08	`01`	`1`	`ENG`	`@1 Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems`
A09	`01`	`1`	`ENG`	`@1 14th IEEE international conference on tools with artificial intelligence (ICTAI 2002) : Washington DC, 4-6 November 2002`
A11	`01`	`1`		`@1 SCHERRER (Bruno)`
A11	`02`	`1`		`@1 CHARPILLET (Francois)`
A14	`01`			`@1 LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239 @2 54506 Vandoeuvre-lès-Nancy @3 FRA @Z 1 aut. @Z 2 aut.`
A20				`@1 463-468`
A21				`@1 2002`
A23	`01`			`@0 ENG`
A26	`01`			`@0 0-7695-1849-4`
A43	`01`			`@1 INIST @2 Y 37923 @5 354000117761100590`
A44				`@0 0000 @1 © 2004 INIST-CNRS. All rights reserved.`
A45				`@0 12 ref.`
A47	`01`	`1`		`@0 04-0171371`
A60				`@1 P @2 C`
A61				`@0 A`
A64	`01`	`1`		`@0 Proceedings - International Conference on Tools with Artificial Intelligence, TAI`
A66	`01`			`@0 USA`
C01	`01`		`ENG`	@0 Solving Multi-Agent Reinforcement Learning Problems is a key issue. Indeed, the complexity of deriving multi-agent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov Decision Processes, Partially Observable Markov Decision Processes and Decentralized Partially Observable Markov Decision Processes.
C02	`01`	`X`		`@0 001D02C02`
C02	`02`	`X`		`@0 001A02H01J`
C03	`01`	`X`	`FRE`	`@0 Apprentissage renforcé @5 01`
C03	`01`	`X`	`ENG`	`@0 Reinforcement learning @5 01`
C03	`01`	`X`	`SPA`	`@0 Aprendizaje reforzado @5 01`
C03	`02`	`3`	`FRE`	`@0 Apprentissage(intelligence artificielle) @5 02`
C03	`02`	`3`	`ENG`	`@0 Learning (artificial intelligence) @5 02`
C03	`03`	`X`	`FRE`	`@0 Système multiagent @5 03`
C03	`03`	`X`	`ENG`	`@0 Multiagent system @5 03`
C03	`03`	`X`	`SPA`	`@0 Sistema multiagente @5 03`
C03	`04`	`X`	`FRE`	`@0 Processus Markov @5 04`
C03	`04`	`X`	`ENG`	`@0 Markov process @5 04`
C03	`04`	`X`	`SPA`	`@0 Proceso Markov @5 04`
C03	`05`	`X`	`FRE`	`@0 Décision Markov @5 05`
C03	`05`	`X`	`ENG`	`@0 Markov decision @5 05`
C03	`05`	`X`	`SPA`	`@0 Decisión Markov @5 05`
C03	`06`	`X`	`FRE`	`@0 Apprentissage collectif @5 06`
C03	`06`	`X`	`ENG`	`@0 Collective learning @5 06`
C03	`06`	`X`	`SPA`	`@0 Aprendizaje colectivo @5 06`
C03	`07`	`X`	`FRE`	`@0 Coapprentissage @4 INC @5 82`
N21				`@1 117`
N82				`@1 PSI`

A30	`01`	`1`	`ENG`	`@1 ICTAI : international conference on tools with artificial intelligence @2 14 @3 Washington DC USA @4 2002-11-04`

Format Inist (serveur)

NO :	PASCAL 04-0171371 INIST
ET :	Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems
AU :	SCHERRER (Bruno); CHARPILLET (Francois)
AF :	LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239/54506 Vandoeuvre-lès-Nancy /France (1 aut., 2 aut.)
DT :	Publication en série; Congrès; Niveau analytique
SO :	Proceedings - International Conference on Tools with Artificial Intelligence, TAI; ISSN 1082-3409; Etats-Unis; Da. 2002; Pp. 463-468; Bibl. 12 ref.
LA :	Anglais
EA :	Solving Multi-Agent Reinforcement Learning Problems is a key issue. Indeed, the complexity of deriving multi-agent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov Decision Processes, Partially Observable Markov Decision Processes and Decentralized Partially Observable Markov Decision Processes.
CC :	001D02C02; 001A02H01J
FD :	Apprentissage renforcé; Apprentissage(intelligence artificielle); Système multiagent; Processus Markov; Décision Markov; Apprentissage collectif; Coapprentissage
ED :	Reinforcement learning; Learning (artificial intelligence); Multiagent system; Markov process; Markov decision; Collective learning
SD :	Aprendizaje reforzado; Sistema multiagente; Proceso Markov; Decisión Markov; Aprendizaje colectivo
LO :	INIST-Y 37923.354000117761100590
ID :	04-0171371

Links to Exploration step

Pascal:04-0171371

Le document en format XML

<record><TEI><teiHeader><fileDesc><titleStmt><title xml:lang="en" level="a">Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems</title>
<author><name sortKey="Scherrer, Bruno" sort="Scherrer, Bruno" uniqKey="Scherrer B" first="Bruno" last="Scherrer">Bruno Scherrer</name>
<affiliation><inist:fA14 i1="01"><s1>LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239</s1>
<s2>54506 Vandoeuvre-lès-Nancy </s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Charpillet, Francois" sort="Charpillet, Francois" uniqKey="Charpillet F" first="Francois" last="Charpillet">Francois Charpillet</name>
<affiliation><inist:fA14 i1="01"><s1>LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239</s1>
<s2>54506 Vandoeuvre-lès-Nancy </s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">INIST</idno>
<idno type="inist">04-0171371</idno>
<date when="2002">2002</date>
<idno type="stanalyst">PASCAL 04-0171371 INIST</idno>
<idno type="RBID">Pascal:04-0171371</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000714</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title xml:lang="en" level="a">Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems</title>
<author><name sortKey="Scherrer, Bruno" sort="Scherrer, Bruno" uniqKey="Scherrer B" first="Bruno" last="Scherrer">Bruno Scherrer</name>
<affiliation><inist:fA14 i1="01"><s1>LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239</s1>
<s2>54506 Vandoeuvre-lès-Nancy </s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
<author><name sortKey="Charpillet, Francois" sort="Charpillet, Francois" uniqKey="Charpillet F" first="Francois" last="Charpillet">Francois Charpillet</name>
<affiliation><inist:fA14 i1="01"><s1>LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239</s1>
<s2>54506 Vandoeuvre-lès-Nancy </s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
</affiliation>
</author>
</analytic>
<series><title level="j" type="main">Proceedings - International Conference on Tools with Artificial Intelligence, TAI</title>
<idno type="ISSN">1082-3409</idno>
<imprint><date when="2002">2002</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt><title level="j" type="main">Proceedings - International Conference on Tools with Artificial Intelligence, TAI</title>
<idno type="ISSN">1082-3409</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass><keywords scheme="KwdEn" xml:lang="en"><term>Collective learning</term>
<term>Learning (artificial intelligence)</term>
<term>Markov decision</term>
<term>Markov process</term>
<term>Multiagent system</term>
<term>Reinforcement learning</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr"><term>Apprentissage renforcé</term>
<term>Apprentissage(intelligence artificielle)</term>
<term>Système multiagent</term>
<term>Processus Markov</term>
<term>Décision Markov</term>
<term>Apprentissage collectif</term>
<term>Coapprentissage</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Solving Multi-Agent Reinforcement Learning Problems is a key issue. Indeed, the complexity of deriving multi-agent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov Decision Processes, Partially Observable Markov Decision Processes and Decentralized Partially Observable Markov Decision Processes.</div>
</front>
</TEI>
<inist><standard h6="B"><pA><fA01 i1="01" i2="1"><s0>1082-3409</s0>
</fA01>
<fA08 i1="01" i2="1" l="ENG"><s1>Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems</s1>
</fA08>
<fA09 i1="01" i2="1" l="ENG"><s1>14th IEEE international conference on tools with artificial intelligence (ICTAI 2002) : Washington DC, 4-6 November 2002</s1>
</fA09>
<fA11 i1="01" i2="1"><s1>SCHERRER (Bruno)</s1>
</fA11>
<fA11 i1="02" i2="1"><s1>CHARPILLET (Francois)</s1>
</fA11>
<fA14 i1="01"><s1>LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239</s1>
<s2>54506 Vandoeuvre-lès-Nancy </s2>
<s3>FRA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</fA14>
<fA20><s1>463-468</s1>
</fA20>
<fA21><s1>2002</s1>
</fA21>
<fA23 i1="01"><s0>ENG</s0>
</fA23>
<fA26 i1="01"><s0>0-7695-1849-4</s0>
</fA26>
<fA43 i1="01"><s1>INIST</s1>
<s2>Y 37923</s2>
<s5>354000117761100590</s5>
</fA43>
<fA44><s0>0000</s0>
<s1>© 2004 INIST-CNRS. All rights reserved.</s1>
</fA44>
<fA45><s0>12 ref.</s0>
</fA45>
<fA47 i1="01" i2="1"><s0>04-0171371</s0>
</fA47>
<fA60><s1>P</s1>
<s2>C</s2>
</fA60>
<fA61><s0>A</s0>
</fA61>
<fA64 i1="01" i2="1"><s0>Proceedings - International Conference on Tools with Artificial Intelligence, TAI</s0>
</fA64>
<fA66 i1="01"><s0>USA</s0>
</fA66>
<fC01 i1="01" l="ENG"><s0>Solving Multi-Agent Reinforcement Learning Problems is a key issue. Indeed, the complexity of deriving multi-agent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov Decision Processes, Partially Observable Markov Decision Processes and Decentralized Partially Observable Markov Decision Processes.</s0>
</fC01>
<fC02 i1="01" i2="X"><s0>001D02C02</s0>
</fC02>
<fC02 i1="02" i2="X"><s0>001A02H01J</s0>
</fC02>
<fC03 i1="01" i2="X" l="FRE"><s0>Apprentissage renforcé</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="ENG"><s0>Reinforcement learning</s0>
<s5>01</s5>
</fC03>
<fC03 i1="01" i2="X" l="SPA"><s0>Aprendizaje reforzado</s0>
<s5>01</s5>
</fC03>
<fC03 i1="02" i2="3" l="FRE"><s0>Apprentissage(intelligence artificielle)</s0>
<s5>02</s5>
</fC03>
<fC03 i1="02" i2="3" l="ENG"><s0>Learning (artificial intelligence)</s0>
<s5>02</s5>
</fC03>
<fC03 i1="03" i2="X" l="FRE"><s0>Système multiagent</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="ENG"><s0>Multiagent system</s0>
<s5>03</s5>
</fC03>
<fC03 i1="03" i2="X" l="SPA"><s0>Sistema multiagente</s0>
<s5>03</s5>
</fC03>
<fC03 i1="04" i2="X" l="FRE"><s0>Processus Markov</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="ENG"><s0>Markov process</s0>
<s5>04</s5>
</fC03>
<fC03 i1="04" i2="X" l="SPA"><s0>Proceso Markov</s0>
<s5>04</s5>
</fC03>
<fC03 i1="05" i2="X" l="FRE"><s0>Décision Markov</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="ENG"><s0>Markov decision</s0>
<s5>05</s5>
</fC03>
<fC03 i1="05" i2="X" l="SPA"><s0>Decisión Markov</s0>
<s5>05</s5>
</fC03>
<fC03 i1="06" i2="X" l="FRE"><s0>Apprentissage collectif</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="ENG"><s0>Collective learning</s0>
<s5>06</s5>
</fC03>
<fC03 i1="06" i2="X" l="SPA"><s0>Aprendizaje colectivo</s0>
<s5>06</s5>
</fC03>
<fC03 i1="07" i2="X" l="FRE"><s0>Coapprentissage</s0>
<s4>INC</s4>
<s5>82</s5>
</fC03>
<fN21><s1>117</s1>
</fN21>
<fN82><s1>PSI</s1>
</fN82>
</pA>
<pR><fA30 i1="01" i2="1" l="ENG"><s1>ICTAI : international conference on tools with artificial intelligence</s1>
<s2>14</s2>
<s3>Washington DC USA</s3>
<s4>2002-11-04</s4>
</fA30>
</pR>
</standard>
<server><NO>PASCAL 04-0171371 INIST</NO>
<ET>Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems</ET>
<AU>SCHERRER (Bruno); CHARPILLET (Francois)</AU>
<AF>LORIA - INRIA Lorraine, Bâtiment LORIA, Campus scientifique B.P. 239/54506 Vandoeuvre-lès-Nancy /France (1 aut., 2 aut.)</AF>
<DT>Publication en série; Congrès; Niveau analytique</DT>
<SO>Proceedings - International Conference on Tools with Artificial Intelligence, TAI; ISSN 1082-3409; Etats-Unis; Da. 2002; Pp. 463-468; Bibl. 12 ref.</SO>
<LA>Anglais</LA>
<EA>Solving Multi-Agent Reinforcement Learning Problems is a key issue. Indeed, the complexity of deriving multi-agent plans, especially when one uses an explicit model of the problem, is dramatically increasing with the number of agents. This papers introduces a general iterative heuristic: at each step one chooses a sub-group of agents and update their policies to optimize the task given the rest of agents have fixed plans. We analyse this process in a general purpose and show how it can be applied to Markov Decision Processes, Partially Observable Markov Decision Processes and Decentralized Partially Observable Markov Decision Processes.</EA>
<CC>001D02C02; 001A02H01J</CC>
<FD>Apprentissage renforcé; Apprentissage(intelligence artificielle); Système multiagent; Processus Markov; Décision Markov; Apprentissage collectif; Coapprentissage</FD>
<ED>Reinforcement learning; Learning (artificial intelligence); Multiagent system; Markov process; Markov decision; Collective learning</ED>
<SD>Aprendizaje reforzado; Sistema multiagente; Proceso Markov; Decisión Markov; Aprendizaje colectivo</SD>
<LO>INIST-Y 37923.354000117761100590</LO>
<ID>04-0171371</ID>
</server>
</inist>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/PascalFrancis/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000714 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/PascalFrancis/Corpus/biblio.hfd -nk 000714 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    PascalFrancis
   |étape=   Corpus
   |type=    RBID
   |clé=     Pascal:04-0171371
   |texte=   Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems
}}

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022

	Serveur d'exploration sur la recherche en informatique en Lorraine
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur la recherche en informatique en Lorraine

Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems

Cooperative co-learning: A model-based approach for solving multi agent Reinforcement problems

Source :

Descripteurs français

English descriptors

Abstract

Notice en format standard (ISO 2709)

Format Inist (serveur)

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri