Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Automatic Generation of an Agent's Basic Behaviors

Identifieur interne : 003835 ( Crin/Corpus ); précédent : 003834; suivant : 003836

Automatic Generation of an Agent's Basic Behaviors

Auteurs : Olivier Buffet ; Alain Dutech ; François Charpillet

Source :

RBID : CRIN:buffet03a

English descriptors

Abstract

The agent approach, as seen by \cite{Russell95}, intends to design ``intelligent'' behaviors. Yet, Reinforcement Learning (RL) methods often fail when confronted with complex tasks. We are therefore trying to develop a methodology for the automated design of agents (in the framework of Markov Decision Processes) in the case where the global task can be decomposed into simpler -possibly concurrent- sub-tasks. Our main idea is to automatically combine basic behaviors using RL methods. This led us to propose two complementary mechanisms presented in the current paper. The first mechanism builds a global policy using a weighted combination of basic policies (which are reusable), the weights being learned by the agent (using Simulated Annealing in our case). An agent designed this way is highly scalable as, without further refinement of the global behavior, it can automatically combine several instances of the same basic behavior to take into account concurrent occurences of the same subtask. The second mechanism aims at creating new basic behaviors for combination. It is based on an incremental learning method that builds on the approximate solution obtained through the combination of older behaviors.

Links to Exploration step

CRIN:buffet03a

Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" wicri:score="309">Automatic Generation of an Agent's Basic Behaviors</title>
</titleStmt>
<publicationStmt>
<idno type="RBID">CRIN:buffet03a</idno>
<date when="2003" year="2003">2003</date>
<idno type="wicri:Area/Crin/Corpus">003835</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en">Automatic Generation of an Agent's Basic Behaviors</title>
<author>
<name sortKey="Buffet, Olivier" sort="Buffet, Olivier" uniqKey="Buffet O" first="Olivier" last="Buffet">Olivier Buffet</name>
</author>
<author>
<name sortKey="Dutech, Alain" sort="Dutech, Alain" uniqKey="Dutech A" first="Alain" last="Dutech">Alain Dutech</name>
</author>
<author>
<name sortKey="Charpillet, Francois" sort="Charpillet, Francois" uniqKey="Charpillet F" first="François" last="Charpillet">François Charpillet</name>
</author>
</analytic>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>adaptation</term>
<term>complex environments</term>
<term>markov decision processes</term>
<term>reinforcement learning</term>
<term>scalability</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en" wicri:score="4765">The agent approach, as seen by \cite{Russell95}, intends to design ``intelligent'' behaviors. Yet, Reinforcement Learning (RL) methods often fail when confronted with complex tasks. We are therefore trying to develop a methodology for the automated design of agents (in the framework of Markov Decision Processes) in the case where the global task can be decomposed into simpler -possibly concurrent- sub-tasks. Our main idea is to automatically combine basic behaviors using RL methods. This led us to propose two complementary mechanisms presented in the current paper. The first mechanism builds a global policy using a weighted combination of basic policies (which are reusable), the weights being learned by the agent (using Simulated Annealing in our case). An agent designed this way is highly scalable as, without further refinement of the global behavior, it can automatically combine several instances of the same basic behavior to take into account concurrent occurences of the same subtask. The second mechanism aims at creating new basic behaviors for combination. It is based on an incremental learning method that builds on the approximate solution obtained through the combination of older behaviors.</div>
</front>
</TEI>
<BibTex type="inproceedings">
<ref>buffet03a</ref>
<crinnumber>A03-R-095</crinnumber>
<category>3</category>
<equipe>MAIA</equipe>
<author>
<e>Buffet, Olivier</e>
<e>Dutech, Alain</e>
<e>Charpillet, François</e>
</author>
<title>Automatic Generation of an Agent's Basic Behaviors</title>
<booktitle>{Second International Joint Conference on Autonomous Agents and Multi-Agent Systems - AAMAS'03, Melbourne, Victoria, Australie}</booktitle>
<year>2003</year>
<editor>Rosenschein, Sandholm, Wooldridge and Yokoo</editor>
<pages>875-882</pages>
<month>Jul</month>
<publisher>ACM press</publisher>
<keywords>
<e>reinforcement learning</e>
<e>scalability</e>
<e>adaptation</e>
<e>complex environments</e>
<e>markov decision processes</e>
</keywords>
<abstract>The agent approach, as seen by \cite{Russell95}, intends to design ``intelligent'' behaviors. Yet, Reinforcement Learning (RL) methods often fail when confronted with complex tasks. We are therefore trying to develop a methodology for the automated design of agents (in the framework of Markov Decision Processes) in the case where the global task can be decomposed into simpler -possibly concurrent- sub-tasks. Our main idea is to automatically combine basic behaviors using RL methods. This led us to propose two complementary mechanisms presented in the current paper. The first mechanism builds a global policy using a weighted combination of basic policies (which are reusable), the weights being learned by the agent (using Simulated Annealing in our case). An agent designed this way is highly scalable as, without further refinement of the global behavior, it can automatically combine several instances of the same basic behavior to take into account concurrent occurences of the same subtask. The second mechanism aims at creating new basic behaviors for combination. It is based on an incremental learning method that builds on the approximate solution obtained through the combination of older behaviors.</abstract>
</BibTex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Lorraine/explor/InforLorV4/Data/Crin/Corpus
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 003835 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Crin/Corpus/biblio.hfd -nk 003835 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Lorraine
   |area=    InforLorV4
   |flux=    Crin
   |étape=   Corpus
   |type=    RBID
   |clé=     CRIN:buffet03a
   |texte=   Automatic Generation of an Agent's Basic Behaviors
}}

Wicri

This area was generated with Dilib version V0.6.33.
Data generation: Mon Jun 10 21:56:28 2019. Site generation: Fri Feb 25 15:29:27 2022