Serveur d'exploration sur la recherche en informatique en Lorraine

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Le cluster Matthieu Geist - Olivier Pietquin

Terms

30Matthieu Geist
17Olivier Pietquin
95Bruno Scherrer
23Amine Boumaza

Associations

Freq.WeightAssociation
1616Matthieu Geist - Olivier Pietquin
1414Bruno Scherrer - Matthieu Geist
1212Amine Boumaza - Bruno Scherrer

Documents par ordre de pertinence**** probable Xml problem ****
000722 (2015) Bruno Scherrer [France] ; Mohammad Ghavamzadeh [France] ; Victor Gabillon [France] ; Boris Lesner [France] ; Matthieu Geist [France]Approximate Modified Policy Iteration and its Application to the Game of Tetris
000936 (2014-09-15) Bruno Scherrer [France] ; Matthieu Geist [France]Local Policy Search in a Convex Space and Conservative Policy Iteration as Boosted Policy Search
000B67 (2014-05-12) Bruno Scherrer [France] ; Matthieu Geist [France]Quand l'optimalité locale implique une garantie globale : recherche locale de politique dans un espace convexe et algorithme d'itération sur les politiques conservatif vu comme une montée de gradient fonctionnel
000D16 (2014-01) Matthieu Geist [France] ; Bruno Scherrer [France]Off-policy Learning with Eligibility Traces: A Survey
000F64 (2013-10-25) Matthieu Geist [France] ; Edouard Klein [France] ; Bilal Piot [France] ; Yann Guermeur [France] ; Olivier Pietquin [France]Around Inverse Reinforcement Learning and Score-based Classification
001057 (2013-08-25) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Particle Swarm Optimisation of Spoken Dialogue System Strategies
001066 (2013-08-22) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Model-free POMDP optimisation of tutoring systems with echo-state networks
001123 (2013-07-01) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Optimisation par essaims particulaires de stratégies de dialogue
001183 (2013-06-06) Bruno Scherrer [France] ; Matthieu Geist [France]Policy Search: Any Local Optimum Enjoys a Global Performance Guarantee
001220 (2013-05-26) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Random Projections: a Remedy for Overfitting Issues in Time Series Prediction with Echo State Networks
001445 (2013) Edouard Klein [France] ; Bilal Piot [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Classification structurée pourl’apprentissage par renforcement inverse
001692 (2013) Edouard Klein [France] ; Bilal Piot [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Classification structurée pour l'apprentissage par renforcement inverse
001750 (2013) Matthieu Geist [France] ; Bruno Scherrer [France]Off-policy Learning with Eligibility Traces: A Survey
001835 (2012-12) Lucie Daubigney [France] ; Matthieu Geist [France] ; Senthilkumar Chandramohan [France] ; Olivier Pietquin [France]A Comprehensive Reinforcement Learning Framework for Dialogue Management Optimisation
001A58 (2012-06-26) Matthieu Geist [France] ; Bruno Scherrer [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France]A Dantzig Selector Approach to Temporal Difference Learning
001A68 (2012-06-25) Bruno Scherrer [France] ; Mohammad Ghavamzadeh [France] ; Victor Gabillon [France] ; Matthieu Geist [France]Approximate Modified Policy Iteration
001A91 (2012-06-04) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Optimisation d'un tuteur intelligent à partir d'un jeu de données fixé
001B15 (2012-05-23) Edouard Klein [France] ; Bilal Piot [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Classification structurée pour l'apprentissage par renforcement inverse
001B20 (2012-05-22) Matthieu Geist [France] ; Bruno Scherrer [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France]Un sélecteur de Dantzig pour l'apprentissage par différences temporelles
001B24 (2012-05-22) Bruno Scherrer [France] ; Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Matthieu Geist [France]Approximations de l'Algorithme Itérations sur les Politiques Modifié
001B39 (2012-05-14) Bruno Scherrer [France] ; Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Matthieu Geist [France]Approximate Modified Policy Iteration
001B96 (2012-03-25) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Off-policy Learning in Large-scale POMDP-based Dialogue Systems
001C55 (2012-01) Lucie Daubigney [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Apprentissage off-policy appliqué à un système de dialogue basé sur les PDMPO
001F54 (2011-12-18) Edouard Klein [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Reducing the dimentionality of the reward space in the Inverse Reinforcement Learning problem
002138 (2011-09-09) Matthieu Geist [France] ; Bruno Scherrer [France]l1-penalized projected Bellman residual
002139 (2011-09-09) Bruno Scherrer [France] ; Matthieu Geist [France]Recursive Least-Squares Learning with Eligibility Traces
002141 (2011-09-09) Edouard Klein [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Batch, Off-policy and Model-free Apprenticeship Learning
002279 (2011-06-23) Bruno Scherrer [France] ; Matthieu Geist [France]Moindres carrés récursifs pour l'évaluation off-policy d'une politique avec traces d'éligibilité
002281 (2011-06-23) Edouard Klein [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Apprentissage par imitation dans un cadre batch, off-policy et sans modèle
002303 (2011-06-16) Edouard Klein [France] ; Matthieu Geist [France] ; Olivier Pietquin [France]Batch, Off-policy and Model-Free Apprenticeship Learning
003D49 (2008-12) Cesar Torres-Huitzil [Mexique] ; Bernard Girau [France] ; Amine Boumaza [France] ; Bruno Scherrer [France]Embedded harmonic control for trajectory planning in large environments
004273 (2008) Amine Boumaza [France] ; Bruno Scherrer [France]Analyse d’un algorithme d’intelligence en essaim pour le fourragement
004599 (2008) Bernard Girau [France] ; Amine Boumaza [France] ; Bruno Scherrer [France] ; Cesar Torres-Huitzil [Mexique]Block-synchronous harmonic control for scalable trajectory planning
004648 (2007-11-21) Amine Boumaza [France] ; Bruno Scherrer [France]Convergence and rate of convergence of simple ant models
004725 (2007-09-26) Amine Boumaza [France] ; Bruno Scherrer [France]Convergence and Rate of Convergence of a Foraging Ant Model
004952 (2007-05) Amine Boumaza [France] ; Bruno Scherrer [France]Convergence and rate of convergence of a simple ant model
004958 (2007-04-12) Amine Boumaza [France] ; Bruno Scherrer [France]Optimal control subsumes harmonic control
004E85 (2007) Amine Boumaza [France] ; Bruno Scherrer [France]Convergence and rate of convergence of a simple ant model
005694 (2006) Amine Boumaza [France] ; Bruno Scherrer [France]Convergence et taux de convergence d'un algorithme fourmi simple
005706 (2006) Amine Boumaza [France] ; Bruno Scherrer [France]Optimal control subsumes harmonic control
005989 (2005-06-22) Amine Boumaza [France] ; Bruno Scherrer [France]Navigation, fonctions harmoniques et contrôle optimal stochastique
005D65 (2005) Amine Boumaza ; Bruno ScherrerNavigation, fonctions harmoniques et contrôle optimal stochastique
000200 (2015-12-10) Maxime Amblard [France] ; Amine Boumaza [France]Robots humains, avez-donc une réalité ?
000448 (2015-07-11) I Aki Fernández Pérez [France] ; Amine Boumaza [France] ; François Charpillet [France]Decentralized Innovation Marking for Neural Controllers in Embodied Evolution
000453 (2015-07-06) Manel Tagorti [France] ; Bruno Scherrer [France]On the Rate of Convergence and Error Bounds for LSTD(λ)
000454 (2015-07-06) Boris Lesner [France] ; Bruno Scherrer [France]Non-Stationary Approximate Modified Policy Iteration
000469 (2015-06-29) I Aki Fernández Pérez [France] ; Amine Boumaza [France] ; François Charpillet [France]Influence of Selection Pressure in Online, Distributed Evolutionary Robotics
000A15 (2014-07-29) I Aki Fernández Pérez [France] ; Amine Boumaza [France] ; François Charpillet [France]Comparison of Selection Methods in On-line Distributed Evolutionary Robotics
000A93 (2014-06-21) Bruno Scherrer [France]Approximate Policy Iteration Schemes: A Comparison
000B92 (2014-05) Manel Tagorti [France] ; Bruno Scherrer [France]Vitesse de convergence et borne d'erreur pour l'algorithme LSTD($\lambda$)
000B93 (2014-05) Bruno Scherrer [France]Une étude comparative de quelques schémas d'approximation de type iterations sur les politiques
000B95 (2014-05) Manel Tagorti [France] ; Bruno Scherrer [France]Rate of Convergence and Error Bounds for LSTD($\lambda$)
000D49 (2014) Eugene A. Feinberg [États-Unis] ; Jefferson Huang [États-Unis] ; Bruno Scherrer [France]Modified policy iteration algorithms are not strongly polynomial for discounted dynamic programming
000F08 (2013-12-05) Bruno Scherrer [France]Improved and Generalized Upper Bounds on the Complexity of Policy Iteration
000F09 (2013-12-05) Victor Gabillon [France] ; Mohammad Ghavamzadeh [France] ; Bruno Scherrer [France]Approximate Dynamic Programming Finally Performs Well in the Game of Tetris
000F29 (2013-11-18) Alain Dutech [France] ; Bruno Scherrer [France] ; Christophe Thiery [France]La carotte et le bâton... et Tetris
001120 (2013-07-01) Bruno Scherrer [France] ; Boris Lesner [France]Sur l'utilisation de politiques non-stationnaires pour les processus de décision Markoviens à horizon infini
001122 (2013-07-01) Bruno Scherrer [France]Quelques majorants de la complexité d'itérations sur les politiques
001130 (2013-07-01) Manel Tagorti [France] ; Bruno Scherrer [France] ; Olivier Buffet [France] ; Joerg Hoffmann [France]Abstraction Pathologies In Markov Decision Processes
001172 (2013-06-10) Manel Tagorti [France] ; Bruno Scherrer [France] ; Olivier Buffet [France] ; Joerg Hoffmann [France]Abstraction Pathologies In Markov Decision Processes
001194 (2013-06-03) Bruno Scherrer [France]On the Performance Bounds of some Policy Search Dynamic Programming Algorithms
001244 (2013-04-19) Boris Lesner [France] ; Bruno Scherrer [France]Tight Performance Bounds for Approximate Modified Policy Iteration with Non-Stationary Policies
001334 (2013-01-01) Bruno Scherrer [France]Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris
001758 (2013) Amine Boumaza [France]How to design good Tetris players
001825 (2012-12-03) Bruno Scherrer [France] ; Boris Lesner [France]On the Use of Non-Stationary Policies for Stationary Infinite-Horizon Markov Decision Processes
001A35 (2012-07-07) Amine Boumaza [France] ; Armelle Brun [France]From Neighbors to Global Neighbors in Collaborative Filtering: an Evolutionary Optimization Approach
001B91 (2012-03-26) Amine Boumaza [France] ; Armelle Brun [France]Stochastic Search for Global Neighbors Selection in Collaborative Filtering
001C03 (2012-03-23) Bruno Scherrer [France]On the Use of Non-Stationary Policies for Infinite-Horizon Discounted Markov Decision Processes
002267 (2011-06-28) Victor Gabillon [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France] ; Bruno Scherrer [France]Classification-based Policy Iteration with a Critic
002378 (2011-05-05) Victor Gabillon [France] ; Alessandro Lazaric [France] ; Mohammad Ghavamzadeh [France] ; Bruno Scherrer [France]Classification-based Policy Iteration with a Critic
002841 (2011) Bruno Scherrer [France]Performance Bounds for Lambda Policy Iteration and Application to the Game of Tetris
002C27 (2010-06-21) Bruno Scherrer [France]Should one compute the Temporal Difference fix point or minimize the Bellman Residual? The unified oblique projection view
002C29 (2010-06-21) Christophe Thiery [France] ; Bruno Scherrer [France]Least-Squares λ Policy Iteration: Bias-Variance Trade-off in Control Problems
002C72 (2010-06-01) Christophe Thiery [France] ; Bruno Scherrer [France]Least-Squares λ Policy Iteration : optimisme et compromis biais-variance pour le contrôle optimal
002C73 (2010-06-01) Raghav Aras [France] ; Olivier Pietquin [France]Optimal Average Reward Controllers For POMDPs
003231 (2010) Bruno Scherrer [France] ; Christophe Thiery [France]Performance bound for Approximate Optimistic Policy Iteration
003232 (2010) Alain Dutech [France] ; Bruno Scherrer [France]Partially Observable Markov Decision Processes
003565 (2009-06-02) Christophe Thiery [France] ; Bruno Scherrer [France]Une approche modifiée de Lambda-Policy Iteration
003859 (2009) Christophe Thiery ; Bruno ScherrerConstruction d’un joueur artificiel pour Tetris
003C68 (2009) Christophe Thiery [France] ; Bruno Scherrer [France]Improvements on Learning Tetris with Cross Entropy
003C92 (2009) Christophe Thiery [France] ; Bruno Scherrer [France]Building Controllers for Tetris
003D48 (2008-12) Bruno Scherrer [France] ; Shie Mannor [Canada]Error Reducing Sampling in Reinforcement Learning
003D50 (2008-12) Marek Petrik [États-Unis] ; Bruno Scherrer [France]Biasing Approximate Dynamic Programming with a Lower Discount Factor
004139 (2008) Alain Dutech [France] ; Bruno Scherrer [France] ; Christophe Thiery [France]La carotte et le bâton... et Tetris
004474 (2008) Alain Dutech [France] ; Bruno Scherrer [France]Processus décisionnels de Markov partiellement observables
004612 (2008) Amine Boumaza [France]A distributed evolutionary approach for fast 3-D stereo reconstruction
004A00 (2007-02-12) Bernard Girau [France] ; Amine Boumaza [France]Embedded harmonic control for dynamic trajectory planning on FPGA
004F65 (2006-10-23) Bruno Scherrer [France]Une condition suffisante pour l'implémentation connexionniste asynchrone
005987 (2005-06-25) Amine Boumaza [France]Learning environment dynamics from self-adaptation. A preliminary investigation
005C50 (2005) Bruno Scherrer [France]Asynchronous Neurocomputing for optimal control and reinforcement learning with large state spaces
005D52 (2005) Bruno ScherrerAsynchronous Neurocomputing for optimal control and reinforcement learning with large state spaces
005E49 (2005) Amine BoumazaLearning environment dynamics from self-adaptation
006E36 (2004) Bruno Scherrer [France]Approche connexionniste du contrôle optimal
007022 (2004) Bruno Scherrer [France] ; Shie Mannor [États-Unis]Error reducing sampling in reinforcement learning
007196 (2003-08) Bruno Scherrer [France]Modular self-organization for a long-living autonomous agent
007272 (2003-04) Bruno Scherrer [France]Parallel asynchronous distributed computations of optimal control in large state space Markov Decision Processes
007292 (2003-01-06) Bruno Scherrer [France]Apprentissage de représentation et auto-organisation modulaire pour un agent autonome
007530 (2003) Bruno ScherrerModular self-organization for a long-living autonomous agent
007531 (2003) Bruno ScherrerModular self-organization for a long-living autonomous agent
007608