UnivTrevesV1, Istex, Corpus, bibRecord, 000F70

The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes

Identifieur interne : 000F70 ( Istex/Corpus ); précédent : 000F69; suivant : 000F71

The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes

Auteurs : Martin Mundhenk ; Judy Goldsmith ; Eric Allender

Source :

Lecture Notes in Computer Science [ 0302-9743 ] ; 1997.

RBID : ISTEX:88595D0300464B7D2C5C495EDAFBCDF1DAC76623

Abstract

Abstract: A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes. We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NPPP, a class for which not many natural problems are known to be complete.

Url:

https://api.istex.fr/document/88595D0300464B7D2C5C495EDAFBCDF1DAC76623/fulltext/pdf

DOI: 10.1007/BFb0029956

Links to Exploration step

ISTEX:88595D0300464B7D2C5C495EDAFBCDF1DAC76623

Le document en format XML

<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
<author><name sortKey="Mundhenk, Martin" sort="Mundhenk, Martin" uniqKey="Mundhenk M" first="Martin" last="Mundhenk">Martin Mundhenk</name>
<affiliation><mods:affiliation>Universität Trier, FB IV - Informatik, D-54286, Trier, Germany</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Goldsmith, Judy" sort="Goldsmith, Judy" uniqKey="Goldsmith J" first="Judy" last="Goldsmith">Judy Goldsmith</name>
<affiliation><mods:affiliation>Dept. of Computer Science, University of Kentucky, 40506-0046, Lexington, KY</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Allender, Eric" sort="Allender, Eric" uniqKey="Allender E" first="Eric" last="Allender">Eric Allender</name>
<affiliation><mods:affiliation>Dept. of Computer Science, Rutgers University, 08855-1179, Piscataway, NJ</mods:affiliation>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:88595D0300464B7D2C5C495EDAFBCDF1DAC76623</idno>
<date when="1997" year="1997">1997</date>
<idno type="doi">10.1007/BFb0029956</idno>
<idno type="url">https://api.istex.fr/document/88595D0300464B7D2C5C495EDAFBCDF1DAC76623/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000F70</idno>
<idno type="wicri:explorRef" wicri:stream="Istex" wicri:step="Corpus" wicri:corpus="ISTEX">000F70</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
<author><name sortKey="Mundhenk, Martin" sort="Mundhenk, Martin" uniqKey="Mundhenk M" first="Martin" last="Mundhenk">Martin Mundhenk</name>
<affiliation><mods:affiliation>Universität Trier, FB IV - Informatik, D-54286, Trier, Germany</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Goldsmith, Judy" sort="Goldsmith, Judy" uniqKey="Goldsmith J" first="Judy" last="Goldsmith">Judy Goldsmith</name>
<affiliation><mods:affiliation>Dept. of Computer Science, University of Kentucky, 40506-0046, Lexington, KY</mods:affiliation>
</affiliation>
</author>
<author><name sortKey="Allender, Eric" sort="Allender, Eric" uniqKey="Allender E" first="Eric" last="Allender">Eric Allender</name>
<affiliation><mods:affiliation>Dept. of Computer Science, Rutgers University, 08855-1179, Piscataway, NJ</mods:affiliation>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>1997</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">88595D0300464B7D2C5C495EDAFBCDF1DAC76623</idno>
<idno type="DOI">10.1007/BFb0029956</idno>
<idno type="ChapterID">13</idno>
<idno type="ChapterID">Chap13</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes. We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NPPP, a class for which not many natural problems are known to be complete.</div>
</front>
</TEI>
<istex><corpusName>springer</corpusName>
<author><json:item><name>Martin Mundhenk</name>
<affiliations><json:string>Universität Trier, FB IV - Informatik, D-54286, Trier, Germany</json:string>
</affiliations>
</json:item>
<json:item><name>Judy Goldsmith</name>
<affiliations><json:string>Dept. of Computer Science, University of Kentucky, 40506-0046, Lexington, KY</json:string>
</affiliations>
</json:item>
<json:item><name>Eric Allender</name>
<affiliations><json:string>Dept. of Computer Science, Rutgers University, 08855-1179, Piscataway, NJ</json:string>
</affiliations>
</json:item>
</author>
<language><json:string>eng</json:string>
</language>
<originalGenre><json:string>ReviewPaper</json:string>
</originalGenre>
<abstract>Abstract: A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes. We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NPPP, a class for which not many natural problems are known to be complete.</abstract>
<qualityIndicators><score>5.584</score>
<pdfVersion>1.3</pdfVersion>
<pdfPageSize>440 x 666 pts</pdfPageSize>
<refBibsNative>false</refBibsNative>
<keywordCount>0</keywordCount>
<abstractCharCount>736</abstractCharCount>
<pdfWordCount>4276</pdfWordCount>
<pdfCharCount>22716</pdfCharCount>
<pdfPageCount>10</pdfPageCount>
<abstractWordCount>109</abstractWordCount>
</qualityIndicators>
<title>The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
<chapterId><json:string>13</json:string>
<json:string>Chap13</json:string>
</chapterId>
<refBibs><json:item><author><json:item><name>E Allender</name>
</json:item>
<json:item><name>M Ogihara</name>
</json:item>
</author>
<host><volume>30</volume>
<pages><last>21</last>
<first>1</first>
</pages>
<issue>1</issue>
<author></author>
<title>RAIRO Theoretical Informatics and Applications</title>
<publicationDate>1996</publicationDate>
</host>
<title>Relationships among PL, #L, and the determinant</title>
<publicationDate>1996</publicationDate>
</json:item>
<json:item><author><json:item><name>C Atvarez</name>
</json:item>
<json:item><name>B Jenner</name>
</json:item>
</author>
<host><volume>107</volume>
<pages><last>30</last>
<first>3</first>
</pages>
<author></author>
<title>Theoretical Computer Science</title>
<publicationDate>1993</publicationDate>
</host>
<title>A very hard log-space counting class</title>
<publicationDate>1993</publicationDate>
</json:item>
<json:item><author><json:item><name>J,L Balc&zar</name>
</json:item>
</author>
<host><volume>86</volume>
<pages><last>188</last>
<first>171</first>
</pages>
<author></author>
<title>Artificial Intelligence</title>
<publicationDate>1996</publicationDate>
</host>
<title>The complexity of searching implicit graphs</title>
<publicationDate>1996</publicationDate>
</json:item>
<json:item><author><json:item><name>J,L Balc&zar</name>
</json:item>
<json:item><name>A Lozano</name>
</json:item>
<json:item><name>J Tor~</name>
</json:item>
</author>
<host><pages><last>377</last>
<first>351</first>
</pages>
<author></author>
<title>Computer Science</title>
<publicationDate>1992</publicationDate>
</host>
<title>The complexity of algorithmic problems on succinct instances</title>
<publicationDate>1992</publicationDate>
</json:item>
<json:item><author><json:item><name>D Beauquier</name>
</json:item>
<json:item><name>D Burago</name>
</json:item>
<json:item><name>A Slissenko</name>
</json:item>
</author>
<host><pages><last>200</last>
<first>191</first>
</pages>
<author></author>
<title>Math</title>
<publicationDate>1995</publicationDate>
</host>
<title>On the complexity of finite memory policies for Markov decision processes</title>
<publicationDate>1995</publicationDate>
</json:item>
<json:item><author><json:item><name>A Borodin</name>
</json:item>
<json:item><name>S Cook</name>
</json:item>
<json:item><name>N Pippenger</name>
</json:item>
</author>
<host><volume>58</volume>
<pages><last>136</last>
<first>113</first>
</pages>
<issue>1- 3</issue>
<author></author>
<title>Information and Control</title>
<publicationDate>1983</publicationDate>
</host>
<title>Parallel computation for well-endowed rings and space-bounded probabilistic machines</title>
<publicationDate>1983</publicationDate>
</json:item>
<json:item><author><json:item><name>C Boutilier</name>
</json:item>
<json:item><name>D Poole</name>
</json:item>
</author>
<host><pages><last>1175</last>
<first>1168</first>
</pages>
<author></author>
<title>Proc. 13th National Conference on Artificial Intelligence</title>
<publicationDate>1996</publicationDate>
</host>
<title>Computing optimal policies for partially observable decision processes using compact representations</title>
<publicationDate>1996</publicationDate>
</json:item>
<json:item><author><json:item><name>T Bylander</name>
</json:item>
</author>
<host><volume>69</volume>
<pages><last>204</last>
<first>165</first>
</pages>
<author></author>
<title>Artificial Intelligence</title>
<publicationDate>1994</publicationDate>
</host>
<title>The computational complexity of propositional STRIPS planning</title>
<publicationDate>1994</publicationDate>
</json:item>
<json:item><author><json:item><name>K Erol</name>
</json:item>
<json:item><name>J Hendler</name>
</json:item>
<json:item><name>D Natl</name>
</json:item>
</author>
<host><author></author>
<title>Annals of Mathematics and Artificial Intelligence</title>
<publicationDate>1996</publicationDate>
</host>
<title>Complexity results for hierarchical task-network planning</title>
<publicationDate>1996</publicationDate>
</json:item>
<json:item><author><json:item><name>K Erol</name>
</json:item>
<json:item><name>D Nan</name>
</json:item>
<json:item><name>V,S Subrahmanian</name>
</json:item>
</author>
<host><volume>76</volume>
<pages><last>88</last>
<first>75</first>
</pages>
<author></author>
<title>Artificial Intelligence</title>
<publicationDate>1995</publicationDate>
</host>
<title>Complexity, decidability and undecidability results for domain-independent planning</title>
<publicationDate>1995</publicationDate>
</json:item>
<json:item><author><json:item><name>S Fermer</name>
</json:item>
<json:item><name>L Fortnow</name>
</json:item>
<json:item><name>S Kurtz</name>
</json:item>
</author>
<host><volume>48</volume>
<pages><last>148</last>
<first>116</first>
</pages>
<issue>1</issue>
<author></author>
<title>Journal of Computer and System Sciences</title>
<publicationDate>1994</publicationDate>
</host>
<title>Gal>-defmable counting classes</title>
<publicationDate>1994</publicationDate>
</json:item>
<json:item><author><json:item><name>H Galperin</name>
</json:item>
<json:item><name>A Wigderson</name>
</json:item>
</author>
<host><volume>56</volume>
<pages><last>198</last>
<first>183</first>
</pages>
<author></author>
<title>Information and Control</title>
<publicationDate>1983</publicationDate>
</host>
<title>Succinct representation of graphs</title>
<publicationDate>1983</publicationDate>
</json:item>
<json:item><author><json:item><name>J Goldsmith</name>
</json:item>
<json:item><name>M Littman</name>
</json:item>
<json:item><name>M Mundhenk</name>
</json:item>
</author>
<host><author></author>
<title>Proc. 13th Conf. on Uncertainty in AI</title>
<publicationDate>1997</publicationDate>
</host>
<title>The complexity of plan existence and evaluation in probabilistic domains</title>
<publicationDate>1997</publicationDate>
</json:item>
<json:item><host><author><json:item><name>J Goldsmith</name>
</json:item>
<json:item><name>C Lusena</name>
</json:item>
<json:item><name>M Mundhenk</name>
</json:item>
</author>
<title>The complexity of deterministicaUy observable finite-horizon Markov decision processes</title>
<publicationDate>1996</publicationDate>
</host>
</json:item>
<json:item><author><json:item><name>H Jung</name>
</json:item>
</author>
<host><pages><last>291</last>
<first>281</first>
</pages>
<author></author>
<title>Proceedings 12th ICALP</title>
<publicationDate>1985</publicationDate>
</host>
<title>On probabilistic time and space</title>
<publicationDate>1985</publicationDate>
</json:item>
<json:item><author><json:item><name>R Ladner</name>
</json:item>
</author>
<host><volume>18</volume>
<pages><last>1097</last>
<first>1087</first>
</pages>
<author></author>
<title>SIAM Journal on Computing</title>
<publicationDate>1989</publicationDate>
</host>
<title>Polynomial space counting problems</title>
<publicationDate>1989</publicationDate>
</json:item>
<json:item><author><json:item><name>M,L Littman</name>
</json:item>
</author>
<host><author></author>
<title>Proc. 14th National Conference on AL AAAI Press</title>
<publicationDate>1997</publicationDate>
</host>
<title>Probabilistic propositional planning: Representations and complexity</title>
<publicationDate>1997</publicationDate>
</json:item>
<json:item><author><json:item><name>W,S Lovejoy</name>
</json:item>
</author>
<host><volume>28</volume>
<pages><last>66</last>
<first>47</first>
</pages>
<author></author>
<title>Annals of Operations Research</title>
<publicationDate>1991</publicationDate>
</host>
<title>A survey of algorithmic methods for partially observed Markov decision processes</title>
<publicationDate>1991</publicationDate>
</json:item>
<json:item><host><author><json:item><name>C,H Papadimitriou</name>
</json:item>
</author>
<title>Computational Complexity</title>
<publicationDate>1994</publicationDate>
</host>
</json:item>
<json:item><author><json:item><name>C,H Papadimitriou</name>
</json:item>
<json:item><name>J,N Tsitsiklis</name>
</json:item>
</author>
<host><pages><last>654</last>
<first>639</first>
</pages>
<author></author>
<title>SIAM Journal of Control and Optimization</title>
<publicationDate>1986</publicationDate>
</host>
<title>Intractable problem_s in control theory</title>
<publicationDate>1986</publicationDate>
</json:item>
<json:item><author><json:item><name>C,H Papadimitriou</name>
</json:item>
<json:item><name>J,N Tsitsiklis</name>
</json:item>
</author>
<host><volume>12</volume>
<pages><last>450</last>
<first>441</first>
</pages>
<issue>3</issue>
<author></author>
<title>Mathematics of Operations Research</title>
<publicationDate>1987</publicationDate>
</host>
<title>The complexity of Markov decision processes</title>
<publicationDate>1987</publicationDate>
</json:item>
<json:item><host><author><json:item><name>M,L Puterman</name>
</json:item>
</author>
<title>Markov decision processes</title>
<publicationDate>1994</publicationDate>
</host>
</json:item>
<json:item><author><json:item><name>V Vinay</name>
</json:item>
</author>
<host><pages><last>284</last>
<first>270</first>
</pages>
<author></author>
<title>Proc. 6th Structure in Complexity Theory Conference</title>
<publicationDate>1991</publicationDate>
</host>
<title>Counting auxiliary pushdown automata and semi-unbounded arithmetic circuits</title>
<publicationDate>1991</publicationDate>
</json:item>
<json:item><author><json:item><name>K,W Wagner</name>
</json:item>
</author>
<host><volume>23</volume>
<pages><last>356</last>
<first>325</first>
</pages>
<author></author>
<title>Acta ]nformatica</title>
<publicationDate>1986</publicationDate>
</host>
<title>The complexity of combinatorial problems with succinct input representation</title>
<publicationDate>1986</publicationDate>
</json:item>
</refBibs>
<genre><json:string>conference</json:string>
</genre>
<serie><editor><json:item><name>Gerhard Goos</name>
</json:item>
<json:item><name>Juris Hartmanis</name>
</json:item>
<json:item><name>Jan van Leeuwen</name>
</json:item>
</editor>
<issn><json:string>0302-9743</json:string>
</issn>
<language><json:string>unknown</json:string>
</language>
<eissn><json:string>1611-3349</json:string>
</eissn>
<title>Lecture Notes in Computer Science</title>
<copyrightDate>1997</copyrightDate>
</serie>
<host><editor><json:item><name>Igor Prívara</name>
</json:item>
<json:item><name>Peter Ružička</name>
</json:item>
</editor>
<subject><json:item><value>Computer Science</value>
</json:item>
<json:item><value>Computer Science</value>
</json:item>
<json:item><value>Theory of Computation</value>
</json:item>
<json:item><value>Software Engineering</value>
</json:item>
<json:item><value>Programming Languages, Compilers, Interpreters</value>
</json:item>
<json:item><value>Discrete Mathematics in Computer Science</value>
</json:item>
</subject>
<isbn><json:string>978-3-540-63437-9</json:string>
</isbn>
<language><json:string>unknown</json:string>
</language>
<eissn><json:string>1611-3349</json:string>
</eissn>
<title>Mathematical Foundations of Computer Science 1997</title>
<bookId><json:string>3540634371</json:string>
</bookId>
<volume>1295</volume>
<pages><last>138</last>
<first>129</first>
</pages>
<issn><json:string>0302-9743</json:string>
</issn>
<genre><json:string>book-series</json:string>
</genre>
<eisbn><json:string>978-3-540-69547-9</json:string>
</eisbn>
<copyrightDate>1997</copyrightDate>
<doi><json:string>10.1007/BFb0029943</json:string>
</doi>
</host>
<publicationDate>2005</publicationDate>
<copyrightDate>1997</copyrightDate>
<doi><json:string>10.1007/BFb0029956</json:string>
</doi>
<id>88595D0300464B7D2C5C495EDAFBCDF1DAC76623</id>
<score>0.39468294</score>
<fulltext><json:item><extension>pdf</extension>
<original>true</original>
<mimetype>application/pdf</mimetype>
<uri>https://api.istex.fr/document/88595D0300464B7D2C5C495EDAFBCDF1DAC76623/fulltext/pdf</uri>
</json:item>
<json:item><extension>zip</extension>
<original>false</original>
<mimetype>application/zip</mimetype>
<uri>https://api.istex.fr/document/88595D0300464B7D2C5C495EDAFBCDF1DAC76623/fulltext/zip</uri>
</json:item>
<istex:fulltextTEI uri="https://api.istex.fr/document/88595D0300464B7D2C5C495EDAFBCDF1DAC76623/fulltext/tei"><teiHeader><fileDesc><titleStmt><title level="a" type="main" xml:lang="en">The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
<respStmt><resp>Références bibliographiques récupérées via GROBID</resp>
<name resp="ISTEX-API">ISTEX-API (INIST-CNRS)</name>
</respStmt>
<respStmt><resp>Références bibliographiques récupérées via GROBID</resp>
<name resp="ISTEX-API">ISTEX-API (INIST-CNRS)</name>
</respStmt>
</titleStmt>
<publicationStmt><authority>ISTEX</authority>
<publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<availability><p>Springer-Verlag, 1997</p>
</availability>
<date>1997</date>
</publicationStmt>
<sourceDesc><biblStruct type="inbook"><analytic><title level="a" type="main" xml:lang="en">The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
<author xml:id="author-1"><persName><forename type="first">Martin</forename>
<surname>Mundhenk</surname>
</persName>
<affiliation>Universität Trier, FB IV - Informatik, D-54286, Trier, Germany</affiliation>
</author>
<author xml:id="author-2"><persName><forename type="first">Judy</forename>
<surname>Goldsmith</surname>
</persName>
<affiliation>Dept. of Computer Science, University of Kentucky, 40506-0046, Lexington, KY</affiliation>
</author>
<author xml:id="author-3"><persName><forename type="first">Eric</forename>
<surname>Allender</surname>
</persName>
<affiliation>Dept. of Computer Science, Rutgers University, 08855-1179, Piscataway, NJ</affiliation>
</author>
</analytic>
<monogr><title level="m">Mathematical Foundations of Computer Science 1997</title>
<title level="m" type="sub">22nd International Symposium, MFCS '97 Bratislava, Slovakia, August 25–29, 1997 Proceedings</title>
<idno type="pISBN">978-3-540-63437-9</idno>
<idno type="eISBN">978-3-540-69547-9</idno>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="DOI">10.1007/BFb0029943</idno>
<idno type="book-ID">3540634371</idno>
<idno type="book-title-ID">46525</idno>
<idno type="book-volume-number">1295</idno>
<idno type="book-chapter-count">51</idno>
<editor><persName><forename type="first">Igor</forename>
<surname>Prívara</surname>
</persName>
</editor>
<editor><persName><forename type="first">Peter</forename>
<surname>Ružička</surname>
</persName>
</editor>
<imprint><publisher>Springer Berlin Heidelberg</publisher>
<pubPlace>Berlin, Heidelberg</pubPlace>
<date type="published" when="2005-06-17"></date>
<biblScope unit="volume">1295</biblScope>
<biblScope unit="page" from="129">129</biblScope>
<biblScope unit="page" to="138">138</biblScope>
</imprint>
</monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<editor><persName><forename type="first">Gerhard</forename>
<surname>Goos</surname>
</persName>
</editor>
<editor><persName><forename type="first">Juris</forename>
<surname>Hartmanis</surname>
</persName>
</editor>
<editor><persName><forename type="first">Jan</forename>
<surname>van Leeuwen</surname>
</persName>
</editor>
<biblScope><date>1997</date>
</biblScope>
<idno type="pISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="series-Id">558</idno>
</series>
<idno type="istex">88595D0300464B7D2C5C495EDAFBCDF1DAC76623</idno>
<idno type="DOI">10.1007/BFb0029956</idno>
<idno type="ChapterID">13</idno>
<idno type="ChapterID">Chap13</idno>
</biblStruct>
</sourceDesc>
</fileDesc>
<profileDesc><creation><date>1997</date>
</creation>
<langUsage><language ident="en">en</language>
</langUsage>
<abstract xml:lang="en"><p>Abstract: A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes. We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NPPP, a class for which not many natural problems are known to be complete.</p>
</abstract>
<textClass><keywords scheme="Book-Subject-Collection"><list><label>SUCO11645</label>
<item><term>Computer Science</term>
</item>
</list>
</keywords>
</textClass>
<textClass><keywords scheme="Book-Subject-Group"><list><label>I</label>
<label>I16005</label>
<label>I14029</label>
<label>I14037</label>
<label>I17028</label>
<item><term>Computer Science</term>
</item>
<item><term>Theory of Computation</term>
</item>
<item><term>Software Engineering</term>
</item>
<item><term>Programming Languages, Compilers, Interpreters</term>
</item>
<item><term>Discrete Mathematics in Computer Science</term>
</item>
</list>
</keywords>
</textClass>
</profileDesc>
<revisionDesc><change when="2005-06-17">Published</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2016-11-22">References added</change>
<change xml:id="refBibs-istex" who="#ISTEX-API" when="2017-01-20">References added</change>
</revisionDesc>
</teiHeader>
</istex:fulltextTEI>
<json:item><extension>txt</extension>
<original>false</original>
<mimetype>text/plain</mimetype>
<uri>https://api.istex.fr/document/88595D0300464B7D2C5C495EDAFBCDF1DAC76623/fulltext/txt</uri>
</json:item>
</fulltext>
<metadata><istex:metadataXml wicri:clean="Springer, Publisher found" wicri:toSee="no header"><istex:xmlDeclaration>version="1.0" encoding="UTF-8"</istex:xmlDeclaration>
<istex:docType PUBLIC="-//Springer-Verlag//DTD A++ V2.4//EN" URI="http://devel.springer.de/A++/V2.4/DTD/A++V2.4.dtd" name="istex:docType"></istex:docType>
<istex:document><Publisher><PublisherInfo><PublisherName>Springer Berlin Heidelberg</PublisherName>
<PublisherLocation>Berlin, Heidelberg</PublisherLocation>
</PublisherInfo>
<Series><SeriesInfo TocLevels="0"><SeriesID>558</SeriesID>
<SeriesPrintISSN>0302-9743</SeriesPrintISSN>
<SeriesElectronicISSN>1611-3349</SeriesElectronicISSN>
<SeriesTitle Language="En">Lecture Notes in Computer Science</SeriesTitle>
<SeriesAbbreviatedTitle>Lect Notes Comput Sci</SeriesAbbreviatedTitle>
</SeriesInfo>
<SeriesHeader><EditorGroup><Editor><EditorName DisplayOrder="Western"><GivenName>Gerhard</GivenName>
<FamilyName>Goos</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Juris</GivenName>
<FamilyName>Hartmanis</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Jan</GivenName>
<Particle>van</Particle>
<FamilyName>Leeuwen</FamilyName>
</EditorName>
</Editor>
</EditorGroup>
</SeriesHeader>
<Book Language="En"><BookInfo MediaType="eBook" Language="En" BookProductType="Proceedings" TocLevels="0" NumberingStyle="Unnumbered"><BookID>3540634371</BookID>
<BookTitle>Mathematical Foundations of Computer Science 1997</BookTitle>
<BookSubTitle>22nd International Symposium, MFCS '97 Bratislava, Slovakia, August 25–29, 1997 Proceedings</BookSubTitle>
<BookVolumeNumber>1295</BookVolumeNumber>
<BookDOI>10.1007/BFb0029943</BookDOI>
<BookTitleID>46525</BookTitleID>
<BookPrintISBN>978-3-540-63437-9</BookPrintISBN>
<BookElectronicISBN>978-3-540-69547-9</BookElectronicISBN>
<BookChapterCount>51</BookChapterCount>
<BookCopyright><CopyrightHolderName>Springer-Verlag</CopyrightHolderName>
<CopyrightYear>1997</CopyrightYear>
</BookCopyright>
<BookSubjectGroup><BookSubject Code="I" Type="Primary">Computer Science</BookSubject>
<BookSubject Code="I16005" Priority="1" Type="Secondary">Theory of Computation</BookSubject>
<BookSubject Code="I14029" Priority="2" Type="Secondary">Software Engineering</BookSubject>
<BookSubject Code="I14037" Priority="3" Type="Secondary">Programming Languages, Compilers, Interpreters</BookSubject>
<BookSubject Code="I17028" Priority="4" Type="Secondary">Discrete Mathematics in Computer Science</BookSubject>
<SubjectCollection Code="SUCO11645">Computer Science</SubjectCollection>
</BookSubjectGroup>
</BookInfo>
<BookHeader><EditorGroup><Editor><EditorName DisplayOrder="Western"><GivenName>Igor</GivenName>
<FamilyName>Prívara</FamilyName>
</EditorName>
</Editor>
<Editor><EditorName DisplayOrder="Western"><GivenName>Peter</GivenName>
<FamilyName>Ružička</FamilyName>
</EditorName>
</Editor>
</EditorGroup>
</BookHeader>
<Chapter ID="Chap13" Language="En"><ChapterInfo ChapterType="ReviewPaper" NumberingStyle="Unnumbered" TocLevels="0" ContainsESM="No"><ChapterID>13</ChapterID>
<ChapterDOI>10.1007/BFb0029956</ChapterDOI>
<ChapterSequenceNumber>13</ChapterSequenceNumber>
<ChapterTitle Language="En">The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</ChapterTitle>
<ChapterCategory>Contributed Papers</ChapterCategory>
<ChapterFirstPage>129</ChapterFirstPage>
<ChapterLastPage>138</ChapterLastPage>
<ChapterCopyright><CopyrightHolderName>Springer-Verlag</CopyrightHolderName>
<CopyrightYear>1997</CopyrightYear>
</ChapterCopyright>
<ChapterHistory><OnlineDate><Year>2005</Year>
<Month>6</Month>
<Day>17</Day>
</OnlineDate>
</ChapterHistory>
<ChapterGrants Type="Regular"><MetadataGrant Grant="OpenAccess"></MetadataGrant>
<AbstractGrant Grant="OpenAccess"></AbstractGrant>
<BodyPDFGrant Grant="Restricted"></BodyPDFGrant>
<BodyHTMLGrant Grant="Restricted"></BodyHTMLGrant>
<BibliographyGrant Grant="Restricted"></BibliographyGrant>
<ESMGrant Grant="Restricted"></ESMGrant>
</ChapterGrants>
<ChapterContext><SeriesID>558</SeriesID>
<BookID>3540634371</BookID>
<BookTitle>Mathematical Foundations of Computer Science 1997</BookTitle>
</ChapterContext>
</ChapterInfo>
<ChapterHeader><AuthorGroup><Author AffiliationIDS="Aff1"><AuthorName DisplayOrder="Western"><GivenName>Martin</GivenName>
<FamilyName>Mundhenk</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff2"><AuthorName DisplayOrder="Western"><GivenName>Judy</GivenName>
<FamilyName>Goldsmith</FamilyName>
</AuthorName>
</Author>
<Author AffiliationIDS="Aff3"><AuthorName DisplayOrder="Western"><GivenName>Eric</GivenName>
<FamilyName>Allender</FamilyName>
</AuthorName>
</Author>
<Affiliation ID="Aff1"><OrgName>Universität Trier, FB IV - Informatik</OrgName>
<OrgAddress><Postcode>D-54286</Postcode>
<City>Trier</City>
<Country>Germany</Country>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff2"><OrgDivision>Dept. of Computer Science</OrgDivision>
<OrgName>University of Kentucky</OrgName>
<OrgAddress><Postcode>40506-0046</Postcode>
<City>Lexington</City>
<State>KY</State>
</OrgAddress>
</Affiliation>
<Affiliation ID="Aff3"><OrgDivision>Dept. of Computer Science</OrgDivision>
<OrgName>Rutgers University</OrgName>
<OrgAddress><Postcode>08855-1179</Postcode>
<City>Piscataway</City>
<State>NJ</State>
</OrgAddress>
</Affiliation>
</AuthorGroup>
<Abstract ID="Abs1" Language="En"><Heading>Abstract</Heading>
<Para>A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes.</Para>
<Para>We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NP<Superscript>PP</Superscript>
, a class for which not many natural problems are known to be complete.</Para>
</Abstract>
<ArticleNote Type="Misc"><SimplePara>Supported in part by the Office of the Vice Chancellor for Research and Graduate Studies at the University of Kentucky, and by the Deutsche Forschungsgemeinschaft (DFG), grant Mu 1226/2-1. Part of the work was done at University of Kentucky.</SimplePara>
</ArticleNote>
<ArticleNote Type="Misc"><SimplePara>Supported in part by NSF grant CCR-9315354.</SimplePara>
</ArticleNote>
<ArticleNote Type="Misc"><SimplePara>Supported in part by NSF grant 9509603. Portions of the work were performed while at the Institute of Mathematical Sciences, Chennai (Madras), India, and at the Wilhelm-Schickard Institut für Informatik, Universität Tübingen (supported by DFG grant TU 7/117-1).</SimplePara>
</ArticleNote>
</ChapterHeader>
<NoBody></NoBody>
</Chapter>
</Book>
</Series>
</Publisher>
</istex:document>
</istex:metadataXml>
<mods version="3.6"><titleInfo lang="en"><title>The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
</titleInfo>
<titleInfo type="alternative" contentType="CDATA" lang="en"><title>The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes</title>
</titleInfo>
<name type="personal"><namePart type="given">Martin</namePart>
<namePart type="family">Mundhenk</namePart>
<affiliation>Universität Trier, FB IV - Informatik, D-54286, Trier, Germany</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Judy</namePart>
<namePart type="family">Goldsmith</namePart>
<affiliation>Dept. of Computer Science, University of Kentucky, 40506-0046, Lexington, KY</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Eric</namePart>
<namePart type="family">Allender</namePart>
<affiliation>Dept. of Computer Science, Rutgers University, 08855-1179, Piscataway, NJ</affiliation>
<role><roleTerm type="text">author</roleTerm>
</role>
</name>
<typeOfResource>text</typeOfResource>
<genre type="conference" displayLabel="ReviewPaper"></genre>
<originInfo><publisher>Springer Berlin Heidelberg</publisher>
<place><placeTerm type="text">Berlin, Heidelberg</placeTerm>
</place>
<dateIssued encoding="w3cdtf">2005-06-17</dateIssued>
<copyrightDate encoding="w3cdtf">1997</copyrightDate>
</originInfo>
<language><languageTerm type="code" authority="rfc3066">en</languageTerm>
<languageTerm type="code" authority="iso639-2b">eng</languageTerm>
</language>
<physicalDescription><internetMediaType>text/html</internetMediaType>
</physicalDescription>
<abstract lang="en">Abstract: A partially-observable Markov decision process (POMDP) is a generalization of a Markov decision process that allows for incomplete information regarding the state of the system. We consider several flavors of finite-horizon POMDPs. Our results concern the complexity of the policy evaluation and policy existence problems, which are characterized in terms of completeness for complexity classes. We prove a new upper bound for the policy evaluation problem for POMDPs, showing it is complete for Probabilistic Logspace. From this, we prove policy existence problems for several variants of unobservable, succinctly represented MDPs to be complete for NPPP, a class for which not many natural problems are known to be complete.</abstract>
<relatedItem type="host"><titleInfo><title>Mathematical Foundations of Computer Science 1997</title>
<subTitle>22nd International Symposium, MFCS '97 Bratislava, Slovakia, August 25–29, 1997 Proceedings</subTitle>
</titleInfo>
<name type="personal"><namePart type="given">Igor</namePart>
<namePart type="family">Prívara</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Peter</namePart>
<namePart type="family">Ružička</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<genre type="book-series" displayLabel="Proceedings"></genre>
<originInfo><copyrightDate encoding="w3cdtf">1997</copyrightDate>
<issuance>monographic</issuance>
</originInfo>
<subject><genre>Book-Subject-Collection</genre>
<topic authority="SpringerSubjectCodes" authorityURI="SUCO11645">Computer Science</topic>
</subject>
<subject><genre>Book-Subject-Group</genre>
<topic authority="SpringerSubjectCodes" authorityURI="I">Computer Science</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I16005">Theory of Computation</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I14029">Software Engineering</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I14037">Programming Languages, Compilers, Interpreters</topic>
<topic authority="SpringerSubjectCodes" authorityURI="I17028">Discrete Mathematics in Computer Science</topic>
</subject>
<identifier type="DOI">10.1007/BFb0029943</identifier>
<identifier type="ISBN">978-3-540-63437-9</identifier>
<identifier type="eISBN">978-3-540-69547-9</identifier>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="BookTitleID">46525</identifier>
<identifier type="BookID">3540634371</identifier>
<identifier type="BookChapterCount">51</identifier>
<identifier type="BookVolumeNumber">1295</identifier>
<part><date>1997</date>
<detail type="volume"><number>1295</number>
<caption>vol.</caption>
</detail>
<extent unit="pages"><start>129</start>
<end>138</end>
</extent>
</part>
<recordInfo><recordOrigin>Springer-Verlag, 1997</recordOrigin>
</recordInfo>
</relatedItem>
<relatedItem type="series"><titleInfo><title>Lecture Notes in Computer Science</title>
</titleInfo>
<name type="personal"><namePart type="given">Gerhard</namePart>
<namePart type="family">Goos</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Juris</namePart>
<namePart type="family">Hartmanis</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<name type="personal"><namePart type="given">Jan</namePart>
<namePart type="family">van Leeuwen</namePart>
<role><roleTerm type="text">editor</roleTerm>
</role>
</name>
<originInfo><copyrightDate encoding="w3cdtf">1997</copyrightDate>
<issuance>serial</issuance>
</originInfo>
<identifier type="ISSN">0302-9743</identifier>
<identifier type="eISSN">1611-3349</identifier>
<identifier type="SeriesID">558</identifier>
<recordInfo><recordOrigin>Springer-Verlag, 1997</recordOrigin>
</recordInfo>
</relatedItem>
<identifier type="istex">88595D0300464B7D2C5C495EDAFBCDF1DAC76623</identifier>
<identifier type="DOI">10.1007/BFb0029956</identifier>
<identifier type="ChapterID">13</identifier>
<identifier type="ChapterID">Chap13</identifier>
<accessCondition type="use and reproduction" contentType="copyright">Springer-Verlag, 1997</accessCondition>
<recordInfo><recordContentSource>SPRINGER</recordContentSource>
<recordOrigin>Springer-Verlag, 1997</recordOrigin>
</recordInfo>
</mods>
</metadata>
</istex>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Wicri/Rhénanie/explor/UnivTrevesV1/Data/Istex/Corpus

HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000F70 | SxmlIndent | more

HfdSelect -h $EXPLOR_AREA/Data/Istex/Corpus/biblio.hfd -nk 000F70 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Wicri/Rhénanie
   |area=    UnivTrevesV1
   |flux=    Istex
   |étape=   Corpus
   |type=    RBID
   |clé=     ISTEX:88595D0300464B7D2C5C495EDAFBCDF1DAC76623
   |texte=   The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes
}}

This area was generated with Dilib version V0.6.31.
Data generation: Sat Jul 22 16:29:01 2017. Site generation: Wed Feb 28 14:55:37 2024

	Serveur d'exploration sur l'Université de Trèves
	Attention, ce site est en cours de développement ! Attention, site généré par des moyens informatiques à partir de corpus bruts. Les informations ne sont donc pas validées.

Serveur d'exploration sur l'Université de Trèves

The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes

The complexity of policy evaluation for finite-horizon partially-observable Markov decision processes

Source :

Abstract

Links to Exploration step

Le document en format XML

Pour manipuler ce document sous Unix (Dilib)

Pour mettre un lien sur cette page dans le réseau Wicri