Matches in SemOpenAlex for { <https://semopenalex.org/work/W1640774615> ?p ?o ?g. }
- W1640774615 endingPage "436" @default.
- W1640774615 startingPage "427" @default.
- W1640774615 abstract "Reactive (memoryless) policies are sufficient in completely observable Markov decision processes (MDPS), but some kind of memory is usually necessary for optimal control of a partially observable MDP. Policies with finite memory can be represented as finite-state automata. In this paper, we extend Baird and Moore's VAPS algorithm to the problem of learning general finite-state automata. Because it performs stochastic gradient descent, this algorithm can be shown to converge to a locally optimal finitestate controller. We provide the details of the algorithm and then consider the question of under what conditions stochastic gradient descent will outperform exact gradient descent. We conclude with empirical results comparing the performance of stochastic and exact gradient descent, and showing the ability of our algorithm to extract the useful information contained in the sequence of past observations to compensate for the lack of observability at each time-step." @default.
- W1640774615 created "2016-06-24" @default.
- W1640774615 creator A5005123625 @default.
- W1640774615 creator A5012862284 @default.
- W1640774615 creator A5074569524 @default.
- W1640774615 creator A5091208273 @default.
- W1640774615 date "1999-07-30" @default.
- W1640774615 modified "2023-09-26" @default.
- W1640774615 title "Learning finite-state controllers for partially observable environments" @default.
- W1640774615 cites W1494689917 @default.
- W1640774615 cites W1515851193 @default.
- W1640774615 cites W1541084404 @default.
- W1640774615 cites W1542709260 @default.
- W1640774615 cites W1555801537 @default.
- W1640774615 cites W1558845053 @default.
- W1640774615 cites W1576452626 @default.
- W1640774615 cites W1583380718 @default.
- W1640774615 cites W1646707810 @default.
- W1640774615 cites W1657542410 @default.
- W1640774615 cites W1687873425 @default.
- W1640774615 cites W1701684472 @default.
- W1640774615 cites W180325379 @default.
- W1640774615 cites W2011418219 @default.
- W1640774615 cites W2012664173 @default.
- W1640774615 cites W2028145673 @default.
- W1640774615 cites W2034725503 @default.
- W1640774615 cites W2044375425 @default.
- W1640774615 cites W2061504687 @default.
- W1640774615 cites W2099873296 @default.
- W1640774615 cites W2100677568 @default.
- W1640774615 cites W2107726111 @default.
- W1640774615 cites W2109393574 @default.
- W1640774615 cites W2119567691 @default.
- W1640774615 cites W2121863487 @default.
- W1640774615 cites W2160067530 @default.
- W1640774615 cites W2164056559 @default.
- W1640774615 cites W2168359464 @default.
- W1640774615 cites W2912185451 @default.
- W1640774615 hasPublicationYear "1999" @default.
- W1640774615 type Work @default.
- W1640774615 sameAs 1640774615 @default.
- W1640774615 citedByCount "130" @default.
- W1640774615 countsByYear W16407746152012 @default.
- W1640774615 countsByYear W16407746152013 @default.
- W1640774615 countsByYear W16407746152014 @default.
- W1640774615 countsByYear W16407746152015 @default.
- W1640774615 countsByYear W16407746152016 @default.
- W1640774615 countsByYear W16407746152017 @default.
- W1640774615 countsByYear W16407746152018 @default.
- W1640774615 countsByYear W16407746152019 @default.
- W1640774615 countsByYear W16407746152020 @default.
- W1640774615 countsByYear W16407746152021 @default.
- W1640774615 crossrefType "proceedings-article" @default.
- W1640774615 hasAuthorship W1640774615A5005123625 @default.
- W1640774615 hasAuthorship W1640774615A5012862284 @default.
- W1640774615 hasAuthorship W1640774615A5074569524 @default.
- W1640774615 hasAuthorship W1640774615A5091208273 @default.
- W1640774615 hasConcept C105795698 @default.
- W1640774615 hasConcept C106189395 @default.
- W1640774615 hasConcept C11413529 @default.
- W1640774615 hasConcept C119857082 @default.
- W1640774615 hasConcept C121332964 @default.
- W1640774615 hasConcept C126255220 @default.
- W1640774615 hasConcept C153258448 @default.
- W1640774615 hasConcept C154945302 @default.
- W1640774615 hasConcept C159886148 @default.
- W1640774615 hasConcept C167822520 @default.
- W1640774615 hasConcept C203479927 @default.
- W1640774615 hasConcept C206688291 @default.
- W1640774615 hasConcept C2775924081 @default.
- W1640774615 hasConcept C2778112365 @default.
- W1640774615 hasConcept C28826006 @default.
- W1640774615 hasConcept C2983497884 @default.
- W1640774615 hasConcept C32848918 @default.
- W1640774615 hasConcept C33923547 @default.
- W1640774615 hasConcept C36299963 @default.
- W1640774615 hasConcept C41008148 @default.
- W1640774615 hasConcept C47446073 @default.
- W1640774615 hasConcept C48103436 @default.
- W1640774615 hasConcept C50644808 @default.
- W1640774615 hasConcept C54355233 @default.
- W1640774615 hasConcept C62520636 @default.
- W1640774615 hasConcept C6557445 @default.
- W1640774615 hasConcept C86803240 @default.
- W1640774615 hasConcept C98763669 @default.
- W1640774615 hasConceptScore W1640774615C105795698 @default.
- W1640774615 hasConceptScore W1640774615C106189395 @default.
- W1640774615 hasConceptScore W1640774615C11413529 @default.
- W1640774615 hasConceptScore W1640774615C119857082 @default.
- W1640774615 hasConceptScore W1640774615C121332964 @default.
- W1640774615 hasConceptScore W1640774615C126255220 @default.
- W1640774615 hasConceptScore W1640774615C153258448 @default.
- W1640774615 hasConceptScore W1640774615C154945302 @default.
- W1640774615 hasConceptScore W1640774615C159886148 @default.
- W1640774615 hasConceptScore W1640774615C167822520 @default.
- W1640774615 hasConceptScore W1640774615C203479927 @default.
- W1640774615 hasConceptScore W1640774615C206688291 @default.
- W1640774615 hasConceptScore W1640774615C2775924081 @default.