Matches in SemOpenAlex for { <https://semopenalex.org/work/W117060268> ?p ?o ?g. }
- W117060268 abstract "Policy-gradient algorithms are attractive as a scalable approach to learning approximate policies for controlling partially observable Markov decision processes (POMDPs). POMDPs can be used to model a wide variety of learning problems, from robot navigation to speech recognition to stock trading. The downside of this generality is that exact algorithms are computationally intractable, motivating approximate methods. Existing policy-gradient methods have worked well for problems admitting good memory-less solutions, but have failed to scale to large problems requiring memory. This paper develops novel algorithms for learning policies with memory. We present a new policy-gradient algorithm that uses an explicit model of the POMDP to estimate gradients, and demonstrate its effectiveness on problems with tens of thousands of states. We also describe three new Monte-Carlo algorithms that learn by interacting with their environment. We compare these algorithms on non-trivial POMDPs, including noisy robot navigation and multi-agent settings." @default.
- W117060268 created "2016-06-24" @default.
- W117060268 creator A5015132137 @default.
- W117060268 creator A5077176880 @default.
- W117060268 date "2002-01-01" @default.
- W117060268 modified "2023-09-23" @default.
- W117060268 title "Internal-State Policy-Gradient Algorithms for Partially Observable Markov Decision Processes" @default.
- W117060268 cites W11162148 @default.
- W117060268 cites W131709709 @default.
- W117060268 cites W1487408605 @default.
- W117060268 cites W1500894465 @default.
- W117060268 cites W1539216098 @default.
- W117060268 cites W1541084404 @default.
- W117060268 cites W1555477527 @default.
- W117060268 cites W1560263223 @default.
- W117060268 cites W1563317173 @default.
- W117060268 cites W1585398001 @default.
- W117060268 cites W159191692 @default.
- W117060268 cites W1594297126 @default.
- W117060268 cites W1602007439 @default.
- W117060268 cites W1606274310 @default.
- W117060268 cites W1640774615 @default.
- W117060268 cites W1657542410 @default.
- W117060268 cites W1657674574 @default.
- W117060268 cites W1701684472 @default.
- W117060268 cites W180325379 @default.
- W117060268 cites W1814308503 @default.
- W117060268 cites W1880549478 @default.
- W117060268 cites W1914583973 @default.
- W117060268 cites W1934019294 @default.
- W117060268 cites W1965537434 @default.
- W117060268 cites W1965786092 @default.
- W117060268 cites W1970602736 @default.
- W117060268 cites W1983016559 @default.
- W117060268 cites W2011418219 @default.
- W117060268 cites W2032100464 @default.
- W117060268 cites W2034725503 @default.
- W117060268 cites W2046765929 @default.
- W117060268 cites W2051195188 @default.
- W117060268 cites W2076420009 @default.
- W117060268 cites W2107726111 @default.
- W117060268 cites W2113889826 @default.
- W117060268 cites W2114501936 @default.
- W117060268 cites W2119717200 @default.
- W117060268 cites W2122695052 @default.
- W117060268 cites W2125510930 @default.
- W117060268 cites W2125838338 @default.
- W117060268 cites W2141899961 @default.
- W117060268 cites W2146398222 @default.
- W117060268 cites W2155027007 @default.
- W117060268 cites W2156737235 @default.
- W117060268 cites W2157140289 @default.
- W117060268 cites W2160067530 @default.
- W117060268 cites W2164056559 @default.
- W117060268 cites W2165805036 @default.
- W117060268 cites W2165959773 @default.
- W117060268 cites W2253356247 @default.
- W117060268 cites W2312609093 @default.
- W117060268 cites W27992807 @default.
- W117060268 cites W2898916015 @default.
- W117060268 cites W91039700 @default.
- W117060268 cites W2116009284 @default.
- W117060268 hasPublicationYear "2002" @default.
- W117060268 type Work @default.
- W117060268 sameAs 117060268 @default.
- W117060268 citedByCount "0" @default.
- W117060268 crossrefType "journal-article" @default.
- W117060268 hasAuthorship W117060268A5015132137 @default.
- W117060268 hasAuthorship W117060268A5077176880 @default.
- W117060268 hasConcept C105795698 @default.
- W117060268 hasConcept C106189395 @default.
- W117060268 hasConcept C11413529 @default.
- W117060268 hasConcept C119857082 @default.
- W117060268 hasConcept C121332964 @default.
- W117060268 hasConcept C126255220 @default.
- W117060268 hasConcept C154945302 @default.
- W117060268 hasConcept C15744967 @default.
- W117060268 hasConcept C159886148 @default.
- W117060268 hasConcept C163836022 @default.
- W117060268 hasConcept C17098449 @default.
- W117060268 hasConcept C2780767217 @default.
- W117060268 hasConcept C32848918 @default.
- W117060268 hasConcept C33923547 @default.
- W117060268 hasConcept C41008148 @default.
- W117060268 hasConcept C48044578 @default.
- W117060268 hasConcept C542102704 @default.
- W117060268 hasConcept C62520636 @default.
- W117060268 hasConcept C77088390 @default.
- W117060268 hasConcept C98763669 @default.
- W117060268 hasConceptScore W117060268C105795698 @default.
- W117060268 hasConceptScore W117060268C106189395 @default.
- W117060268 hasConceptScore W117060268C11413529 @default.
- W117060268 hasConceptScore W117060268C119857082 @default.
- W117060268 hasConceptScore W117060268C121332964 @default.
- W117060268 hasConceptScore W117060268C126255220 @default.
- W117060268 hasConceptScore W117060268C154945302 @default.
- W117060268 hasConceptScore W117060268C15744967 @default.
- W117060268 hasConceptScore W117060268C159886148 @default.
- W117060268 hasConceptScore W117060268C163836022 @default.
- W117060268 hasConceptScore W117060268C17098449 @default.