Matches in SemOpenAlex for { <https://semopenalex.org/work/W1921016406> ?p ?o ?g. }
- W1921016406 endingPage "28" @default.
- W1921016406 startingPage "1" @default.
- W1921016406 abstract "Markov decision processes (MDPs) provide a rich framework for planning under uncertainty. However, exactly solving a large MDP is usually intractable due to the “curse of dimensionality”— the state space grows exponentially with the number of state variables. Online algorithms tackle this problem by avoiding computing a policy for the entire state space. On the other hand, since online algorithm has to find a near-optimal action online in almost real time, the computation time is often very limited. In the context of reinforcement learning, MAXQ is a value function decomposition method that exploits the underlying structure of the original MDP and decomposes it into a combination of smaller subproblems arranged over a task hierarchy. In this article, we present MAXQ-OP—a novel online planning algorithm for large MDPs that utilizes MAXQ hierarchical decomposition in online settings. Compared to traditional online planning algorithms, MAXQ-OP is able to reach much more deeper states in the search tree with relatively less computation time by exploiting MAXQ hierarchical decomposition online. We empirically evaluate our algorithm in the standard Taxi domain—a common benchmark for MDPs—to show the effectiveness of our approach. We have also conducted a long-term case study in a highly complex simulated soccer domain and developed a team named WrightEagle that has won five world champions and five runners-up in the recent 10 years of RoboCup Soccer Simulation 2D annual competitions. The results in the RoboCup domain confirm the scalability of MAXQ-OP to very large domains." @default.
- W1921016406 created "2016-06-24" @default.
- W1921016406 creator A5029102289 @default.
- W1921016406 creator A5083303426 @default.
- W1921016406 creator A5084710341 @default.
- W1921016406 date "2015-07-15" @default.
- W1921016406 modified "2023-09-25" @default.
- W1921016406 title "Online Planning for Large Markov Decision Processes with Hierarchical Decomposition" @default.
- W1921016406 cites W1586162706 @default.
- W1921016406 cites W1625390266 @default.
- W1921016406 cites W1899715700 @default.
- W1921016406 cites W1976800061 @default.
- W1921016406 cites W1997840820 @default.
- W1921016406 cites W2009533501 @default.
- W1921016406 cites W2028838947 @default.
- W1921016406 cites W2090170171 @default.
- W1921016406 cites W2097381042 @default.
- W1921016406 cites W2104641222 @default.
- W1921016406 cites W2105757562 @default.
- W1921016406 cites W2109910161 @default.
- W1921016406 cites W2116708838 @default.
- W1921016406 cites W2126316555 @default.
- W1921016406 cites W2127412976 @default.
- W1921016406 cites W2141754131 @default.
- W1921016406 cites W2161252410 @default.
- W1921016406 cites W2168359464 @default.
- W1921016406 cites W2168405694 @default.
- W1921016406 cites W2169977622 @default.
- W1921016406 cites W2735260544 @default.
- W1921016406 cites W3010488986 @default.
- W1921016406 cites W4249441547 @default.
- W1921016406 doi "https://doi.org/10.1145/2717316" @default.
- W1921016406 hasPublicationYear "2015" @default.
- W1921016406 type Work @default.
- W1921016406 sameAs 1921016406 @default.
- W1921016406 citedByCount "30" @default.
- W1921016406 countsByYear W19210164062016 @default.
- W1921016406 countsByYear W19210164062017 @default.
- W1921016406 countsByYear W19210164062018 @default.
- W1921016406 countsByYear W19210164062019 @default.
- W1921016406 countsByYear W19210164062020 @default.
- W1921016406 countsByYear W19210164062021 @default.
- W1921016406 countsByYear W19210164062022 @default.
- W1921016406 countsByYear W19210164062023 @default.
- W1921016406 crossrefType "journal-article" @default.
- W1921016406 hasAuthorship W1921016406A5029102289 @default.
- W1921016406 hasAuthorship W1921016406A5083303426 @default.
- W1921016406 hasAuthorship W1921016406A5084710341 @default.
- W1921016406 hasConcept C105795698 @default.
- W1921016406 hasConcept C106189395 @default.
- W1921016406 hasConcept C111030470 @default.
- W1921016406 hasConcept C119857082 @default.
- W1921016406 hasConcept C124101348 @default.
- W1921016406 hasConcept C13280743 @default.
- W1921016406 hasConcept C134306372 @default.
- W1921016406 hasConcept C151730666 @default.
- W1921016406 hasConcept C154945302 @default.
- W1921016406 hasConcept C159886148 @default.
- W1921016406 hasConcept C162324750 @default.
- W1921016406 hasConcept C185798385 @default.
- W1921016406 hasConcept C187736073 @default.
- W1921016406 hasConcept C205649164 @default.
- W1921016406 hasConcept C2779343474 @default.
- W1921016406 hasConcept C2780451532 @default.
- W1921016406 hasConcept C33923547 @default.
- W1921016406 hasConcept C36503486 @default.
- W1921016406 hasConcept C41008148 @default.
- W1921016406 hasConcept C48044578 @default.
- W1921016406 hasConcept C72434380 @default.
- W1921016406 hasConcept C77088390 @default.
- W1921016406 hasConcept C86803240 @default.
- W1921016406 hasConcept C97541855 @default.
- W1921016406 hasConceptScore W1921016406C105795698 @default.
- W1921016406 hasConceptScore W1921016406C106189395 @default.
- W1921016406 hasConceptScore W1921016406C111030470 @default.
- W1921016406 hasConceptScore W1921016406C119857082 @default.
- W1921016406 hasConceptScore W1921016406C124101348 @default.
- W1921016406 hasConceptScore W1921016406C13280743 @default.
- W1921016406 hasConceptScore W1921016406C134306372 @default.
- W1921016406 hasConceptScore W1921016406C151730666 @default.
- W1921016406 hasConceptScore W1921016406C154945302 @default.
- W1921016406 hasConceptScore W1921016406C159886148 @default.
- W1921016406 hasConceptScore W1921016406C162324750 @default.
- W1921016406 hasConceptScore W1921016406C185798385 @default.
- W1921016406 hasConceptScore W1921016406C187736073 @default.
- W1921016406 hasConceptScore W1921016406C205649164 @default.
- W1921016406 hasConceptScore W1921016406C2779343474 @default.
- W1921016406 hasConceptScore W1921016406C2780451532 @default.
- W1921016406 hasConceptScore W1921016406C33923547 @default.
- W1921016406 hasConceptScore W1921016406C36503486 @default.
- W1921016406 hasConceptScore W1921016406C41008148 @default.
- W1921016406 hasConceptScore W1921016406C48044578 @default.
- W1921016406 hasConceptScore W1921016406C72434380 @default.
- W1921016406 hasConceptScore W1921016406C77088390 @default.
- W1921016406 hasConceptScore W1921016406C86803240 @default.
- W1921016406 hasConceptScore W1921016406C97541855 @default.
- W1921016406 hasFunder F4320321001 @default.
- W1921016406 hasIssue "4" @default.