Matches in SemOpenAlex for { <https://semopenalex.org/work/W2964855005> ?p ?o ?g. }
Showing items 1 to 94 of
94
with 100 items per page.
- W2964855005 abstract "How to best explore in domains with sparse, delayed, and deceptive rewards is an important open problem for reinforcement learning (RL). This paper considers one such domain, the recently-proposed multi-agent benchmark of Pommerman. This domain is very challenging for RL --- past work has shown that model-free RL algorithms fail to achieve significant learning without artificially reducing the environment's complexity. In this paper, we illuminate reasons behind this failure by providing a thorough analysis on the hardness of random exploration in Pommerman. While model-free random exploration is typically futile, we develop a model-based automatic reasoning module that can be used for safer exploration by pruning actions that will surely lead the agent to death. We empirically demonstrate that this module can significantly improve learning." @default.
- W2964855005 created "2019-08-13" @default.
- W2964855005 creator A5005259414 @default.
- W2964855005 creator A5056815920 @default.
- W2964855005 creator A5070914351 @default.
- W2964855005 creator A5071731471 @default.
- W2964855005 date "2019-07-26" @default.
- W2964855005 modified "2023-09-27" @default.
- W2964855005 title "On Hard Exploration for Reinforcement Learning: a Case Study in Pommerman" @default.
- W2964855005 cites W1505937442 @default.
- W2964855005 cites W1969302761 @default.
- W2964855005 cites W1988526405 @default.
- W2964855005 cites W2020920737 @default.
- W2964855005 cites W2189058185 @default.
- W2964855005 cites W2257979135 @default.
- W2964855005 cites W2419612459 @default.
- W2964855005 cites W2558251412 @default.
- W2964855005 cites W2736601468 @default.
- W2964855005 cites W2766447205 @default.
- W2964855005 cites W2788049526 @default.
- W2964855005 cites W2902907165 @default.
- W2964855005 cites W2903445514 @default.
- W2964855005 cites W2904616874 @default.
- W2964855005 cites W2914261249 @default.
- W2964855005 cites W2962691671 @default.
- W2964855005 cites W2963184621 @default.
- W2964855005 cites W2963628590 @default.
- W2964855005 cites W2964299886 @default.
- W2964855005 hasPublicationYear "2019" @default.
- W2964855005 type Work @default.
- W2964855005 sameAs 2964855005 @default.
- W2964855005 citedByCount "0" @default.
- W2964855005 crossrefType "posted-content" @default.
- W2964855005 hasAuthorship W2964855005A5005259414 @default.
- W2964855005 hasAuthorship W2964855005A5056815920 @default.
- W2964855005 hasAuthorship W2964855005A5070914351 @default.
- W2964855005 hasAuthorship W2964855005A5071731471 @default.
- W2964855005 hasConcept C108010975 @default.
- W2964855005 hasConcept C119857082 @default.
- W2964855005 hasConcept C13280743 @default.
- W2964855005 hasConcept C134306372 @default.
- W2964855005 hasConcept C154945302 @default.
- W2964855005 hasConcept C185798385 @default.
- W2964855005 hasConcept C205649164 @default.
- W2964855005 hasConcept C2776654903 @default.
- W2964855005 hasConcept C33923547 @default.
- W2964855005 hasConcept C36503486 @default.
- W2964855005 hasConcept C38652104 @default.
- W2964855005 hasConcept C41008148 @default.
- W2964855005 hasConcept C6557445 @default.
- W2964855005 hasConcept C86803240 @default.
- W2964855005 hasConcept C97541855 @default.
- W2964855005 hasConceptScore W2964855005C108010975 @default.
- W2964855005 hasConceptScore W2964855005C119857082 @default.
- W2964855005 hasConceptScore W2964855005C13280743 @default.
- W2964855005 hasConceptScore W2964855005C134306372 @default.
- W2964855005 hasConceptScore W2964855005C154945302 @default.
- W2964855005 hasConceptScore W2964855005C185798385 @default.
- W2964855005 hasConceptScore W2964855005C205649164 @default.
- W2964855005 hasConceptScore W2964855005C2776654903 @default.
- W2964855005 hasConceptScore W2964855005C33923547 @default.
- W2964855005 hasConceptScore W2964855005C36503486 @default.
- W2964855005 hasConceptScore W2964855005C38652104 @default.
- W2964855005 hasConceptScore W2964855005C41008148 @default.
- W2964855005 hasConceptScore W2964855005C6557445 @default.
- W2964855005 hasConceptScore W2964855005C86803240 @default.
- W2964855005 hasConceptScore W2964855005C97541855 @default.
- W2964855005 hasLocation W29648550051 @default.
- W2964855005 hasOpenAccess W2964855005 @default.
- W2964855005 hasPrimaryLocation W29648550051 @default.
- W2964855005 hasRelatedWork W1535586732 @default.
- W2964855005 hasRelatedWork W2201750637 @default.
- W2964855005 hasRelatedWork W2272929109 @default.
- W2964855005 hasRelatedWork W2550182557 @default.
- W2964855005 hasRelatedWork W2935889419 @default.
- W2964855005 hasRelatedWork W2949243561 @default.
- W2964855005 hasRelatedWork W2963794592 @default.
- W2964855005 hasRelatedWork W2964464273 @default.
- W2964855005 hasRelatedWork W2982365794 @default.
- W2964855005 hasRelatedWork W2984518291 @default.
- W2964855005 hasRelatedWork W2991032634 @default.
- W2964855005 hasRelatedWork W3005607450 @default.
- W2964855005 hasRelatedWork W3005888089 @default.
- W2964855005 hasRelatedWork W3025362081 @default.
- W2964855005 hasRelatedWork W3081661003 @default.
- W2964855005 hasRelatedWork W3094736930 @default.
- W2964855005 hasRelatedWork W3116276887 @default.
- W2964855005 hasRelatedWork W3181701811 @default.
- W2964855005 hasRelatedWork W3200054052 @default.
- W2964855005 hasRelatedWork W76312321 @default.
- W2964855005 isParatext "false" @default.
- W2964855005 isRetracted "false" @default.
- W2964855005 magId "2964855005" @default.
- W2964855005 workType "article" @default.