Matches in SemOpenAlex for { <https://semopenalex.org/work/W4221164780> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4221164780 abstract "In the maximum state entropy exploration framework, an agent interacts with a reward-free environment to learn a policy that maximizes the entropy of the expected state visitations it is inducing. Hazan et al. (2019) noted that the class of Markovian stochastic policies is sufficient for the maximum state entropy objective, and exploiting non-Markovianity is generally considered pointless in this setting. In this paper, we argue that non-Markovianity is instead paramount for maximum state entropy exploration in a finite-sample regime. Especially, we recast the objective to target the expected entropy of the induced state visitations in a single trial. Then, we show that the class of non-Markovian deterministic policies is sufficient for the introduced objective, while Markovian policies suffer non-zero regret in general. However, we prove that the problem of finding an optimal non-Markovian policy is NP-hard. Despite this negative result, we discuss avenues to address the problem in a tractable way and how non-Markovian exploration could benefit the sample efficiency of online reinforcement learning in future works." @default.
- W4221164780 created "2022-04-03" @default.
- W4221164780 creator A5017130830 @default.
- W4221164780 creator A5047410750 @default.
- W4221164780 creator A5052185248 @default.
- W4221164780 date "2022-02-07" @default.
- W4221164780 modified "2023-09-24" @default.
- W4221164780 title "The Importance of Non-Markovianity in Maximum State Entropy Exploration" @default.
- W4221164780 doi "https://doi.org/10.48550/arxiv.2202.03060" @default.
- W4221164780 hasPublicationYear "2022" @default.
- W4221164780 type Work @default.
- W4221164780 citedByCount "0" @default.
- W4221164780 crossrefType "posted-content" @default.
- W4221164780 hasAuthorship W4221164780A5017130830 @default.
- W4221164780 hasAuthorship W4221164780A5047410750 @default.
- W4221164780 hasAuthorship W4221164780A5052185248 @default.
- W4221164780 hasBestOaLocation W42211647801 @default.
- W4221164780 hasConcept C105795698 @default.
- W4221164780 hasConcept C106301342 @default.
- W4221164780 hasConcept C119857082 @default.
- W4221164780 hasConcept C121332964 @default.
- W4221164780 hasConcept C121864883 @default.
- W4221164780 hasConcept C126255220 @default.
- W4221164780 hasConcept C154945302 @default.
- W4221164780 hasConcept C159886148 @default.
- W4221164780 hasConcept C28826006 @default.
- W4221164780 hasConcept C33923547 @default.
- W4221164780 hasConcept C41008148 @default.
- W4221164780 hasConcept C50817715 @default.
- W4221164780 hasConcept C62520636 @default.
- W4221164780 hasConcept C9679016 @default.
- W4221164780 hasConcept C97541855 @default.
- W4221164780 hasConceptScore W4221164780C105795698 @default.
- W4221164780 hasConceptScore W4221164780C106301342 @default.
- W4221164780 hasConceptScore W4221164780C119857082 @default.
- W4221164780 hasConceptScore W4221164780C121332964 @default.
- W4221164780 hasConceptScore W4221164780C121864883 @default.
- W4221164780 hasConceptScore W4221164780C126255220 @default.
- W4221164780 hasConceptScore W4221164780C154945302 @default.
- W4221164780 hasConceptScore W4221164780C159886148 @default.
- W4221164780 hasConceptScore W4221164780C28826006 @default.
- W4221164780 hasConceptScore W4221164780C33923547 @default.
- W4221164780 hasConceptScore W4221164780C41008148 @default.
- W4221164780 hasConceptScore W4221164780C50817715 @default.
- W4221164780 hasConceptScore W4221164780C62520636 @default.
- W4221164780 hasConceptScore W4221164780C9679016 @default.
- W4221164780 hasConceptScore W4221164780C97541855 @default.
- W4221164780 hasLocation W42211647801 @default.
- W4221164780 hasOpenAccess W4221164780 @default.
- W4221164780 hasPrimaryLocation W42211647801 @default.
- W4221164780 hasRelatedWork W1987887771 @default.
- W4221164780 hasRelatedWork W2088584583 @default.
- W4221164780 hasRelatedWork W2136503687 @default.
- W4221164780 hasRelatedWork W2953025711 @default.
- W4221164780 hasRelatedWork W3092949629 @default.
- W4221164780 hasRelatedWork W3102693044 @default.
- W4221164780 hasRelatedWork W3211602134 @default.
- W4221164780 hasRelatedWork W4302357557 @default.
- W4221164780 hasRelatedWork W4308216308 @default.
- W4221164780 hasRelatedWork W1619097585 @default.
- W4221164780 isParatext "false" @default.
- W4221164780 isRetracted "false" @default.
- W4221164780 workType "article" @default.