Matches in SemOpenAlex for { <https://semopenalex.org/work/W2970720882> ?p ?o ?g. }
- W2970720882 endingPage "4899" @default.
- W2970720882 startingPage "4890" @default.
- W2970720882 abstract "The exploration bonus is an effective approach to manage the exploration-exploitation trade-off in Markov Decision Processes (MDPs). While it has been analyzed in infinite-horizon discounted and finite-horizon problems, we focus on designing and analysing the exploration bonus in the more challenging infinite-horizon undiscounted setting. We first introduce SCAL+, a variant of SCAL (Fruit et al. 2018), that uses a suitable exploration bonus to solve any discrete unknown weakly-communicating MDP for which an upper bound $c$ on the span of the optimal bias function is known. We prove that SCAL+ enjoys the same regret guarantees as SCAL, which relies on the less efficient extended value iteration approach. Furthermore, we leverage the flexibility provided by the exploration bonus scheme to generalize SCAL+ to smooth MDPs with continuous state space and discrete actions. We show that the resulting algorithm (SCCAL+) achieves the same regret bound as UCCRL (Ortner and Ryabko, 2012) while being the first implementable algorithm for this setting." @default.
- W2970720882 created "2019-09-05" @default.
- W2970720882 creator A5008246659 @default.
- W2970720882 creator A5014791481 @default.
- W2970720882 creator A5015598243 @default.
- W2970720882 creator A5091526684 @default.
- W2970720882 date "2019-09-06" @default.
- W2970720882 modified "2023-09-29" @default.
- W2970720882 title "Exploration Bonus for Regret Minimization in Discrete and Continuous Average Reward MDPs" @default.
- W2970720882 cites W1701974503 @default.
- W2970720882 cites W1850488217 @default.
- W2970720882 cites W1854776945 @default.
- W2970720882 cites W1988526405 @default.
- W2970720882 cites W1994123361 @default.
- W2970720882 cites W2011815717 @default.
- W2970720882 cites W2046176422 @default.
- W2970720882 cites W2083459869 @default.
- W2970720882 cites W2119567691 @default.
- W2970720882 cites W2154204727 @default.
- W2970720882 cites W2663108269 @default.
- W2970720882 cites W2752560390 @default.
- W2970720882 cites W2769648743 @default.
- W2970720882 cites W2788488350 @default.
- W2970720882 cites W2832404192 @default.
- W2970720882 cites W2963276097 @default.
- W2970720882 cites W3196847620 @default.
- W2970720882 cites W53099289 @default.
- W2970720882 hasPublicationYear "2019" @default.
- W2970720882 type Work @default.
- W2970720882 sameAs 2970720882 @default.
- W2970720882 citedByCount "16" @default.
- W2970720882 countsByYear W29707208822020 @default.
- W2970720882 countsByYear W29707208822021 @default.
- W2970720882 crossrefType "proceedings-article" @default.
- W2970720882 hasAuthorship W2970720882A5008246659 @default.
- W2970720882 hasAuthorship W2970720882A5014791481 @default.
- W2970720882 hasAuthorship W2970720882A5015598243 @default.
- W2970720882 hasAuthorship W2970720882A5091526684 @default.
- W2970720882 hasConcept C105795698 @default.
- W2970720882 hasConcept C106189395 @default.
- W2970720882 hasConcept C119857082 @default.
- W2970720882 hasConcept C126255220 @default.
- W2970720882 hasConcept C134306372 @default.
- W2970720882 hasConcept C144237770 @default.
- W2970720882 hasConcept C14646407 @default.
- W2970720882 hasConcept C153083717 @default.
- W2970720882 hasConcept C154945302 @default.
- W2970720882 hasConcept C159886148 @default.
- W2970720882 hasConcept C2780598303 @default.
- W2970720882 hasConcept C33923547 @default.
- W2970720882 hasConcept C41008148 @default.
- W2970720882 hasConcept C50817715 @default.
- W2970720882 hasConcept C55689738 @default.
- W2970720882 hasConcept C72434380 @default.
- W2970720882 hasConcept C77553402 @default.
- W2970720882 hasConceptScore W2970720882C105795698 @default.
- W2970720882 hasConceptScore W2970720882C106189395 @default.
- W2970720882 hasConceptScore W2970720882C119857082 @default.
- W2970720882 hasConceptScore W2970720882C126255220 @default.
- W2970720882 hasConceptScore W2970720882C134306372 @default.
- W2970720882 hasConceptScore W2970720882C144237770 @default.
- W2970720882 hasConceptScore W2970720882C14646407 @default.
- W2970720882 hasConceptScore W2970720882C153083717 @default.
- W2970720882 hasConceptScore W2970720882C154945302 @default.
- W2970720882 hasConceptScore W2970720882C159886148 @default.
- W2970720882 hasConceptScore W2970720882C2780598303 @default.
- W2970720882 hasConceptScore W2970720882C33923547 @default.
- W2970720882 hasConceptScore W2970720882C41008148 @default.
- W2970720882 hasConceptScore W2970720882C50817715 @default.
- W2970720882 hasConceptScore W2970720882C55689738 @default.
- W2970720882 hasConceptScore W2970720882C72434380 @default.
- W2970720882 hasConceptScore W2970720882C77553402 @default.
- W2970720882 hasLocation W29707208821 @default.
- W2970720882 hasOpenAccess W2970720882 @default.
- W2970720882 hasPrimaryLocation W29707208821 @default.
- W2970720882 hasRelatedWork W1662803991 @default.
- W2970720882 hasRelatedWork W1850488217 @default.
- W2970720882 hasRelatedWork W2119567691 @default.
- W2970720882 hasRelatedWork W2137125320 @default.
- W2970720882 hasRelatedWork W2403303158 @default.
- W2970720882 hasRelatedWork W2489939061 @default.
- W2970720882 hasRelatedWork W2903569780 @default.
- W2970720882 hasRelatedWork W2944461362 @default.
- W2970720882 hasRelatedWork W2953199352 @default.
- W2970720882 hasRelatedWork W2962723383 @default.
- W2970720882 hasRelatedWork W2963049774 @default.
- W2970720882 hasRelatedWork W2963767098 @default.
- W2970720882 hasRelatedWork W2964054583 @default.
- W2970720882 hasRelatedWork W3035397247 @default.
- W2970720882 hasRelatedWork W3041070598 @default.
- W2970720882 hasRelatedWork W3091279148 @default.
- W2970720882 hasRelatedWork W3128235563 @default.
- W2970720882 hasRelatedWork W3157247518 @default.
- W2970720882 hasRelatedWork W3177489476 @default.
- W2970720882 hasRelatedWork W3209987675 @default.
- W2970720882 hasVolume "32" @default.
- W2970720882 isParatext "false" @default.
- W2970720882 isRetracted "false" @default.