Matches in SemOpenAlex for { <https://semopenalex.org/work/W2970870329> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W2970870329 endingPage "12213" @default.
- W2970870329 startingPage "12203" @default.
- W2970870329 abstract "State-of-the-art efficient model-based Reinforcement Learning (RL) algorithms typically act by iteratively solving empirical models, i.e., by performing full-planning on Markov Decision Processes (MDPs) built by the gathered experience. In this paper, we focus on model-based RL in the finite-state finite-horizon MDP setting and establish that exploring with greedy policies -- act by 1-step planning -- can achieve tight minimax performance in terms of regret, O(sqrt{HSAT}). Thus, full-planning in model-based RL can be avoided altogether without any performance degradation, and, by doing so, the computational complexity decreases by a factor of S. The results are based on a novel analysis of real-time dynamic programming, then extended to model-based RL. Specifically, we generalize existing algorithms that perform full-planning to such that act by 1-step planning. For these generalizations, we prove regret bounds with the same rate as their full-planning counterparts." @default.
- W2970870329 created "2019-09-05" @default.
- W2970870329 creator A5013843778 @default.
- W2970870329 creator A5018784842 @default.
- W2970870329 creator A5036260775 @default.
- W2970870329 creator A5090891199 @default.
- W2970870329 date "2019-05-01" @default.
- W2970870329 modified "2023-09-24" @default.
- W2970870329 title "Tight Regret Bounds for Model-Based Reinforcement Learning with Greedy Policies" @default.
- W2970870329 hasPublicationYear "2019" @default.
- W2970870329 type Work @default.
- W2970870329 sameAs 2970870329 @default.
- W2970870329 citedByCount "28" @default.
- W2970870329 countsByYear W29708703292019 @default.
- W2970870329 countsByYear W29708703292020 @default.
- W2970870329 countsByYear W29708703292021 @default.
- W2970870329 countsByYear W29708703292022 @default.
- W2970870329 crossrefType "proceedings-article" @default.
- W2970870329 hasAuthorship W2970870329A5013843778 @default.
- W2970870329 hasAuthorship W2970870329A5018784842 @default.
- W2970870329 hasAuthorship W2970870329A5036260775 @default.
- W2970870329 hasAuthorship W2970870329A5090891199 @default.
- W2970870329 hasConcept C105795698 @default.
- W2970870329 hasConcept C106189395 @default.
- W2970870329 hasConcept C119857082 @default.
- W2970870329 hasConcept C126255220 @default.
- W2970870329 hasConcept C149728462 @default.
- W2970870329 hasConcept C154945302 @default.
- W2970870329 hasConcept C159886148 @default.
- W2970870329 hasConcept C28761237 @default.
- W2970870329 hasConcept C33923547 @default.
- W2970870329 hasConcept C37404715 @default.
- W2970870329 hasConcept C41008148 @default.
- W2970870329 hasConcept C50817715 @default.
- W2970870329 hasConcept C97541855 @default.
- W2970870329 hasConceptScore W2970870329C105795698 @default.
- W2970870329 hasConceptScore W2970870329C106189395 @default.
- W2970870329 hasConceptScore W2970870329C119857082 @default.
- W2970870329 hasConceptScore W2970870329C126255220 @default.
- W2970870329 hasConceptScore W2970870329C149728462 @default.
- W2970870329 hasConceptScore W2970870329C154945302 @default.
- W2970870329 hasConceptScore W2970870329C159886148 @default.
- W2970870329 hasConceptScore W2970870329C28761237 @default.
- W2970870329 hasConceptScore W2970870329C33923547 @default.
- W2970870329 hasConceptScore W2970870329C37404715 @default.
- W2970870329 hasConceptScore W2970870329C41008148 @default.
- W2970870329 hasConceptScore W2970870329C50817715 @default.
- W2970870329 hasConceptScore W2970870329C97541855 @default.
- W2970870329 hasLocation W29708703291 @default.
- W2970870329 hasOpenAccess W2970870329 @default.
- W2970870329 hasPrimaryLocation W29708703291 @default.
- W2970870329 hasRelatedWork W1850488217 @default.
- W2970870329 hasRelatedWork W2111764152 @default.
- W2970870329 hasRelatedWork W2119567691 @default.
- W2970870329 hasRelatedWork W2119738618 @default.
- W2970870329 hasRelatedWork W2121863487 @default.
- W2970870329 hasRelatedWork W2512014291 @default.
- W2970870329 hasRelatedWork W2545659366 @default.
- W2970870329 hasRelatedWork W2946284958 @default.
- W2970870329 hasRelatedWork W2947223001 @default.
- W2970870329 hasRelatedWork W2962723383 @default.
- W2970870329 hasRelatedWork W2963049774 @default.
- W2970870329 hasRelatedWork W2963490519 @default.
- W2970870329 hasRelatedWork W2963582321 @default.
- W2970870329 hasRelatedWork W2963767098 @default.
- W2970870329 hasRelatedWork W2964054583 @default.
- W2970870329 hasRelatedWork W2971249033 @default.
- W2970870329 hasRelatedWork W2991929641 @default.
- W2970870329 hasRelatedWork W3015662311 @default.
- W2970870329 hasRelatedWork W3046395471 @default.
- W2970870329 hasRelatedWork W3088243654 @default.
- W2970870329 hasVolume "32" @default.
- W2970870329 isParatext "false" @default.
- W2970870329 isRetracted "false" @default.
- W2970870329 magId "2970870329" @default.
- W2970870329 workType "article" @default.