Matches in SemOpenAlex for { <https://semopenalex.org/work/W2891166116> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W2891166116 endingPage "1789" @default.
- W2891166116 startingPage "1779" @default.
- W2891166116 abstract "Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games." @default.
- W2891166116 created "2018-09-27" @default.
- W2891166116 creator A5024285845 @default.
- W2891166116 creator A5065836447 @default.
- W2891166116 creator A5069876930 @default.
- W2891166116 creator A5080591144 @default.
- W2891166116 date "2018-01-01" @default.
- W2891166116 modified "2023-09-23" @default.
- W2891166116 title "Temporal Regularization for Markov Decision Process" @default.
- W2891166116 hasPublicationYear "2018" @default.
- W2891166116 type Work @default.
- W2891166116 sameAs 2891166116 @default.
- W2891166116 citedByCount "8" @default.
- W2891166116 countsByYear W28911661162019 @default.
- W2891166116 countsByYear W28911661162020 @default.
- W2891166116 countsByYear W28911661162021 @default.
- W2891166116 crossrefType "proceedings-article" @default.
- W2891166116 hasAuthorship W2891166116A5024285845 @default.
- W2891166116 hasAuthorship W2891166116A5065836447 @default.
- W2891166116 hasAuthorship W2891166116A5069876930 @default.
- W2891166116 hasAuthorship W2891166116A5080591144 @default.
- W2891166116 hasConcept C105795698 @default.
- W2891166116 hasConcept C106189395 @default.
- W2891166116 hasConcept C119857082 @default.
- W2891166116 hasConcept C126255220 @default.
- W2891166116 hasConcept C134306372 @default.
- W2891166116 hasConcept C135252773 @default.
- W2891166116 hasConcept C141718189 @default.
- W2891166116 hasConcept C152442038 @default.
- W2891166116 hasConcept C154945302 @default.
- W2891166116 hasConcept C159886148 @default.
- W2891166116 hasConcept C163836022 @default.
- W2891166116 hasConcept C165696696 @default.
- W2891166116 hasConcept C17098449 @default.
- W2891166116 hasConcept C2776135515 @default.
- W2891166116 hasConcept C33923547 @default.
- W2891166116 hasConcept C38652104 @default.
- W2891166116 hasConcept C41008148 @default.
- W2891166116 hasConcept C97541855 @default.
- W2891166116 hasConcept C98763669 @default.
- W2891166116 hasConceptScore W2891166116C105795698 @default.
- W2891166116 hasConceptScore W2891166116C106189395 @default.
- W2891166116 hasConceptScore W2891166116C119857082 @default.
- W2891166116 hasConceptScore W2891166116C126255220 @default.
- W2891166116 hasConceptScore W2891166116C134306372 @default.
- W2891166116 hasConceptScore W2891166116C135252773 @default.
- W2891166116 hasConceptScore W2891166116C141718189 @default.
- W2891166116 hasConceptScore W2891166116C152442038 @default.
- W2891166116 hasConceptScore W2891166116C154945302 @default.
- W2891166116 hasConceptScore W2891166116C159886148 @default.
- W2891166116 hasConceptScore W2891166116C163836022 @default.
- W2891166116 hasConceptScore W2891166116C165696696 @default.
- W2891166116 hasConceptScore W2891166116C17098449 @default.
- W2891166116 hasConceptScore W2891166116C2776135515 @default.
- W2891166116 hasConceptScore W2891166116C33923547 @default.
- W2891166116 hasConceptScore W2891166116C38652104 @default.
- W2891166116 hasConceptScore W2891166116C41008148 @default.
- W2891166116 hasConceptScore W2891166116C97541855 @default.
- W2891166116 hasConceptScore W2891166116C98763669 @default.
- W2891166116 hasLocation W28911661161 @default.
- W2891166116 hasOpenAccess W2891166116 @default.
- W2891166116 hasPrimaryLocation W28911661161 @default.
- W2891166116 hasRelatedWork W1757796397 @default.
- W2891166116 hasRelatedWork W1771410628 @default.
- W2891166116 hasRelatedWork W2060063574 @default.
- W2891166116 hasRelatedWork W2108682071 @default.
- W2891166116 hasRelatedWork W2119567691 @default.
- W2891166116 hasRelatedWork W2268781370 @default.
- W2891166116 hasRelatedWork W2477485887 @default.
- W2891166116 hasRelatedWork W2619268125 @default.
- W2891166116 hasRelatedWork W2736601468 @default.
- W2891166116 hasRelatedWork W2766159815 @default.
- W2891166116 hasRelatedWork W2898630544 @default.
- W2891166116 hasRelatedWork W2963805014 @default.
- W2891166116 hasRelatedWork W2985639332 @default.
- W2891166116 hasRelatedWork W3049313187 @default.
- W2891166116 hasRelatedWork W3100507333 @default.
- W2891166116 hasRelatedWork W3125388860 @default.
- W2891166116 hasRelatedWork W3128842277 @default.
- W2891166116 hasRelatedWork W3199724058 @default.
- W2891166116 hasRelatedWork W3212584871 @default.
- W2891166116 hasRelatedWork W3212732692 @default.
- W2891166116 hasVolume "31" @default.
- W2891166116 isParatext "false" @default.
- W2891166116 isRetracted "false" @default.
- W2891166116 magId "2891166116" @default.
- W2891166116 workType "article" @default.