Matches in SemOpenAlex for { <https://semopenalex.org/work/W2898630544> ?p ?o ?g. }
- W2898630544 abstract "Several applications of Reinforcement Learning suffer from instability due to high variance. This is especially prevalent in high dimensional domains. Regularization is a commonly used technique in machine learning to reduce variance, at the cost of introducing some bias. Most existing regularization techniques focus on spatial (perceptual) regularization. Yet in reinforcement learning, due to the nature of the Bellman equation, there is an opportunity to also exploit temporal regularization based on smoothness in value estimates over trajectories. This paper explores a class of methods for temporal regularization. We formally characterize the bias induced by this technique using Markov chain concepts. We illustrate the various characteristics of temporal regularization via a sequence of simple discrete and continuous MDPs, and show that the technique provides improvement even in high-dimensional Atari games." @default.
- W2898630544 created "2018-11-09" @default.
- W2898630544 creator A5024285845 @default.
- W2898630544 creator A5065836447 @default.
- W2898630544 creator A5069876930 @default.
- W2898630544 creator A5080591144 @default.
- W2898630544 date "2018-11-01" @default.
- W2898630544 modified "2023-09-23" @default.
- W2898630544 title "Temporal Regularization in Markov Decision Process" @default.
- W2898630544 cites W1518613644 @default.
- W2898630544 cites W1541084404 @default.
- W2898630544 cites W1553320709 @default.
- W2898630544 cites W1600293573 @default.
- W2898630544 cites W1646707810 @default.
- W2898630544 cites W1662803991 @default.
- W2898630544 cites W1969477885 @default.
- W2898630544 cites W2075268401 @default.
- W2898630544 cites W2098148222 @default.
- W2898630544 cites W2101363765 @default.
- W2898630544 cites W2117481033 @default.
- W2898630544 cites W2119567691 @default.
- W2898630544 cites W2121863487 @default.
- W2898630544 cites W2132782512 @default.
- W2898630544 cites W2145339207 @default.
- W2898630544 cites W2151161180 @default.
- W2898630544 cites W2160284799 @default.
- W2898630544 cites W2257979135 @default.
- W2898630544 cites W2296319761 @default.
- W2898630544 cites W2460675832 @default.
- W2898630544 cites W2619268125 @default.
- W2898630544 cites W2736601468 @default.
- W2898630544 cites W2750726423 @default.
- W2898630544 cites W2754517384 @default.
- W2898630544 cites W2810869885 @default.
- W2898630544 cites W2953055478 @default.
- W2898630544 cites W2962676505 @default.
- W2898630544 cites W2963068985 @default.
- W2898630544 cites W2110158343 @default.
- W2898630544 hasPublicationYear "2018" @default.
- W2898630544 type Work @default.
- W2898630544 sameAs 2898630544 @default.
- W2898630544 citedByCount "1" @default.
- W2898630544 countsByYear W28986305442021 @default.
- W2898630544 crossrefType "posted-content" @default.
- W2898630544 hasAuthorship W2898630544A5024285845 @default.
- W2898630544 hasAuthorship W2898630544A5065836447 @default.
- W2898630544 hasAuthorship W2898630544A5069876930 @default.
- W2898630544 hasAuthorship W2898630544A5080591144 @default.
- W2898630544 hasConcept C105795698 @default.
- W2898630544 hasConcept C106189395 @default.
- W2898630544 hasConcept C119857082 @default.
- W2898630544 hasConcept C126255220 @default.
- W2898630544 hasConcept C154945302 @default.
- W2898630544 hasConcept C159886148 @default.
- W2898630544 hasConcept C165696696 @default.
- W2898630544 hasConcept C2776135515 @default.
- W2898630544 hasConcept C33923547 @default.
- W2898630544 hasConcept C38652104 @default.
- W2898630544 hasConcept C41008148 @default.
- W2898630544 hasConcept C97541855 @default.
- W2898630544 hasConcept C98763669 @default.
- W2898630544 hasConceptScore W2898630544C105795698 @default.
- W2898630544 hasConceptScore W2898630544C106189395 @default.
- W2898630544 hasConceptScore W2898630544C119857082 @default.
- W2898630544 hasConceptScore W2898630544C126255220 @default.
- W2898630544 hasConceptScore W2898630544C154945302 @default.
- W2898630544 hasConceptScore W2898630544C159886148 @default.
- W2898630544 hasConceptScore W2898630544C165696696 @default.
- W2898630544 hasConceptScore W2898630544C2776135515 @default.
- W2898630544 hasConceptScore W2898630544C33923547 @default.
- W2898630544 hasConceptScore W2898630544C38652104 @default.
- W2898630544 hasConceptScore W2898630544C41008148 @default.
- W2898630544 hasConceptScore W2898630544C97541855 @default.
- W2898630544 hasConceptScore W2898630544C98763669 @default.
- W2898630544 hasLocation W28986305441 @default.
- W2898630544 hasOpenAccess W2898630544 @default.
- W2898630544 hasPrimaryLocation W28986305441 @default.
- W2898630544 hasRelatedWork W1577206135 @default.
- W2898630544 hasRelatedWork W2060063574 @default.
- W2898630544 hasRelatedWork W2268781370 @default.
- W2898630544 hasRelatedWork W2477485887 @default.
- W2898630544 hasRelatedWork W2619268125 @default.
- W2898630544 hasRelatedWork W2766159815 @default.
- W2898630544 hasRelatedWork W2891166116 @default.
- W2898630544 hasRelatedWork W2896102874 @default.
- W2898630544 hasRelatedWork W2949625232 @default.
- W2898630544 hasRelatedWork W2955611448 @default.
- W2898630544 hasRelatedWork W2963805014 @default.
- W2898630544 hasRelatedWork W2973391802 @default.
- W2898630544 hasRelatedWork W2985639332 @default.
- W2898630544 hasRelatedWork W3049313187 @default.
- W2898630544 hasRelatedWork W3100507333 @default.
- W2898630544 hasRelatedWork W3125388860 @default.
- W2898630544 hasRelatedWork W3128842277 @default.
- W2898630544 hasRelatedWork W3199724058 @default.
- W2898630544 hasRelatedWork W3212584871 @default.
- W2898630544 hasRelatedWork W3212732692 @default.
- W2898630544 isParatext "false" @default.
- W2898630544 isRetracted "false" @default.
- W2898630544 magId "2898630544" @default.