Matches in SemOpenAlex for { <https://semopenalex.org/work/W3118409705> ?p ?o ?g. }
- W3118409705 abstract "Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments. For example, most RL algorithms collect new data throughout training, using a non-stationary behaviour policy. Due to the transience of this non-stationarity, it is often not explicitly addressed in deep RL and a single neural network is continually updated. However, we find evidence that neural networks exhibit a memory effect, where these transient non-stationarities can permanently impact the latent representation and adversely affect generalisation performance. Consequently, to improve generalisation of deep RL agents, we propose Iterated Relearning (ITER). ITER augments standard RL training by repeated knowledge transfer of the current policy into a freshly initialised network, which thereby experiences less non-stationarity during training. Experimentally, we show that ITER improves performance on the challenging generalisation benchmarks ProcGen and Multiroom." @default.
- W3118409705 created "2021-01-18" @default.
- W3118409705 creator A5016496356 @default.
- W3118409705 creator A5020311328 @default.
- W3118409705 creator A5024136382 @default.
- W3118409705 creator A5045468603 @default.
- W3118409705 creator A5056879203 @default.
- W3118409705 date "2021-05-03" @default.
- W3118409705 modified "2023-09-26" @default.
- W3118409705 title "Transient Non-stationarity and Generalisation in Deep Reinforcement Learning" @default.
- W3118409705 cites W1821462560 @default.
- W3118409705 cites W2060277733 @default.
- W3118409705 cites W2119567691 @default.
- W3118409705 cites W2121863487 @default.
- W3118409705 cites W2153633111 @default.
- W3118409705 cites W2156371714 @default.
- W3118409705 cites W2168342951 @default.
- W3118409705 cites W2169136412 @default.
- W3118409705 cites W2194775991 @default.
- W3118409705 cites W2473930607 @default.
- W3118409705 cites W2605102758 @default.
- W3118409705 cites W2736601468 @default.
- W3118409705 cites W2786036274 @default.
- W3118409705 cites W2788388592 @default.
- W3118409705 cites W2789517807 @default.
- W3118409705 cites W2797527950 @default.
- W3118409705 cites W2809668646 @default.
- W3118409705 cites W2914731160 @default.
- W3118409705 cites W2916826721 @default.
- W3118409705 cites W2962764591 @default.
- W3118409705 cites W2962858248 @default.
- W3118409705 cites W2962957031 @default.
- W3118409705 cites W2963199420 @default.
- W3118409705 cites W2963560049 @default.
- W3118409705 cites W2963680188 @default.
- W3118409705 cites W2963850662 @default.
- W3118409705 cites W2963875819 @default.
- W3118409705 cites W2964222566 @default.
- W3118409705 cites W2964293126 @default.
- W3118409705 cites W2970214542 @default.
- W3118409705 cites W2981021427 @default.
- W3118409705 cites W2994073215 @default.
- W3118409705 cites W2999617596 @default.
- W3118409705 cites W3034971196 @default.
- W3118409705 cites W3103361051 @default.
- W3118409705 cites W3118608800 @default.
- W3118409705 hasPublicationYear "2021" @default.
- W3118409705 type Work @default.
- W3118409705 sameAs 3118409705 @default.
- W3118409705 citedByCount "4" @default.
- W3118409705 countsByYear W31184097052021 @default.
- W3118409705 crossrefType "proceedings-article" @default.
- W3118409705 hasAuthorship W3118409705A5016496356 @default.
- W3118409705 hasAuthorship W3118409705A5020311328 @default.
- W3118409705 hasAuthorship W3118409705A5024136382 @default.
- W3118409705 hasAuthorship W3118409705A5045468603 @default.
- W3118409705 hasAuthorship W3118409705A5056879203 @default.
- W3118409705 hasConcept C111919701 @default.
- W3118409705 hasConcept C119857082 @default.
- W3118409705 hasConcept C134306372 @default.
- W3118409705 hasConcept C140479938 @default.
- W3118409705 hasConcept C154945302 @default.
- W3118409705 hasConcept C17744445 @default.
- W3118409705 hasConcept C199539241 @default.
- W3118409705 hasConcept C2776359362 @default.
- W3118409705 hasConcept C2780799671 @default.
- W3118409705 hasConcept C33923547 @default.
- W3118409705 hasConcept C41008148 @default.
- W3118409705 hasConcept C50644808 @default.
- W3118409705 hasConcept C94625758 @default.
- W3118409705 hasConcept C97541855 @default.
- W3118409705 hasConceptScore W3118409705C111919701 @default.
- W3118409705 hasConceptScore W3118409705C119857082 @default.
- W3118409705 hasConceptScore W3118409705C134306372 @default.
- W3118409705 hasConceptScore W3118409705C140479938 @default.
- W3118409705 hasConceptScore W3118409705C154945302 @default.
- W3118409705 hasConceptScore W3118409705C17744445 @default.
- W3118409705 hasConceptScore W3118409705C199539241 @default.
- W3118409705 hasConceptScore W3118409705C2776359362 @default.
- W3118409705 hasConceptScore W3118409705C2780799671 @default.
- W3118409705 hasConceptScore W3118409705C33923547 @default.
- W3118409705 hasConceptScore W3118409705C41008148 @default.
- W3118409705 hasConceptScore W3118409705C50644808 @default.
- W3118409705 hasConceptScore W3118409705C94625758 @default.
- W3118409705 hasConceptScore W3118409705C97541855 @default.
- W3118409705 hasLocation W31184097051 @default.
- W3118409705 hasOpenAccess W3118409705 @default.
- W3118409705 hasPrimaryLocation W31184097051 @default.
- W3118409705 hasRelatedWork W122779128 @default.
- W3118409705 hasRelatedWork W2295413729 @default.
- W3118409705 hasRelatedWork W2795712614 @default.
- W3118409705 hasRelatedWork W2946421721 @default.
- W3118409705 hasRelatedWork W2952765942 @default.
- W3118409705 hasRelatedWork W2994753955 @default.
- W3118409705 hasRelatedWork W3001528895 @default.
- W3118409705 hasRelatedWork W3014176776 @default.
- W3118409705 hasRelatedWork W3016846277 @default.
- W3118409705 hasRelatedWork W3034752558 @default.
- W3118409705 hasRelatedWork W3035595221 @default.
- W3118409705 hasRelatedWork W3037179286 @default.