Matches in SemOpenAlex for { <https://semopenalex.org/work/W3210047029> ?p ?o ?g. }
- W3210047029 abstract "Transfer in Reinforcement Learning aims to improve learning performance on target tasks using knowledge from experienced source tasks. Successor features (SF) are a prominent transfer mechanism in domains where the reward function changes between tasks. They reevaluate the expected return of previously learned policies in a new target task and to transfer their knowledge. A limiting factor of the SF framework is its assumption that rewards linearly decompose into successor features and a reward weight vector. We propose a novel SF mechanism, $xi$-learning, based on learning the cumulative discounted probability of successor features. Crucially, $xi$-learning allows to reevaluate the expected return of policies for general reward functions. We introduce two $xi$-learning variations, prove its convergence, and provide a guarantee on its transfer performance. Experimental evaluations based on $xi$-learning with function approximation demonstrate the prominent advantage of $xi$-learning over available mechanisms not only for general reward functions, but also in the case of linearly decomposable reward functions." @default.
- W3210047029 created "2021-11-08" @default.
- W3210047029 creator A5042833606 @default.
- W3210047029 creator A5066621495 @default.
- W3210047029 date "2021-11-12" @default.
- W3210047029 modified "2023-09-25" @default.
- W3210047029 title "Xi-Learning: Successor Feature Transfer Learning for General Reward Functions" @default.
- W3210047029 cites W158722652 @default.
- W3210047029 cites W2056354534 @default.
- W3210047029 cites W2097381042 @default.
- W3210047029 cites W2121863487 @default.
- W3210047029 cites W2145339207 @default.
- W3210047029 cites W2147750403 @default.
- W3210047029 cites W2417089653 @default.
- W3210047029 cites W2804673281 @default.
- W3210047029 cites W2902907165 @default.
- W3210047029 cites W2904339729 @default.
- W3210047029 cites W2913485808 @default.
- W3210047029 cites W2914694967 @default.
- W3210047029 cites W2952371101 @default.
- W3210047029 cites W2952451928 @default.
- W3210047029 cites W2953319434 @default.
- W3210047029 cites W2962717849 @default.
- W3210047029 cites W2963019567 @default.
- W3210047029 cites W2981870800 @default.
- W3210047029 cites W2990858845 @default.
- W3210047029 cites W3022287533 @default.
- W3210047029 cites W3072315125 @default.
- W3210047029 cites W3085267010 @default.
- W3210047029 cites W3093511257 @default.
- W3210047029 cites W3117215073 @default.
- W3210047029 cites W3125995651 @default.
- W3210047029 cites W3135037631 @default.
- W3210047029 cites W3135452916 @default.
- W3210047029 cites W3139073295 @default.
- W3210047029 cites W3173451630 @default.
- W3210047029 cites W3207825988 @default.
- W3210047029 cites W567721252 @default.
- W3210047029 hasPublicationYear "2021" @default.
- W3210047029 type Work @default.
- W3210047029 sameAs 3210047029 @default.
- W3210047029 citedByCount "0" @default.
- W3210047029 crossrefType "posted-content" @default.
- W3210047029 hasAuthorship W3210047029A5042833606 @default.
- W3210047029 hasAuthorship W3210047029A5066621495 @default.
- W3210047029 hasBestOaLocation W32100470291 @default.
- W3210047029 hasConcept C119857082 @default.
- W3210047029 hasConcept C134306372 @default.
- W3210047029 hasConcept C138885662 @default.
- W3210047029 hasConcept C14036430 @default.
- W3210047029 hasConcept C150899416 @default.
- W3210047029 hasConcept C154945302 @default.
- W3210047029 hasConcept C162324750 @default.
- W3210047029 hasConcept C171041071 @default.
- W3210047029 hasConcept C2776401178 @default.
- W3210047029 hasConcept C2777303404 @default.
- W3210047029 hasConcept C2779178101 @default.
- W3210047029 hasConcept C33923547 @default.
- W3210047029 hasConcept C41008148 @default.
- W3210047029 hasConcept C41895202 @default.
- W3210047029 hasConcept C50522688 @default.
- W3210047029 hasConcept C75306776 @default.
- W3210047029 hasConcept C78458016 @default.
- W3210047029 hasConcept C86803240 @default.
- W3210047029 hasConcept C97541855 @default.
- W3210047029 hasConceptScore W3210047029C119857082 @default.
- W3210047029 hasConceptScore W3210047029C134306372 @default.
- W3210047029 hasConceptScore W3210047029C138885662 @default.
- W3210047029 hasConceptScore W3210047029C14036430 @default.
- W3210047029 hasConceptScore W3210047029C150899416 @default.
- W3210047029 hasConceptScore W3210047029C154945302 @default.
- W3210047029 hasConceptScore W3210047029C162324750 @default.
- W3210047029 hasConceptScore W3210047029C171041071 @default.
- W3210047029 hasConceptScore W3210047029C2776401178 @default.
- W3210047029 hasConceptScore W3210047029C2777303404 @default.
- W3210047029 hasConceptScore W3210047029C2779178101 @default.
- W3210047029 hasConceptScore W3210047029C33923547 @default.
- W3210047029 hasConceptScore W3210047029C41008148 @default.
- W3210047029 hasConceptScore W3210047029C41895202 @default.
- W3210047029 hasConceptScore W3210047029C50522688 @default.
- W3210047029 hasConceptScore W3210047029C75306776 @default.
- W3210047029 hasConceptScore W3210047029C78458016 @default.
- W3210047029 hasConceptScore W3210047029C86803240 @default.
- W3210047029 hasConceptScore W3210047029C97541855 @default.
- W3210047029 hasLocation W32100470291 @default.
- W3210047029 hasOpenAccess W3210047029 @default.
- W3210047029 hasPrimaryLocation W32100470291 @default.
- W3210047029 hasRelatedWork W1364030 @default.
- W3210047029 hasRelatedWork W2892939 @default.
- W3210047029 hasRelatedWork W4783353 @default.
- W3210047029 hasRelatedWork W5922077 @default.
- W3210047029 hasRelatedWork W5991500 @default.
- W3210047029 hasRelatedWork W8042183 @default.
- W3210047029 hasRelatedWork W8333676 @default.
- W3210047029 hasRelatedWork W8340350 @default.
- W3210047029 hasRelatedWork W867563 @default.
- W3210047029 hasRelatedWork W9958333 @default.
- W3210047029 isParatext "false" @default.
- W3210047029 isRetracted "false" @default.
- W3210047029 magId "3210047029" @default.