Matches in SemOpenAlex for { <https://semopenalex.org/work/W3087200438> ?p ?o ?g. }
- W3087200438 abstract "Existing model-based value expansion methods typically leverage a world model for value estimation with a fixed rollout horizon to assist policy learning. However, the fixed rollout with an inaccurate model has a potential to harm the learning process. In this paper, we investigate the idea of using the model knowledge for value expansion adaptively. We propose a novel method called Dynamic-horizon Model-based Value Expansion (DMVE) to adjust the world model usage with different rollout horizons. Inspired by reconstruction-based techniques that can be applied for visual data novelty detection, we utilize a world model with a reconstruction module for image feature extraction, in order to acquire more precise value estimation. The raw and the reconstructed images are both used to determine the appropriate horizon for adaptive value expansion. On several benchmark visual control tasks, experimental results show that DMVE outperforms all baselines in sample efficiency and final performance, indicating that DMVE can achieve more effective and accurate value estimation than state-of-the-art model-based methods." @default.
- W3087200438 created "2020-09-25" @default.
- W3087200438 creator A5003596182 @default.
- W3087200438 creator A5012412127 @default.
- W3087200438 creator A5047509839 @default.
- W3087200438 creator A5048989648 @default.
- W3087200438 creator A5049454999 @default.
- W3087200438 date "2020-09-21" @default.
- W3087200438 modified "2023-09-23" @default.
- W3087200438 title "Dynamic Horizon Value Estimation for Model-based Reinforcement Learning." @default.
- W3087200438 cites W1980035368 @default.
- W3087200438 cites W2102928860 @default.
- W3087200438 cites W2115627867 @default.
- W3087200438 cites W2119717200 @default.
- W3087200438 cites W2121103318 @default.
- W3087200438 cites W2121863487 @default.
- W3087200438 cites W2123979492 @default.
- W3087200438 cites W2140135625 @default.
- W3087200438 cites W2147800946 @default.
- W3087200438 cites W2151268438 @default.
- W3087200438 cites W2162717641 @default.
- W3087200438 cites W2165150801 @default.
- W3087200438 cites W2173248099 @default.
- W3087200438 cites W2268617045 @default.
- W3087200438 cites W2416041116 @default.
- W3087200438 cites W2738669288 @default.
- W3087200438 cites W2789824229 @default.
- W3087200438 cites W2889347284 @default.
- W3087200438 cites W2890208753 @default.
- W3087200438 cites W2892230114 @default.
- W3087200438 cites W2897498038 @default.
- W3087200438 cites W2900152462 @default.
- W3087200438 cites W2904693122 @default.
- W3087200438 cites W2939569248 @default.
- W3087200438 cites W2950517718 @default.
- W3087200438 cites W2962804251 @default.
- W3087200438 cites W2962872206 @default.
- W3087200438 cites W2962902376 @default.
- W3087200438 cites W2962951703 @default.
- W3087200438 cites W2963061824 @default.
- W3087200438 cites W2963430173 @default.
- W3087200438 cites W2963846183 @default.
- W3087200438 cites W2963890729 @default.
- W3087200438 cites W2963923407 @default.
- W3087200438 cites W2964043796 @default.
- W3087200438 cites W2964173023 @default.
- W3087200438 cites W2964220198 @default.
- W3087200438 cites W2964295739 @default.
- W3087200438 cites W2964340928 @default.
- W3087200438 cites W2966477753 @default.
- W3087200438 cites W2970277495 @default.
- W3087200438 cites W2995298643 @default.
- W3087200438 cites W3035020106 @default.
- W3087200438 cites W3036619998 @default.
- W3087200438 cites W3040472139 @default.
- W3087200438 cites W3040680121 @default.
- W3087200438 cites W3084241738 @default.
- W3087200438 hasPublicationYear "2020" @default.
- W3087200438 type Work @default.
- W3087200438 sameAs 3087200438 @default.
- W3087200438 citedByCount "0" @default.
- W3087200438 crossrefType "posted-content" @default.
- W3087200438 hasAuthorship W3087200438A5003596182 @default.
- W3087200438 hasAuthorship W3087200438A5012412127 @default.
- W3087200438 hasAuthorship W3087200438A5047509839 @default.
- W3087200438 hasAuthorship W3087200438A5048989648 @default.
- W3087200438 hasAuthorship W3087200438A5049454999 @default.
- W3087200438 hasConcept C111919701 @default.
- W3087200438 hasConcept C119857082 @default.
- W3087200438 hasConcept C126255220 @default.
- W3087200438 hasConcept C13280743 @default.
- W3087200438 hasConcept C14646407 @default.
- W3087200438 hasConcept C153083717 @default.
- W3087200438 hasConcept C154945302 @default.
- W3087200438 hasConcept C185798385 @default.
- W3087200438 hasConcept C205649164 @default.
- W3087200438 hasConcept C2776291640 @default.
- W3087200438 hasConcept C28761237 @default.
- W3087200438 hasConcept C33923547 @default.
- W3087200438 hasConcept C41008148 @default.
- W3087200438 hasConcept C97541855 @default.
- W3087200438 hasConcept C98045186 @default.
- W3087200438 hasConceptScore W3087200438C111919701 @default.
- W3087200438 hasConceptScore W3087200438C119857082 @default.
- W3087200438 hasConceptScore W3087200438C126255220 @default.
- W3087200438 hasConceptScore W3087200438C13280743 @default.
- W3087200438 hasConceptScore W3087200438C14646407 @default.
- W3087200438 hasConceptScore W3087200438C153083717 @default.
- W3087200438 hasConceptScore W3087200438C154945302 @default.
- W3087200438 hasConceptScore W3087200438C185798385 @default.
- W3087200438 hasConceptScore W3087200438C205649164 @default.
- W3087200438 hasConceptScore W3087200438C2776291640 @default.
- W3087200438 hasConceptScore W3087200438C28761237 @default.
- W3087200438 hasConceptScore W3087200438C33923547 @default.
- W3087200438 hasConceptScore W3087200438C41008148 @default.
- W3087200438 hasConceptScore W3087200438C97541855 @default.
- W3087200438 hasConceptScore W3087200438C98045186 @default.
- W3087200438 hasLocation W30872004381 @default.
- W3087200438 hasOpenAccess W3087200438 @default.
- W3087200438 hasPrimaryLocation W30872004381 @default.