Matches in SemOpenAlex for { <https://semopenalex.org/work/W3208987033> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W3208987033 abstract "Offline reinforcement learning (RL) harnesses the power of massive datasets for resolving sequential decision problems. Most existing papers only discuss defending against out-of-distribution (OOD) actions while we investigate a broader issue, the spurious correlations between epistemic uncertainty and decision-making, an essential factor that causes suboptimality. In this paper, we propose Spurious COrrelation REduction (SCORE) for offline RL, a practically effective and theoretically provable algorithm. We empirically show that SCORE achieves the SoTA performance with 3.1x acceleration on various tasks in a standard benchmark (D4RL). The proposed algorithm introduces an annealing behavior cloning regularizer to help produce a high-quality estimation of uncertainty which is critical for eliminating spurious correlations from suboptimality. Theoretically, we justify the rationality of the proposed method and prove its convergence to the optimal policy with a sublinear rate under mild assumptions." @default.
- W3208987033 created "2021-11-08" @default.
- W3208987033 creator A5025567827 @default.
- W3208987033 creator A5044788927 @default.
- W3208987033 creator A5048272675 @default.
- W3208987033 creator A5057139422 @default.
- W3208987033 creator A5058787076 @default.
- W3208987033 creator A5078210646 @default.
- W3208987033 creator A5088551822 @default.
- W3208987033 date "2021-10-24" @default.
- W3208987033 modified "2023-10-18" @default.
- W3208987033 title "SCORE: Spurious COrrelation REduction for Offline Reinforcement Learning" @default.
- W3208987033 cites W1771410628 @default.
- W3208987033 cites W2072931156 @default.
- W3208987033 cites W2130304665 @default.
- W3208987033 cites W2787938642 @default.
- W3208987033 cites W2962785728 @default.
- W3208987033 cites W2963704132 @default.
- W3208987033 cites W2963938771 @default.
- W3208987033 cites W3022566517 @default.
- W3208987033 cites W3048911479 @default.
- W3208987033 cites W3133362425 @default.
- W3208987033 doi "https://doi.org/10.48550/arxiv.2110.12468" @default.
- W3208987033 hasPublicationYear "2021" @default.
- W3208987033 type Work @default.
- W3208987033 sameAs 3208987033 @default.
- W3208987033 citedByCount "0" @default.
- W3208987033 crossrefType "posted-content" @default.
- W3208987033 hasAuthorship W3208987033A5025567827 @default.
- W3208987033 hasAuthorship W3208987033A5044788927 @default.
- W3208987033 hasAuthorship W3208987033A5048272675 @default.
- W3208987033 hasAuthorship W3208987033A5057139422 @default.
- W3208987033 hasAuthorship W3208987033A5058787076 @default.
- W3208987033 hasAuthorship W3208987033A5078210646 @default.
- W3208987033 hasAuthorship W3208987033A5088551822 @default.
- W3208987033 hasBestOaLocation W32089870331 @default.
- W3208987033 hasConcept C111335779 @default.
- W3208987033 hasConcept C11413529 @default.
- W3208987033 hasConcept C117160843 @default.
- W3208987033 hasConcept C117220453 @default.
- W3208987033 hasConcept C119857082 @default.
- W3208987033 hasConcept C126255220 @default.
- W3208987033 hasConcept C13280743 @default.
- W3208987033 hasConcept C134306372 @default.
- W3208987033 hasConcept C136764020 @default.
- W3208987033 hasConcept C154945302 @default.
- W3208987033 hasConcept C162324750 @default.
- W3208987033 hasConcept C185798385 @default.
- W3208987033 hasConcept C205649164 @default.
- W3208987033 hasConcept C2524010 @default.
- W3208987033 hasConcept C2777303404 @default.
- W3208987033 hasConcept C2780490138 @default.
- W3208987033 hasConcept C2986087404 @default.
- W3208987033 hasConcept C33923547 @default.
- W3208987033 hasConcept C41008148 @default.
- W3208987033 hasConcept C50522688 @default.
- W3208987033 hasConcept C97256817 @default.
- W3208987033 hasConcept C97541855 @default.
- W3208987033 hasConceptScore W3208987033C111335779 @default.
- W3208987033 hasConceptScore W3208987033C11413529 @default.
- W3208987033 hasConceptScore W3208987033C117160843 @default.
- W3208987033 hasConceptScore W3208987033C117220453 @default.
- W3208987033 hasConceptScore W3208987033C119857082 @default.
- W3208987033 hasConceptScore W3208987033C126255220 @default.
- W3208987033 hasConceptScore W3208987033C13280743 @default.
- W3208987033 hasConceptScore W3208987033C134306372 @default.
- W3208987033 hasConceptScore W3208987033C136764020 @default.
- W3208987033 hasConceptScore W3208987033C154945302 @default.
- W3208987033 hasConceptScore W3208987033C162324750 @default.
- W3208987033 hasConceptScore W3208987033C185798385 @default.
- W3208987033 hasConceptScore W3208987033C205649164 @default.
- W3208987033 hasConceptScore W3208987033C2524010 @default.
- W3208987033 hasConceptScore W3208987033C2777303404 @default.
- W3208987033 hasConceptScore W3208987033C2780490138 @default.
- W3208987033 hasConceptScore W3208987033C2986087404 @default.
- W3208987033 hasConceptScore W3208987033C33923547 @default.
- W3208987033 hasConceptScore W3208987033C41008148 @default.
- W3208987033 hasConceptScore W3208987033C50522688 @default.
- W3208987033 hasConceptScore W3208987033C97256817 @default.
- W3208987033 hasConceptScore W3208987033C97541855 @default.
- W3208987033 hasLocation W32089870331 @default.
- W3208987033 hasLocation W32089870332 @default.
- W3208987033 hasOpenAccess W3208987033 @default.
- W3208987033 hasPrimaryLocation W32089870331 @default.
- W3208987033 hasRelatedWork W3022038857 @default.
- W3208987033 hasRelatedWork W3132110306 @default.
- W3208987033 hasRelatedWork W3153007185 @default.
- W3208987033 hasRelatedWork W3212439828 @default.
- W3208987033 hasRelatedWork W4225307033 @default.
- W3208987033 hasRelatedWork W4225619808 @default.
- W3208987033 hasRelatedWork W4281626144 @default.
- W3208987033 hasRelatedWork W4286893825 @default.
- W3208987033 hasRelatedWork W4319083788 @default.
- W3208987033 hasRelatedWork W4361253176 @default.
- W3208987033 isParatext "false" @default.
- W3208987033 isRetracted "false" @default.
- W3208987033 magId "3208987033" @default.
- W3208987033 workType "article" @default.