Matches in SemOpenAlex for { <https://semopenalex.org/work/W3177049651> ?p ?o ?g. }
Showing items 1 to 70 of
70
with 100 items per page.
- W3177049651 endingPage "9773" @default.
- W3177049651 startingPage "9765" @default.
- W3177049651 abstract "Transferring knowledge among various environments is important for efficiently learning multiple tasks online. Most existing methods directly use the previously learned models or previously learned optimal policies to learn new tasks. However, these methods may be inefficient when the underlying models or optimal policies are substantially different across tasks. In this paper, we propose Template Learning (TempLe), a PAC-MDP method for multi-task reinforcement learning that could be applied to tasks with varying state/action space without prior knowledge of inter-task mappings. TempLe gains sample efficiency by extracting similarities of the transition dynamics across tasks even when their underlying models or optimal policies have limited commonalities. We present two algorithms for an ``online'' and a ``finite-model'' setting respectively. We prove that our proposed TempLe algorithms achieve much lower sample complexity than single-task learners or state-of-the-art multi-task methods. We show via systematically designed experiments that our TempLe method universally outperforms the state-of-the-art multi-task methods (PAC-MDP or not) in various settings and regimes." @default.
- W3177049651 created "2021-07-05" @default.
- W3177049651 creator A5030428283 @default.
- W3177049651 creator A5037162168 @default.
- W3177049651 creator A5073908930 @default.
- W3177049651 date "2021-05-18" @default.
- W3177049651 modified "2023-10-14" @default.
- W3177049651 title "TempLe: Learning Template of Transitions for Sample Efficient Multi-task RL" @default.
- W3177049651 doi "https://doi.org/10.1609/aaai.v35i11.17174" @default.
- W3177049651 hasPublicationYear "2021" @default.
- W3177049651 type Work @default.
- W3177049651 sameAs 3177049651 @default.
- W3177049651 citedByCount "3" @default.
- W3177049651 countsByYear W31770496512022 @default.
- W3177049651 countsByYear W31770496512023 @default.
- W3177049651 crossrefType "journal-article" @default.
- W3177049651 hasAuthorship W3177049651A5030428283 @default.
- W3177049651 hasAuthorship W3177049651A5037162168 @default.
- W3177049651 hasAuthorship W3177049651A5073908930 @default.
- W3177049651 hasBestOaLocation W31770496511 @default.
- W3177049651 hasConcept C105795698 @default.
- W3177049651 hasConcept C119857082 @default.
- W3177049651 hasConcept C154945302 @default.
- W3177049651 hasConcept C162324750 @default.
- W3177049651 hasConcept C185592680 @default.
- W3177049651 hasConcept C187736073 @default.
- W3177049651 hasConcept C198531522 @default.
- W3177049651 hasConcept C2778445095 @default.
- W3177049651 hasConcept C2780451532 @default.
- W3177049651 hasConcept C33923547 @default.
- W3177049651 hasConcept C41008148 @default.
- W3177049651 hasConcept C43617362 @default.
- W3177049651 hasConcept C72434380 @default.
- W3177049651 hasConcept C97541855 @default.
- W3177049651 hasConceptScore W3177049651C105795698 @default.
- W3177049651 hasConceptScore W3177049651C119857082 @default.
- W3177049651 hasConceptScore W3177049651C154945302 @default.
- W3177049651 hasConceptScore W3177049651C162324750 @default.
- W3177049651 hasConceptScore W3177049651C185592680 @default.
- W3177049651 hasConceptScore W3177049651C187736073 @default.
- W3177049651 hasConceptScore W3177049651C198531522 @default.
- W3177049651 hasConceptScore W3177049651C2778445095 @default.
- W3177049651 hasConceptScore W3177049651C2780451532 @default.
- W3177049651 hasConceptScore W3177049651C33923547 @default.
- W3177049651 hasConceptScore W3177049651C41008148 @default.
- W3177049651 hasConceptScore W3177049651C43617362 @default.
- W3177049651 hasConceptScore W3177049651C72434380 @default.
- W3177049651 hasConceptScore W3177049651C97541855 @default.
- W3177049651 hasIssue "11" @default.
- W3177049651 hasLocation W31770496511 @default.
- W3177049651 hasLocation W31770496512 @default.
- W3177049651 hasOpenAccess W3177049651 @default.
- W3177049651 hasPrimaryLocation W31770496511 @default.
- W3177049651 hasRelatedWork W1517383877 @default.
- W3177049651 hasRelatedWork W2952448454 @default.
- W3177049651 hasRelatedWork W2984671263 @default.
- W3177049651 hasRelatedWork W2996320348 @default.
- W3177049651 hasRelatedWork W2997970896 @default.
- W3177049651 hasRelatedWork W3035642820 @default.
- W3177049651 hasRelatedWork W3090436287 @default.
- W3177049651 hasRelatedWork W4287063340 @default.
- W3177049651 hasRelatedWork W4287647350 @default.
- W3177049651 hasRelatedWork W4319083788 @default.
- W3177049651 hasVolume "35" @default.
- W3177049651 isParatext "false" @default.
- W3177049651 isRetracted "false" @default.
- W3177049651 magId "3177049651" @default.
- W3177049651 workType "article" @default.