Matches in SemOpenAlex for { <https://semopenalex.org/work/W3118161885> ?p ?o ?g. }
- W3118161885 endingPage "11032" @default.
- W3118161885 startingPage "11024" @default.
- W3118161885 abstract "Humans can abstract prior knowledge from very little data and use it to boost skill learning. In this paper, we propose routine-augmented policy learning (RAPL), which discovers routines composed of primitive actions from a single demonstration and uses discovered routines to augment policy learning. To discover routines from the demonstration, we first abstract routine candidates by identifying grammar over the demonstrated action trajectory. Then, the best routines measured by length and frequency are selected to form a routine library. We propose to learn policy simultaneously at primitive-level and routine-level with discovered routines, leveraging the temporal structure of routines. Our approach enables imitating expert behavior at multiple temporal scales for imitation learning and promotes reinforcement learning exploration. Extensive experiments on Atari games demonstrate that RAPL improves the state-of-the-art imitation learning method SQIL and reinforcement learning method A2C. Further, we show that discovered routines can generalize to unseen levels and difficulties on the CoinRun benchmark." @default.
- W3118161885 created "2021-01-05" @default.
- W3118161885 creator A5010972658 @default.
- W3118161885 creator A5040877128 @default.
- W3118161885 creator A5051951021 @default.
- W3118161885 creator A5071093940 @default.
- W3118161885 creator A5089593268 @default.
- W3118161885 date "2021-05-18" @default.
- W3118161885 modified "2023-10-16" @default.
- W3118161885 title "Augmenting Policy Learning with Routines Discovered from a Single Demonstration" @default.
- W3118161885 cites W1503821144 @default.
- W3118161885 cites W1504212531 @default.
- W3118161885 cites W1556824961 @default.
- W3118161885 cites W1569657508 @default.
- W3118161885 cites W1586944634 @default.
- W3118161885 cites W1592847719 @default.
- W3118161885 cites W1986014385 @default.
- W3118161885 cites W2020149918 @default.
- W3118161885 cites W2105686649 @default.
- W3118161885 cites W2106688224 @default.
- W3118161885 cites W2109910161 @default.
- W3118161885 cites W2168696874 @default.
- W3118161885 cites W2245825236 @default.
- W3118161885 cites W2337392266 @default.
- W3118161885 cites W2419788411 @default.
- W3118161885 cites W2461708070 @default.
- W3118161885 cites W2523728418 @default.
- W3118161885 cites W2592215206 @default.
- W3118161885 cites W2605016475 @default.
- W3118161885 cites W2614839826 @default.
- W3118161885 cites W2741122588 @default.
- W3118161885 cites W2766354286 @default.
- W3118161885 cites W2786036274 @default.
- W3118161885 cites W2803616302 @default.
- W3118161885 cites W2889970038 @default.
- W3118161885 cites W2898621204 @default.
- W3118161885 cites W2902117899 @default.
- W3118161885 cites W2903181768 @default.
- W3118161885 cites W2904157920 @default.
- W3118161885 cites W2912283104 @default.
- W3118161885 cites W2950462959 @default.
- W3118161885 cites W2963099939 @default.
- W3118161885 cites W2963262099 @default.
- W3118161885 cites W2963277051 @default.
- W3118161885 cites W2963376229 @default.
- W3118161885 cites W2963912551 @default.
- W3118161885 cites W2964043796 @default.
- W3118161885 cites W2964121744 @default.
- W3118161885 cites W2964250417 @default.
- W3118161885 cites W2964317067 @default.
- W3118161885 cites W2966794046 @default.
- W3118161885 cites W2977710579 @default.
- W3118161885 cites W2996668373 @default.
- W3118161885 cites W3037207827 @default.
- W3118161885 cites W3103780890 @default.
- W3118161885 cites W2291079750 @default.
- W3118161885 doi "https://doi.org/10.1609/aaai.v35i12.17316" @default.
- W3118161885 hasPublicationYear "2021" @default.
- W3118161885 type Work @default.
- W3118161885 sameAs 3118161885 @default.
- W3118161885 citedByCount "1" @default.
- W3118161885 countsByYear W31181618852021 @default.
- W3118161885 crossrefType "journal-article" @default.
- W3118161885 hasAuthorship W3118161885A5010972658 @default.
- W3118161885 hasAuthorship W3118161885A5040877128 @default.
- W3118161885 hasAuthorship W3118161885A5051951021 @default.
- W3118161885 hasAuthorship W3118161885A5071093940 @default.
- W3118161885 hasAuthorship W3118161885A5089593268 @default.
- W3118161885 hasBestOaLocation W31181618851 @default.
- W3118161885 hasConcept C119857082 @default.
- W3118161885 hasConcept C121332964 @default.
- W3118161885 hasConcept C126388530 @default.
- W3118161885 hasConcept C1276947 @default.
- W3118161885 hasConcept C13280743 @default.
- W3118161885 hasConcept C13662910 @default.
- W3118161885 hasConcept C154945302 @default.
- W3118161885 hasConcept C15744967 @default.
- W3118161885 hasConcept C185798385 @default.
- W3118161885 hasConcept C205649164 @default.
- W3118161885 hasConcept C2780791683 @default.
- W3118161885 hasConcept C41008148 @default.
- W3118161885 hasConcept C62520636 @default.
- W3118161885 hasConcept C77805123 @default.
- W3118161885 hasConcept C97541855 @default.
- W3118161885 hasConceptScore W3118161885C119857082 @default.
- W3118161885 hasConceptScore W3118161885C121332964 @default.
- W3118161885 hasConceptScore W3118161885C126388530 @default.
- W3118161885 hasConceptScore W3118161885C1276947 @default.
- W3118161885 hasConceptScore W3118161885C13280743 @default.
- W3118161885 hasConceptScore W3118161885C13662910 @default.
- W3118161885 hasConceptScore W3118161885C154945302 @default.
- W3118161885 hasConceptScore W3118161885C15744967 @default.
- W3118161885 hasConceptScore W3118161885C185798385 @default.
- W3118161885 hasConceptScore W3118161885C205649164 @default.
- W3118161885 hasConceptScore W3118161885C2780791683 @default.
- W3118161885 hasConceptScore W3118161885C41008148 @default.
- W3118161885 hasConceptScore W3118161885C62520636 @default.
- W3118161885 hasConceptScore W3118161885C77805123 @default.