Matches in SemOpenAlex for { <https://semopenalex.org/work/W3124056987> ?p ?o ?g. }
- W3124056987 abstract "Policy specification is a process by which a human can initialize a robot's behaviour and, in turn, warm-start policy optimization via Reinforcement Learning (RL). While policy specification/design is inherently a collaborative process, modern methods based on Learning from Demonstration or Deep RL lack the model interpretability and accessibility to be classified as such. Current state-of-the-art methods for policy specification rely on black-box models, which are an insufficient means of collaboration for non-expert users: These models provide no means of inspecting policies learnt by the agent and are not focused on creating a usable modality for teaching robot behaviour. In this paper, we propose a novel machine learning framework that enables humans to 1) specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, 2) leverage these policies to warm-start reinforcement learning and 3) outperform baselines that lack our natural language initialization mechanism. We train our approach by collecting a first-of-its-kind corpus mapping free-form natural language policy descriptions to decision tree-based policies. We show that our novel framework translates natural language to decision trees with a 96% and 97% accuracy on a held-out corpus across two domains, respectively. Finally, we validate that policies initialized with natural language commands are able to significantly outperform relevant baselines (p < 0.001) that do not benefit from our natural language-based warm-start technique." @default.
- W3124056987 created "2021-02-01" @default.
- W3124056987 creator A5008211323 @default.
- W3124056987 creator A5023539634 @default.
- W3124056987 creator A5051331381 @default.
- W3124056987 creator A5089421543 @default.
- W3124056987 date "2021-01-18" @default.
- W3124056987 modified "2023-09-27" @default.
- W3124056987 title "Interpretable Policy Specification and Synthesis through Natural Language and RL" @default.
- W3124056987 cites W1501005121 @default.
- W3124056987 cites W1522301498 @default.
- W3124056987 cites W1825675169 @default.
- W3124056987 cites W1999874108 @default.
- W3124056987 cites W2121517924 @default.
- W3124056987 cites W2224454470 @default.
- W3124056987 cites W2236233024 @default.
- W3124056987 cites W2444236087 @default.
- W3124056987 cites W2553882142 @default.
- W3124056987 cites W2594475271 @default.
- W3124056987 cites W2611818442 @default.
- W3124056987 cites W2736601468 @default.
- W3124056987 cites W2773656636 @default.
- W3124056987 cites W2787560479 @default.
- W3124056987 cites W2804020409 @default.
- W3124056987 cites W2804930149 @default.
- W3124056987 cites W2895560838 @default.
- W3124056987 cites W2944766483 @default.
- W3124056987 cites W2948609886 @default.
- W3124056987 cites W2949888546 @default.
- W3124056987 cites W2951420334 @default.
- W3124056987 cites W2962790223 @default.
- W3124056987 cites W2962851944 @default.
- W3124056987 cites W2963099939 @default.
- W3124056987 cites W2964125683 @default.
- W3124056987 cites W2964153729 @default.
- W3124056987 cites W2964308564 @default.
- W3124056987 cites W2981601849 @default.
- W3124056987 cites W2990963624 @default.
- W3124056987 cites W3006381853 @default.
- W3124056987 cites W3037467471 @default.
- W3124056987 cites W3097001800 @default.
- W3124056987 cites W46490633 @default.
- W3124056987 hasPublicationYear "2021" @default.
- W3124056987 type Work @default.
- W3124056987 sameAs 3124056987 @default.
- W3124056987 citedByCount "3" @default.
- W3124056987 countsByYear W31240569872021 @default.
- W3124056987 crossrefType "posted-content" @default.
- W3124056987 hasAuthorship W3124056987A5008211323 @default.
- W3124056987 hasAuthorship W3124056987A5023539634 @default.
- W3124056987 hasAuthorship W3124056987A5051331381 @default.
- W3124056987 hasAuthorship W3124056987A5089421543 @default.
- W3124056987 hasConcept C114466953 @default.
- W3124056987 hasConcept C119857082 @default.
- W3124056987 hasConcept C136764020 @default.
- W3124056987 hasConcept C153083717 @default.
- W3124056987 hasConcept C154945302 @default.
- W3124056987 hasConcept C195324797 @default.
- W3124056987 hasConcept C199360897 @default.
- W3124056987 hasConcept C204321447 @default.
- W3124056987 hasConcept C2779439875 @default.
- W3124056987 hasConcept C2780615836 @default.
- W3124056987 hasConcept C2781067378 @default.
- W3124056987 hasConcept C41008148 @default.
- W3124056987 hasConcept C97541855 @default.
- W3124056987 hasConceptScore W3124056987C114466953 @default.
- W3124056987 hasConceptScore W3124056987C119857082 @default.
- W3124056987 hasConceptScore W3124056987C136764020 @default.
- W3124056987 hasConceptScore W3124056987C153083717 @default.
- W3124056987 hasConceptScore W3124056987C154945302 @default.
- W3124056987 hasConceptScore W3124056987C195324797 @default.
- W3124056987 hasConceptScore W3124056987C199360897 @default.
- W3124056987 hasConceptScore W3124056987C204321447 @default.
- W3124056987 hasConceptScore W3124056987C2779439875 @default.
- W3124056987 hasConceptScore W3124056987C2780615836 @default.
- W3124056987 hasConceptScore W3124056987C2781067378 @default.
- W3124056987 hasConceptScore W3124056987C41008148 @default.
- W3124056987 hasConceptScore W3124056987C97541855 @default.
- W3124056987 hasLocation W31240569871 @default.
- W3124056987 hasOpenAccess W3124056987 @default.
- W3124056987 hasPrimaryLocation W31240569871 @default.
- W3124056987 hasRelatedWork W1602918461 @default.
- W3124056987 hasRelatedWork W2158615804 @default.
- W3124056987 hasRelatedWork W2292403815 @default.
- W3124056987 hasRelatedWork W2613239086 @default.
- W3124056987 hasRelatedWork W2705882626 @default.
- W3124056987 hasRelatedWork W2799054028 @default.
- W3124056987 hasRelatedWork W2895835746 @default.
- W3124056987 hasRelatedWork W2938578532 @default.
- W3124056987 hasRelatedWork W2940813702 @default.
- W3124056987 hasRelatedWork W2963508916 @default.
- W3124056987 hasRelatedWork W2966032590 @default.
- W3124056987 hasRelatedWork W3030535585 @default.
- W3124056987 hasRelatedWork W3037614835 @default.
- W3124056987 hasRelatedWork W3092418144 @default.
- W3124056987 hasRelatedWork W3093964511 @default.
- W3124056987 hasRelatedWork W3118630514 @default.
- W3124056987 hasRelatedWork W3124166961 @default.
- W3124056987 hasRelatedWork W3134613326 @default.
- W3124056987 hasRelatedWork W3183605017 @default.