Matches in SemOpenAlex for { <https://semopenalex.org/work/W3096356720> ?p ?o ?g. }
- W3096356720 abstract "A fundamental trait of intelligence is the ability to achieve goals in the face of novel circumstances, such as making decisions from new action choices. However, standard reinforcement learning assumes a fixed set of actions and requires expensive retraining when given a new action set. To make learning agents more adaptable, we introduce the problem of zero-shot generalization to new actions. We propose a two-stage framework where the agent first infers action representations from action information acquired separately from the task. A policy flexible to varying action sets is then trained with generalization objectives. We benchmark generalization on sequential tasks, such as selecting from an unseen tool-set to solve physical reasoning puzzles and stacking towers with novel 3D shapes. Videos and code are available at this https URL" @default.
- W3096356720 created "2020-11-09" @default.
- W3096356720 creator A5022376480 @default.
- W3096356720 creator A5054155451 @default.
- W3096356720 creator A5058691599 @default.
- W3096356720 date "2020-11-03" @default.
- W3096356720 modified "2023-09-27" @default.
- W3096356720 title "Generalization to New Actions in Reinforcement Learning" @default.
- W3096356720 cites W2044320213 @default.
- W3096356720 cites W2098774185 @default.
- W3096356720 cites W2101493843 @default.
- W3096356720 cites W2111296615 @default.
- W3096356720 cites W2131774270 @default.
- W3096356720 cites W2155027007 @default.
- W3096356720 cites W2156909104 @default.
- W3096356720 cites W2158782408 @default.
- W3096356720 cites W2163922914 @default.
- W3096356720 cites W2213612645 @default.
- W3096356720 cites W2215378786 @default.
- W3096356720 cites W2253991908 @default.
- W3096356720 cites W2465628802 @default.
- W3096356720 cites W2553882142 @default.
- W3096356720 cites W2619034550 @default.
- W3096356720 cites W2620290674 @default.
- W3096356720 cites W2736601468 @default.
- W3096356720 cites W2740828027 @default.
- W3096356720 cites W2753738274 @default.
- W3096356720 cites W2765349170 @default.
- W3096356720 cites W2785948534 @default.
- W3096356720 cites W2788388592 @default.
- W3096356720 cites W2796979132 @default.
- W3096356720 cites W2806859579 @default.
- W3096356720 cites W2886601525 @default.
- W3096356720 cites W2898436992 @default.
- W3096356720 cites W2899530572 @default.
- W3096356720 cites W2899771611 @default.
- W3096356720 cites W2903181768 @default.
- W3096356720 cites W2911820093 @default.
- W3096356720 cites W2913343212 @default.
- W3096356720 cites W2913988491 @default.
- W3096356720 cites W2937206389 @default.
- W3096356720 cites W2949117887 @default.
- W3096356720 cites W2963207497 @default.
- W3096356720 cites W2963344681 @default.
- W3096356720 cites W2964027856 @default.
- W3096356720 cites W2964055695 @default.
- W3096356720 cites W2964121744 @default.
- W3096356720 cites W2964158321 @default.
- W3096356720 cites W2964208960 @default.
- W3096356720 cites W2964342357 @default.
- W3096356720 cites W2970806862 @default.
- W3096356720 cites W2971631853 @default.
- W3096356720 cites W2994689640 @default.
- W3096356720 cites W2998718698 @default.
- W3096356720 cites W3037966979 @default.
- W3096356720 hasPublicationYear "2020" @default.
- W3096356720 type Work @default.
- W3096356720 sameAs 3096356720 @default.
- W3096356720 citedByCount "1" @default.
- W3096356720 countsByYear W30963567202019 @default.
- W3096356720 crossrefType "posted-content" @default.
- W3096356720 hasAuthorship W3096356720A5022376480 @default.
- W3096356720 hasAuthorship W3096356720A5054155451 @default.
- W3096356720 hasAuthorship W3096356720A5058691599 @default.
- W3096356720 hasConcept C119857082 @default.
- W3096356720 hasConcept C121332964 @default.
- W3096356720 hasConcept C13280743 @default.
- W3096356720 hasConcept C134306372 @default.
- W3096356720 hasConcept C144024400 @default.
- W3096356720 hasConcept C144133560 @default.
- W3096356720 hasConcept C154945302 @default.
- W3096356720 hasConcept C155202549 @default.
- W3096356720 hasConcept C162324750 @default.
- W3096356720 hasConcept C177148314 @default.
- W3096356720 hasConcept C177264268 @default.
- W3096356720 hasConcept C185798385 @default.
- W3096356720 hasConcept C187736073 @default.
- W3096356720 hasConcept C199360897 @default.
- W3096356720 hasConcept C205649164 @default.
- W3096356720 hasConcept C2776760102 @default.
- W3096356720 hasConcept C2778712577 @default.
- W3096356720 hasConcept C2779304628 @default.
- W3096356720 hasConcept C2780451532 @default.
- W3096356720 hasConcept C2780791683 @default.
- W3096356720 hasConcept C33923547 @default.
- W3096356720 hasConcept C36289849 @default.
- W3096356720 hasConcept C41008148 @default.
- W3096356720 hasConcept C62520636 @default.
- W3096356720 hasConcept C97541855 @default.
- W3096356720 hasConceptScore W3096356720C119857082 @default.
- W3096356720 hasConceptScore W3096356720C121332964 @default.
- W3096356720 hasConceptScore W3096356720C13280743 @default.
- W3096356720 hasConceptScore W3096356720C134306372 @default.
- W3096356720 hasConceptScore W3096356720C144024400 @default.
- W3096356720 hasConceptScore W3096356720C144133560 @default.
- W3096356720 hasConceptScore W3096356720C154945302 @default.
- W3096356720 hasConceptScore W3096356720C155202549 @default.
- W3096356720 hasConceptScore W3096356720C162324750 @default.
- W3096356720 hasConceptScore W3096356720C177148314 @default.
- W3096356720 hasConceptScore W3096356720C177264268 @default.