Matches in SemOpenAlex for { <https://semopenalex.org/work/W3085993909> ?p ?o ?g. }
Showing items 1 to 88 of
88
with 100 items per page.
- W3085993909 abstract "We propose a new method for training an agent via an evolutionary strategy (ES), in which we iteratively improve a set of samples to imitate: Starting with a random set, in every iteration we replace a subset of the samples with samples from the best trajectories discovered so far. The evaluation procedure for this set is to train, via supervised learning, a randomly initialised neural network (NN) to imitate the set and then execute the acquired policy against the environment. Our method is thus an ES based on a fitness function that expresses the effectiveness of imitating an evolving data subset. This is in contrast to other ES techniques that iterate over the weights of the policy directly. By observing the samples that the agent selects for learning, it is possible to interpret and evaluate the evolving strategy of the agent more explicitly than in NN learning. In our experiments, we trained an agent to solve the OpenAI Gym environment Bipedalwalker-v3 by imitating an evolutionarily selected set of only 25 samples with a NN with only a few thousand parameters. We further test our method on the Procgen game Plunder and show here as well that the proposed method is an interpretable, small, robust and effective alternative to other ES or policy gradient methods." @default.
- W3085993909 created "2020-09-21" @default.
- W3085993909 creator A5030760534 @default.
- W3085993909 creator A5059315456 @default.
- W3085993909 date "2020-09-17" @default.
- W3085993909 modified "2023-09-27" @default.
- W3085993909 title "Evolutionary Selective Imitation: Interpretable Agents by Imitation Learning Without a Demonstrator." @default.
- W3085993909 cites W1986014385 @default.
- W3085993909 cites W2099397840 @default.
- W3085993909 cites W2104588723 @default.
- W3085993909 cites W2155307968 @default.
- W3085993909 cites W2167224731 @default.
- W3085993909 cites W2342840547 @default.
- W3085993909 cites W2462906003 @default.
- W3085993909 cites W2596367596 @default.
- W3085993909 cites W2778749116 @default.
- W3085993909 cites W2796284132 @default.
- W3085993909 cites W2904157920 @default.
- W3085993909 cites W2962702317 @default.
- W3085993909 cites W2962790223 @default.
- W3085993909 cites W2962843949 @default.
- W3085993909 cites W2963099939 @default.
- W3085993909 cites W2963367680 @default.
- W3085993909 cites W2963374347 @default.
- W3085993909 cites W2964263543 @default.
- W3085993909 cites W2994073215 @default.
- W3085993909 cites W2998396902 @default.
- W3085993909 cites W3106528330 @default.
- W3085993909 hasPublicationYear "2020" @default.
- W3085993909 type Work @default.
- W3085993909 sameAs 3085993909 @default.
- W3085993909 citedByCount "0" @default.
- W3085993909 crossrefType "posted-content" @default.
- W3085993909 hasAuthorship W3085993909A5030760534 @default.
- W3085993909 hasAuthorship W3085993909A5059315456 @default.
- W3085993909 hasConcept C119857082 @default.
- W3085993909 hasConcept C126388530 @default.
- W3085993909 hasConcept C14036430 @default.
- W3085993909 hasConcept C154945302 @default.
- W3085993909 hasConcept C15744967 @default.
- W3085993909 hasConcept C169903167 @default.
- W3085993909 hasConcept C177264268 @default.
- W3085993909 hasConcept C199360897 @default.
- W3085993909 hasConcept C41008148 @default.
- W3085993909 hasConcept C50644808 @default.
- W3085993909 hasConcept C77805123 @default.
- W3085993909 hasConcept C78458016 @default.
- W3085993909 hasConcept C86803240 @default.
- W3085993909 hasConceptScore W3085993909C119857082 @default.
- W3085993909 hasConceptScore W3085993909C126388530 @default.
- W3085993909 hasConceptScore W3085993909C14036430 @default.
- W3085993909 hasConceptScore W3085993909C154945302 @default.
- W3085993909 hasConceptScore W3085993909C15744967 @default.
- W3085993909 hasConceptScore W3085993909C169903167 @default.
- W3085993909 hasConceptScore W3085993909C177264268 @default.
- W3085993909 hasConceptScore W3085993909C199360897 @default.
- W3085993909 hasConceptScore W3085993909C41008148 @default.
- W3085993909 hasConceptScore W3085993909C50644808 @default.
- W3085993909 hasConceptScore W3085993909C77805123 @default.
- W3085993909 hasConceptScore W3085993909C78458016 @default.
- W3085993909 hasConceptScore W3085993909C86803240 @default.
- W3085993909 hasLocation W30859939091 @default.
- W3085993909 hasOpenAccess W3085993909 @default.
- W3085993909 hasPrimaryLocation W30859939091 @default.
- W3085993909 hasRelatedWork W1601125311 @default.
- W3085993909 hasRelatedWork W1931792391 @default.
- W3085993909 hasRelatedWork W1995959609 @default.
- W3085993909 hasRelatedWork W2061133144 @default.
- W3085993909 hasRelatedWork W2164892942 @default.
- W3085993909 hasRelatedWork W2418837394 @default.
- W3085993909 hasRelatedWork W2513603937 @default.
- W3085993909 hasRelatedWork W2560678327 @default.
- W3085993909 hasRelatedWork W2569026911 @default.
- W3085993909 hasRelatedWork W2785635021 @default.
- W3085993909 hasRelatedWork W2786381149 @default.
- W3085993909 hasRelatedWork W2788730383 @default.
- W3085993909 hasRelatedWork W2891781407 @default.
- W3085993909 hasRelatedWork W2930620797 @default.
- W3085993909 hasRelatedWork W2953043480 @default.
- W3085993909 hasRelatedWork W2957144066 @default.
- W3085993909 hasRelatedWork W2970479807 @default.
- W3085993909 hasRelatedWork W2998219625 @default.
- W3085993909 hasRelatedWork W3171575163 @default.
- W3085993909 hasRelatedWork W630842315 @default.
- W3085993909 isParatext "false" @default.
- W3085993909 isRetracted "false" @default.
- W3085993909 magId "3085993909" @default.
- W3085993909 workType "article" @default.