Matches in SemOpenAlex for { <https://semopenalex.org/work/W3206854305> ?p ?o ?g. }
- W3206854305 abstract "There has been rapidly growing interest in meta-learning as a method for increasing the flexibility and sample efficiency of reinforcement learning. One problem in this area of research, however, has been a scarcity of adequate benchmark tasks. In general, the structure underlying past benchmarks has either been too simple to be inherently interesting, or too ill-defined to support principled analysis. In the present work, we introduce a new benchmark for meta-RL research, emphasizing transparency and potential for in-depth analysis as well as structural richness. Alchemy is a 3D video game, implemented in Unity, which involves a latent causal structure that is resampled procedurally from episode to episode, affording structure learning, online inference, hypothesis testing and action sequencing based on abstract domain knowledge. We evaluate a pair of powerful RL agents on Alchemy and present an in-depth analysis of one of these agents. Results clearly indicate a frank and specific failure of meta-learning, providing validation for Alchemy as a challenging benchmark for meta-RL. Concurrent with this report, we are releasing Alchemy as public resource, together with a suite of analysis tools and sample agent trajectories." @default.
- W3206854305 created "2021-10-25" @default.
- W3206854305 creator A5005349213 @default.
- W3206854305 creator A5007418812 @default.
- W3206854305 creator A5010900024 @default.
- W3206854305 creator A5027859840 @default.
- W3206854305 creator A5031035504 @default.
- W3206854305 creator A5034114908 @default.
- W3206854305 creator A5034618143 @default.
- W3206854305 creator A5035836682 @default.
- W3206854305 creator A5036050246 @default.
- W3206854305 creator A5038295363 @default.
- W3206854305 creator A5044385855 @default.
- W3206854305 creator A5053035058 @default.
- W3206854305 creator A5054833182 @default.
- W3206854305 creator A5061272217 @default.
- W3206854305 creator A5083771180 @default.
- W3206854305 creator A5085539003 @default.
- W3206854305 creator A5089497713 @default.
- W3206854305 date "2021-02-04" @default.
- W3206854305 modified "2023-10-01" @default.
- W3206854305 title "Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents" @default.
- W3206854305 cites W1515851193 @default.
- W3206854305 cites W1621791442 @default.
- W3206854305 cites W2064675550 @default.
- W3206854305 cites W2123399796 @default.
- W3206854305 cites W2123429050 @default.
- W3206854305 cites W2123713131 @default.
- W3206854305 cites W2128152674 @default.
- W3206854305 cites W2128775537 @default.
- W3206854305 cites W2144578442 @default.
- W3206854305 cites W2158782408 @default.
- W3206854305 cites W2174786457 @default.
- W3206854305 cites W2472819217 @default.
- W3206854305 cites W2480004914 @default.
- W3206854305 cites W2551887912 @default.
- W3206854305 cites W2578206533 @default.
- W3206854305 cites W2604763608 @default.
- W3206854305 cites W2769923555 @default.
- W3206854305 cites W2784596339 @default.
- W3206854305 cites W2785397462 @default.
- W3206854305 cites W2786036274 @default.
- W3206854305 cites W2787501667 @default.
- W3206854305 cites W2788904251 @default.
- W3206854305 cites W2789517807 @default.
- W3206854305 cites W2790414966 @default.
- W3206854305 cites W2796979132 @default.
- W3206854305 cites W2889987506 @default.
- W3206854305 cites W2891227010 @default.
- W3206854305 cites W2909534157 @default.
- W3206854305 cites W2914731160 @default.
- W3206854305 cites W2914898814 @default.
- W3206854305 cites W2923504512 @default.
- W3206854305 cites W2938321354 @default.
- W3206854305 cites W2944299231 @default.
- W3206854305 cites W2945020056 @default.
- W3206854305 cites W2963025296 @default.
- W3206854305 cites W2963305465 @default.
- W3206854305 cites W2963403868 @default.
- W3206854305 cites W2963680188 @default.
- W3206854305 cites W2963948533 @default.
- W3206854305 cites W2964296021 @default.
- W3206854305 cites W2967210407 @default.
- W3206854305 cites W2971014752 @default.
- W3206854305 cites W2982316857 @default.
- W3206854305 cites W2995181668 @default.
- W3206854305 cites W3030163527 @default.
- W3206854305 cites W3032377877 @default.
- W3206854305 cites W3034758614 @default.
- W3206854305 cites W3034946435 @default.
- W3206854305 cites W3035435378 @default.
- W3206854305 cites W3037871539 @default.
- W3206854305 cites W3047290399 @default.
- W3206854305 cites W3094497556 @default.
- W3206854305 cites W3109949687 @default.
- W3206854305 cites W3110161557 @default.
- W3206854305 cites W3118210634 @default.
- W3206854305 cites W3124628996 @default.
- W3206854305 cites W3125634603 @default.
- W3206854305 cites W3127593076 @default.
- W3206854305 cites W3166906164 @default.
- W3206854305 cites W3183390056 @default.
- W3206854305 cites W99485931 @default.
- W3206854305 cites W2426267443 @default.
- W3206854305 cites W2770298516 @default.
- W3206854305 hasPublicationYear "2021" @default.
- W3206854305 type Work @default.
- W3206854305 sameAs 3206854305 @default.
- W3206854305 citedByCount "5" @default.
- W3206854305 countsByYear W32068543052021 @default.
- W3206854305 crossrefType "posted-content" @default.
- W3206854305 hasAuthorship W3206854305A5005349213 @default.
- W3206854305 hasAuthorship W3206854305A5007418812 @default.
- W3206854305 hasAuthorship W3206854305A5010900024 @default.
- W3206854305 hasAuthorship W3206854305A5027859840 @default.
- W3206854305 hasAuthorship W3206854305A5031035504 @default.
- W3206854305 hasAuthorship W3206854305A5034114908 @default.
- W3206854305 hasAuthorship W3206854305A5034618143 @default.
- W3206854305 hasAuthorship W3206854305A5035836682 @default.
- W3206854305 hasAuthorship W3206854305A5036050246 @default.