Matches in SemOpenAlex for { <https://semopenalex.org/work/W3086118045> ?p ?o ?g. }
- W3086118045 endingPage "11791" @default.
- W3086118045 startingPage "11782" @default.
- W3086118045 abstract "We address the problem of efficient exploration for transition model learning in the relational model-based reinforcement learning setting without extrinsic goals or rewards. Inspired by human curiosity, we propose goal-literal babbling (GLIB), a simple and general method for exploration in such problems. GLIB samples relational conjunctive goals that can be understood as specific, targeted effects that the agent would like to achieve in the world, and plans to achieve these goals using the transition model being learned. We provide theoretical guarantees showing that exploration with GLIB will converge almost surely to the ground truth model. Experimentally, we find GLIB to strongly outperform existing methods in both prediction and planning on a range of tasks, encompassing standard PDDL and PPDDL planning benchmarks and a robotic manipulation task implemented in the PyBullet physics simulator. Video: https://youtu.be/F6lmrPT6TOY Code: https://git.io/JIsTB" @default.
- W3086118045 created "2020-09-21" @default.
- W3086118045 creator A5012862284 @default.
- W3086118045 creator A5063238208 @default.
- W3086118045 creator A5071093940 @default.
- W3086118045 creator A5073565150 @default.
- W3086118045 creator A5076794038 @default.
- W3086118045 date "2021-05-18" @default.
- W3086118045 modified "2023-09-27" @default.
- W3086118045 title "GLIB: Efficient Exploration for Relational Model-Based Reinforcement Learning via Goal-Literal Babbling" @default.
- W3086118045 cites W1003193527 @default.
- W3086118045 cites W127341816 @default.
- W3086118045 cites W1443446730 @default.
- W3086118045 cites W1491843047 @default.
- W3086118045 cites W1492014007 @default.
- W3086118045 cites W1505937442 @default.
- W3086118045 cites W15206906 @default.
- W3086118045 cites W1552549180 @default.
- W3086118045 cites W16599163 @default.
- W3086118045 cites W1990182336 @default.
- W3086118045 cites W2004303440 @default.
- W3086118045 cites W2033072307 @default.
- W3086118045 cites W2048537180 @default.
- W3086118045 cites W2097083865 @default.
- W3086118045 cites W2111384628 @default.
- W3086118045 cites W2116086091 @default.
- W3086118045 cites W2119521930 @default.
- W3086118045 cites W2130251175 @default.
- W3086118045 cites W2134100786 @default.
- W3086118045 cites W2149390907 @default.
- W3086118045 cites W2154022540 @default.
- W3086118045 cites W2187263253 @default.
- W3086118045 cites W2337392266 @default.
- W3086118045 cites W2489939061 @default.
- W3086118045 cites W2521075165 @default.
- W3086118045 cites W2555868854 @default.
- W3086118045 cites W2565678376 @default.
- W3086118045 cites W2616430965 @default.
- W3086118045 cites W2744921630 @default.
- W3086118045 cites W2900860440 @default.
- W3086118045 cites W2936168870 @default.
- W3086118045 cites W2963201535 @default.
- W3086118045 cites W2964342357 @default.
- W3086118045 cites W2965976698 @default.
- W3086118045 cites W3006576697 @default.
- W3086118045 cites W3020831056 @default.
- W3086118045 cites W3036282537 @default.
- W3086118045 cites W3101355526 @default.
- W3086118045 cites W42860436 @default.
- W3086118045 cites W570039843 @default.
- W3086118045 cites W640625031 @default.
- W3086118045 cites W1569963244 @default.
- W3086118045 doi "https://doi.org/10.1609/aaai.v35i13.17400" @default.
- W3086118045 hasPublicationYear "2021" @default.
- W3086118045 type Work @default.
- W3086118045 sameAs 3086118045 @default.
- W3086118045 citedByCount "1" @default.
- W3086118045 countsByYear W30861180452020 @default.
- W3086118045 crossrefType "journal-article" @default.
- W3086118045 hasAuthorship W3086118045A5012862284 @default.
- W3086118045 hasAuthorship W3086118045A5063238208 @default.
- W3086118045 hasAuthorship W3086118045A5071093940 @default.
- W3086118045 hasAuthorship W3086118045A5073565150 @default.
- W3086118045 hasAuthorship W3086118045A5076794038 @default.
- W3086118045 hasBestOaLocation W30861180451 @default.
- W3086118045 hasConcept C104317684 @default.
- W3086118045 hasConcept C127413603 @default.
- W3086118045 hasConcept C134589348 @default.
- W3086118045 hasConcept C138885662 @default.
- W3086118045 hasConcept C154945302 @default.
- W3086118045 hasConcept C177264268 @default.
- W3086118045 hasConcept C185592680 @default.
- W3086118045 hasConcept C194232998 @default.
- W3086118045 hasConcept C199360897 @default.
- W3086118045 hasConcept C201995342 @default.
- W3086118045 hasConcept C2776760102 @default.
- W3086118045 hasConcept C2780451532 @default.
- W3086118045 hasConcept C2780882242 @default.
- W3086118045 hasConcept C41008148 @default.
- W3086118045 hasConcept C41895202 @default.
- W3086118045 hasConcept C55493867 @default.
- W3086118045 hasConcept C97541855 @default.
- W3086118045 hasConceptScore W3086118045C104317684 @default.
- W3086118045 hasConceptScore W3086118045C127413603 @default.
- W3086118045 hasConceptScore W3086118045C134589348 @default.
- W3086118045 hasConceptScore W3086118045C138885662 @default.
- W3086118045 hasConceptScore W3086118045C154945302 @default.
- W3086118045 hasConceptScore W3086118045C177264268 @default.
- W3086118045 hasConceptScore W3086118045C185592680 @default.
- W3086118045 hasConceptScore W3086118045C194232998 @default.
- W3086118045 hasConceptScore W3086118045C199360897 @default.
- W3086118045 hasConceptScore W3086118045C201995342 @default.
- W3086118045 hasConceptScore W3086118045C2776760102 @default.
- W3086118045 hasConceptScore W3086118045C2780451532 @default.
- W3086118045 hasConceptScore W3086118045C2780882242 @default.
- W3086118045 hasConceptScore W3086118045C41008148 @default.
- W3086118045 hasConceptScore W3086118045C41895202 @default.
- W3086118045 hasConceptScore W3086118045C55493867 @default.