Matches in SemOpenAlex for { <https://semopenalex.org/work/W3210157506> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W3210157506 abstract "Learning meaningful behaviors in the absence of reward is a difficult problem in reinforcement learning. A desirable and challenging unsupervised objective is to learn a set of diverse skills that provide a thorough coverage of the state space while being directed, i.e., reliably reaching distinct regions of the environment. In this paper, we build on the mutual information framework for skill discovery and introduce UPSIDE, which addresses the coverage-directedness trade-off in the following ways: 1) We design policies with a decoupled structure of a directed skill, trained to reach a specific region, followed by a diffusing part that induces a local coverage. 2) We optimize policies by maximizing their number under the constraint that each of them reaches distinct regions of the environment (i.e., they are sufficiently discriminable) and prove that this serves as a lower bound to the original mutual information objective. 3) Finally, we compose the learned directed skills into a growing tree that adaptively covers the environment. We illustrate in several navigation and control environments how the skills learned by UPSIDE solve sparse-reward downstream tasks better than existing baselines." @default.
- W3210157506 created "2021-11-08" @default.
- W3210157506 creator A5002110131 @default.
- W3210157506 creator A5014791481 @default.
- W3210157506 creator A5027101473 @default.
- W3210157506 creator A5071798388 @default.
- W3210157506 date "2022-04-25" @default.
- W3210157506 modified "2023-10-01" @default.
- W3210157506 title "Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching" @default.
- W3210157506 cites W115285041 @default.
- W3210157506 cites W2145339207 @default.
- W3210157506 cites W2158782408 @default.
- W3210157506 cites W2556477470 @default.
- W3210157506 cites W2592215206 @default.
- W3210157506 cites W2606433045 @default.
- W3210157506 cites W2614839826 @default.
- W3210157506 cites W2883433335 @default.
- W3210157506 cites W2903327785 @default.
- W3210157506 cites W2953326529 @default.
- W3210157506 cites W2962730405 @default.
- W3210157506 cites W2962902376 @default.
- W3210157506 cites W2963438456 @default.
- W3210157506 cites W2963639957 @default.
- W3210157506 cites W2964190622 @default.
- W3210157506 cites W2964217298 @default.
- W3210157506 cites W2994995984 @default.
- W3210157506 cites W2995363872 @default.
- W3210157506 cites W2995636097 @default.
- W3210157506 cites W3027456239 @default.
- W3210157506 cites W3034731451 @default.
- W3210157506 cites W3043848240 @default.
- W3210157506 cites W3098815154 @default.
- W3210157506 cites W3111981703 @default.
- W3210157506 cites W3113994363 @default.
- W3210157506 cites W3121879877 @default.
- W3210157506 cites W3124128931 @default.
- W3210157506 cites W3167112175 @default.
- W3210157506 cites W3167815619 @default.
- W3210157506 cites W3170741447 @default.
- W3210157506 cites W3173228761 @default.
- W3210157506 cites W3173335063 @default.
- W3210157506 cites W3183413276 @default.
- W3210157506 cites W3205151669 @default.
- W3210157506 hasPublicationYear "2022" @default.
- W3210157506 type Work @default.
- W3210157506 sameAs 3210157506 @default.
- W3210157506 citedByCount "0" @default.
- W3210157506 crossrefType "proceedings-article" @default.
- W3210157506 hasAuthorship W3210157506A5002110131 @default.
- W3210157506 hasAuthorship W3210157506A5014791481 @default.
- W3210157506 hasAuthorship W3210157506A5027101473 @default.
- W3210157506 hasAuthorship W3210157506A5071798388 @default.
- W3210157506 hasBestOaLocation W32101575061 @default.
- W3210157506 hasConcept C154945302 @default.
- W3210157506 hasConcept C199360897 @default.
- W3210157506 hasConcept C41008148 @default.
- W3210157506 hasConcept C48103436 @default.
- W3210157506 hasConceptScore W3210157506C154945302 @default.
- W3210157506 hasConceptScore W3210157506C199360897 @default.
- W3210157506 hasConceptScore W3210157506C41008148 @default.
- W3210157506 hasConceptScore W3210157506C48103436 @default.
- W3210157506 hasLocation W32101575061 @default.
- W3210157506 hasLocation W32101575062 @default.
- W3210157506 hasOpenAccess W3210157506 @default.
- W3210157506 hasPrimaryLocation W32101575061 @default.
- W3210157506 hasRelatedWork W1596801655 @default.
- W3210157506 hasRelatedWork W2130043461 @default.
- W3210157506 hasRelatedWork W2350741829 @default.
- W3210157506 hasRelatedWork W2358668433 @default.
- W3210157506 hasRelatedWork W2376932109 @default.
- W3210157506 hasRelatedWork W2382290278 @default.
- W3210157506 hasRelatedWork W2390279801 @default.
- W3210157506 hasRelatedWork W2748952813 @default.
- W3210157506 hasRelatedWork W2899084033 @default.
- W3210157506 hasRelatedWork W2530322880 @default.
- W3210157506 isParatext "false" @default.
- W3210157506 isRetracted "false" @default.
- W3210157506 magId "3210157506" @default.
- W3210157506 workType "article" @default.