Matches in SemOpenAlex for { <https://semopenalex.org/work/W3096277050> ?p ?o ?g. }
- W3096277050 abstract "Traditionally, reinforcement learning methods predict the next action based on the current state. However, in many situations, directly applying actions to control systems or robots is dangerous and may lead to unexpected behaviors because action is rather low-level. In this paper, we propose a novel hierarchical reinforcement learning framework without explicit action. Our meta policy tries to manipulate the next optimal state and actual action is produced by the inverse dynamics model. To stabilize the training process, we integrate adversarial learning and information bottleneck into our framework. Under our framework, widely available state-only demonstrations can be exploited effectively for imitation learning. Also, prior knowledge and constraints can be applied to meta policy. We test our algorithm in simulation tasks and its combination with imitation learning. The experimental results show the reliability and robustness of our algorithms." @default.
- W3096277050 created "2020-11-09" @default.
- W3096277050 creator A5008812639 @default.
- W3096277050 creator A5019389336 @default.
- W3096277050 creator A5020849033 @default.
- W3096277050 creator A5045099154 @default.
- W3096277050 creator A5047533842 @default.
- W3096277050 creator A5061104105 @default.
- W3096277050 creator A5071921002 @default.
- W3096277050 creator A5072355910 @default.
- W3096277050 creator A5091070290 @default.
- W3096277050 date "2020-11-02" @default.
- W3096277050 modified "2023-09-26" @default.
- W3096277050 title "NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control." @default.
- W3096277050 cites W1980035368 @default.
- W3096277050 cites W2104733512 @default.
- W3096277050 cites W2121103318 @default.
- W3096277050 cites W2121863487 @default.
- W3096277050 cites W2131831090 @default.
- W3096277050 cites W2158782408 @default.
- W3096277050 cites W2174803659 @default.
- W3096277050 cites W2296673577 @default.
- W3096277050 cites W2342840547 @default.
- W3096277050 cites W2404067440 @default.
- W3096277050 cites W2604382266 @default.
- W3096277050 cites W2616964725 @default.
- W3096277050 cites W2729615412 @default.
- W3096277050 cites W2736601468 @default.
- W3096277050 cites W2738778707 @default.
- W3096277050 cites W2757631751 @default.
- W3096277050 cites W2769112066 @default.
- W3096277050 cites W2772709170 @default.
- W3096277050 cites W2805762288 @default.
- W3096277050 cites W2884247313 @default.
- W3096277050 cites W2886140408 @default.
- W3096277050 cites W2892806280 @default.
- W3096277050 cites W2894766094 @default.
- W3096277050 cites W2937051932 @default.
- W3096277050 cites W2962715211 @default.
- W3096277050 cites W2962787969 @default.
- W3096277050 cites W2962986780 @default.
- W3096277050 cites W2963411833 @default.
- W3096277050 cites W2964460729 @default.
- W3096277050 cites W2970003882 @default.
- W3096277050 cites W2970749192 @default.
- W3096277050 cites W2981030070 @default.
- W3096277050 cites W2993089717 @default.
- W3096277050 cites W3013077052 @default.
- W3096277050 cites W3016047885 @default.
- W3096277050 cites W3037207827 @default.
- W3096277050 hasPublicationYear "2020" @default.
- W3096277050 type Work @default.
- W3096277050 sameAs 3096277050 @default.
- W3096277050 citedByCount "0" @default.
- W3096277050 crossrefType "posted-content" @default.
- W3096277050 hasAuthorship W3096277050A5008812639 @default.
- W3096277050 hasAuthorship W3096277050A5019389336 @default.
- W3096277050 hasAuthorship W3096277050A5020849033 @default.
- W3096277050 hasAuthorship W3096277050A5045099154 @default.
- W3096277050 hasAuthorship W3096277050A5047533842 @default.
- W3096277050 hasAuthorship W3096277050A5061104105 @default.
- W3096277050 hasAuthorship W3096277050A5071921002 @default.
- W3096277050 hasAuthorship W3096277050A5072355910 @default.
- W3096277050 hasAuthorship W3096277050A5091070290 @default.
- W3096277050 hasConcept C104317684 @default.
- W3096277050 hasConcept C119857082 @default.
- W3096277050 hasConcept C121332964 @default.
- W3096277050 hasConcept C145420912 @default.
- W3096277050 hasConcept C149635348 @default.
- W3096277050 hasConcept C154945302 @default.
- W3096277050 hasConcept C183759332 @default.
- W3096277050 hasConcept C185592680 @default.
- W3096277050 hasConcept C2780513914 @default.
- W3096277050 hasConcept C2780791683 @default.
- W3096277050 hasConcept C33923547 @default.
- W3096277050 hasConcept C37736160 @default.
- W3096277050 hasConcept C41008148 @default.
- W3096277050 hasConcept C51672120 @default.
- W3096277050 hasConcept C55493867 @default.
- W3096277050 hasConcept C62520636 @default.
- W3096277050 hasConcept C63479239 @default.
- W3096277050 hasConcept C88610354 @default.
- W3096277050 hasConcept C90509273 @default.
- W3096277050 hasConcept C97541855 @default.
- W3096277050 hasConceptScore W3096277050C104317684 @default.
- W3096277050 hasConceptScore W3096277050C119857082 @default.
- W3096277050 hasConceptScore W3096277050C121332964 @default.
- W3096277050 hasConceptScore W3096277050C145420912 @default.
- W3096277050 hasConceptScore W3096277050C149635348 @default.
- W3096277050 hasConceptScore W3096277050C154945302 @default.
- W3096277050 hasConceptScore W3096277050C183759332 @default.
- W3096277050 hasConceptScore W3096277050C185592680 @default.
- W3096277050 hasConceptScore W3096277050C2780513914 @default.
- W3096277050 hasConceptScore W3096277050C2780791683 @default.
- W3096277050 hasConceptScore W3096277050C33923547 @default.
- W3096277050 hasConceptScore W3096277050C37736160 @default.
- W3096277050 hasConceptScore W3096277050C41008148 @default.
- W3096277050 hasConceptScore W3096277050C51672120 @default.
- W3096277050 hasConceptScore W3096277050C55493867 @default.
- W3096277050 hasConceptScore W3096277050C62520636 @default.