Matches in SemOpenAlex for { <https://semopenalex.org/work/W3034311880> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W3034311880 abstract "Reinforcement learning agents usually learn from scratch, which requires a large number of interactions with the environment. This is quite different from the learning process of human. When faced with a new task, human naturally have the common sense and use the prior knowledge to derive an initial policy and guide the learning process afterwards. Although the prior knowledge may be not fully applicable to the new task, the learning process is significantly sped up since the initial policy ensures a quick-start of learning and intermediate guidance allows to avoid unnecessary exploration. Taking this inspiration, we propose knowledge guided policy network (KoGuN), a novel framework that combines human prior suboptimal knowledge with reinforcement learning. Our framework consists of a fuzzy rule controller to represent human knowledge and a refine module to finetune suboptimal prior knowledge. The proposed framework is end-to-end and can be combined with existing policy-based reinforcement learning algorithm. We conduct experiments on several control tasks. The empirical results show that our approach, which combines suboptimal human knowledge and RL, achieves significant improvement on learning efficiency of flat RL algorithms, even with very low-performance human prior knowledge." @default.
- W3034311880 created "2020-06-19" @default.
- W3034311880 creator A5019579988 @default.
- W3034311880 creator A5029360035 @default.
- W3034311880 creator A5029760958 @default.
- W3034311880 creator A5047509839 @default.
- W3034311880 creator A5082417441 @default.
- W3034311880 creator A5084937471 @default.
- W3034311880 creator A5074584486 @default.
- W3034311880 date "2020-07-01" @default.
- W3034311880 modified "2023-09-27" @default.
- W3034311880 title "KoGuN: Accelerating Deep Reinforcement Learning via Integrating Human Suboptimal Knowledge" @default.
- W3034311880 doi "https://doi.org/10.24963/ijcai.2020/317" @default.
- W3034311880 hasPublicationYear "2020" @default.
- W3034311880 type Work @default.
- W3034311880 sameAs 3034311880 @default.
- W3034311880 citedByCount "11" @default.
- W3034311880 countsByYear W30343118802020 @default.
- W3034311880 countsByYear W30343118802021 @default.
- W3034311880 countsByYear W30343118802023 @default.
- W3034311880 crossrefType "proceedings-article" @default.
- W3034311880 hasAuthorship W3034311880A5019579988 @default.
- W3034311880 hasAuthorship W3034311880A5029360035 @default.
- W3034311880 hasAuthorship W3034311880A5029760958 @default.
- W3034311880 hasAuthorship W3034311880A5047509839 @default.
- W3034311880 hasAuthorship W3034311880A5074584486 @default.
- W3034311880 hasAuthorship W3034311880A5082417441 @default.
- W3034311880 hasAuthorship W3034311880A5084937471 @default.
- W3034311880 hasBestOaLocation W30343118801 @default.
- W3034311880 hasConcept C111919701 @default.
- W3034311880 hasConcept C119857082 @default.
- W3034311880 hasConcept C127413603 @default.
- W3034311880 hasConcept C154945302 @default.
- W3034311880 hasConcept C199190896 @default.
- W3034311880 hasConcept C201995342 @default.
- W3034311880 hasConcept C203479927 @default.
- W3034311880 hasConcept C2780451532 @default.
- W3034311880 hasConcept C2781235140 @default.
- W3034311880 hasConcept C41008148 @default.
- W3034311880 hasConcept C58166 @default.
- W3034311880 hasConcept C6557445 @default.
- W3034311880 hasConcept C86803240 @default.
- W3034311880 hasConcept C97541855 @default.
- W3034311880 hasConcept C98045186 @default.
- W3034311880 hasConceptScore W3034311880C111919701 @default.
- W3034311880 hasConceptScore W3034311880C119857082 @default.
- W3034311880 hasConceptScore W3034311880C127413603 @default.
- W3034311880 hasConceptScore W3034311880C154945302 @default.
- W3034311880 hasConceptScore W3034311880C199190896 @default.
- W3034311880 hasConceptScore W3034311880C201995342 @default.
- W3034311880 hasConceptScore W3034311880C203479927 @default.
- W3034311880 hasConceptScore W3034311880C2780451532 @default.
- W3034311880 hasConceptScore W3034311880C2781235140 @default.
- W3034311880 hasConceptScore W3034311880C41008148 @default.
- W3034311880 hasConceptScore W3034311880C58166 @default.
- W3034311880 hasConceptScore W3034311880C6557445 @default.
- W3034311880 hasConceptScore W3034311880C86803240 @default.
- W3034311880 hasConceptScore W3034311880C97541855 @default.
- W3034311880 hasConceptScore W3034311880C98045186 @default.
- W3034311880 hasLocation W30343118801 @default.
- W3034311880 hasLocation W30343118802 @default.
- W3034311880 hasOpenAccess W3034311880 @default.
- W3034311880 hasPrimaryLocation W30343118801 @default.
- W3034311880 hasRelatedWork W2134289401 @default.
- W3034311880 hasRelatedWork W2156006853 @default.
- W3034311880 hasRelatedWork W2557694176 @default.
- W3034311880 hasRelatedWork W2961085424 @default.
- W3034311880 hasRelatedWork W3022038857 @default.
- W3034311880 hasRelatedWork W3196155444 @default.
- W3034311880 hasRelatedWork W4306321456 @default.
- W3034311880 hasRelatedWork W4312812851 @default.
- W3034311880 hasRelatedWork W4319083788 @default.
- W3034311880 hasRelatedWork W4379471189 @default.
- W3034311880 isParatext "false" @default.
- W3034311880 isRetracted "false" @default.
- W3034311880 magId "3034311880" @default.
- W3034311880 workType "article" @default.