Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313887415> ?p ?o ?g. }
- W4313887415 endingPage "761" @default.
- W4313887415 startingPage "748" @default.
- W4313887415 abstract "Achieving high performance in a multi-domain dialogue system with low computation is undoubtedly challenging. Previous works applying an end-to-end approach have been very successful. However, the computational cost remains a major issue since the large-sized language model using GPT-2 is required. Meanwhile, the optimization for individual components in the dialogue system has not shown promising result, especially for the component of dialogue management due to the complexity of multi-domain state and action representation. To cope with these issues, this article presents an efficient guidance learning where the imitation learning and the hierarchical reinforcement learning (HRL) with human-in-the-loop are performed to achieve high performance via an inexpensive dialogue agent. The behavior cloning with auxiliary tasks is exploited to identify the important features in latent representation. In particular, the proposed HRL is designed to treat each goal of a dialogue with the corresponding sub-policy so as to provide efficient dialogue policy learning by utilizing the guidance from human through action pruning and action evaluation, as well as the reward obtained from the interaction with the simulated user in the environment. Experimental results on ConvLab-2 framework show that the proposed method achieves state-of-the-art performance in dialogue policy optimization and outperforms the GPT-2 based solutions in end-to-end system evaluation." @default.
- W4313887415 created "2023-01-10" @default.
- W4313887415 creator A5043243811 @default.
- W4313887415 creator A5061908942 @default.
- W4313887415 date "2023-01-01" @default.
- W4313887415 modified "2023-10-18" @default.
- W4313887415 title "Hierarchical Reinforcement Learning With Guidance for Multi-Domain Dialogue Policy" @default.
- W4313887415 cites W2119717200 @default.
- W4313887415 cites W2169203263 @default.
- W4313887415 cites W2605982830 @default.
- W4313887415 cites W2739936944 @default.
- W4313887415 cites W2798494119 @default.
- W4313887415 cites W2798914047 @default.
- W4313887415 cites W2806600904 @default.
- W4313887415 cites W2915295540 @default.
- W4313887415 cites W2953071719 @default.
- W4313887415 cites W2963412005 @default.
- W4313887415 cites W2963433587 @default.
- W4313887415 cites W2963797754 @default.
- W4313887415 cites W2964006684 @default.
- W4313887415 cites W2964180249 @default.
- W4313887415 cites W2970828515 @default.
- W4313887415 cites W2985067290 @default.
- W4313887415 cites W2988647680 @default.
- W4313887415 cites W2997108628 @default.
- W4313887415 cites W2997771882 @default.
- W4313887415 cites W3009593063 @default.
- W4313887415 cites W3034782127 @default.
- W4313887415 cites W3037879762 @default.
- W4313887415 cites W3095200696 @default.
- W4313887415 cites W3099140719 @default.
- W4313887415 cites W3100128199 @default.
- W4313887415 cites W3105184920 @default.
- W4313887415 cites W3114038149 @default.
- W4313887415 cites W3117383065 @default.
- W4313887415 cites W3161450564 @default.
- W4313887415 cites W3197004354 @default.
- W4313887415 cites W3208178363 @default.
- W4313887415 cites W4210790685 @default.
- W4313887415 cites W4225322339 @default.
- W4313887415 cites W4294974598 @default.
- W4313887415 cites W4312554247 @default.
- W4313887415 doi "https://doi.org/10.1109/taslp.2023.3235202" @default.
- W4313887415 hasPublicationYear "2023" @default.
- W4313887415 type Work @default.
- W4313887415 citedByCount "2" @default.
- W4313887415 countsByYear W43138874152023 @default.
- W4313887415 crossrefType "journal-article" @default.
- W4313887415 hasAuthorship W4313887415A5043243811 @default.
- W4313887415 hasAuthorship W4313887415A5061908942 @default.
- W4313887415 hasBestOaLocation W43138874151 @default.
- W4313887415 hasConcept C107457646 @default.
- W4313887415 hasConcept C108010975 @default.
- W4313887415 hasConcept C11413529 @default.
- W4313887415 hasConcept C119857082 @default.
- W4313887415 hasConcept C121050878 @default.
- W4313887415 hasConcept C121332964 @default.
- W4313887415 hasConcept C126388530 @default.
- W4313887415 hasConcept C134306372 @default.
- W4313887415 hasConcept C154945302 @default.
- W4313887415 hasConcept C15744967 @default.
- W4313887415 hasConcept C168167062 @default.
- W4313887415 hasConcept C17744445 @default.
- W4313887415 hasConcept C199360897 @default.
- W4313887415 hasConcept C199539241 @default.
- W4313887415 hasConcept C2776359362 @default.
- W4313887415 hasConcept C2780626000 @default.
- W4313887415 hasConcept C2780791683 @default.
- W4313887415 hasConcept C33923547 @default.
- W4313887415 hasConcept C36503486 @default.
- W4313887415 hasConcept C41008148 @default.
- W4313887415 hasConcept C48103436 @default.
- W4313887415 hasConcept C62520636 @default.
- W4313887415 hasConcept C6557445 @default.
- W4313887415 hasConcept C77805123 @default.
- W4313887415 hasConcept C86803240 @default.
- W4313887415 hasConcept C94625758 @default.
- W4313887415 hasConcept C97355855 @default.
- W4313887415 hasConcept C97541855 @default.
- W4313887415 hasConceptScore W4313887415C107457646 @default.
- W4313887415 hasConceptScore W4313887415C108010975 @default.
- W4313887415 hasConceptScore W4313887415C11413529 @default.
- W4313887415 hasConceptScore W4313887415C119857082 @default.
- W4313887415 hasConceptScore W4313887415C121050878 @default.
- W4313887415 hasConceptScore W4313887415C121332964 @default.
- W4313887415 hasConceptScore W4313887415C126388530 @default.
- W4313887415 hasConceptScore W4313887415C134306372 @default.
- W4313887415 hasConceptScore W4313887415C154945302 @default.
- W4313887415 hasConceptScore W4313887415C15744967 @default.
- W4313887415 hasConceptScore W4313887415C168167062 @default.
- W4313887415 hasConceptScore W4313887415C17744445 @default.
- W4313887415 hasConceptScore W4313887415C199360897 @default.
- W4313887415 hasConceptScore W4313887415C199539241 @default.
- W4313887415 hasConceptScore W4313887415C2776359362 @default.
- W4313887415 hasConceptScore W4313887415C2780626000 @default.
- W4313887415 hasConceptScore W4313887415C2780791683 @default.
- W4313887415 hasConceptScore W4313887415C33923547 @default.
- W4313887415 hasConceptScore W4313887415C36503486 @default.