Matches in SemOpenAlex for { <https://semopenalex.org/work/W3026554384> ?p ?o ?g. }
- W3026554384 abstract "Open Domain dialog system evaluation is one of the most important challenges in dialog research. Existing automatic evaluation metrics, such as BLEU are mostly reference-based. They calculate the difference between the generated response and a limited number of available references. Likert-score based self-reported user rating is widely adopted by social conversational systems, such as Amazon Alexa Prize chatbots. However, self-reported user rating suffers from bias and variance among different users. To alleviate this problem, we formulate dialog evaluation as a comparison task. We also propose an automatic evaluation model CMADE (Comparison Model for Automatic Dialog Evaluation) that automatically cleans self-reported user ratings as it trains on them. Specifically, we first use a self-supervised method to learn better dialog feature representation, and then use KNN and Shapley to remove confusing samples. Our experiments show that CMADE achieves 89.2% accuracy in the dialog comparison task." @default.
- W3026554384 created "2020-05-29" @default.
- W3026554384 creator A5005779176 @default.
- W3026554384 creator A5061025828 @default.
- W3026554384 creator A5076286335 @default.
- W3026554384 date "2020-05-21" @default.
- W3026554384 modified "2023-09-23" @default.
- W3026554384 title "Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation" @default.
- W3026554384 cites W1591706642 @default.
- W3026554384 cites W1596984324 @default.
- W3026554384 cites W1601787291 @default.
- W3026554384 cites W1968958679 @default.
- W3026554384 cites W1978776260 @default.
- W3026554384 cites W1981688737 @default.
- W3026554384 cites W1990406041 @default.
- W3026554384 cites W2037789405 @default.
- W3026554384 cites W2049248196 @default.
- W3026554384 cites W2101105183 @default.
- W3026554384 cites W2110619046 @default.
- W3026554384 cites W2123301721 @default.
- W3026554384 cites W2128877075 @default.
- W3026554384 cites W2143331230 @default.
- W3026554384 cites W2150901716 @default.
- W3026554384 cites W2154652894 @default.
- W3026554384 cites W2183341477 @default.
- W3026554384 cites W2239239723 @default.
- W3026554384 cites W2250645967 @default.
- W3026554384 cites W2440214111 @default.
- W3026554384 cites W2577366047 @default.
- W3026554384 cites W2729339203 @default.
- W3026554384 cites W2782940392 @default.
- W3026554384 cites W2883555934 @default.
- W3026554384 cites W2898658996 @default.
- W3026554384 cites W2902856285 @default.
- W3026554384 cites W2905227118 @default.
- W3026554384 cites W2911446408 @default.
- W3026554384 cites W2911899023 @default.
- W3026554384 cites W2912748147 @default.
- W3026554384 cites W2919624000 @default.
- W3026554384 cites W2921189944 @default.
- W3026554384 cites W2939984132 @default.
- W3026554384 cites W2943766373 @default.
- W3026554384 cites W2948210185 @default.
- W3026554384 cites W2951098724 @default.
- W3026554384 cites W2962821719 @default.
- W3026554384 cites W2962852048 @default.
- W3026554384 cites W2963326483 @default.
- W3026554384 cites W2963341956 @default.
- W3026554384 cites W2963403868 @default.
- W3026554384 cites W2963527228 @default.
- W3026554384 cites W2963672599 @default.
- W3026554384 cites W2963802733 @default.
- W3026554384 cites W2963852396 @default.
- W3026554384 cites W2963903950 @default.
- W3026554384 cites W2964178377 @default.
- W3026554384 cites W2964352131 @default.
- W3026554384 cites W2971883198 @default.
- W3026554384 cites W2972900451 @default.
- W3026554384 cites W2979618422 @default.
- W3026554384 cites W2979722627 @default.
- W3026554384 cites W2979937837 @default.
- W3026554384 cites W2990141119 @default.
- W3026554384 cites W2995247309 @default.
- W3026554384 cites W2995672047 @default.
- W3026554384 cites W3018730156 @default.
- W3026554384 cites W3094459920 @default.
- W3026554384 cites W3103981637 @default.
- W3026554384 cites W3125596464 @default.
- W3026554384 doi "https://doi.org/10.48550/arxiv.2005.10716" @default.
- W3026554384 hasPublicationYear "2020" @default.
- W3026554384 type Work @default.
- W3026554384 sameAs 3026554384 @default.
- W3026554384 citedByCount "4" @default.
- W3026554384 countsByYear W30265543842020 @default.
- W3026554384 countsByYear W30265543842021 @default.
- W3026554384 crossrefType "posted-content" @default.
- W3026554384 hasAuthorship W3026554384A5005779176 @default.
- W3026554384 hasAuthorship W3026554384A5061025828 @default.
- W3026554384 hasAuthorship W3026554384A5076286335 @default.
- W3026554384 hasBestOaLocation W30265543841 @default.
- W3026554384 hasConcept C105776082 @default.
- W3026554384 hasConcept C105795698 @default.
- W3026554384 hasConcept C107457646 @default.
- W3026554384 hasConcept C119857082 @default.
- W3026554384 hasConcept C121955636 @default.
- W3026554384 hasConcept C134306372 @default.
- W3026554384 hasConcept C136764020 @default.
- W3026554384 hasConcept C138885662 @default.
- W3026554384 hasConcept C144133560 @default.
- W3026554384 hasConcept C154945302 @default.
- W3026554384 hasConcept C162324750 @default.
- W3026554384 hasConcept C173853756 @default.
- W3026554384 hasConcept C187736073 @default.
- W3026554384 hasConcept C190954187 @default.
- W3026554384 hasConcept C196083921 @default.
- W3026554384 hasConcept C204321447 @default.
- W3026554384 hasConcept C2776401178 @default.
- W3026554384 hasConcept C2780451532 @default.
- W3026554384 hasConcept C33923547 @default.
- W3026554384 hasConcept C36503486 @default.