Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385571211> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4385571211 abstract "The pretrained language model (PLM) based metrics have been successfully used in evaluating language generation tasks. Recent studies of the human evaluation community show that considering both major errors (e.g. mistranslated tokens) and minor errors (e.g. imperfections in fluency) can produce high-quality judgments. This inspires us to approach the final goal of the automatic metrics (human-like evaluations) by fine-grained error analysis. In this paper, we argue that the ability to estimate sentence confidence is the tip of the iceberg for PLM-based metrics. And it can be used to refine the generated sentence toward higher confidence and more reference-grounded, where the costs of refining and approaching reference are used to determine the major and minor errors, respectively.To this end, we take BARTScore as the testbed and present an innovative solution to marry the unexploited sentence refining capacity of BARTScore and human-like error analysis, where the final score consists of both the evaluations of major and minor errors. Experiments show that our solution consistently and significantly improves BARTScore, and outperforms top-scoring metrics in 19/25 test settings. Analyses demonstrate our method robustly and efficiently approaches human-like evaluations, enjoying better interpretability. Our code and scripts will be publicly released in https://github.com/Coldmist-Lu/ErrorAnalysis_NLGEvaluation." @default.
- W4385571211 created "2023-08-05" @default.
- W4385571211 creator A5010462322 @default.
- W4385571211 creator A5031077283 @default.
- W4385571211 creator A5053159289 @default.
- W4385571211 creator A5065438419 @default.
- W4385571211 creator A5074103823 @default.
- W4385571211 creator A5086939495 @default.
- W4385571211 date "2023-01-01" @default.
- W4385571211 modified "2023-09-26" @default.
- W4385571211 title "Toward Human-Like Evaluation for Natural Language Generation with Error Analysis" @default.
- W4385571211 doi "https://doi.org/10.18653/v1/2023.acl-long.324" @default.
- W4385571211 hasPublicationYear "2023" @default.
- W4385571211 type Work @default.
- W4385571211 citedByCount "0" @default.
- W4385571211 crossrefType "proceedings-article" @default.
- W4385571211 hasAuthorship W4385571211A5010462322 @default.
- W4385571211 hasAuthorship W4385571211A5031077283 @default.
- W4385571211 hasAuthorship W4385571211A5053159289 @default.
- W4385571211 hasAuthorship W4385571211A5065438419 @default.
- W4385571211 hasAuthorship W4385571211A5074103823 @default.
- W4385571211 hasAuthorship W4385571211A5086939495 @default.
- W4385571211 hasBestOaLocation W43855712111 @default.
- W4385571211 hasConcept C119857082 @default.
- W4385571211 hasConcept C136764020 @default.
- W4385571211 hasConcept C138885662 @default.
- W4385571211 hasConcept C154945302 @default.
- W4385571211 hasConcept C17744445 @default.
- W4385571211 hasConcept C195324797 @default.
- W4385571211 hasConcept C199360897 @default.
- W4385571211 hasConcept C199539241 @default.
- W4385571211 hasConcept C204321447 @default.
- W4385571211 hasConcept C2777413886 @default.
- W4385571211 hasConcept C2777530160 @default.
- W4385571211 hasConcept C2777601683 @default.
- W4385571211 hasConcept C2779760435 @default.
- W4385571211 hasConcept C2781067378 @default.
- W4385571211 hasConcept C31395832 @default.
- W4385571211 hasConcept C41008148 @default.
- W4385571211 hasConcept C41895202 @default.
- W4385571211 hasConcept C61423126 @default.
- W4385571211 hasConceptScore W4385571211C119857082 @default.
- W4385571211 hasConceptScore W4385571211C136764020 @default.
- W4385571211 hasConceptScore W4385571211C138885662 @default.
- W4385571211 hasConceptScore W4385571211C154945302 @default.
- W4385571211 hasConceptScore W4385571211C17744445 @default.
- W4385571211 hasConceptScore W4385571211C195324797 @default.
- W4385571211 hasConceptScore W4385571211C199360897 @default.
- W4385571211 hasConceptScore W4385571211C199539241 @default.
- W4385571211 hasConceptScore W4385571211C204321447 @default.
- W4385571211 hasConceptScore W4385571211C2777413886 @default.
- W4385571211 hasConceptScore W4385571211C2777530160 @default.
- W4385571211 hasConceptScore W4385571211C2777601683 @default.
- W4385571211 hasConceptScore W4385571211C2779760435 @default.
- W4385571211 hasConceptScore W4385571211C2781067378 @default.
- W4385571211 hasConceptScore W4385571211C31395832 @default.
- W4385571211 hasConceptScore W4385571211C41008148 @default.
- W4385571211 hasConceptScore W4385571211C41895202 @default.
- W4385571211 hasConceptScore W4385571211C61423126 @default.
- W4385571211 hasLocation W43855712111 @default.
- W4385571211 hasOpenAccess W4385571211 @default.
- W4385571211 hasPrimaryLocation W43855712111 @default.
- W4385571211 hasRelatedWork W159132833 @default.
- W4385571211 hasRelatedWork W2130109619 @default.
- W4385571211 hasRelatedWork W2293457016 @default.
- W4385571211 hasRelatedWork W3006943036 @default.
- W4385571211 hasRelatedWork W4200511449 @default.
- W4385571211 hasRelatedWork W4206534706 @default.
- W4385571211 hasRelatedWork W4229079080 @default.
- W4385571211 hasRelatedWork W4299487748 @default.
- W4385571211 hasRelatedWork W4307308917 @default.
- W4385571211 hasRelatedWork W1872130062 @default.
- W4385571211 isParatext "false" @default.
- W4385571211 isRetracted "false" @default.
- W4385571211 workType "article" @default.