Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385572641> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W4385572641 abstract "A popular approach to unveiling the black box of neural NLP models is to leverage saliency methods, which assign scalar importance scores to each input component. A common practice for evaluating whether an interpretability method is faithful has been to use evaluation-by-agreement – if multiple methods agree on an explanation, its credibility increases. However, recent work has found that saliency methods exhibit weak rank correlations even when applied to the same model instance and advocated for alternative diagnostic methods. In our work, we demonstrate that rank correlation is not a good fit for evaluating agreement and argue that Pearson-r is a better-suited alternative. We further show that regularization techniques that increase faithfulness of attention explanations also increase agreement between saliency methods. By connecting our findings to instance categories based on training dynamics, we show that the agreement of saliency method explanations is very low for easy-to-learn instances. Finally, we connect the improvement in agreement across instance categories to local representation space statistics of instances, paving the way for work on analyzing which intrinsic model properties improve their predisposition to interpretability methods." @default.
- W4385572641 created "2023-08-05" @default.
- W4385572641 creator A5009562561 @default.
- W4385572641 creator A5020691628 @default.
- W4385572641 creator A5067851924 @default.
- W4385572641 date "2023-01-01" @default.
- W4385572641 modified "2023-09-24" @default.
- W4385572641 title "Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods" @default.
- W4385572641 doi "https://doi.org/10.18653/v1/2023.findings-acl.582" @default.
- W4385572641 hasPublicationYear "2023" @default.
- W4385572641 type Work @default.
- W4385572641 citedByCount "0" @default.
- W4385572641 crossrefType "proceedings-article" @default.
- W4385572641 hasAuthorship W4385572641A5009562561 @default.
- W4385572641 hasAuthorship W4385572641A5020691628 @default.
- W4385572641 hasAuthorship W4385572641A5067851924 @default.
- W4385572641 hasBestOaLocation W43855726411 @default.
- W4385572641 hasConcept C107673813 @default.
- W4385572641 hasConcept C119857082 @default.
- W4385572641 hasConcept C153083717 @default.
- W4385572641 hasConcept C153180895 @default.
- W4385572641 hasConcept C154945302 @default.
- W4385572641 hasConcept C17744445 @default.
- W4385572641 hasConcept C199539241 @default.
- W4385572641 hasConcept C2776135515 @default.
- W4385572641 hasConcept C2776359362 @default.
- W4385572641 hasConcept C2780224610 @default.
- W4385572641 hasConcept C2781067378 @default.
- W4385572641 hasConcept C32834561 @default.
- W4385572641 hasConcept C41008148 @default.
- W4385572641 hasConcept C94625758 @default.
- W4385572641 hasConceptScore W4385572641C107673813 @default.
- W4385572641 hasConceptScore W4385572641C119857082 @default.
- W4385572641 hasConceptScore W4385572641C153083717 @default.
- W4385572641 hasConceptScore W4385572641C153180895 @default.
- W4385572641 hasConceptScore W4385572641C154945302 @default.
- W4385572641 hasConceptScore W4385572641C17744445 @default.
- W4385572641 hasConceptScore W4385572641C199539241 @default.
- W4385572641 hasConceptScore W4385572641C2776135515 @default.
- W4385572641 hasConceptScore W4385572641C2776359362 @default.
- W4385572641 hasConceptScore W4385572641C2780224610 @default.
- W4385572641 hasConceptScore W4385572641C2781067378 @default.
- W4385572641 hasConceptScore W4385572641C32834561 @default.
- W4385572641 hasConceptScore W4385572641C41008148 @default.
- W4385572641 hasConceptScore W4385572641C94625758 @default.
- W4385572641 hasLocation W43855726411 @default.
- W4385572641 hasOpenAccess W4385572641 @default.
- W4385572641 hasPrimaryLocation W43855726411 @default.
- W4385572641 hasRelatedWork W2042327336 @default.
- W4385572641 hasRelatedWork W2321141263 @default.
- W4385572641 hasRelatedWork W2543161807 @default.
- W4385572641 hasRelatedWork W3006943036 @default.
- W4385572641 hasRelatedWork W3023163568 @default.
- W4385572641 hasRelatedWork W3136084287 @default.
- W4385572641 hasRelatedWork W4200511449 @default.
- W4385572641 hasRelatedWork W4206534706 @default.
- W4385572641 hasRelatedWork W4229079080 @default.
- W4385572641 hasRelatedWork W4299487748 @default.
- W4385572641 isParatext "false" @default.
- W4385572641 isRetracted "false" @default.
- W4385572641 workType "article" @default.