Matches in SemOpenAlex for { <https://semopenalex.org/work/W4381714069> ?p ?o ?g. }
Showing items 1 to 59 of
59
with 100 items per page.
- W4381714069 abstract "We study feature interactions in the context of feature attribution methods for post-hoc interpretability. In interpretability research, getting to grips with feature interactions is increasingly recognised as an important challenge, because interacting features are key to the success of neural networks. Feature interactions allow a model to build up hierarchical representations for its input, and might provide an ideal starting point for the investigation into linguistic structure in language models. However, uncovering the exact role that these interactions play is also difficult, and a diverse range of interaction attribution methods has been proposed. In this paper, we focus on the question which of these methods most faithfully reflects the inner workings of the target models. We work out a grey box methodology, in which we train models to perfection on a formal language classification task, using PCFGs. We show that under specific configurations, some methods are indeed able to uncover the grammatical rules acquired by a model. Based on these findings we extend our evaluation to a case study on language models, providing novel insights into the linguistic structure that these models have acquired." @default.
- W4381714069 created "2023-06-24" @default.
- W4381714069 creator A5007928903 @default.
- W4381714069 creator A5056188012 @default.
- W4381714069 date "2023-06-21" @default.
- W4381714069 modified "2023-10-14" @default.
- W4381714069 title "Feature Interactions Reveal Linguistic Structure in Language Models" @default.
- W4381714069 doi "https://doi.org/10.48550/arxiv.2306.12181" @default.
- W4381714069 hasPublicationYear "2023" @default.
- W4381714069 type Work @default.
- W4381714069 citedByCount "0" @default.
- W4381714069 crossrefType "posted-content" @default.
- W4381714069 hasAuthorship W4381714069A5007928903 @default.
- W4381714069 hasAuthorship W4381714069A5056188012 @default.
- W4381714069 hasBestOaLocation W43817140691 @default.
- W4381714069 hasConcept C120665830 @default.
- W4381714069 hasConcept C121332964 @default.
- W4381714069 hasConcept C137293760 @default.
- W4381714069 hasConcept C138885662 @default.
- W4381714069 hasConcept C151730666 @default.
- W4381714069 hasConcept C154945302 @default.
- W4381714069 hasConcept C192209626 @default.
- W4381714069 hasConcept C204321447 @default.
- W4381714069 hasConcept C2776401178 @default.
- W4381714069 hasConcept C2779343474 @default.
- W4381714069 hasConcept C2781067378 @default.
- W4381714069 hasConcept C41008148 @default.
- W4381714069 hasConcept C41895202 @default.
- W4381714069 hasConcept C86803240 @default.
- W4381714069 hasConceptScore W4381714069C120665830 @default.
- W4381714069 hasConceptScore W4381714069C121332964 @default.
- W4381714069 hasConceptScore W4381714069C137293760 @default.
- W4381714069 hasConceptScore W4381714069C138885662 @default.
- W4381714069 hasConceptScore W4381714069C151730666 @default.
- W4381714069 hasConceptScore W4381714069C154945302 @default.
- W4381714069 hasConceptScore W4381714069C192209626 @default.
- W4381714069 hasConceptScore W4381714069C204321447 @default.
- W4381714069 hasConceptScore W4381714069C2776401178 @default.
- W4381714069 hasConceptScore W4381714069C2779343474 @default.
- W4381714069 hasConceptScore W4381714069C2781067378 @default.
- W4381714069 hasConceptScore W4381714069C41008148 @default.
- W4381714069 hasConceptScore W4381714069C41895202 @default.
- W4381714069 hasConceptScore W4381714069C86803240 @default.
- W4381714069 hasLocation W43817140691 @default.
- W4381714069 hasOpenAccess W4381714069 @default.
- W4381714069 hasPrimaryLocation W43817140691 @default.
- W4381714069 hasRelatedWork W142374489 @default.
- W4381714069 hasRelatedWork W1803932089 @default.
- W4381714069 hasRelatedWork W1985007624 @default.
- W4381714069 hasRelatedWork W2176369193 @default.
- W4381714069 hasRelatedWork W2359001871 @default.
- W4381714069 hasRelatedWork W2909124124 @default.
- W4381714069 hasRelatedWork W2916492174 @default.
- W4381714069 hasRelatedWork W3107474891 @default.
- W4381714069 hasRelatedWork W4293093780 @default.
- W4381714069 hasRelatedWork W2584532118 @default.
- W4381714069 isParatext "false" @default.
- W4381714069 isRetracted "false" @default.
- W4381714069 workType "article" @default.