Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287393007> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4287393007 abstract "Recently, it was shown that most popular IR measures are not interval-scaled, implying that decades of experimental IR research used potentially improper methods, which may have produced questionable results. However, it was unclear if and to what extent these findings apply to actual evaluations and this opened a debate in the community with researchers standing on opposite positions about whether this should be considered an issue (or not) and to what extent. In this paper, we first give an introduction to the representational measurement theory explaining why certain operations and significance tests are permissible only with scales of a certain level. For that, we introduce the notion of meaningfulness specifying the conditions under which the truth (or falsity) of a statement is invariant under permissible transformations of a scale. Furthermore, we show how the recall base and the length of the run may make comparison and aggregation across topics problematic. Then we propose a straightforward and powerful approach for turning an evaluation measure into an interval scale, and describe an experimental evaluation of the differences between using the original measures and the interval-scaled ones. For all the regarded measures - namely Precision, Recall, Average Precision, (Normalized) Discounted Cumulative Gain, Rank-Biased Precision and Reciprocal Rank - we observe substantial effects, both on the order of average values and on the outcome of significance tests. For the latter, previously significant differences turn out to be insignificant, while insignificant ones become significant. The effect varies remarkably between the tests considered but overall, on average, we observed a 25% change in the decision about which systems are significantly different and which are not." @default.
- W4287393007 created "2022-07-25" @default.
- W4287393007 creator A5054035663 @default.
- W4287393007 creator A5069843101 @default.
- W4287393007 creator A5083058200 @default.
- W4287393007 date "2021-01-07" @default.
- W4287393007 modified "2023-10-18" @default.
- W4287393007 title "Towards Meaningful Statements in IR Evaluation. Mapping Evaluation Measures to Interval Scales" @default.
- W4287393007 doi "https://doi.org/10.48550/arxiv.2101.02668" @default.
- W4287393007 hasPublicationYear "2021" @default.
- W4287393007 type Work @default.
- W4287393007 citedByCount "0" @default.
- W4287393007 crossrefType "posted-content" @default.
- W4287393007 hasAuthorship W4287393007A5054035663 @default.
- W4287393007 hasAuthorship W4287393007A5069843101 @default.
- W4287393007 hasAuthorship W4287393007A5083058200 @default.
- W4287393007 hasBestOaLocation W42873930071 @default.
- W4287393007 hasConcept C100660578 @default.
- W4287393007 hasConcept C105795698 @default.
- W4287393007 hasConcept C111472728 @default.
- W4287393007 hasConcept C114614502 @default.
- W4287393007 hasConcept C121332964 @default.
- W4287393007 hasConcept C138885662 @default.
- W4287393007 hasConcept C149782125 @default.
- W4287393007 hasConcept C154945302 @default.
- W4287393007 hasConcept C15744967 @default.
- W4287393007 hasConcept C164226766 @default.
- W4287393007 hasConcept C180747234 @default.
- W4287393007 hasConcept C2777685122 @default.
- W4287393007 hasConcept C2777742833 @default.
- W4287393007 hasConcept C2778067643 @default.
- W4287393007 hasConcept C2778755073 @default.
- W4287393007 hasConcept C33923547 @default.
- W4287393007 hasConcept C41008148 @default.
- W4287393007 hasConcept C41895202 @default.
- W4287393007 hasConcept C55786151 @default.
- W4287393007 hasConcept C62520636 @default.
- W4287393007 hasConcept C81669768 @default.
- W4287393007 hasConceptScore W4287393007C100660578 @default.
- W4287393007 hasConceptScore W4287393007C105795698 @default.
- W4287393007 hasConceptScore W4287393007C111472728 @default.
- W4287393007 hasConceptScore W4287393007C114614502 @default.
- W4287393007 hasConceptScore W4287393007C121332964 @default.
- W4287393007 hasConceptScore W4287393007C138885662 @default.
- W4287393007 hasConceptScore W4287393007C149782125 @default.
- W4287393007 hasConceptScore W4287393007C154945302 @default.
- W4287393007 hasConceptScore W4287393007C15744967 @default.
- W4287393007 hasConceptScore W4287393007C164226766 @default.
- W4287393007 hasConceptScore W4287393007C180747234 @default.
- W4287393007 hasConceptScore W4287393007C2777685122 @default.
- W4287393007 hasConceptScore W4287393007C2777742833 @default.
- W4287393007 hasConceptScore W4287393007C2778067643 @default.
- W4287393007 hasConceptScore W4287393007C2778755073 @default.
- W4287393007 hasConceptScore W4287393007C33923547 @default.
- W4287393007 hasConceptScore W4287393007C41008148 @default.
- W4287393007 hasConceptScore W4287393007C41895202 @default.
- W4287393007 hasConceptScore W4287393007C55786151 @default.
- W4287393007 hasConceptScore W4287393007C62520636 @default.
- W4287393007 hasConceptScore W4287393007C81669768 @default.
- W4287393007 hasLocation W42873930071 @default.
- W4287393007 hasOpenAccess W4287393007 @default.
- W4287393007 hasPrimaryLocation W42873930071 @default.
- W4287393007 hasRelatedWork W1968513441 @default.
- W4287393007 hasRelatedWork W2017586301 @default.
- W4287393007 hasRelatedWork W2025511434 @default.
- W4287393007 hasRelatedWork W2036617191 @default.
- W4287393007 hasRelatedWork W2075730884 @default.
- W4287393007 hasRelatedWork W2318653344 @default.
- W4287393007 hasRelatedWork W2948461793 @default.
- W4287393007 hasRelatedWork W3118555138 @default.
- W4287393007 hasRelatedWork W4234996786 @default.
- W4287393007 hasRelatedWork W4287393007 @default.
- W4287393007 isParatext "false" @default.
- W4287393007 isRetracted "false" @default.
- W4287393007 workType "article" @default.