Matches in SemOpenAlex for { <https://semopenalex.org/work/W3154686231> ?p ?o ?g. }
Showing items 1 to 85 of
85
with 100 items per page.
- W3154686231 abstract "While previous work in comparing statistical significance tests for IR system evaluation have focused on paired data tests (e.g., for evaluating two systems using a common test collection), two-sample tests must be used when the reproducibility of IR experiments across different test collections must be examined. Using real runs and a test collection from the NTCIR-15 WWW-3 Task, the present study compares the properties of three two-sample significance tests for comparing two systems: Student's t-test (i.e., the classical parametric test), the Wilcoxon rank sum test (i.e., the classical nonparametric test), and the randomisation test (i.e., a population-free method that utilises modern computational power). In terms of the false positive rate (i.e., the chance of detecting a statistical significance even though the two samples of evaluation measure scores come from the same system), the three tests behave similarly, although the Wilcoxon rank sum test appears to be slightly more robust than the other two for very small topic set sizes (e.g., 10 topics each) with a large significance level (e.g., α=0.10). On the other hand, the t-test and the Wilcoxon rank sum test are very similar to each other from the following two viewpoints: How often do they both detect a nonexistent difference? and How often do they both overlook a true difference? Compared to the two classical significance tests, the randomisation test behaves markedly differently in terms of the above two viewpoints. Hence, we suggest that researchers should at least be aware of the above properties of the three two-sample tests when choosing from them." @default.
- W3154686231 created "2021-04-26" @default.
- W3154686231 creator A5023595778 @default.
- W3154686231 date "2021-07-11" @default.
- W3154686231 modified "2023-09-27" @default.
- W3154686231 title "On the Two-Sample Randomisation Test for IR Evaluation" @default.
- W3154686231 cites W2017292914 @default.
- W3154686231 cites W2058896506 @default.
- W3154686231 cites W2075893676 @default.
- W3154686231 cites W2076227143 @default.
- W3154686231 cites W2077046902 @default.
- W3154686231 cites W2336806308 @default.
- W3154686231 cites W2515650098 @default.
- W3154686231 cites W2956058978 @default.
- W3154686231 cites W3100184886 @default.
- W3154686231 cites W3172635446 @default.
- W3154686231 doi "https://doi.org/10.1145/3404835.3463002" @default.
- W3154686231 hasPublicationYear "2021" @default.
- W3154686231 type Work @default.
- W3154686231 sameAs 3154686231 @default.
- W3154686231 citedByCount "1" @default.
- W3154686231 countsByYear W31546862312021 @default.
- W3154686231 crossrefType "proceedings-article" @default.
- W3154686231 hasAuthorship W3154686231A5023595778 @default.
- W3154686231 hasConcept C102366305 @default.
- W3154686231 hasConcept C105795698 @default.
- W3154686231 hasConcept C114614502 @default.
- W3154686231 hasConcept C117251300 @default.
- W3154686231 hasConcept C12868164 @default.
- W3154686231 hasConcept C129848803 @default.
- W3154686231 hasConcept C142362112 @default.
- W3154686231 hasConcept C151730666 @default.
- W3154686231 hasConcept C153349607 @default.
- W3154686231 hasConcept C154945302 @default.
- W3154686231 hasConcept C164226766 @default.
- W3154686231 hasConcept C185592680 @default.
- W3154686231 hasConcept C198531522 @default.
- W3154686231 hasConcept C206041023 @default.
- W3154686231 hasConcept C2776035091 @default.
- W3154686231 hasConcept C2777267654 @default.
- W3154686231 hasConcept C33923547 @default.
- W3154686231 hasConcept C41008148 @default.
- W3154686231 hasConcept C43617362 @default.
- W3154686231 hasConcept C65409693 @default.
- W3154686231 hasConcept C86803240 @default.
- W3154686231 hasConcept C87007009 @default.
- W3154686231 hasConceptScore W3154686231C102366305 @default.
- W3154686231 hasConceptScore W3154686231C105795698 @default.
- W3154686231 hasConceptScore W3154686231C114614502 @default.
- W3154686231 hasConceptScore W3154686231C117251300 @default.
- W3154686231 hasConceptScore W3154686231C12868164 @default.
- W3154686231 hasConceptScore W3154686231C129848803 @default.
- W3154686231 hasConceptScore W3154686231C142362112 @default.
- W3154686231 hasConceptScore W3154686231C151730666 @default.
- W3154686231 hasConceptScore W3154686231C153349607 @default.
- W3154686231 hasConceptScore W3154686231C154945302 @default.
- W3154686231 hasConceptScore W3154686231C164226766 @default.
- W3154686231 hasConceptScore W3154686231C185592680 @default.
- W3154686231 hasConceptScore W3154686231C198531522 @default.
- W3154686231 hasConceptScore W3154686231C206041023 @default.
- W3154686231 hasConceptScore W3154686231C2776035091 @default.
- W3154686231 hasConceptScore W3154686231C2777267654 @default.
- W3154686231 hasConceptScore W3154686231C33923547 @default.
- W3154686231 hasConceptScore W3154686231C41008148 @default.
- W3154686231 hasConceptScore W3154686231C43617362 @default.
- W3154686231 hasConceptScore W3154686231C65409693 @default.
- W3154686231 hasConceptScore W3154686231C86803240 @default.
- W3154686231 hasConceptScore W3154686231C87007009 @default.
- W3154686231 hasLocation W31546862311 @default.
- W3154686231 hasOpenAccess W3154686231 @default.
- W3154686231 hasPrimaryLocation W31546862311 @default.
- W3154686231 hasRelatedWork W200873173 @default.
- W3154686231 hasRelatedWork W2015235728 @default.
- W3154686231 hasRelatedWork W2036600297 @default.
- W3154686231 hasRelatedWork W2050225804 @default.
- W3154686231 hasRelatedWork W2163998574 @default.
- W3154686231 hasRelatedWork W2592359670 @default.
- W3154686231 hasRelatedWork W2621969506 @default.
- W3154686231 hasRelatedWork W4243177274 @default.
- W3154686231 hasRelatedWork W4251018085 @default.
- W3154686231 hasRelatedWork W584167426 @default.
- W3154686231 isParatext "false" @default.
- W3154686231 isRetracted "false" @default.
- W3154686231 magId "3154686231" @default.
- W3154686231 workType "article" @default.