Matches in SemOpenAlex for { <https://semopenalex.org/work/W3044602607> ?p ?o ?g. }
Showing items 1 to 100 of
100
with 100 items per page.
- W3044602607 abstract "Term frequency-inverse document frequency, or tf-idf for short, is a numerical measure that is widely used in information retrieval to quantify the importance of a term of interest in one out of many documents. While tf-idf was originally proposed as a heuristic, much work has been devoted over the years to placing it on a solid theoretical foundation. Following in this tradition, we here advance the first justification for tf-idf that is grounded in statistical hypothesis testing. More precisely, we first show that the one-tailed version of Fisher's exact test, also known as the hypergeometric test, corresponds well with a common tf-idf variant on selected real-data information retrieval tasks. We then set forth a mathematical argument that suggests the tf-idf variant approximates the negative logarithm of the one-tailed Fisher's exact test P-value (i.e., a hypergeometric distribution tail probability). The Fisher's exact test interpretation of this common tf-idf variant furnishes the working statistician with a ready explanation of tf-idf's long-established effectiveness." @default.
- W3044602607 created "2020-07-29" @default.
- W3044602607 creator A5047365455 @default.
- W3044602607 creator A5048570936 @default.
- W3044602607 date "2020-02-26" @default.
- W3044602607 modified "2023-09-23" @default.
- W3044602607 title "Fisher's exact test explains a popular metric in information retrieval." @default.
- W3044602607 cites W1513915383 @default.
- W3044602607 cites W1521626219 @default.
- W3044602607 cites W1532325895 @default.
- W3044602607 cites W1965667542 @default.
- W3044602607 cites W1966365186 @default.
- W3044602607 cites W1981159181 @default.
- W3044602607 cites W1985213868 @default.
- W3044602607 cites W1985697096 @default.
- W3044602607 cites W2003298360 @default.
- W3044602607 cites W2014415866 @default.
- W3044602607 cites W2024932032 @default.
- W3044602607 cites W2037745957 @default.
- W3044602607 cites W2043292582 @default.
- W3044602607 cites W2043909051 @default.
- W3044602607 cites W2075006521 @default.
- W3044602607 cites W2093390569 @default.
- W3044602607 cites W2096152098 @default.
- W3044602607 cites W2096468639 @default.
- W3044602607 cites W2096561521 @default.
- W3044602607 cites W2101746535 @default.
- W3044602607 cites W2105157020 @default.
- W3044602607 cites W2133465414 @default.
- W3044602607 cites W2135631383 @default.
- W3044602607 cites W2144211451 @default.
- W3044602607 cites W2152274187 @default.
- W3044602607 cites W2154873880 @default.
- W3044602607 cites W2158139273 @default.
- W3044602607 cites W2164530213 @default.
- W3044602607 cites W2166574880 @default.
- W3044602607 cites W2171030481 @default.
- W3044602607 cites W24524376 @default.
- W3044602607 cites W2558405088 @default.
- W3044602607 cites W2560014971 @default.
- W3044602607 cites W2969322340 @default.
- W3044602607 cites W3101256217 @default.
- W3044602607 cites W3120740533 @default.
- W3044602607 cites W202663801 @default.
- W3044602607 hasPublicationYear "2020" @default.
- W3044602607 type Work @default.
- W3044602607 sameAs 3044602607 @default.
- W3044602607 citedByCount "0" @default.
- W3044602607 crossrefType "posted-content" @default.
- W3044602607 hasAuthorship W3044602607A5047365455 @default.
- W3044602607 hasAuthorship W3044602607A5048570936 @default.
- W3044602607 hasConcept C105795698 @default.
- W3044602607 hasConcept C11413529 @default.
- W3044602607 hasConcept C121332964 @default.
- W3044602607 hasConcept C134306372 @default.
- W3044602607 hasConcept C154945302 @default.
- W3044602607 hasConcept C162324750 @default.
- W3044602607 hasConcept C173801870 @default.
- W3044602607 hasConcept C176217482 @default.
- W3044602607 hasConcept C176671685 @default.
- W3044602607 hasConcept C191093355 @default.
- W3044602607 hasConcept C21547014 @default.
- W3044602607 hasConcept C23123220 @default.
- W3044602607 hasConcept C29406490 @default.
- W3044602607 hasConcept C33923547 @default.
- W3044602607 hasConcept C39927690 @default.
- W3044602607 hasConcept C41008148 @default.
- W3044602607 hasConcept C61797465 @default.
- W3044602607 hasConcept C62520636 @default.
- W3044602607 hasConcept C87007009 @default.
- W3044602607 hasConceptScore W3044602607C105795698 @default.
- W3044602607 hasConceptScore W3044602607C11413529 @default.
- W3044602607 hasConceptScore W3044602607C121332964 @default.
- W3044602607 hasConceptScore W3044602607C134306372 @default.
- W3044602607 hasConceptScore W3044602607C154945302 @default.
- W3044602607 hasConceptScore W3044602607C162324750 @default.
- W3044602607 hasConceptScore W3044602607C173801870 @default.
- W3044602607 hasConceptScore W3044602607C176217482 @default.
- W3044602607 hasConceptScore W3044602607C176671685 @default.
- W3044602607 hasConceptScore W3044602607C191093355 @default.
- W3044602607 hasConceptScore W3044602607C21547014 @default.
- W3044602607 hasConceptScore W3044602607C23123220 @default.
- W3044602607 hasConceptScore W3044602607C29406490 @default.
- W3044602607 hasConceptScore W3044602607C33923547 @default.
- W3044602607 hasConceptScore W3044602607C39927690 @default.
- W3044602607 hasConceptScore W3044602607C41008148 @default.
- W3044602607 hasConceptScore W3044602607C61797465 @default.
- W3044602607 hasConceptScore W3044602607C62520636 @default.
- W3044602607 hasConceptScore W3044602607C87007009 @default.
- W3044602607 hasLocation W30446026071 @default.
- W3044602607 hasOpenAccess W3044602607 @default.
- W3044602607 hasPrimaryLocation W30446026071 @default.
- W3044602607 hasRelatedWork W1480443223 @default.
- W3044602607 hasRelatedWork W2074902906 @default.
- W3044602607 hasRelatedWork W2091913736 @default.
- W3044602607 hasRelatedWork W2805779696 @default.
- W3044602607 isParatext "false" @default.
- W3044602607 isRetracted "false" @default.
- W3044602607 magId "3044602607" @default.
- W3044602607 workType "article" @default.