Matches in SemOpenAlex for { <https://semopenalex.org/work/W201536428> ?p ?o ?g. }
- W201536428 abstract "Search engines have become a crucial tool for finding information in repositories containing large amounts of textual data in unstructured form (e.g., the Web). However, the task of ad hoc information retrieval, that is, finding documents within a corpus that are relevant to an information need specified using a query, remains a hard challenge. The language modeling approach to information retrieval provides an effective framework for approaching various problems and has yielded impressive empirical performance. However, most previous work on language models for information retrieval focuses on document-specific characteristics to estimate documents' language models, and therefore does not take into account the structure of the surrounding corpus, a potentially rich source of additional information. We present a novel perspective for approaching the task of ad hoc retrieval: information provided by document-based language models can be enhanced by the incorporation of information drawn from clusters of similar documents that are created offline. We present several retrieval algorithms that are natural instantiations of this idea and that post performance that is substantially better than that of the standard language modeling approach. We also show that the best performing of these algorithms posts state-of-the-art performance for structural re-ranking, wherein an initially retrieved subset of the documents is re-ranked to obtain high precision specifically among the first few documents, using inter-document similarities within the list as an extra information source. As further exploration of the re-ranking approach just described, and inspired by the PageRank and HITS (hubs and authorities) algorithms for Web search, we propose a graph-based framework that applies to document collections lacking hyperlink information. Specifically, centrality induced over graphs wherein links represent asymmetric language-model-based inter-document similarities constitutes the basis of effective re-ranking algorithms. Combining our two paradigms for similarity representation---i.e., clusters of documents and links representing language-model-based inter-item similarities---helps to improve the effectiveness of centrality-based approaches. For example, document authoritativeness as induced by the HITS algorithm over cluster-document graphs is a highly effective re-ranking criterion. Furthermore, authoritative clusters are shown to contain a high percentage of relevant documents." @default.
- W201536428 created "2016-06-24" @default.
- W201536428 creator A5041210904 @default.
- W201536428 creator A5076876084 @default.
- W201536428 date "2006-01-01" @default.
- W201536428 modified "2023-09-22" @default.
- W201536428 title "Inter-document similarities, language models, and ad hoc information retrieval" @default.
- W201536428 cites W132343450 @default.
- W201536428 cites W1483301080 @default.
- W201536428 cites W1486980688 @default.
- W201536428 cites W1490796714 @default.
- W201536428 cites W1497443639 @default.
- W201536428 cites W1501307387 @default.
- W201536428 cites W1514324734 @default.
- W201536428 cites W1514403774 @default.
- W201536428 cites W1515852583 @default.
- W201536428 cites W1525595230 @default.
- W201536428 cites W1526730373 @default.
- W201536428 cites W1534714852 @default.
- W201536428 cites W1585620735 @default.
- W201536428 cites W1592871157 @default.
- W201536428 cites W1594759534 @default.
- W201536428 cites W1660390307 @default.
- W201536428 cites W1803641895 @default.
- W201536428 cites W1880262756 @default.
- W201536428 cites W1904228841 @default.
- W201536428 cites W1928657940 @default.
- W201536428 cites W1963658069 @default.
- W201536428 cites W1964348731 @default.
- W201536428 cites W1965657003 @default.
- W201536428 cites W1971987737 @default.
- W201536428 cites W1972645849 @default.
- W201536428 cites W1975422446 @default.
- W201536428 cites W1975998118 @default.
- W201536428 cites W1979459060 @default.
- W201536428 cites W1981825277 @default.
- W201536428 cites W1985180788 @default.
- W201536428 cites W1988914905 @default.
- W201536428 cites W1989468977 @default.
- W201536428 cites W1990388042 @default.
- W201536428 cites W1992795877 @default.
- W201536428 cites W1993972354 @default.
- W201536428 cites W1996764654 @default.
- W201536428 cites W1999817920 @default.
- W201536428 cites W2000569744 @default.
- W201536428 cites W2003170434 @default.
- W201536428 cites W2005422315 @default.
- W201536428 cites W2006681603 @default.
- W201536428 cites W2015338694 @default.
- W201536428 cites W2018557178 @default.
- W201536428 cites W201917955 @default.
- W201536428 cites W2019976352 @default.
- W201536428 cites W2021680564 @default.
- W201536428 cites W2021986193 @default.
- W201536428 cites W2022286021 @default.
- W201536428 cites W2026953311 @default.
- W201536428 cites W2027445772 @default.
- W201536428 cites W2028709054 @default.
- W201536428 cites W2030603245 @default.
- W201536428 cites W2041565863 @default.
- W201536428 cites W2042980227 @default.
- W201536428 cites W2043909051 @default.
- W201536428 cites W2047221353 @default.
- W201536428 cites W2048045485 @default.
- W201536428 cites W2053549370 @default.
- W201536428 cites W2058553017 @default.
- W201536428 cites W2061198046 @default.
- W201536428 cites W2062270497 @default.
- W201536428 cites W2063392856 @default.
- W201536428 cites W2064580901 @default.
- W201536428 cites W2066636486 @default.
- W201536428 cites W2066867064 @default.
- W201536428 cites W2067802667 @default.
- W201536428 cites W2068905009 @default.
- W201536428 cites W2073722401 @default.
- W201536428 cites W2074449313 @default.
- W201536428 cites W2077122984 @default.
- W201536428 cites W2079168273 @default.
- W201536428 cites W2083745421 @default.
- W201536428 cites W2084048649 @default.
- W201536428 cites W2084334506 @default.
- W201536428 cites W2093390569 @default.
- W201536428 cites W2095368471 @default.
- W201536428 cites W2097308346 @default.
- W201536428 cites W2097802284 @default.
- W201536428 cites W2099194852 @default.
- W201536428 cites W2099437808 @default.
- W201536428 cites W2100260099 @default.
- W201536428 cites W2100506586 @default.
- W201536428 cites W2100958137 @default.
- W201536428 cites W2102046030 @default.
- W201536428 cites W2102270039 @default.
- W201536428 cites W2109749916 @default.
- W201536428 cites W2110189854 @default.
- W201536428 cites W2110891728 @default.
- W201536428 cites W2111212948 @default.
- W201536428 cites W2111557120 @default.
- W201536428 cites W2114512077 @default.
- W201536428 cites W2114524997 @default.
- W201536428 cites W2114804204 @default.