Matches in SemOpenAlex for { <https://semopenalex.org/work/W2040254860> ?p ?o ?g. }
- W2040254860 endingPage "36" @default.
- W2040254860 startingPage "1" @default.
- W2040254860 abstract "The inverted index is the backbone of modern web search engines. For each word in a collection of web documents, the index records the list of documents where this word occurs. Given a set of query words, the job of a search engine is to output a ranked list of the most relevant documents containing the query. However, if the query consists of an arbitrary string—which can be a partial word, multiword phrase, or more generally any sequence of characters—then word boundaries are no longer relevant and we need a different approach. In string retrieval settings, we are given a set D ={ d 1 , d 2 , d 3 , …, d D } of D strings with n characters in total taken from an alphabet set Σ = [σ], and the task of the search engine, for a given query pattern P of length p , is to report the “most relevant” strings in D containing P . The query may also consist of two or more patterns. The notion of relevance can be captured by a function score ( P , d r ), which indicates how relevant document d r is to the pattern P . Some example score functions are the frequency of pattern occurrences, proximity between pattern occurrences, or pattern-independent PageRank of the document. The first formal framework to study such kinds of retrieval problems was given by Muthukrishnan [SODA 2002]. He considered two metrics for relevance: frequency and proximity. He took a threshold-based approach on these metrics and gave data structures that use O ( n log n ) words of space. We study this problem in a somewhat more natural top- k framework. Here, k is a part of the query, and the top k most relevant (highest-scoring) documents are to be reported in sorted order of score. We present the first linear-space framework (i.e., using O ( n ) words of space) that is capable of handling arbitrary score functions with near-optimal O ( p + k log k ) query time. The query time can be made optimal O ( p + k ) if sorted order is not necessary. Further, we derive compact space and succinct space indexes (for some specific score functions). This space compression comes at the cost of higher query time. At last, we extend our framework to handle the case of multiple patterns. Apart from providing a robust framework, our results also improve many earlier results in index space or query time or both." @default.
- W2040254860 created "2016-06-24" @default.
- W2040254860 creator A5003340402 @default.
- W2040254860 creator A5013112058 @default.
- W2040254860 creator A5049669178 @default.
- W2040254860 creator A5087213950 @default.
- W2040254860 date "2014-04-01" @default.
- W2040254860 modified "2023-10-16" @default.
- W2040254860 title "Space-Efficient Frameworks for Top- <i>k</i> String Retrieval" @default.
- W2040254860 cites W110689283 @default.
- W2040254860 cites W1504477191 @default.
- W2040254860 cites W1518925092 @default.
- W2040254860 cites W1608371561 @default.
- W2040254860 cites W1970194312 @default.
- W2040254860 cites W1974033543 @default.
- W2040254860 cites W1985108724 @default.
- W2040254860 cites W1986970296 @default.
- W2040254860 cites W1989749956 @default.
- W2040254860 cites W1996641400 @default.
- W2040254860 cites W2007791040 @default.
- W2040254860 cites W2014318353 @default.
- W2040254860 cites W2027252317 @default.
- W2040254860 cites W2038142281 @default.
- W2040254860 cites W2044014345 @default.
- W2040254860 cites W2049415039 @default.
- W2040254860 cites W2065209187 @default.
- W2040254860 cites W2073921136 @default.
- W2040254860 cites W2080106004 @default.
- W2040254860 cites W2085218027 @default.
- W2040254860 cites W2085933841 @default.
- W2040254860 cites W2088386938 @default.
- W2040254860 cites W2090021115 @default.
- W2040254860 cites W2093918274 @default.
- W2040254860 cites W2099649694 @default.
- W2040254860 cites W2107082304 @default.
- W2040254860 cites W2111046826 @default.
- W2040254860 cites W2111340560 @default.
- W2040254860 cites W2118274795 @default.
- W2040254860 cites W2121252285 @default.
- W2040254860 cites W2141957180 @default.
- W2040254860 cites W2144759920 @default.
- W2040254860 cites W2149530645 @default.
- W2040254860 cites W2151453116 @default.
- W2040254860 cites W2158874082 @default.
- W2040254860 cites W2159647614 @default.
- W2040254860 cites W2165621523 @default.
- W2040254860 cites W2170899819 @default.
- W2040254860 cites W2173123188 @default.
- W2040254860 cites W2191209163 @default.
- W2040254860 cites W227213435 @default.
- W2040254860 cites W2533248932 @default.
- W2040254860 cites W2912556076 @default.
- W2040254860 cites W3124327069 @default.
- W2040254860 cites W4242124034 @default.
- W2040254860 cites W4292081093 @default.
- W2040254860 cites W4361867045 @default.
- W2040254860 cites W80537337 @default.
- W2040254860 doi "https://doi.org/10.1145/2590774" @default.
- W2040254860 hasPublicationYear "2014" @default.
- W2040254860 type Work @default.
- W2040254860 sameAs 2040254860 @default.
- W2040254860 citedByCount "29" @default.
- W2040254860 countsByYear W20402548602014 @default.
- W2040254860 countsByYear W20402548602015 @default.
- W2040254860 countsByYear W20402548602016 @default.
- W2040254860 countsByYear W20402548602017 @default.
- W2040254860 countsByYear W20402548602018 @default.
- W2040254860 countsByYear W20402548602019 @default.
- W2040254860 countsByYear W20402548602020 @default.
- W2040254860 countsByYear W20402548602021 @default.
- W2040254860 countsByYear W20402548602022 @default.
- W2040254860 countsByYear W20402548602023 @default.
- W2040254860 crossrefType "journal-article" @default.
- W2040254860 hasAuthorship W2040254860A5003340402 @default.
- W2040254860 hasAuthorship W2040254860A5013112058 @default.
- W2040254860 hasAuthorship W2040254860A5049669178 @default.
- W2040254860 hasAuthorship W2040254860A5087213950 @default.
- W2040254860 hasConcept C111919701 @default.
- W2040254860 hasConcept C130590232 @default.
- W2040254860 hasConcept C154945302 @default.
- W2040254860 hasConcept C157486923 @default.
- W2040254860 hasConcept C158154518 @default.
- W2040254860 hasConcept C161156560 @default.
- W2040254860 hasConcept C162324750 @default.
- W2040254860 hasConcept C177264268 @default.
- W2040254860 hasConcept C17744445 @default.
- W2040254860 hasConcept C187736073 @default.
- W2040254860 hasConcept C189430467 @default.
- W2040254860 hasConcept C199360897 @default.
- W2040254860 hasConcept C199539241 @default.
- W2040254860 hasConcept C204321447 @default.
- W2040254860 hasConcept C23123220 @default.
- W2040254860 hasConcept C2524010 @default.
- W2040254860 hasConcept C2776224158 @default.
- W2040254860 hasConcept C2778572836 @default.
- W2040254860 hasConcept C2780451532 @default.
- W2040254860 hasConcept C33923547 @default.
- W2040254860 hasConcept C37914503 @default.