Matches in SemOpenAlex for { <https://semopenalex.org/work/W2904105590> ?p ?o ?g. }
- W2904105590 abstract "Given a database of vectors, a cosine threshold query returns all vectors in the database having cosine similarity to a query vector above a given threshold {theta}. These queries arise naturally in many applications, such as document retrieval, image search, and mass spectrometry. The present paper considers the efficient evaluation of such queries, providing novel optimality guarantees and exhibiting good performance on real datasets. We take as a starting point Fagin's well-known Threshold Algorithm (TA), which can be used to answer cosine threshold queries as follows: an inverted index is first built from the database vectors during pre-processing; at query time, the algorithm traverses the index partially to gather a set of candidate vectors to be later verified for {theta}-similarity. However, directly applying TA in its raw form misses significant optimization opportunities. Indeed, we first show that one can take advantage of the fact that the vectors can be assumed to be normalized, to obtain an improved, tight stopping condition for index traversal and to efficiently compute it incrementally. Then we show that one can take advantage of data skewness to obtain better traversal strategies. In particular, we show a novel traversal strategy that exploits a common data skewness condition which holds in multiple domains including mass spectrometry, documents, and image databases. We show that under the skewness assumption, the new traversal strategy has a strong, near-optimal performance guarantee. The techniques developed in the paper are quite general since they can be applied to a large class of similarity functions beyond cosine." @default.
- W2904105590 created "2018-12-22" @default.
- W2904105590 creator A5012309357 @default.
- W2904105590 creator A5028888374 @default.
- W2904105590 creator A5059167302 @default.
- W2904105590 creator A5074134616 @default.
- W2904105590 creator A5087034314 @default.
- W2904105590 date "2018-12-18" @default.
- W2904105590 modified "2023-09-27" @default.
- W2904105590 title "Index-based, High-dimensional, Cosine Threshold Querying with Optimality Guarantees" @default.
- W2904105590 cites W1500864743 @default.
- W2904105590 cites W1532325895 @default.
- W2904105590 cites W1541459201 @default.
- W2904105590 cites W1736726159 @default.
- W2904105590 cites W1965370151 @default.
- W2904105590 cites W1972972498 @default.
- W2904105590 cites W1977046819 @default.
- W2904105590 cites W1980344365 @default.
- W2904105590 cites W1997944350 @default.
- W2904105590 cites W2009688537 @default.
- W2904105590 cites W2010416066 @default.
- W2904105590 cites W2018618445 @default.
- W2904105590 cites W2019082936 @default.
- W2904105590 cites W2026465178 @default.
- W2904105590 cites W2039742379 @default.
- W2904105590 cites W2068199476 @default.
- W2904105590 cites W2073637473 @default.
- W2904105590 cites W2074197587 @default.
- W2904105590 cites W2076703245 @default.
- W2904105590 cites W2086504823 @default.
- W2904105590 cites W2090682579 @default.
- W2904105590 cites W2097776316 @default.
- W2904105590 cites W2099253838 @default.
- W2904105590 cites W2110026675 @default.
- W2904105590 cites W2112560981 @default.
- W2904105590 cites W2113765377 @default.
- W2904105590 cites W2118323718 @default.
- W2904105590 cites W2123544675 @default.
- W2904105590 cites W2124222502 @default.
- W2904105590 cites W2130729839 @default.
- W2904105590 cites W2133296809 @default.
- W2904105590 cites W2133406951 @default.
- W2904105590 cites W2141649964 @default.
- W2904105590 cites W2147717514 @default.
- W2904105590 cites W2150414292 @default.
- W2904105590 cites W2154610494 @default.
- W2904105590 cites W2155502235 @default.
- W2904105590 cites W2159477220 @default.
- W2904105590 cites W2164520297 @default.
- W2904105590 cites W2171034893 @default.
- W2904105590 cites W2171790913 @default.
- W2904105590 cites W2197084977 @default.
- W2904105590 cites W2221389368 @default.
- W2904105590 cites W2244863692 @default.
- W2904105590 cites W2250382098 @default.
- W2904105590 cites W2401610261 @default.
- W2904105590 cites W2468625129 @default.
- W2904105590 cites W2504691963 @default.
- W2904105590 cites W2519691041 @default.
- W2904105590 cites W2529628230 @default.
- W2904105590 cites W2560304626 @default.
- W2904105590 cites W2567214097 @default.
- W2904105590 cites W2607310450 @default.
- W2904105590 cites W2612210001 @default.
- W2904105590 cites W2612493630 @default.
- W2904105590 cites W2666600683 @default.
- W2904105590 cites W2735058673 @default.
- W2904105590 cites W2795250672 @default.
- W2904105590 cites W2798277753 @default.
- W2904105590 cites W2798766386 @default.
- W2904105590 cites W2963593740 @default.
- W2904105590 cites W2964066722 @default.
- W2904105590 cites W3105233790 @default.
- W2904105590 cites W3105727767 @default.
- W2904105590 doi "https://doi.org/10.48550/arxiv.1812.07695" @default.
- W2904105590 hasPublicationYear "2018" @default.
- W2904105590 type Work @default.
- W2904105590 sameAs 2904105590 @default.
- W2904105590 citedByCount "0" @default.
- W2904105590 crossrefType "posted-content" @default.
- W2904105590 hasAuthorship W2904105590A5012309357 @default.
- W2904105590 hasAuthorship W2904105590A5028888374 @default.
- W2904105590 hasAuthorship W2904105590A5059167302 @default.
- W2904105590 hasAuthorship W2904105590A5074134616 @default.
- W2904105590 hasAuthorship W2904105590A5087034314 @default.
- W2904105590 hasBestOaLocation W29041055901 @default.
- W2904105590 hasConcept C103278499 @default.
- W2904105590 hasConcept C105795698 @default.
- W2904105590 hasConcept C11413529 @default.
- W2904105590 hasConcept C115961682 @default.
- W2904105590 hasConcept C122342681 @default.
- W2904105590 hasConcept C124101348 @default.
- W2904105590 hasConcept C140745168 @default.
- W2904105590 hasConcept C154945302 @default.
- W2904105590 hasConcept C177264268 @default.
- W2904105590 hasConcept C178009071 @default.
- W2904105590 hasConcept C199360897 @default.
- W2904105590 hasConcept C2524010 @default.
- W2904105590 hasConcept C2780762811 @default.
- W2904105590 hasConcept C33923547 @default.