Matches in SemOpenAlex for { <https://semopenalex.org/work/W2950458457> ?p ?o ?g. }
- W2950458457 abstract "We consider the problem of approximate set similarity search under Braun-Blanquet similarity $B(mathbf{x}, mathbf{y}) = |mathbf{x} cap mathbf{y}| / max(|mathbf{x}|, |mathbf{y}|)$. The $(b_2, b_2)$-approximate Braun-Blanquet similarity search problem is to preprocess a collection of sets $P$ such that, given a query set $mathbf{q}$, if there exists $mathbf{x} in P$ with $B(mathbf{q}, mathbf{x}) geq b_1$, then we can efficiently return $mathbf{x}' in P$ with $B(mathbf{q}, mathbf{x}') > b_2$. We present a simple data structure that solves this problem with space usage $O(n^{1+rho}log n + sum_{mathbf{x} in P}|mathbf{x}|)$ and query time $O(|mathbf{q}|n^{rho} log n)$ where $n = |P|$ and $rho = log(1/b_1)/log(1/b_2)$. Making use of existing lower bounds for locality-sensitive hashing by O'Donnell et al. (TOCT 2014) we show that this value of $rho$ is tight across the parameter space, i.e., for every choice of constants $0 < b_2 < b_1 < 1$. In the case where all sets have the same size our solution strictly improves upon the value of $rho$ that can be obtained through the use of state-of-the-art data-independent techniques in the Indyk-Motwani locality-sensitive hashing framework (STOC 1998) such as Broder's MinHash (CCS 1997) for Jaccard similarity and Andoni et al.'s cross-polytope LSH (NIPS 2015) for cosine similarity. Surprisingly, even though our solution is data-independent, for a large part of the parameter space we outperform the currently best data-dependent method by Andoni and Razenshteyn (STOC 2015)." @default.
- W2950458457 created "2019-06-27" @default.
- W2950458457 creator A5014293815 @default.
- W2950458457 creator A5083645347 @default.
- W2950458457 date "2016-12-22" @default.
- W2950458457 modified "2023-09-27" @default.
- W2950458457 title "Set Similarity Search Beyond MinHash" @default.
- W2950458457 cites W1455310343 @default.
- W2950458457 cites W1542553486 @default.
- W2950458457 cites W164887593 @default.
- W2950458457 cites W1898304433 @default.
- W2950458457 cites W190065572 @default.
- W2950458457 cites W1965996575 @default.
- W2950458457 cites W1994945255 @default.
- W2950458457 cites W2002359780 @default.
- W2950458457 cites W2011737794 @default.
- W2950458457 cites W2012833704 @default.
- W2950458457 cites W2097776316 @default.
- W2950458457 cites W2103012681 @default.
- W2950458457 cites W2123485784 @default.
- W2950458457 cites W2132069633 @default.
- W2950458457 cites W2134212491 @default.
- W2950458457 cites W2147717514 @default.
- W2950458457 cites W2152565070 @default.
- W2950458457 cites W2241750177 @default.
- W2950458457 cites W2295744963 @default.
- W2950458457 cites W2308071406 @default.
- W2950458457 cites W2608053747 @default.
- W2950458457 cites W2899702797 @default.
- W2950458457 cites W2914771567 @default.
- W2950458457 cites W2949388608 @default.
- W2950458457 cites W2952088295 @default.
- W2950458457 cites W2953209208 @default.
- W2950458457 cites W2963563740 @default.
- W2950458457 cites W3105727767 @default.
- W2950458457 hasPublicationYear "2016" @default.
- W2950458457 type Work @default.
- W2950458457 sameAs 2950458457 @default.
- W2950458457 citedByCount "4" @default.
- W2950458457 countsByYear W29504584572017 @default.
- W2950458457 countsByYear W29504584572018 @default.
- W2950458457 crossrefType "posted-content" @default.
- W2950458457 hasAuthorship W2950458457A5014293815 @default.
- W2950458457 hasAuthorship W2950458457A5083645347 @default.
- W2950458457 hasConcept C103278499 @default.
- W2950458457 hasConcept C105795698 @default.
- W2950458457 hasConcept C111919701 @default.
- W2950458457 hasConcept C114614502 @default.
- W2950458457 hasConcept C115961682 @default.
- W2950458457 hasConcept C121332964 @default.
- W2950458457 hasConcept C154945302 @default.
- W2950458457 hasConcept C203519979 @default.
- W2950458457 hasConcept C2778572836 @default.
- W2950458457 hasConcept C2780762811 @default.
- W2950458457 hasConcept C33923547 @default.
- W2950458457 hasConcept C38652104 @default.
- W2950458457 hasConcept C41008148 @default.
- W2950458457 hasConcept C67388219 @default.
- W2950458457 hasConcept C73555534 @default.
- W2950458457 hasConcept C74270461 @default.
- W2950458457 hasConcept C99138194 @default.
- W2950458457 hasConceptScore W2950458457C103278499 @default.
- W2950458457 hasConceptScore W2950458457C105795698 @default.
- W2950458457 hasConceptScore W2950458457C111919701 @default.
- W2950458457 hasConceptScore W2950458457C114614502 @default.
- W2950458457 hasConceptScore W2950458457C115961682 @default.
- W2950458457 hasConceptScore W2950458457C121332964 @default.
- W2950458457 hasConceptScore W2950458457C154945302 @default.
- W2950458457 hasConceptScore W2950458457C203519979 @default.
- W2950458457 hasConceptScore W2950458457C2778572836 @default.
- W2950458457 hasConceptScore W2950458457C2780762811 @default.
- W2950458457 hasConceptScore W2950458457C33923547 @default.
- W2950458457 hasConceptScore W2950458457C38652104 @default.
- W2950458457 hasConceptScore W2950458457C41008148 @default.
- W2950458457 hasConceptScore W2950458457C67388219 @default.
- W2950458457 hasConceptScore W2950458457C73555534 @default.
- W2950458457 hasConceptScore W2950458457C74270461 @default.
- W2950458457 hasConceptScore W2950458457C99138194 @default.
- W2950458457 hasLocation W29504584571 @default.
- W2950458457 hasOpenAccess W2950458457 @default.
- W2950458457 hasPrimaryLocation W29504584571 @default.
- W2950458457 hasRelatedWork W1782245179 @default.
- W2950458457 hasRelatedWork W2018741301 @default.
- W2950458457 hasRelatedWork W2072340529 @default.
- W2950458457 hasRelatedWork W2282723428 @default.
- W2950458457 hasRelatedWork W2394626294 @default.
- W2950458457 hasRelatedWork W2401929940 @default.
- W2950458457 hasRelatedWork W2586700818 @default.
- W2950458457 hasRelatedWork W2611562969 @default.
- W2950458457 hasRelatedWork W2951279353 @default.
- W2950458457 hasRelatedWork W2952460245 @default.
- W2950458457 hasRelatedWork W2952603777 @default.
- W2950458457 hasRelatedWork W2953299275 @default.
- W2950458457 hasRelatedWork W2963722385 @default.
- W2950458457 hasRelatedWork W3015560540 @default.
- W2950458457 hasRelatedWork W3037318061 @default.
- W2950458457 hasRelatedWork W3047118271 @default.
- W2950458457 hasRelatedWork W3098927475 @default.
- W2950458457 hasRelatedWork W3110680648 @default.
- W2950458457 hasRelatedWork W3206714591 @default.