Matches in SemOpenAlex for { <https://semopenalex.org/work/W2120058970> ?p ?o ?g. }
- W2120058970 abstract "This paper offers a novel look at using a dimensionality-reduction technique called simhash to detect similar document pairs in large-scale collections. We show that this algorithm produces interesting intermediate data, which is normally discarded, that can be used to predict which of the bits in the final hash are more susceptible to being flipped in similar documents. This paves the way for a probabilistic search technique in the Hamming space of simhashes that can be significantly faster and more space-efficient than the existing simhash approaches. We show that with 95% recall compared to deterministic search of prior work, our method exhibits 4-14 times faster lookup and requires 2-10 times less RAM on our collection of 70M web pages." @default.
- W2120058970 created "2016-06-24" @default.
- W2120058970 creator A5076734500 @default.
- W2120058970 creator A5085002707 @default.
- W2120058970 date "2011-01-01" @default.
- W2120058970 modified "2023-09-23" @default.
- W2120058970 title "Probabilistic near-duplicate detection using simhash" @default.
- W2120058970 cites W1530595269 @default.
- W2120058970 cites W1974942389 @default.
- W2120058970 cites W1991800036 @default.
- W2120058970 cites W2012833704 @default.
- W2120058970 cites W2048779798 @default.
- W2120058970 cites W2053017876 @default.
- W2120058970 cites W2067432306 @default.
- W2120058970 cites W2081193615 @default.
- W2120058970 cites W2085922539 @default.
- W2120058970 cites W2118269922 @default.
- W2120058970 cites W2145349611 @default.
- W2120058970 cites W2145990704 @default.
- W2120058970 cites W2147717514 @default.
- W2120058970 cites W2148781362 @default.
- W2120058970 cites W2148885851 @default.
- W2120058970 cites W2152565070 @default.
- W2120058970 cites W2161449253 @default.
- W2120058970 cites W2397770138 @default.
- W2120058970 cites W2913932916 @default.
- W2120058970 cites W87694687 @default.
- W2120058970 doi "https://doi.org/10.1145/2063576.2063737" @default.
- W2120058970 hasPublicationYear "2011" @default.
- W2120058970 type Work @default.
- W2120058970 sameAs 2120058970 @default.
- W2120058970 citedByCount "27" @default.
- W2120058970 countsByYear W21200589702012 @default.
- W2120058970 countsByYear W21200589702013 @default.
- W2120058970 countsByYear W21200589702014 @default.
- W2120058970 countsByYear W21200589702015 @default.
- W2120058970 countsByYear W21200589702016 @default.
- W2120058970 countsByYear W21200589702017 @default.
- W2120058970 countsByYear W21200589702019 @default.
- W2120058970 countsByYear W21200589702020 @default.
- W2120058970 countsByYear W21200589702021 @default.
- W2120058970 countsByYear W21200589702022 @default.
- W2120058970 crossrefType "proceedings-article" @default.
- W2120058970 hasAuthorship W2120058970A5076734500 @default.
- W2120058970 hasAuthorship W2120058970A5085002707 @default.
- W2120058970 hasBestOaLocation W21200589702 @default.
- W2120058970 hasConcept C111335779 @default.
- W2120058970 hasConcept C111919701 @default.
- W2120058970 hasConcept C11413529 @default.
- W2120058970 hasConcept C124101348 @default.
- W2120058970 hasConcept C154945302 @default.
- W2120058970 hasConcept C157125643 @default.
- W2120058970 hasConcept C193319292 @default.
- W2120058970 hasConcept C23123220 @default.
- W2120058970 hasConcept C2524010 @default.
- W2120058970 hasConcept C2778572836 @default.
- W2120058970 hasConcept C2779494224 @default.
- W2120058970 hasConcept C33923547 @default.
- W2120058970 hasConcept C38652104 @default.
- W2120058970 hasConcept C41008148 @default.
- W2120058970 hasConcept C49937458 @default.
- W2120058970 hasConcept C57273362 @default.
- W2120058970 hasConcept C67388219 @default.
- W2120058970 hasConcept C70518039 @default.
- W2120058970 hasConcept C73150493 @default.
- W2120058970 hasConcept C80444323 @default.
- W2120058970 hasConcept C99138194 @default.
- W2120058970 hasConceptScore W2120058970C111335779 @default.
- W2120058970 hasConceptScore W2120058970C111919701 @default.
- W2120058970 hasConceptScore W2120058970C11413529 @default.
- W2120058970 hasConceptScore W2120058970C124101348 @default.
- W2120058970 hasConceptScore W2120058970C154945302 @default.
- W2120058970 hasConceptScore W2120058970C157125643 @default.
- W2120058970 hasConceptScore W2120058970C193319292 @default.
- W2120058970 hasConceptScore W2120058970C23123220 @default.
- W2120058970 hasConceptScore W2120058970C2524010 @default.
- W2120058970 hasConceptScore W2120058970C2778572836 @default.
- W2120058970 hasConceptScore W2120058970C2779494224 @default.
- W2120058970 hasConceptScore W2120058970C33923547 @default.
- W2120058970 hasConceptScore W2120058970C38652104 @default.
- W2120058970 hasConceptScore W2120058970C41008148 @default.
- W2120058970 hasConceptScore W2120058970C49937458 @default.
- W2120058970 hasConceptScore W2120058970C57273362 @default.
- W2120058970 hasConceptScore W2120058970C67388219 @default.
- W2120058970 hasConceptScore W2120058970C70518039 @default.
- W2120058970 hasConceptScore W2120058970C73150493 @default.
- W2120058970 hasConceptScore W2120058970C80444323 @default.
- W2120058970 hasConceptScore W2120058970C99138194 @default.
- W2120058970 hasLocation W21200589701 @default.
- W2120058970 hasLocation W21200589702 @default.
- W2120058970 hasOpenAccess W2120058970 @default.
- W2120058970 hasPrimaryLocation W21200589701 @default.
- W2120058970 hasRelatedWork W1579524835 @default.
- W2120058970 hasRelatedWork W1998116167 @default.
- W2120058970 hasRelatedWork W2012833704 @default.
- W2120058970 hasRelatedWork W2015509951 @default.
- W2120058970 hasRelatedWork W2069270778 @default.
- W2120058970 hasRelatedWork W2085922539 @default.
- W2120058970 hasRelatedWork W2092617647 @default.
- W2120058970 hasRelatedWork W2096929438 @default.