Matches in SemOpenAlex for { <https://semopenalex.org/work/W2019412213> ?p ?o ?g. }
- W2019412213 endingPage "1929" @default.
- W2019412213 startingPage "1916" @default.
- W2019412213 abstract "Similarity joins play an important role in many application areas, such as data integration and cleaning, record linkage, and pattern recognition. In this paper, we study efficient algorithms for similarity joins with an edit distance constraint. Currently, the most prevalent approach is based on extracting overlapping grams from strings and considering only strings that share a certain number of grams as candidates. Unlike these existing approaches, we propose a novel approach to edit similarity join based on extracting nonoverlapping substrings, or chunks, from strings. We propose a class of chunking schemes based on the notion of tail-restricted chunk boundary dictionary. A new algorithm, VChunkJoin, is designed by integrating existing filtering methods and several new filters unique to our chunk-based method. We also design a greedy algorithm to automatically select a good chunking scheme for a given data set. We demonstrate experimentally that the new algorithm is faster than alternative methods yet occupies less space." @default.
- W2019412213 created "2016-06-24" @default.
- W2019412213 creator A5030822803 @default.
- W2019412213 creator A5036148682 @default.
- W2019412213 creator A5048552211 @default.
- W2019412213 creator A5052993469 @default.
- W2019412213 creator A5056923399 @default.
- W2019412213 date "2013-08-01" @default.
- W2019412213 modified "2023-09-23" @default.
- W2019412213 title "VChunkJoin: An Efficient Algorithm for Edit Similarity Joins" @default.
- W2019412213 cites W1970026646 @default.
- W2019412213 cites W1974995373 @default.
- W2019412213 cites W1993831977 @default.
- W2019412213 cites W2000482994 @default.
- W2019412213 cites W2001496424 @default.
- W2019412213 cites W2007682403 @default.
- W2019412213 cites W2009036829 @default.
- W2019412213 cites W2011632873 @default.
- W2019412213 cites W2024605621 @default.
- W2019412213 cites W2067566391 @default.
- W2019412213 cites W2071735063 @default.
- W2019412213 cites W2072173758 @default.
- W2019412213 cites W2085678516 @default.
- W2019412213 cites W2089326542 @default.
- W2019412213 cites W2097184821 @default.
- W2019412213 cites W2097776316 @default.
- W2019412213 cites W2099370490 @default.
- W2019412213 cites W2100548092 @default.
- W2019412213 cites W2107293766 @default.
- W2019412213 cites W2111295912 @default.
- W2019412213 cites W2112099725 @default.
- W2019412213 cites W2115214414 @default.
- W2019412213 cites W2115215982 @default.
- W2019412213 cites W2119455368 @default.
- W2019412213 cites W2121516976 @default.
- W2019412213 cites W2123020735 @default.
- W2019412213 cites W2127675794 @default.
- W2019412213 cites W2129750215 @default.
- W2019412213 cites W2139660688 @default.
- W2019412213 cites W2141469207 @default.
- W2019412213 cites W2147033904 @default.
- W2019412213 cites W2148578434 @default.
- W2019412213 cites W2150916025 @default.
- W2019412213 cites W2162592052 @default.
- W2019412213 cites W2167847032 @default.
- W2019412213 cites W3146259567 @default.
- W2019412213 doi "https://doi.org/10.1109/tkde.2012.79" @default.
- W2019412213 hasPublicationYear "2013" @default.
- W2019412213 type Work @default.
- W2019412213 sameAs 2019412213 @default.
- W2019412213 citedByCount "38" @default.
- W2019412213 countsByYear W20194122132013 @default.
- W2019412213 countsByYear W20194122132014 @default.
- W2019412213 countsByYear W20194122132015 @default.
- W2019412213 countsByYear W20194122132016 @default.
- W2019412213 countsByYear W20194122132017 @default.
- W2019412213 countsByYear W20194122132018 @default.
- W2019412213 countsByYear W20194122132019 @default.
- W2019412213 countsByYear W20194122132020 @default.
- W2019412213 countsByYear W20194122132022 @default.
- W2019412213 countsByYear W20194122132023 @default.
- W2019412213 crossrefType "journal-article" @default.
- W2019412213 hasAuthorship W2019412213A5030822803 @default.
- W2019412213 hasAuthorship W2019412213A5036148682 @default.
- W2019412213 hasAuthorship W2019412213A5048552211 @default.
- W2019412213 hasAuthorship W2019412213A5052993469 @default.
- W2019412213 hasAuthorship W2019412213A5056923399 @default.
- W2019412213 hasConcept C103278499 @default.
- W2019412213 hasConcept C11413529 @default.
- W2019412213 hasConcept C115961682 @default.
- W2019412213 hasConcept C124101348 @default.
- W2019412213 hasConcept C154945302 @default.
- W2019412213 hasConcept C177264268 @default.
- W2019412213 hasConcept C182407805 @default.
- W2019412213 hasConcept C199360897 @default.
- W2019412213 hasConcept C203357204 @default.
- W2019412213 hasConcept C2778692605 @default.
- W2019412213 hasConcept C41008148 @default.
- W2019412213 hasConcept C44359876 @default.
- W2019412213 hasConcept C80444323 @default.
- W2019412213 hasConceptScore W2019412213C103278499 @default.
- W2019412213 hasConceptScore W2019412213C11413529 @default.
- W2019412213 hasConceptScore W2019412213C115961682 @default.
- W2019412213 hasConceptScore W2019412213C124101348 @default.
- W2019412213 hasConceptScore W2019412213C154945302 @default.
- W2019412213 hasConceptScore W2019412213C177264268 @default.
- W2019412213 hasConceptScore W2019412213C182407805 @default.
- W2019412213 hasConceptScore W2019412213C199360897 @default.
- W2019412213 hasConceptScore W2019412213C203357204 @default.
- W2019412213 hasConceptScore W2019412213C2778692605 @default.
- W2019412213 hasConceptScore W2019412213C41008148 @default.
- W2019412213 hasConceptScore W2019412213C44359876 @default.
- W2019412213 hasConceptScore W2019412213C80444323 @default.
- W2019412213 hasIssue "8" @default.
- W2019412213 hasLocation W20194122131 @default.
- W2019412213 hasOpenAccess W2019412213 @default.
- W2019412213 hasPrimaryLocation W20194122131 @default.
- W2019412213 hasRelatedWork W1564399566 @default.