Matches in SemOpenAlex for { <https://semopenalex.org/work/W2040887688> ?p ?o ?g. }
- W2040887688 endingPage "184" @default.
- W2040887688 startingPage "170" @default.
- W2040887688 abstract "Entity matching (EM) is the task of identifying records that refer to the same entity from different sources. EM is widely used in real-world applications such as data integration and data cleaning, but the naive method of EM leads to exhaustive pair-wise comparisons. To enhance the efficiency of EM, we transform EM into the top-k query problem of identifying the best k results for a given match function, and propose a new EM algorithm using pre-materialized lists, which refer to the sorted lists of record pairs. Our proposed algorithm identifies the EM results with sub-linear cost using the materialized lists. Because it requires us to materialize the sorted lists with all record pairs, however, this approach can be impractical. To address this problem, we reduce the size of the materialized lists, which stores only 1% of all pairs without sacrificing EM accuracy. This method is inspired by the notion of skyline queries. In addition, we extend our proposed framework to collective entity matching that exploits both attributes and the reference relationships across records. Experimental results show that the proposed algorithms are an order of magnitude faster than the state-of-the-art algorithms without compromising accuracy." @default.
- W2040887688 created "2016-06-24" @default.
- W2040887688 creator A5001967366 @default.
- W2040887688 creator A5024873601 @default.
- W2040887688 creator A5065423554 @default.
- W2040887688 date "2014-03-01" @default.
- W2040887688 modified "2023-09-25" @default.
- W2040887688 title "Efficient entity matching using materialized lists" @default.
- W2040887688 cites W130948412 @default.
- W2040887688 cites W1559390933 @default.
- W2040887688 cites W1612155886 @default.
- W2040887688 cites W1920916604 @default.
- W2040887688 cites W1965264121 @default.
- W2040887688 cites W1981590391 @default.
- W2040887688 cites W2006997130 @default.
- W2040887688 cites W2024770506 @default.
- W2040887688 cites W2034190452 @default.
- W2040887688 cites W2036216970 @default.
- W2040887688 cites W2040348939 @default.
- W2040887688 cites W2046020929 @default.
- W2040887688 cites W2055405704 @default.
- W2040887688 cites W2067566391 @default.
- W2040887688 cites W2073471108 @default.
- W2040887688 cites W2083039049 @default.
- W2040887688 cites W2110529497 @default.
- W2040887688 cites W2111971349 @default.
- W2040887688 cites W2116396741 @default.
- W2040887688 cites W2123561513 @default.
- W2040887688 cites W2134206624 @default.
- W2040887688 cites W2140894345 @default.
- W2040887688 cites W2148019918 @default.
- W2040887688 cites W2154454189 @default.
- W2040887688 cites W2154785834 @default.
- W2040887688 cites W2170188482 @default.
- W2040887688 cites W2170902582 @default.
- W2040887688 cites W2171574281 @default.
- W2040887688 doi "https://doi.org/10.1016/j.ins.2013.08.045" @default.
- W2040887688 hasPublicationYear "2014" @default.
- W2040887688 type Work @default.
- W2040887688 sameAs 2040887688 @default.
- W2040887688 citedByCount "6" @default.
- W2040887688 countsByYear W20408876882014 @default.
- W2040887688 countsByYear W20408876882015 @default.
- W2040887688 countsByYear W20408876882018 @default.
- W2040887688 countsByYear W20408876882019 @default.
- W2040887688 countsByYear W20408876882020 @default.
- W2040887688 crossrefType "journal-article" @default.
- W2040887688 hasAuthorship W2040887688A5001967366 @default.
- W2040887688 hasAuthorship W2040887688A5024873601 @default.
- W2040887688 hasAuthorship W2040887688A5065423554 @default.
- W2040887688 hasConcept C105795698 @default.
- W2040887688 hasConcept C124101348 @default.
- W2040887688 hasConcept C14036430 @default.
- W2040887688 hasConcept C148840519 @default.
- W2040887688 hasConcept C162324750 @default.
- W2040887688 hasConcept C165064840 @default.
- W2040887688 hasConcept C165696696 @default.
- W2040887688 hasConcept C187736073 @default.
- W2040887688 hasConcept C23123220 @default.
- W2040887688 hasConcept C2780451532 @default.
- W2040887688 hasConcept C2780757406 @default.
- W2040887688 hasConcept C33923547 @default.
- W2040887688 hasConcept C38652104 @default.
- W2040887688 hasConcept C41008148 @default.
- W2040887688 hasConcept C54239708 @default.
- W2040887688 hasConcept C77088390 @default.
- W2040887688 hasConcept C78458016 @default.
- W2040887688 hasConcept C86803240 @default.
- W2040887688 hasConcept C98199447 @default.
- W2040887688 hasConceptScore W2040887688C105795698 @default.
- W2040887688 hasConceptScore W2040887688C124101348 @default.
- W2040887688 hasConceptScore W2040887688C14036430 @default.
- W2040887688 hasConceptScore W2040887688C148840519 @default.
- W2040887688 hasConceptScore W2040887688C162324750 @default.
- W2040887688 hasConceptScore W2040887688C165064840 @default.
- W2040887688 hasConceptScore W2040887688C165696696 @default.
- W2040887688 hasConceptScore W2040887688C187736073 @default.
- W2040887688 hasConceptScore W2040887688C23123220 @default.
- W2040887688 hasConceptScore W2040887688C2780451532 @default.
- W2040887688 hasConceptScore W2040887688C2780757406 @default.
- W2040887688 hasConceptScore W2040887688C33923547 @default.
- W2040887688 hasConceptScore W2040887688C38652104 @default.
- W2040887688 hasConceptScore W2040887688C41008148 @default.
- W2040887688 hasConceptScore W2040887688C54239708 @default.
- W2040887688 hasConceptScore W2040887688C77088390 @default.
- W2040887688 hasConceptScore W2040887688C78458016 @default.
- W2040887688 hasConceptScore W2040887688C86803240 @default.
- W2040887688 hasConceptScore W2040887688C98199447 @default.
- W2040887688 hasLocation W20408876881 @default.
- W2040887688 hasOpenAccess W2040887688 @default.
- W2040887688 hasPrimaryLocation W20408876881 @default.
- W2040887688 hasRelatedWork W1486838520 @default.
- W2040887688 hasRelatedWork W2023608319 @default.
- W2040887688 hasRelatedWork W2145015575 @default.
- W2040887688 hasRelatedWork W2293853248 @default.
- W2040887688 hasRelatedWork W2331788134 @default.
- W2040887688 hasRelatedWork W2727889391 @default.
- W2040887688 hasRelatedWork W2896609224 @default.