Matches in SemOpenAlex for { <https://semopenalex.org/work/W2038585855> ?p ?o ?g. }
- W2038585855 abstract "This paper explores combinatorial optimization for problems of max-weight graph matching on multi-partite graphs, which arise in integrating multiple data sources. Entity resolution-the data integration problem of performing noisy joins on structured data-typically proceeds by first hashing each record into zero or more blocks, scoring pairs of records that are co-blocked for similarity, and then matching pairs of sufficient similarity. In the most common case of matching two sources, it is often desirable for the final matching to be one-to-one (a record may be matched with at most one other); members of the database and statistical record linkage communities accomplish such matchings in the final stage by weighted bipartite graph matching on similarity scores. Such matchings are intuitively appealing: they leverage a natural global property of many real-world entity stores-that of being nearly deduped-and are known to provide significant improvements to precision and recall. Unfortunately unlike the bipartite case, exact max-weight matching on multi-partite graphs is known to be NP-hard. Our two-fold algorithmic contributions approximate multi-partite max-weight matching: our first algorithm borrows optimization techniques common to Bayesian probabilistic inference; our second is a greedy approximation algorithm. In addition to a theoretical guarantee on the latter, we present comparisons on a real-world ER problem from Bing significantly larger than typically found in the literature, publication data, and on a series of synthetic problems. Our results quantify significant improvements due to exploiting multiple sources, which are made possible by global one-to-one constraints linking otherwise independent matching sub-problems. We also discover that our algorithms are complementary: one being much more robust under noise, and the other being simple to implement and very fast to run." @default.
- W2038585855 created "2016-06-24" @default.
- W2038585855 creator A5025344952 @default.
- W2038585855 creator A5078824132 @default.
- W2038585855 creator A5090652507 @default.
- W2038585855 date "2014-02-03" @default.
- W2038585855 modified "2023-09-29" @default.
- W2038585855 title "Principled Graph Matching Algorithms for Integrating Multiple Data Sources" @default.
- W2038585855 cites W1511986666 @default.
- W2038585855 cites W1536860849 @default.
- W2038585855 cites W1563742761 @default.
- W2038585855 cites W1589277632 @default.
- W2038585855 cites W1742677423 @default.
- W2038585855 cites W1761401273 @default.
- W2038585855 cites W177885090 @default.
- W2038585855 cites W1979629649 @default.
- W2038585855 cites W1981590391 @default.
- W2038585855 cites W1998475090 @default.
- W2038585855 cites W2025575149 @default.
- W2038585855 cites W2034190452 @default.
- W2038585855 cites W2041439319 @default.
- W2038585855 cites W2065841482 @default.
- W2038585855 cites W2067566391 @default.
- W2038585855 cites W2069728146 @default.
- W2038585855 cites W2092708731 @default.
- W2038585855 cites W2099091355 @default.
- W2038585855 cites W2104511295 @default.
- W2038585855 cites W2107499437 @default.
- W2038585855 cites W2108991785 @default.
- W2038585855 cites W2123561513 @default.
- W2038585855 cites W2133676910 @default.
- W2038585855 cites W2145007893 @default.
- W2038585855 cites W2145492473 @default.
- W2038585855 cites W2164456230 @default.
- W2038585855 cites W2164625277 @default.
- W2038585855 cites W2170987079 @default.
- W2038585855 cites W2340149117 @default.
- W2038585855 cites W2402132408 @default.
- W2038585855 cites W61518770 @default.
- W2038585855 cites W1564630549 @default.
- W2038585855 hasPublicationYear "2014" @default.
- W2038585855 type Work @default.
- W2038585855 sameAs 2038585855 @default.
- W2038585855 citedByCount "0" @default.
- W2038585855 crossrefType "posted-content" @default.
- W2038585855 hasAuthorship W2038585855A5025344952 @default.
- W2038585855 hasAuthorship W2038585855A5078824132 @default.
- W2038585855 hasAuthorship W2038585855A5090652507 @default.
- W2038585855 hasConcept C105795698 @default.
- W2038585855 hasConcept C11413529 @default.
- W2038585855 hasConcept C124101348 @default.
- W2038585855 hasConcept C132525143 @default.
- W2038585855 hasConcept C153083717 @default.
- W2038585855 hasConcept C154945302 @default.
- W2038585855 hasConcept C165064840 @default.
- W2038585855 hasConcept C197657726 @default.
- W2038585855 hasConcept C2776214188 @default.
- W2038585855 hasConcept C33923547 @default.
- W2038585855 hasConcept C41008148 @default.
- W2038585855 hasConcept C61455927 @default.
- W2038585855 hasConcept C72545166 @default.
- W2038585855 hasConcept C80444323 @default.
- W2038585855 hasConceptScore W2038585855C105795698 @default.
- W2038585855 hasConceptScore W2038585855C11413529 @default.
- W2038585855 hasConceptScore W2038585855C124101348 @default.
- W2038585855 hasConceptScore W2038585855C132525143 @default.
- W2038585855 hasConceptScore W2038585855C153083717 @default.
- W2038585855 hasConceptScore W2038585855C154945302 @default.
- W2038585855 hasConceptScore W2038585855C165064840 @default.
- W2038585855 hasConceptScore W2038585855C197657726 @default.
- W2038585855 hasConceptScore W2038585855C2776214188 @default.
- W2038585855 hasConceptScore W2038585855C33923547 @default.
- W2038585855 hasConceptScore W2038585855C41008148 @default.
- W2038585855 hasConceptScore W2038585855C61455927 @default.
- W2038585855 hasConceptScore W2038585855C72545166 @default.
- W2038585855 hasConceptScore W2038585855C80444323 @default.
- W2038585855 hasLocation W20385858551 @default.
- W2038585855 hasOpenAccess W2038585855 @default.
- W2038585855 hasPrimaryLocation W20385858551 @default.
- W2038585855 hasRelatedWork W1973535531 @default.
- W2038585855 hasRelatedWork W1992112216 @default.
- W2038585855 hasRelatedWork W2048202272 @default.
- W2038585855 hasRelatedWork W2087829218 @default.
- W2038585855 hasRelatedWork W2099225351 @default.
- W2038585855 hasRelatedWork W2188600765 @default.
- W2038585855 hasRelatedWork W2590154857 @default.
- W2038585855 hasRelatedWork W2606791715 @default.
- W2038585855 hasRelatedWork W2611719484 @default.
- W2038585855 hasRelatedWork W2736701013 @default.
- W2038585855 hasRelatedWork W2737061368 @default.
- W2038585855 hasRelatedWork W2763434573 @default.
- W2038585855 hasRelatedWork W2793119763 @default.
- W2038585855 hasRelatedWork W2796285681 @default.
- W2038585855 hasRelatedWork W2898123547 @default.
- W2038585855 hasRelatedWork W2951438725 @default.
- W2038585855 hasRelatedWork W2964166089 @default.
- W2038585855 hasRelatedWork W3031338162 @default.
- W2038585855 hasRelatedWork W3032304977 @default.
- W2038585855 hasRelatedWork W3206583875 @default.
- W2038585855 isParatext "false" @default.