Matches in SemOpenAlex for { <https://semopenalex.org/work/W2052698082> ?p ?o ?g. }
- W2052698082 abstract "In entity matching, a fundamental issue while training a classifier to label pairs of entities as either duplicates or non-duplicates is the one of selecting informative training examples. Although active learning presents an attractive solution to this problem, previous approaches minimize the misclassification rate (0-1 loss) of the classifier, which is an unsuitable metric for entity matching due to class imbalance (i.e., many more non-duplicate pairs than duplicate pairs). To address this, a recent paper [1] proposes to maximize recall of the classifier under the constraint that its precision should be greater than a specified threshold. However, the proposed technique requires the labels of all n input pairs in the worst-case.Our main result is an active learning algorithm that approximately maximizes recall of the classifier while respecting a precision constraint with provably sub-linear label complexity (under certain distributional assumptions). Our algorithm uses as a black-box any active learning module that minimizes 0-1 loss. We show that label complexity of our algorithm is at most log n times the label complexity of the black-box, and also bound the difference in the recall of classifier learnt by our algorithm and the recall of the optimal classifier satisfying the precision constraint. We provide an empirical evaluation of our algorithm on several real-world matching data sets that demonstrates the effectiveness of our approach." @default.
- W2052698082 created "2016-06-24" @default.
- W2052698082 creator A5003599004 @default.
- W2052698082 creator A5013608601 @default.
- W2052698082 creator A5041045187 @default.
- W2052698082 creator A5090835762 @default.
- W2052698082 date "2012-01-01" @default.
- W2052698082 modified "2023-09-28" @default.
- W2052698082 title "Active sampling for entity matching" @default.
- W2052698082 cites W1518784700 @default.
- W2052698082 cites W1572210044 @default.
- W2052698082 cites W1600192391 @default.
- W2052698082 cites W2005314985 @default.
- W2052698082 cites W2036216970 @default.
- W2052698082 cites W2042932437 @default.
- W2052698082 cites W2056707879 @default.
- W2052698082 cites W2067566391 @default.
- W2052698082 cites W2095644746 @default.
- W2052698082 cites W2099416425 @default.
- W2052698082 cites W2104511295 @default.
- W2052698082 cites W2108991785 @default.
- W2052698082 cites W2114232233 @default.
- W2052698082 cites W2117756453 @default.
- W2052698082 cites W2117974736 @default.
- W2052698082 cites W2119320829 @default.
- W2052698082 cites W2122689903 @default.
- W2052698082 cites W2125943921 @default.
- W2052698082 cites W2128518360 @default.
- W2052698082 cites W2142021955 @default.
- W2052698082 cites W2142261479 @default.
- W2052698082 cites W2143124645 @default.
- W2052698082 cites W2151023586 @default.
- W2052698082 cites W2164456230 @default.
- W2052698082 cites W2165484066 @default.
- W2052698082 cites W2166886563 @default.
- W2052698082 cites W2167595980 @default.
- W2052698082 cites W2170902582 @default.
- W2052698082 cites W2903158431 @default.
- W2052698082 cites W2951528191 @default.
- W2052698082 cites W3120740533 @default.
- W2052698082 cites W46452414 @default.
- W2052698082 cites W61518770 @default.
- W2052698082 doi "https://doi.org/10.1145/2339530.2339707" @default.
- W2052698082 hasPublicationYear "2012" @default.
- W2052698082 type Work @default.
- W2052698082 sameAs 2052698082 @default.
- W2052698082 citedByCount "77" @default.
- W2052698082 countsByYear W20526980822012 @default.
- W2052698082 countsByYear W20526980822013 @default.
- W2052698082 countsByYear W20526980822014 @default.
- W2052698082 countsByYear W20526980822015 @default.
- W2052698082 countsByYear W20526980822016 @default.
- W2052698082 countsByYear W20526980822017 @default.
- W2052698082 countsByYear W20526980822018 @default.
- W2052698082 countsByYear W20526980822019 @default.
- W2052698082 countsByYear W20526980822020 @default.
- W2052698082 countsByYear W20526980822021 @default.
- W2052698082 countsByYear W20526980822022 @default.
- W2052698082 countsByYear W20526980822023 @default.
- W2052698082 crossrefType "proceedings-article" @default.
- W2052698082 hasAuthorship W2052698082A5003599004 @default.
- W2052698082 hasAuthorship W2052698082A5013608601 @default.
- W2052698082 hasAuthorship W2052698082A5041045187 @default.
- W2052698082 hasAuthorship W2052698082A5090835762 @default.
- W2052698082 hasBestOaLocation W20526980822 @default.
- W2052698082 hasConcept C100660578 @default.
- W2052698082 hasConcept C11413529 @default.
- W2052698082 hasConcept C119857082 @default.
- W2052698082 hasConcept C124101348 @default.
- W2052698082 hasConcept C138885662 @default.
- W2052698082 hasConcept C153180895 @default.
- W2052698082 hasConcept C154945302 @default.
- W2052698082 hasConcept C311688 @default.
- W2052698082 hasConcept C41008148 @default.
- W2052698082 hasConcept C41895202 @default.
- W2052698082 hasConcept C81669768 @default.
- W2052698082 hasConcept C95623464 @default.
- W2052698082 hasConceptScore W2052698082C100660578 @default.
- W2052698082 hasConceptScore W2052698082C11413529 @default.
- W2052698082 hasConceptScore W2052698082C119857082 @default.
- W2052698082 hasConceptScore W2052698082C124101348 @default.
- W2052698082 hasConceptScore W2052698082C138885662 @default.
- W2052698082 hasConceptScore W2052698082C153180895 @default.
- W2052698082 hasConceptScore W2052698082C154945302 @default.
- W2052698082 hasConceptScore W2052698082C311688 @default.
- W2052698082 hasConceptScore W2052698082C41008148 @default.
- W2052698082 hasConceptScore W2052698082C41895202 @default.
- W2052698082 hasConceptScore W2052698082C81669768 @default.
- W2052698082 hasConceptScore W2052698082C95623464 @default.
- W2052698082 hasLocation W20526980821 @default.
- W2052698082 hasLocation W20526980822 @default.
- W2052698082 hasOpenAccess W2052698082 @default.
- W2052698082 hasPrimaryLocation W20526980821 @default.
- W2052698082 hasRelatedWork W1547612978 @default.
- W2052698082 hasRelatedWork W1981590391 @default.
- W2052698082 hasRelatedWork W2028818097 @default.
- W2052698082 hasRelatedWork W2031250218 @default.
- W2052698082 hasRelatedWork W2035615211 @default.
- W2052698082 hasRelatedWork W2043772275 @default.
- W2052698082 hasRelatedWork W2053653724 @default.