Matches in SemOpenAlex for { <https://semopenalex.org/work/W2510118238> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W2510118238 abstract "The record deduplication is the task of identifying, in a data repository, records that refer to the same real world entity or object in spite of misspelling words, types, different writing styles or even different schema representations or data types. In existing system aims at providing Unsupervised Duplication Detection (UDD) method which can be used to identify and remove the duplicate records from different data sources. Starting from the non duplicate set, the two cooperating classifiers, a Weighted Component Similarity Summing Classifier (WCSS) and Support Vector Machine (SVM) are used to iteratively identify the duplicate records from the non duplicate record and present a genetic programming (GP) approach to record deduplication. Their GP-based approach is also able to automatically find effective deduplication functions. We propose to employ learnable text distance functions for each database field, and show that such measures are capable of adapting to the specific notion of similarity that is appropriate for the field’s domain. We present two learnable text similarity measures suitable for this task: an extended variant of learnable string edit distance, and a novel vector-space based measure that employs a Support Vector Machine (SVM) for training. Experimental results on a range of datasets show" @default.
- W2510118238 created "2016-09-16" @default.
- W2510118238 creator A5029677534 @default.
- W2510118238 date "2013-01-01" @default.
- W2510118238 modified "2023-09-27" @default.
- W2510118238 title "Effective Duplicate Detection Using Generational Evolutionary Algorithm" @default.
- W2510118238 cites W1660390307 @default.
- W2510118238 cites W1881647329 @default.
- W2510118238 cites W2009570821 @default.
- W2510118238 cites W2073471108 @default.
- W2510118238 cites W2085099553 @default.
- W2510118238 cites W2124996875 @default.
- W2510118238 cites W2153977007 @default.
- W2510118238 cites W2610179052 @default.
- W2510118238 hasPublicationYear "2013" @default.
- W2510118238 type Work @default.
- W2510118238 sameAs 2510118238 @default.
- W2510118238 citedByCount "0" @default.
- W2510118238 crossrefType "journal-article" @default.
- W2510118238 hasAuthorship W2510118238A5029677534 @default.
- W2510118238 hasConcept C12267149 @default.
- W2510118238 hasConcept C124101348 @default.
- W2510118238 hasConcept C153180895 @default.
- W2510118238 hasConcept C154945302 @default.
- W2510118238 hasConcept C2776517306 @default.
- W2510118238 hasConcept C32587265 @default.
- W2510118238 hasConcept C41008148 @default.
- W2510118238 hasConcept C44359876 @default.
- W2510118238 hasConcept C77088390 @default.
- W2510118238 hasConcept C95623464 @default.
- W2510118238 hasConceptScore W2510118238C12267149 @default.
- W2510118238 hasConceptScore W2510118238C124101348 @default.
- W2510118238 hasConceptScore W2510118238C153180895 @default.
- W2510118238 hasConceptScore W2510118238C154945302 @default.
- W2510118238 hasConceptScore W2510118238C2776517306 @default.
- W2510118238 hasConceptScore W2510118238C32587265 @default.
- W2510118238 hasConceptScore W2510118238C41008148 @default.
- W2510118238 hasConceptScore W2510118238C44359876 @default.
- W2510118238 hasConceptScore W2510118238C77088390 @default.
- W2510118238 hasConceptScore W2510118238C95623464 @default.
- W2510118238 hasLocation W25101182381 @default.
- W2510118238 hasOpenAccess W2510118238 @default.
- W2510118238 hasPrimaryLocation W25101182381 @default.
- W2510118238 hasRelatedWork W1985463782 @default.
- W2510118238 hasRelatedWork W2028431844 @default.
- W2510118238 hasRelatedWork W2139945125 @default.
- W2510118238 hasRelatedWork W2145487765 @default.
- W2510118238 hasRelatedWork W2153977007 @default.
- W2510118238 hasRelatedWork W2161370387 @default.
- W2510118238 hasRelatedWork W2164456230 @default.
- W2510118238 hasRelatedWork W2167011595 @default.
- W2510118238 hasRelatedWork W2169948249 @default.
- W2510118238 hasRelatedWork W2183969224 @default.
- W2510118238 hasRelatedWork W2479258387 @default.
- W2510118238 hasRelatedWork W2558583977 @default.
- W2510118238 hasRelatedWork W2891808172 @default.
- W2510118238 hasRelatedWork W2949770028 @default.
- W2510118238 hasRelatedWork W2971101812 @default.
- W2510118238 hasRelatedWork W3207840959 @default.
- W2510118238 hasRelatedWork W2185763298 @default.
- W2510118238 hasRelatedWork W2187540230 @default.
- W2510118238 hasRelatedWork W2556484267 @default.
- W2510118238 hasRelatedWork W3141997893 @default.
- W2510118238 isParatext "false" @default.
- W2510118238 isRetracted "false" @default.
- W2510118238 magId "2510118238" @default.
- W2510118238 workType "article" @default.