Matches in SemOpenAlex for { <https://semopenalex.org/work/W2914585587> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W2914585587 abstract "As one critical task in the data analysis pipeline, data cleaning is notoriously human labor-intensive and error-prone. Knowledge base-assisted data cleaning has proved a powerful tool for finding and fixing data defects; however, its applicability is inevitably bounded by the natural limitations of knowledge bases. Meanwhile, although a vast number of knowledge sources exist in the form of free-text corpora (e.g., Wikipedia), transforming them into formats usable by existing data cleaning tools can be prohibitively costly and error-prone, if not at all impossible. Here, we present DeepClean, the first end-to-end data cleaning framework powered by free-text knowledge sources. At a high level, DeepClean leverages a knowledge source through its question-answering (QA) interface and achieves high-quality cleaning via iterative question asking. Specifically, DeepClean detects and repairs data defects in three stages: (i) Pattern extraction - it automatically discovers the semantic types of the data attributes as well as their correlations; (ii) Question generation - it translates each data tuple into a minimal set of validation questions; (iii) Completion and repair - by checking the answers returned by the knowledge source against the data values, it identifies erroneous cases and suggests possible fixes. Through extensive empirical studies, we demonstrate that DeepClean is applicable to a range of domains, and can effectively repair a variety of data defects, highlighting data cleaning powered by free-text knowledge sources as a promising direction for future research." @default.
- W2914585587 created "2019-02-21" @default.
- W2914585587 creator A5022999126 @default.
- W2914585587 creator A5039596514 @default.
- W2914585587 creator A5067516624 @default.
- W2914585587 creator A5068080767 @default.
- W2914585587 date "2018-10-01" @default.
- W2914585587 modified "2023-09-27" @default.
- W2914585587 title "DeepClean: Data Cleaning via Question Asking" @default.
- W2914585587 cites W140075217 @default.
- W2914585587 cites W1544827683 @default.
- W2914585587 cites W1793121960 @default.
- W2914585587 cites W1823076153 @default.
- W2914585587 cites W1970544520 @default.
- W2914585587 cites W1992479406 @default.
- W2914585587 cites W2020278455 @default.
- W2914585587 cites W2044469685 @default.
- W2914585587 cites W2063103859 @default.
- W2914585587 cites W2078132546 @default.
- W2914585587 cites W2089206172 @default.
- W2914585587 cites W2092364718 @default.
- W2914585587 cites W2094728533 @default.
- W2914585587 cites W2106895292 @default.
- W2914585587 cites W2111869785 @default.
- W2914585587 cites W2122865749 @default.
- W2914585587 cites W2131774270 @default.
- W2914585587 cites W2142472956 @default.
- W2914585587 cites W2161163216 @default.
- W2914585587 cites W2164187405 @default.
- W2914585587 cites W2167333415 @default.
- W2914585587 cites W2183774130 @default.
- W2914585587 cites W2250539671 @default.
- W2914585587 cites W2251913848 @default.
- W2914585587 cites W2427527485 @default.
- W2914585587 cites W2591700809 @default.
- W2914585587 cites W2626154462 @default.
- W2914585587 cites W2962985038 @default.
- W2914585587 cites W2964121744 @default.
- W2914585587 cites W2964212344 @default.
- W2914585587 cites W2964308564 @default.
- W2914585587 cites W3101556001 @default.
- W2914585587 cites W3104486441 @default.
- W2914585587 cites W86887328 @default.
- W2914585587 cites W2123340686 @default.
- W2914585587 doi "https://doi.org/10.1109/dsaa.2018.00039" @default.
- W2914585587 hasPublicationYear "2018" @default.
- W2914585587 type Work @default.
- W2914585587 sameAs 2914585587 @default.
- W2914585587 citedByCount "1" @default.
- W2914585587 countsByYear W29145855872022 @default.
- W2914585587 crossrefType "proceedings-article" @default.
- W2914585587 hasAuthorship W2914585587A5022999126 @default.
- W2914585587 hasAuthorship W2914585587A5039596514 @default.
- W2914585587 hasAuthorship W2914585587A5067516624 @default.
- W2914585587 hasAuthorship W2914585587A5068080767 @default.
- W2914585587 hasConcept C41008148 @default.
- W2914585587 hasConceptScore W2914585587C41008148 @default.
- W2914585587 hasLocation W29145855871 @default.
- W2914585587 hasOpenAccess W2914585587 @default.
- W2914585587 hasPrimaryLocation W29145855871 @default.
- W2914585587 hasRelatedWork W2049775471 @default.
- W2914585587 hasRelatedWork W2093578348 @default.
- W2914585587 hasRelatedWork W2350741829 @default.
- W2914585587 hasRelatedWork W2358668433 @default.
- W2914585587 hasRelatedWork W2376932109 @default.
- W2914585587 hasRelatedWork W2382290278 @default.
- W2914585587 hasRelatedWork W2390279801 @default.
- W2914585587 hasRelatedWork W2748952813 @default.
- W2914585587 hasRelatedWork W2899084033 @default.
- W2914585587 hasRelatedWork W3004735627 @default.
- W2914585587 isParatext "false" @default.
- W2914585587 isRetracted "false" @default.
- W2914585587 magId "2914585587" @default.
- W2914585587 workType "article" @default.