Matches in SemOpenAlex for { <https://semopenalex.org/work/W2000809552> ?p ?o ?g. }
- W2000809552 abstract "A lot of systems and applications are data-driven, and the correctness of their operation relies heavily on the correctness of their data. While existing data cleaning techniques can be quite effective at purging datasets of errors, they disregard the fact that a lot of errors are systematic, inherent to the process that produces the data, and thus will keep occurring unless the problem is corrected at its source. In contrast to traditional data cleaning, in this paper we focus on data diagnosis: explaining where and how the errors happen in a data generative process. We develop a large-scale diagnostic framework called DATA X-RAY. Our contributions are three-fold. First, we transform the diagnosis problem to the problem of finding common properties among erroneous elements, with minimal domain-specific assumptions. Second, we use Bayesian analysis to derive a cost model that implements three intuitive principles of good diagnoses. Third, we design an efficient, highly-parallelizable algorithm for performing data diagnosis on large-scale data. We evaluate our cost model and algorithm using both real-world and synthetic data, and show that our diagnostic framework produces better diagnoses and is orders of magnitude more efficient than existing techniques." @default.
- W2000809552 created "2016-06-24" @default.
- W2000809552 creator A5001402526 @default.
- W2000809552 creator A5019048013 @default.
- W2000809552 creator A5082396860 @default.
- W2000809552 date "2015-05-27" @default.
- W2000809552 modified "2023-10-12" @default.
- W2000809552 title "Data X-Ray" @default.
- W2000809552 cites W1565102206 @default.
- W2000809552 cites W1981873012 @default.
- W2000809552 cites W1983178058 @default.
- W2000809552 cites W1985694287 @default.
- W2000809552 cites W1988545508 @default.
- W2000809552 cites W1996430422 @default.
- W2000809552 cites W2004915807 @default.
- W2000809552 cites W2007682403 @default.
- W2000809552 cites W2008896880 @default.
- W2000809552 cites W2016753842 @default.
- W2000809552 cites W2022953760 @default.
- W2000809552 cites W2023052779 @default.
- W2000809552 cites W2024834471 @default.
- W2000809552 cites W2028438572 @default.
- W2000809552 cites W2044469685 @default.
- W2000809552 cites W2047182010 @default.
- W2000809552 cites W2054967933 @default.
- W2000809552 cites W2063103859 @default.
- W2000809552 cites W2074663022 @default.
- W2000809552 cites W2087183379 @default.
- W2000809552 cites W2088958550 @default.
- W2000809552 cites W2089634871 @default.
- W2000809552 cites W2096544401 @default.
- W2000809552 cites W2107080109 @default.
- W2000809552 cites W2110411158 @default.
- W2000809552 cites W2112840274 @default.
- W2000809552 cites W2115010808 @default.
- W2000809552 cites W2124026371 @default.
- W2000809552 cites W2127767383 @default.
- W2000809552 cites W2133409729 @default.
- W2000809552 cites W2133986470 @default.
- W2000809552 cites W2133990480 @default.
- W2000809552 cites W2135046866 @default.
- W2000809552 cites W2144767994 @default.
- W2000809552 cites W2149706766 @default.
- W2000809552 cites W2155160033 @default.
- W2000809552 cites W2157054705 @default.
- W2000809552 cites W2164334663 @default.
- W2000809552 cites W2167541073 @default.
- W2000809552 cites W2169585110 @default.
- W2000809552 cites W2171332293 @default.
- W2000809552 cites W2261544779 @default.
- W2000809552 cites W2293299776 @default.
- W2000809552 cites W2402668406 @default.
- W2000809552 cites W2625773614 @default.
- W2000809552 cites W3123128004 @default.
- W2000809552 cites W333550619 @default.
- W2000809552 cites W4250102289 @default.
- W2000809552 doi "https://doi.org/10.1145/2723372.2750549" @default.
- W2000809552 hasPublicationYear "2015" @default.
- W2000809552 type Work @default.
- W2000809552 sameAs 2000809552 @default.
- W2000809552 citedByCount "69" @default.
- W2000809552 countsByYear W20008095522015 @default.
- W2000809552 countsByYear W20008095522016 @default.
- W2000809552 countsByYear W20008095522017 @default.
- W2000809552 countsByYear W20008095522018 @default.
- W2000809552 countsByYear W20008095522019 @default.
- W2000809552 countsByYear W20008095522020 @default.
- W2000809552 countsByYear W20008095522021 @default.
- W2000809552 countsByYear W20008095522022 @default.
- W2000809552 countsByYear W20008095522023 @default.
- W2000809552 crossrefType "proceedings-article" @default.
- W2000809552 hasAuthorship W2000809552A5001402526 @default.
- W2000809552 hasAuthorship W2000809552A5019048013 @default.
- W2000809552 hasAuthorship W2000809552A5082396860 @default.
- W2000809552 hasBestOaLocation W20008095521 @default.
- W2000809552 hasConcept C111919701 @default.
- W2000809552 hasConcept C11413529 @default.
- W2000809552 hasConcept C120665830 @default.
- W2000809552 hasConcept C121332964 @default.
- W2000809552 hasConcept C124101348 @default.
- W2000809552 hasConcept C134306372 @default.
- W2000809552 hasConcept C142724271 @default.
- W2000809552 hasConcept C148047603 @default.
- W2000809552 hasConcept C192209626 @default.
- W2000809552 hasConcept C2778755073 @default.
- W2000809552 hasConcept C33923547 @default.
- W2000809552 hasConcept C36503486 @default.
- W2000809552 hasConcept C41008148 @default.
- W2000809552 hasConcept C534262118 @default.
- W2000809552 hasConcept C55439883 @default.
- W2000809552 hasConcept C62520636 @default.
- W2000809552 hasConcept C71924100 @default.
- W2000809552 hasConcept C98045186 @default.
- W2000809552 hasConceptScore W2000809552C111919701 @default.
- W2000809552 hasConceptScore W2000809552C11413529 @default.
- W2000809552 hasConceptScore W2000809552C120665830 @default.
- W2000809552 hasConceptScore W2000809552C121332964 @default.
- W2000809552 hasConceptScore W2000809552C124101348 @default.
- W2000809552 hasConceptScore W2000809552C134306372 @default.
- W2000809552 hasConceptScore W2000809552C142724271 @default.