Matches in SemOpenAlex for { <https://semopenalex.org/work/W3107002183> ?p ?o ?g. }
- W3107002183 abstract "With the rapid development of the internet technology, dirty data are commonly observed in various real scenarios, e.g., owing to unreliable sensor reading, transmission and collection from heterogeneous sources. To deal with their negative effects on downstream applications, data cleaning approaches are designed to preprocess the dirty data before conducting applications. The idea of most data cleaning methods is to identify or correct dirty data, referring to the values of their neighbors which share the same information. Unfortunately, owing to data sparsity and heterogeneity, the number of neighbors based on equality relationship is rather limited, especially in the presence of data values with variances. To tackle this problem, distance-based data cleaning approaches propose to consider similarity neighbors based on value distance. By tolerance of small variants, the enriched similarity neighbors can be identified and used for data cleaning tasks. At the same time, distance relationship between tuples is also helpful to guide the data cleaning, which contains more information and includes the equality relationship. Therefore, distance-based technology plays an important role in the data cleaning area, and we also have reason to believe that distance-based data cleaning technology will attract more attention in data preprocessing research in the future. Hence this survey provides a classification of four main data cleaning tasks, i.e., rule profiling, error detection, data repair and data imputation, and comprehensively reviews the state of the art for each class." @default.
- W3107002183 created "2020-12-07" @default.
- W3107002183 creator A5057209439 @default.
- W3107002183 creator A5065922956 @default.
- W3107002183 date "2020-11-23" @default.
- W3107002183 modified "2023-09-23" @default.
- W3107002183 title "Distance-based Data Cleaning: A Survey (Technical Report)." @default.
- W3107002183 cites W107081620 @default.
- W3107002183 cites W131681758 @default.
- W3107002183 cites W1502001434 @default.
- W3107002183 cites W1519775745 @default.
- W3107002183 cites W1552339598 @default.
- W3107002183 cites W1673310716 @default.
- W3107002183 cites W172453251 @default.
- W3107002183 cites W1781748254 @default.
- W3107002183 cites W1966836840 @default.
- W3107002183 cites W1985239372 @default.
- W3107002183 cites W1985258161 @default.
- W3107002183 cites W1987111416 @default.
- W3107002183 cites W1992673035 @default.
- W3107002183 cites W2002928429 @default.
- W3107002183 cites W2017508768 @default.
- W3107002183 cites W2017978889 @default.
- W3107002183 cites W2040808478 @default.
- W3107002183 cites W2045054164 @default.
- W3107002183 cites W2049058890 @default.
- W3107002183 cites W2055621992 @default.
- W3107002183 cites W2059009730 @default.
- W3107002183 cites W2061240327 @default.
- W3107002183 cites W2079223746 @default.
- W3107002183 cites W2079701324 @default.
- W3107002183 cites W2090132955 @default.
- W3107002183 cites W2099637074 @default.
- W3107002183 cites W2100986433 @default.
- W3107002183 cites W2108132403 @default.
- W3107002183 cites W2111043320 @default.
- W3107002183 cites W2119367950 @default.
- W3107002183 cites W2127218421 @default.
- W3107002183 cites W2127672769 @default.
- W3107002183 cites W2140215983 @default.
- W3107002183 cites W2142541325 @default.
- W3107002183 cites W2144182447 @default.
- W3107002183 cites W2145346822 @default.
- W3107002183 cites W2147805208 @default.
- W3107002183 cites W2160642098 @default.
- W3107002183 cites W2162449239 @default.
- W3107002183 cites W2164187405 @default.
- W3107002183 cites W2166549982 @default.
- W3107002183 cites W2167489506 @default.
- W3107002183 cites W2167546040 @default.
- W3107002183 cites W2169940602 @default.
- W3107002183 cites W2183774130 @default.
- W3107002183 cites W2186349203 @default.
- W3107002183 cites W2282784388 @default.
- W3107002183 cites W2298871042 @default.
- W3107002183 cites W2405686381 @default.
- W3107002183 cites W2421610675 @default.
- W3107002183 cites W2482320976 @default.
- W3107002183 cites W2500909930 @default.
- W3107002183 cites W2591700809 @default.
- W3107002183 cites W2604625137 @default.
- W3107002183 cites W2605029335 @default.
- W3107002183 cites W2616903276 @default.
- W3107002183 cites W2732517469 @default.
- W3107002183 cites W2748435103 @default.
- W3107002183 cites W2791676989 @default.
- W3107002183 cites W2901288224 @default.
- W3107002183 cites W2912031392 @default.
- W3107002183 cites W2918901610 @default.
- W3107002183 cites W2935765604 @default.
- W3107002183 cites W2951565755 @default.
- W3107002183 cites W3031359560 @default.
- W3107002183 cites W3080402700 @default.
- W3107002183 hasPublicationYear "2020" @default.
- W3107002183 type Work @default.
- W3107002183 sameAs 3107002183 @default.
- W3107002183 citedByCount "0" @default.
- W3107002183 crossrefType "posted-content" @default.
- W3107002183 hasAuthorship W3107002183A5057209439 @default.
- W3107002183 hasAuthorship W3107002183A5065922956 @default.
- W3107002183 hasConcept C103278499 @default.
- W3107002183 hasConcept C10551718 @default.
- W3107002183 hasConcept C105795698 @default.
- W3107002183 hasConcept C115961682 @default.
- W3107002183 hasConcept C118615104 @default.
- W3107002183 hasConcept C118930307 @default.
- W3107002183 hasConcept C119857082 @default.
- W3107002183 hasConcept C124101348 @default.
- W3107002183 hasConcept C133462117 @default.
- W3107002183 hasConcept C154945302 @default.
- W3107002183 hasConcept C2522767166 @default.
- W3107002183 hasConcept C33923547 @default.
- W3107002183 hasConcept C34736171 @default.
- W3107002183 hasConcept C41008148 @default.
- W3107002183 hasConcept C58041806 @default.
- W3107002183 hasConcept C9357733 @default.
- W3107002183 hasConceptScore W3107002183C103278499 @default.
- W3107002183 hasConceptScore W3107002183C10551718 @default.
- W3107002183 hasConceptScore W3107002183C105795698 @default.
- W3107002183 hasConceptScore W3107002183C115961682 @default.