Matches in SemOpenAlex for { <https://semopenalex.org/work/W4280523009> ?p ?o ?g. }
- W4280523009 abstract "Taking domain knowledge into account is a long-standing issue in AI, especially nowadays where huge amounts of data are collected in the hope of delivering new in-sights and value. Let us consider the following scenario. Let D(y, x1, … ,x <inf xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>n</inf> ) be a dataset, Alice a data scientist, Bob a domain expert and <tex xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>$y$</tex> = <tex xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>$f$</tex> (x1, … , xn) a function known by Bob from his background knowledge. We are interested in the following simple yet crucial questions for Alice: how to define the satisfaction of f in D and how difficult is it to measure that satisfaction? It turns out that those problems are related to functional dependencies (FDs) and especially FD measurements used to quantify their satisfaction in a dataset such as the g3 indicator. In this paper, we examine the computation of g3 with crisp FDs (aka. exact FDs) and a large class of non-crisp FDs replacing strict equality by more flexible predicates. Interestingly, it is known that the computation of g3 with crisp FDs is polynomial but turns out to be NP-Hard for non-crisp FDs. In this paper, we propose different exact and approximate solutions for the computation of g3 for both types. First, for crisp FDs with very large datasets, we propose solutions based on uniform and stratified random sampling. Second, for non-crisp FDs we present a detailed computation pipeline with various computation optimizations, including approximation algorithms and adaptations of recent developments in sublinear algorithms for NP-Hard problems. We also propose an in-depth experimental study of the algorithms presented in terms of time performances and approximation accuracy. All the algorithms are also made available through FASTG3, an open-source Python library designed to be intuitive and efficient thanks to an underlying C++ implementation." @default.
- W4280523009 created "2022-05-22" @default.
- W4280523009 creator A5057464580 @default.
- W4280523009 creator A5063350327 @default.
- W4280523009 creator A5064659680 @default.
- W4280523009 date "2022-05-01" @default.
- W4280523009 modified "2023-10-01" @default.
- W4280523009 title "Assessing the Existence of a Function in a Dataset with the g3 Indicator" @default.
- W4280523009 cites W1539682630 @default.
- W4280523009 cites W1573493856 @default.
- W4280523009 cites W1605346166 @default.
- W4280523009 cites W1724849505 @default.
- W4280523009 cites W1971459440 @default.
- W4280523009 cites W1980155175 @default.
- W4280523009 cites W1994962776 @default.
- W4280523009 cites W2003665833 @default.
- W4280523009 cites W2027002571 @default.
- W4280523009 cites W2030970869 @default.
- W4280523009 cites W2044108498 @default.
- W4280523009 cites W2048450291 @default.
- W4280523009 cites W2051200809 @default.
- W4280523009 cites W2065528935 @default.
- W4280523009 cites W2102489964 @default.
- W4280523009 cites W2102998074 @default.
- W4280523009 cites W2103012681 @default.
- W4280523009 cites W2104266030 @default.
- W4280523009 cites W2109330224 @default.
- W4280523009 cites W2115500858 @default.
- W4280523009 cites W2119367950 @default.
- W4280523009 cites W2119885577 @default.
- W4280523009 cites W2120220749 @default.
- W4280523009 cites W2122448854 @default.
- W4280523009 cites W2133409729 @default.
- W4280523009 cites W2137118456 @default.
- W4280523009 cites W2143698439 @default.
- W4280523009 cites W2145937415 @default.
- W4280523009 cites W2153531471 @default.
- W4280523009 cites W2166549982 @default.
- W4280523009 cites W2186686397 @default.
- W4280523009 cites W2551370001 @default.
- W4280523009 cites W2732517469 @default.
- W4280523009 cites W2953253371 @default.
- W4280523009 cites W299839057 @default.
- W4280523009 cites W3010460567 @default.
- W4280523009 cites W3086465290 @default.
- W4280523009 cites W3185107419 @default.
- W4280523009 doi "https://doi.org/10.1109/icde53745.2022.00050" @default.
- W4280523009 hasPublicationYear "2022" @default.
- W4280523009 type Work @default.
- W4280523009 citedByCount "1" @default.
- W4280523009 countsByYear W42805230092023 @default.
- W4280523009 crossrefType "proceedings-article" @default.
- W4280523009 hasAuthorship W4280523009A5057464580 @default.
- W4280523009 hasAuthorship W4280523009A5063350327 @default.
- W4280523009 hasAuthorship W4280523009A5064659680 @default.
- W4280523009 hasBestOaLocation W42805230092 @default.
- W4280523009 hasConcept C11413529 @default.
- W4280523009 hasConcept C118615104 @default.
- W4280523009 hasConcept C121158502 @default.
- W4280523009 hasConcept C134306372 @default.
- W4280523009 hasConcept C14036430 @default.
- W4280523009 hasConcept C154945302 @default.
- W4280523009 hasConcept C161191863 @default.
- W4280523009 hasConcept C199360897 @default.
- W4280523009 hasConcept C2778222013 @default.
- W4280523009 hasConcept C33923547 @default.
- W4280523009 hasConcept C36503486 @default.
- W4280523009 hasConcept C41008148 @default.
- W4280523009 hasConcept C45374587 @default.
- W4280523009 hasConcept C78458016 @default.
- W4280523009 hasConcept C80444323 @default.
- W4280523009 hasConcept C86803240 @default.
- W4280523009 hasConceptScore W4280523009C11413529 @default.
- W4280523009 hasConceptScore W4280523009C118615104 @default.
- W4280523009 hasConceptScore W4280523009C121158502 @default.
- W4280523009 hasConceptScore W4280523009C134306372 @default.
- W4280523009 hasConceptScore W4280523009C14036430 @default.
- W4280523009 hasConceptScore W4280523009C154945302 @default.
- W4280523009 hasConceptScore W4280523009C161191863 @default.
- W4280523009 hasConceptScore W4280523009C199360897 @default.
- W4280523009 hasConceptScore W4280523009C2778222013 @default.
- W4280523009 hasConceptScore W4280523009C33923547 @default.
- W4280523009 hasConceptScore W4280523009C36503486 @default.
- W4280523009 hasConceptScore W4280523009C41008148 @default.
- W4280523009 hasConceptScore W4280523009C45374587 @default.
- W4280523009 hasConceptScore W4280523009C78458016 @default.
- W4280523009 hasConceptScore W4280523009C80444323 @default.
- W4280523009 hasConceptScore W4280523009C86803240 @default.
- W4280523009 hasLocation W42805230091 @default.
- W4280523009 hasLocation W42805230092 @default.
- W4280523009 hasLocation W42805230093 @default.
- W4280523009 hasLocation W42805230094 @default.
- W4280523009 hasLocation W42805230095 @default.
- W4280523009 hasLocation W42805230096 @default.
- W4280523009 hasOpenAccess W4280523009 @default.
- W4280523009 hasPrimaryLocation W42805230091 @default.
- W4280523009 hasRelatedWork W1572523360 @default.
- W4280523009 hasRelatedWork W1991466308 @default.
- W4280523009 hasRelatedWork W2362192218 @default.
- W4280523009 hasRelatedWork W2373204995 @default.