Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226173407> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4226173407 abstract "Missing data is a fundamental obstacle in the practice of data science. This paper surveys a few conventions for imputation as available in the Automunge open source python library platform for tabular data preprocessing, including ML infill in which auto ML models are trained for target features from partitioned extracts of a training set. A series of validation experiments were performed to benchmark imputation scenarios towards downstream model performance, in which it was found for the given benchmark sets that in many cases ML infill outperformed for both numeric and categoric target features, and was otherwise at minimum within noise distributions of the other imputation scenarios. Evidence also suggested supplementing ML infill with the addition of support columns with boolean integer markers signaling presence of infill was usually beneficial to downstream model performance. We consider these results sufficient to recommend defaulting to ML infill for tabular learning, and further recommend supplementing imputations with support columns signaling presence of infill, each as can be prepared with push-button operation in the Automunge library. Our contributions include an auto ML derived missing data imputation library for tabular learning in the python ecosystem, fully integrated into a preprocessing platform with an extensive library of feature transformations, with a novel production friendly implementation that bases imputation models on a designated train set for consistent basis towards additional data." @default.
- W4226173407 created "2022-05-05" @default.
- W4226173407 creator A5009417935 @default.
- W4226173407 date "2022-02-18" @default.
- W4226173407 modified "2023-10-18" @default.
- W4226173407 title "Missing Data Infill with Automunge" @default.
- W4226173407 doi "https://doi.org/10.48550/arxiv.2202.09484" @default.
- W4226173407 hasPublicationYear "2022" @default.
- W4226173407 type Work @default.
- W4226173407 citedByCount "0" @default.
- W4226173407 crossrefType "posted-content" @default.
- W4226173407 hasAuthorship W4226173407A5009417935 @default.
- W4226173407 hasBestOaLocation W42261734071 @default.
- W4226173407 hasConcept C10551718 @default.
- W4226173407 hasConcept C119857082 @default.
- W4226173407 hasConcept C124101348 @default.
- W4226173407 hasConcept C127313418 @default.
- W4226173407 hasConcept C127413603 @default.
- W4226173407 hasConcept C13280743 @default.
- W4226173407 hasConcept C153180895 @default.
- W4226173407 hasConcept C154945302 @default.
- W4226173407 hasConcept C185798385 @default.
- W4226173407 hasConcept C199360897 @default.
- W4226173407 hasConcept C2781219549 @default.
- W4226173407 hasConcept C34736171 @default.
- W4226173407 hasConcept C41008148 @default.
- W4226173407 hasConcept C519991488 @default.
- W4226173407 hasConcept C58041806 @default.
- W4226173407 hasConcept C66938386 @default.
- W4226173407 hasConcept C9357733 @default.
- W4226173407 hasConceptScore W4226173407C10551718 @default.
- W4226173407 hasConceptScore W4226173407C119857082 @default.
- W4226173407 hasConceptScore W4226173407C124101348 @default.
- W4226173407 hasConceptScore W4226173407C127313418 @default.
- W4226173407 hasConceptScore W4226173407C127413603 @default.
- W4226173407 hasConceptScore W4226173407C13280743 @default.
- W4226173407 hasConceptScore W4226173407C153180895 @default.
- W4226173407 hasConceptScore W4226173407C154945302 @default.
- W4226173407 hasConceptScore W4226173407C185798385 @default.
- W4226173407 hasConceptScore W4226173407C199360897 @default.
- W4226173407 hasConceptScore W4226173407C2781219549 @default.
- W4226173407 hasConceptScore W4226173407C34736171 @default.
- W4226173407 hasConceptScore W4226173407C41008148 @default.
- W4226173407 hasConceptScore W4226173407C519991488 @default.
- W4226173407 hasConceptScore W4226173407C58041806 @default.
- W4226173407 hasConceptScore W4226173407C66938386 @default.
- W4226173407 hasConceptScore W4226173407C9357733 @default.
- W4226173407 hasLocation W42261734071 @default.
- W4226173407 hasOpenAccess W4226173407 @default.
- W4226173407 hasPrimaryLocation W42261734071 @default.
- W4226173407 hasRelatedWork W2022884247 @default.
- W4226173407 hasRelatedWork W2163930361 @default.
- W4226173407 hasRelatedWork W2412415446 @default.
- W4226173407 hasRelatedWork W2777581963 @default.
- W4226173407 hasRelatedWork W2784019465 @default.
- W4226173407 hasRelatedWork W2926599221 @default.
- W4226173407 hasRelatedWork W2997537609 @default.
- W4226173407 hasRelatedWork W4226173407 @default.
- W4226173407 hasRelatedWork W4237962661 @default.
- W4226173407 hasRelatedWork W53754145 @default.
- W4226173407 isParatext "false" @default.
- W4226173407 isRetracted "false" @default.
- W4226173407 workType "article" @default.