Matches in SemOpenAlex for { <https://semopenalex.org/work/W2950682718> ?p ?o ?g. }
Showing items 1 to 92 of
92
with 100 items per page.
- W2950682718 abstract "Organizations routinely accumulate semi-structured log datasets generated as the output of code; these datasets remain unused and uninterpreted, and occupy wasted space - this phenomenon has been colloquially referred to as lake problem. One approach to leverage these semi-structured datasets is to convert them into a structured relational format, following which they can be analyzed in conjunction with other datasets. We present Datamaran, an tool that extracts structure from semi-structured log datasets with no human supervision. Datamaran automatically identifies field and record endpoints, separates the structured parts from the unstructured noise or formatting, and can tease apart multiple structures from within a dataset, in order to efficiently extract structured relational datasets from semi-structured log datasets, at scale with high accuracy. Compared to other unsupervised log dataset extraction tools developed in prior work, Datamaran does not require the record boundaries to be known beforehand, making it much more applicable to the noisy log files that are ubiquitous in data lakes. Datamaran can successfully extract structured information from all datasets used in prior work, and can achieve 95% extraction accuracy on automatically collected log datasets from GitHub - a substantial 66% increase of accuracy compared to unsupervised schemes from prior work. Our user study further demonstrates that the extraction results of Datamaran are closer to the desired structure than competing algorithms." @default.
- W2950682718 created "2019-06-27" @default.
- W2950682718 creator A5013608601 @default.
- W2950682718 creator A5033996984 @default.
- W2950682718 creator A5090828085 @default.
- W2950682718 date "2017-08-29" @default.
- W2950682718 modified "2023-09-27" @default.
- W2950682718 title "Navigating the Data Lake with Datamaran: Automatically Extracting Structure from Log Datasets" @default.
- W2950682718 cites W1538375546 @default.
- W2950682718 cites W1553019137 @default.
- W2950682718 cites W1562942180 @default.
- W2950682718 cites W1982280055 @default.
- W2950682718 cites W200042785 @default.
- W2950682718 cites W2018506753 @default.
- W2950682718 cites W2023673418 @default.
- W2950682718 cites W2055228862 @default.
- W2950682718 cites W2064766209 @default.
- W2950682718 cites W2093559286 @default.
- W2950682718 cites W2093752301 @default.
- W2950682718 cites W2102098892 @default.
- W2950682718 cites W2103931177 @default.
- W2950682718 cites W2106950427 @default.
- W2950682718 cites W2108223890 @default.
- W2950682718 cites W2115461474 @default.
- W2950682718 cites W2124410446 @default.
- W2950682718 cites W2132525863 @default.
- W2950682718 cites W2135767707 @default.
- W2950682718 cites W2137435551 @default.
- W2950682718 cites W2144951274 @default.
- W2950682718 cites W2145007893 @default.
- W2950682718 cites W2146105230 @default.
- W2950682718 cites W2150721933 @default.
- W2950682718 cites W2164119735 @default.
- W2950682718 cites W2424304400 @default.
- W2950682718 cites W2438792749 @default.
- W2950682718 hasPublicationYear "2017" @default.
- W2950682718 type Work @default.
- W2950682718 sameAs 2950682718 @default.
- W2950682718 citedByCount "0" @default.
- W2950682718 crossrefType "posted-content" @default.
- W2950682718 hasAuthorship W2950682718A5013608601 @default.
- W2950682718 hasAuthorship W2950682718A5033996984 @default.
- W2950682718 hasAuthorship W2950682718A5090828085 @default.
- W2950682718 hasConcept C124101348 @default.
- W2950682718 hasConcept C153083717 @default.
- W2950682718 hasConcept C154945302 @default.
- W2950682718 hasConcept C177264268 @default.
- W2950682718 hasConcept C195807954 @default.
- W2950682718 hasConcept C199360897 @default.
- W2950682718 hasConcept C23123220 @default.
- W2950682718 hasConcept C2776760102 @default.
- W2950682718 hasConcept C40077939 @default.
- W2950682718 hasConcept C41008148 @default.
- W2950682718 hasConcept C5655090 @default.
- W2950682718 hasConceptScore W2950682718C124101348 @default.
- W2950682718 hasConceptScore W2950682718C153083717 @default.
- W2950682718 hasConceptScore W2950682718C154945302 @default.
- W2950682718 hasConceptScore W2950682718C177264268 @default.
- W2950682718 hasConceptScore W2950682718C195807954 @default.
- W2950682718 hasConceptScore W2950682718C199360897 @default.
- W2950682718 hasConceptScore W2950682718C23123220 @default.
- W2950682718 hasConceptScore W2950682718C2776760102 @default.
- W2950682718 hasConceptScore W2950682718C40077939 @default.
- W2950682718 hasConceptScore W2950682718C41008148 @default.
- W2950682718 hasConceptScore W2950682718C5655090 @default.
- W2950682718 hasLocation W29506827181 @default.
- W2950682718 hasOpenAccess W2950682718 @default.
- W2950682718 hasPrimaryLocation W29506827181 @default.
- W2950682718 hasRelatedWork W1171978009 @default.
- W2950682718 hasRelatedWork W191651381 @default.
- W2950682718 hasRelatedWork W2007059545 @default.
- W2950682718 hasRelatedWork W2042898098 @default.
- W2950682718 hasRelatedWork W2072495471 @default.
- W2950682718 hasRelatedWork W2169411532 @default.
- W2950682718 hasRelatedWork W2189575506 @default.
- W2950682718 hasRelatedWork W2345102896 @default.
- W2950682718 hasRelatedWork W2471982557 @default.
- W2950682718 hasRelatedWork W248898913 @default.
- W2950682718 hasRelatedWork W2529771924 @default.
- W2950682718 hasRelatedWork W2578054369 @default.
- W2950682718 hasRelatedWork W2589223628 @default.
- W2950682718 hasRelatedWork W3011610465 @default.
- W2950682718 hasRelatedWork W3042356186 @default.
- W2950682718 hasRelatedWork W3100214273 @default.
- W2950682718 hasRelatedWork W3198076218 @default.
- W2950682718 hasRelatedWork W70182023 @default.
- W2950682718 hasRelatedWork W2182615025 @default.
- W2950682718 hasRelatedWork W3085750834 @default.
- W2950682718 isParatext "false" @default.
- W2950682718 isRetracted "false" @default.
- W2950682718 magId "2950682718" @default.
- W2950682718 workType "article" @default.