Matches in SemOpenAlex for { <https://semopenalex.org/work/W848556834> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W848556834 endingPage "152" @default.
- W848556834 startingPage "152" @default.
- W848556834 abstract "Provenance is metadata that describes the lineage of a data product. Lineage is invaluable in advancing the reuse and reproducibility of scientific results in e-Science. Through the availability of provenance, future researchers can make valid assessments of data quality or consider the trustworthiness of the data. The shift towards 'Big Data' has presented challenges in provenance driven by data volume and variety, and the need for making data more valuable and veracious. This dissertation examines provenance quality, capture, and representation particularly for highly voluminous provenance that occurs with growing frequency in large-scale science.This work has at its core a framework and methodology that identify three dimensions of provenance quality: correctness, completeness, and relevance. Based on the proposed quality dimensions, the framework supports provenance quality analysis at the node/edge, graph, and multi-graph levels, which includes analysis of annotations, timestamps and the structure of provenance traces. A supporting contribution is the design and generation of a pseudo-realistic provenance workload that consists of 48,000 provenance traces, forming a provenance database 10 Gigabytes in size. This workload is composed of provenance from 6 varied realistic workflows and includes a failure model that introduces several types of failures into provenance data including workflow executions that experienced failures and workflow executions that experienced faults in message passing communication between application and provenance system, the latter resulting in dropped provenance.Provenance in High Performance Computing is directly addressed with the design of a cache storage solution that supports multi-level provenance capture with minimum collection overhead. A distributed NoSQL database stores the collected provenance. Evaluation is carried out through experiments performed on two production systems at the National Energy Research Scientific Computing Center.The final contribution is in the experimental evaluation of two storage approaches for provenance, graph and relational databases, and the impact on retrieval for provenance specific realistic queries. Results carried out at scale and using real-world provenance traces show that graph databases are better suited for the retrieval of large provenance graphs by ID and relational databases provide a better option for provenance graphs that are of great depth in evaluated scenarios." @default.
- W848556834 created "2016-06-24" @default.
- W848556834 creator A5038012602 @default.
- W848556834 creator A5049801956 @default.
- W848556834 date "2014-01-01" @default.
- W848556834 modified "2023-09-23" @default.
- W848556834 title "Quality, retrieval and analysis of provenance in large-scale data" @default.
- W848556834 cites W2145791740 @default.
- W848556834 hasPublicationYear "2014" @default.
- W848556834 type Work @default.
- W848556834 sameAs 848556834 @default.
- W848556834 citedByCount "1" @default.
- W848556834 countsByYear W8485568342015 @default.
- W848556834 crossrefType "journal-article" @default.
- W848556834 hasAuthorship W848556834A5038012602 @default.
- W848556834 hasAuthorship W848556834A5049801956 @default.
- W848556834 hasConcept C124101348 @default.
- W848556834 hasConcept C127313418 @default.
- W848556834 hasConcept C136764020 @default.
- W848556834 hasConcept C177212765 @default.
- W848556834 hasConcept C23123220 @default.
- W848556834 hasConcept C2522767166 @default.
- W848556834 hasConcept C2780049196 @default.
- W848556834 hasConcept C41008148 @default.
- W848556834 hasConcept C5900021 @default.
- W848556834 hasConcept C77088390 @default.
- W848556834 hasConcept C93518851 @default.
- W848556834 hasConceptScore W848556834C124101348 @default.
- W848556834 hasConceptScore W848556834C127313418 @default.
- W848556834 hasConceptScore W848556834C136764020 @default.
- W848556834 hasConceptScore W848556834C177212765 @default.
- W848556834 hasConceptScore W848556834C23123220 @default.
- W848556834 hasConceptScore W848556834C2522767166 @default.
- W848556834 hasConceptScore W848556834C2780049196 @default.
- W848556834 hasConceptScore W848556834C41008148 @default.
- W848556834 hasConceptScore W848556834C5900021 @default.
- W848556834 hasConceptScore W848556834C77088390 @default.
- W848556834 hasConceptScore W848556834C93518851 @default.
- W848556834 hasLocation W8485568341 @default.
- W848556834 hasOpenAccess W848556834 @default.
- W848556834 hasPrimaryLocation W8485568341 @default.
- W848556834 hasRelatedWork W1538517548 @default.
- W848556834 hasRelatedWork W1550026538 @default.
- W848556834 hasRelatedWork W1997250112 @default.
- W848556834 hasRelatedWork W1997702077 @default.
- W848556834 hasRelatedWork W2007982454 @default.
- W848556834 hasRelatedWork W2027172230 @default.
- W848556834 hasRelatedWork W2032190725 @default.
- W848556834 hasRelatedWork W2095427462 @default.
- W848556834 hasRelatedWork W2115373755 @default.
- W848556834 hasRelatedWork W2377340254 @default.
- W848556834 hasRelatedWork W2497704682 @default.
- W848556834 hasRelatedWork W2738640851 @default.
- W848556834 hasRelatedWork W2743945610 @default.
- W848556834 hasRelatedWork W2766840679 @default.
- W848556834 hasRelatedWork W2766945679 @default.
- W848556834 hasRelatedWork W2890494195 @default.
- W848556834 hasRelatedWork W2972340514 @default.
- W848556834 hasRelatedWork W9937398 @default.
- W848556834 hasRelatedWork W2531316959 @default.
- W848556834 hasRelatedWork W3195916141 @default.
- W848556834 isParatext "false" @default.
- W848556834 isRetracted "false" @default.
- W848556834 magId "848556834" @default.
- W848556834 workType "article" @default.