Matches in SemOpenAlex for { <https://semopenalex.org/work/W2201763571> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W2201763571 abstract "In the era of data-intensive scientific discovery, data analysis is critical for scientists to identify essential information from the mountains of data generated by large-scale simulations or experiments. A generic operation in scientific data analysis is to combine information from multiple data sets, which are stored in heterogeneous ile formats. This operation is typically known as a Join in database management field. Currently, a join operation involving multiple data sets in different file formats is time-consuming because of the need to prepare data (i.e., to convert data into a uniform format or to ingest into a database) and to run the join algorithms. Furthermore, data processing languages, such as SQL (Structured Query Language), can not easily express typical scientific analysis tasks such as interpolation. In this paper, we propose three techniques to address these challenges: a two-level data model to process data from different file formats without converting to a uniform format, a data organization structure known as Multi-Dimensional Binning (MDBin), and a join processing algorithm known as Spatially Clustered Join (SCJoin). Together, these techniques allow scientific data files to be used for query processing with less I/O cost and fast query response time without the extra cost to perform ile format conversion and data ingestion. Evaluation of our proposed techniques in joining and interpolating data sets generated by a plasma physics simulation studying space weather phenomenon showed up to 8X improvement over FastQuery. Querying with our solution outperforms SciDB, a popular array data management system for scientific data, by 43X-143X. We also demonstrate that our methods scale to 64K CPU cores in analyzing 32TB data on a large-scale supercomputing system." @default.
- W2201763571 created "2016-06-24" @default.
- W2201763571 creator A5042447742 @default.
- W2201763571 creator A5043129695 @default.
- W2201763571 creator A5062233562 @default.
- W2201763571 date "2015-10-01" @default.
- W2201763571 modified "2023-10-16" @default.
- W2201763571 title "Spatially clustered join on heterogeneous scientific data sets" @default.
- W2201763571 cites W1569878055 @default.
- W2201763571 cites W1961332808 @default.
- W2201763571 cites W1986056848 @default.
- W2201763571 cites W1993172344 @default.
- W2201763571 cites W2002779106 @default.
- W2201763571 cites W2013896112 @default.
- W2201763571 cites W2014830756 @default.
- W2201763571 cites W2021593960 @default.
- W2201763571 cites W2032756564 @default.
- W2201763571 cites W2036090066 @default.
- W2201763571 cites W2048026762 @default.
- W2201763571 cites W2053087996 @default.
- W2201763571 cites W2060098952 @default.
- W2201763571 cites W2084757499 @default.
- W2201763571 cites W2124105793 @default.
- W2201763571 cites W2138143219 @default.
- W2201763571 cites W2140980002 @default.
- W2201763571 cites W2155603580 @default.
- W2201763571 cites W2156077349 @default.
- W2201763571 cites W2161692763 @default.
- W2201763571 cites W2295596515 @default.
- W2201763571 cites W2413469520 @default.
- W2201763571 cites W2498603544 @default.
- W2201763571 cites W4247616591 @default.
- W2201763571 doi "https://doi.org/10.1109/bigdata.2015.7363778" @default.
- W2201763571 hasPublicationYear "2015" @default.
- W2201763571 type Work @default.
- W2201763571 sameAs 2201763571 @default.
- W2201763571 citedByCount "6" @default.
- W2201763571 countsByYear W22017635712016 @default.
- W2201763571 countsByYear W22017635712017 @default.
- W2201763571 countsByYear W22017635712019 @default.
- W2201763571 countsByYear W22017635712020 @default.
- W2201763571 crossrefType "proceedings-article" @default.
- W2201763571 hasAuthorship W2201763571A5042447742 @default.
- W2201763571 hasAuthorship W2201763571A5043129695 @default.
- W2201763571 hasAuthorship W2201763571A5062233562 @default.
- W2201763571 hasBestOaLocation W22017635712 @default.
- W2201763571 hasConcept C114614502 @default.
- W2201763571 hasConcept C2776124973 @default.
- W2201763571 hasConcept C33923547 @default.
- W2201763571 hasConcept C41008148 @default.
- W2201763571 hasConcept C80444323 @default.
- W2201763571 hasConceptScore W2201763571C114614502 @default.
- W2201763571 hasConceptScore W2201763571C2776124973 @default.
- W2201763571 hasConceptScore W2201763571C33923547 @default.
- W2201763571 hasConceptScore W2201763571C41008148 @default.
- W2201763571 hasConceptScore W2201763571C80444323 @default.
- W2201763571 hasLocation W22017635711 @default.
- W2201763571 hasLocation W22017635712 @default.
- W2201763571 hasOpenAccess W2201763571 @default.
- W2201763571 hasPrimaryLocation W22017635711 @default.
- W2201763571 hasRelatedWork W160933117 @default.
- W2201763571 hasRelatedWork W189327844 @default.
- W2201763571 hasRelatedWork W1969186495 @default.
- W2201763571 hasRelatedWork W1971287449 @default.
- W2201763571 hasRelatedWork W2355525742 @default.
- W2201763571 hasRelatedWork W2373422372 @default.
- W2201763571 hasRelatedWork W2375916644 @default.
- W2201763571 hasRelatedWork W2392825881 @default.
- W2201763571 hasRelatedWork W2795113968 @default.
- W2201763571 hasRelatedWork W4234476925 @default.
- W2201763571 isParatext "false" @default.
- W2201763571 isRetracted "false" @default.
- W2201763571 magId "2201763571" @default.
- W2201763571 workType "article" @default.