Matches in SemOpenAlex for { <https://semopenalex.org/work/W2810019908> ?p ?o ?g. }
- W2810019908 abstract "Author(s): Sun, Liwen | Advisor(s): Franklin, Michael J | Abstract: As data volumes continue to expand, analytics approaches that require exhaustively scanning data sets become untenable. For this reason, modern analytics systems employ data skipping techniques to avoid looking at large volumes of irrelevant data. By maintaining some metadata for each block of data, a query may skip a data block if the metadata indicates that the block does not contain relevant data. The effectiveness of data skipping, however, depends on how the underlying data are organized into blocks. In this dissertation, we propose a fine-grained data layout framework, called ``Generalized Skipping-Oriented Partitioning and Replication'' (GSOP-R), which aims to maximize query performance through aggressive data skipping. Based on observations of real-world analytics workloads, we find that the workload patterns can be summarized as a succinct set of features. The GSOP-R framework uses these features to transform the incoming data into a small set of feature vectors, and then performs clustering algorithms using the feature vectors instead of the actual data. A resulting GSOP-R layout scheme is highly flexible. For instance, it allows different columns to be horizontally partitioned in different ways and supports replication of only parts of rows or columns. We developed several designs for GSOP-R on Apache Spark and Apache Parquet and then evaluated their performance using two public benchmarks and several real-world workloads. Our results show that GSOP-R can reduce the amount of data scanned and improve end-to-end query response times over the state-of-the-art techniques by a factor of 2 to 9." @default.
- W2810019908 created "2018-07-10" @default.
- W2810019908 creator A5087740768 @default.
- W2810019908 date "2017-01-01" @default.
- W2810019908 modified "2023-09-27" @default.
- W2810019908 title "Skipping-oriented Data Design for Large-Scale Analytics" @default.
- W2810019908 cites W1538267625 @default.
- W2810019908 cites W1786193469 @default.
- W2810019908 cites W1851390469 @default.
- W2810019908 cites W1967601791 @default.
- W2810019908 cites W1987748328 @default.
- W2810019908 cites W1997375126 @default.
- W2810019908 cites W2003580796 @default.
- W2810019908 cites W2006296837 @default.
- W2810019908 cites W2016381774 @default.
- W2810019908 cites W2035355147 @default.
- W2810019908 cites W2039910065 @default.
- W2810019908 cites W2063601856 @default.
- W2810019908 cites W2071989194 @default.
- W2810019908 cites W2075345089 @default.
- W2810019908 cites W2086959852 @default.
- W2810019908 cites W2105252819 @default.
- W2810019908 cites W2106867122 @default.
- W2810019908 cites W2110086534 @default.
- W2810019908 cites W2114854276 @default.
- W2810019908 cites W2117546628 @default.
- W2810019908 cites W2120970098 @default.
- W2810019908 cites W2121947440 @default.
- W2810019908 cites W2130598739 @default.
- W2810019908 cites W2131975293 @default.
- W2810019908 cites W2133741724 @default.
- W2810019908 cites W2137156898 @default.
- W2810019908 cites W2139072600 @default.
- W2810019908 cites W2141012957 @default.
- W2810019908 cites W2144839430 @default.
- W2810019908 cites W2145269645 @default.
- W2810019908 cites W2160342152 @default.
- W2810019908 cites W2165467455 @default.
- W2810019908 cites W2167631575 @default.
- W2810019908 cites W2169188831 @default.
- W2810019908 cites W2171543585 @default.
- W2810019908 cites W2182237911 @default.
- W2810019908 cites W2574861468 @default.
- W2810019908 cites W3027691100 @default.
- W2810019908 cites W3138367763 @default.
- W2810019908 hasPublicationYear "2017" @default.
- W2810019908 type Work @default.
- W2810019908 sameAs 2810019908 @default.
- W2810019908 citedByCount "0" @default.
- W2810019908 crossrefType "journal-article" @default.
- W2810019908 hasAuthorship W2810019908A5087740768 @default.
- W2810019908 hasConcept C105795698 @default.
- W2810019908 hasConcept C111919701 @default.
- W2810019908 hasConcept C124101348 @default.
- W2810019908 hasConcept C12590798 @default.
- W2810019908 hasConcept C135598885 @default.
- W2810019908 hasConcept C136764020 @default.
- W2810019908 hasConcept C154945302 @default.
- W2810019908 hasConcept C175801342 @default.
- W2810019908 hasConcept C177264268 @default.
- W2810019908 hasConcept C199360897 @default.
- W2810019908 hasConcept C23123220 @default.
- W2810019908 hasConcept C2524010 @default.
- W2810019908 hasConcept C2777210771 @default.
- W2810019908 hasConcept C2778476105 @default.
- W2810019908 hasConcept C2781215313 @default.
- W2810019908 hasConcept C33923547 @default.
- W2810019908 hasConcept C41008148 @default.
- W2810019908 hasConcept C58489278 @default.
- W2810019908 hasConcept C73555534 @default.
- W2810019908 hasConcept C75684735 @default.
- W2810019908 hasConcept C77088390 @default.
- W2810019908 hasConcept C79158427 @default.
- W2810019908 hasConcept C93518851 @default.
- W2810019908 hasConceptScore W2810019908C105795698 @default.
- W2810019908 hasConceptScore W2810019908C111919701 @default.
- W2810019908 hasConceptScore W2810019908C124101348 @default.
- W2810019908 hasConceptScore W2810019908C12590798 @default.
- W2810019908 hasConceptScore W2810019908C135598885 @default.
- W2810019908 hasConceptScore W2810019908C136764020 @default.
- W2810019908 hasConceptScore W2810019908C154945302 @default.
- W2810019908 hasConceptScore W2810019908C175801342 @default.
- W2810019908 hasConceptScore W2810019908C177264268 @default.
- W2810019908 hasConceptScore W2810019908C199360897 @default.
- W2810019908 hasConceptScore W2810019908C23123220 @default.
- W2810019908 hasConceptScore W2810019908C2524010 @default.
- W2810019908 hasConceptScore W2810019908C2777210771 @default.
- W2810019908 hasConceptScore W2810019908C2778476105 @default.
- W2810019908 hasConceptScore W2810019908C2781215313 @default.
- W2810019908 hasConceptScore W2810019908C33923547 @default.
- W2810019908 hasConceptScore W2810019908C41008148 @default.
- W2810019908 hasConceptScore W2810019908C58489278 @default.
- W2810019908 hasConceptScore W2810019908C73555534 @default.
- W2810019908 hasConceptScore W2810019908C75684735 @default.
- W2810019908 hasConceptScore W2810019908C77088390 @default.
- W2810019908 hasConceptScore W2810019908C79158427 @default.
- W2810019908 hasConceptScore W2810019908C93518851 @default.
- W2810019908 hasLocation W28100199081 @default.
- W2810019908 hasOpenAccess W2810019908 @default.
- W2810019908 hasPrimaryLocation W28100199081 @default.