Matches in SemOpenAlex for { <https://semopenalex.org/work/W2912256776> ?p ?o ?g. }
- W2912256776 endingPage "44" @default.
- W2912256776 startingPage "1" @default.
- W2912256776 abstract "Parallel dataflow engines such as Apache Hadoop, Apache Spark, and Apache Flink are an established alternative to relational databases for modern data analysis applications. A characteristic of these systems is a scalable programming model based on distributed collections and parallel transformations expressed by means of second-order functions such as map and reduce. Notable examples are Flink’s DataSet and Spark’s RDD programming abstractions. These programming models are realized as EDSLs—domain specific languages embedded in a general-purpose host language such as Java, Scala, or Python. This approach has several advantages over traditional external DSLs such as SQL or XQuery. First, syntactic constructs from the host language (e.g., anonymous functions syntax, value definitions, and fluent syntax via method chaining) can be reused in the EDSL. This eases the learning curve for developers already familiar with the host language. Second, it allows for seamless integration of library methods written in the host language via the function parameters passed to the parallel dataflow operators. This reduces the effort for developing analytics dataflows that go beyond pure SQL and require domain-specific logic. At the same time, however, state-of-the-art parallel dataflow EDSLs exhibit a number of shortcomings. First, one of the main advantages of an external DSL such as SQL—the high-level, declarative Select-From-Where syntax—is either lost completely or mimicked in a non-standard way. Second, execution aspects such as caching, join order, and partial aggregation have to be decided by the programmer. Optimizing them automatically is very difficult due to the limited program context available in the intermediate representation of the DSL. In this article, we argue that the limitations listed above are a side effect of the adopted type-based embedding approach. As a solution, we propose an alternative EDSL design based on quotations. We present a DSL embedded in Scala and discuss its compiler pipeline, intermediate representation, and some of the enabled optimizations. We promote the algebraic type of bags in union representation as a model for distributed collections and its associated structural recursion scheme and monad as a model for parallel collection processing. At the source code level, Scala’s comprehension syntax over a bag monad can be used to encode Select-From-Where expressions in a standard way. At the intermediate representation level, maintaining comprehensions as a first-class citizen can be used to simplify the design and implementation of holistic dataflow optimizations that accommodate for nesting and control-flow. The proposed DSL design therefore reconciles the benefits of embedded parallel dataflow DSLs with the declarativity and optimization potential of external DSLs like SQL." @default.
- W2912256776 created "2019-02-21" @default.
- W2912256776 creator A5002413906 @default.
- W2912256776 creator A5050846585 @default.
- W2912256776 creator A5061693378 @default.
- W2912256776 date "2019-01-29" @default.
- W2912256776 modified "2023-10-01" @default.
- W2912256776 title "Representations and Optimizations for Embedded Parallel Dataflow Languages" @default.
- W2912256776 cites W1556345310 @default.
- W2912256776 cites W1561487379 @default.
- W2912256776 cites W1575324001 @default.
- W2912256776 cites W1966981171 @default.
- W2912256776 cites W1969621165 @default.
- W2912256776 cites W1970372442 @default.
- W2912256776 cites W1978924650 @default.
- W2912256776 cites W1995041765 @default.
- W2912256776 cites W1995618084 @default.
- W2912256776 cites W1997143185 @default.
- W2912256776 cites W2007074710 @default.
- W2912256776 cites W2007397391 @default.
- W2912256776 cites W2012056301 @default.
- W2912256776 cites W2026049208 @default.
- W2912256776 cites W2036971997 @default.
- W2912256776 cites W2038412523 @default.
- W2912256776 cites W2041920040 @default.
- W2912256776 cites W2045380064 @default.
- W2912256776 cites W2060280513 @default.
- W2912256776 cites W2088675571 @default.
- W2912256776 cites W2098935637 @default.
- W2912256776 cites W2100830825 @default.
- W2912256776 cites W2108996071 @default.
- W2912256776 cites W2110086534 @default.
- W2912256776 cites W2112866468 @default.
- W2912256776 cites W2114409719 @default.
- W2912256776 cites W2117362077 @default.
- W2912256776 cites W2117818027 @default.
- W2912256776 cites W2119871735 @default.
- W2912256776 cites W2122339407 @default.
- W2912256776 cites W2122988820 @default.
- W2912256776 cites W2130872776 @default.
- W2912256776 cites W2143609451 @default.
- W2912256776 cites W2146620757 @default.
- W2912256776 cites W2149127686 @default.
- W2912256776 cites W2151251992 @default.
- W2912256776 cites W2152149667 @default.
- W2912256776 cites W2153329411 @default.
- W2912256776 cites W2154697693 @default.
- W2912256776 cites W2170616854 @default.
- W2912256776 cites W2185907055 @default.
- W2912256776 cites W2247317079 @default.
- W2912256776 cites W2295608275 @default.
- W2912256776 cites W2295914203 @default.
- W2912256776 cites W2339920159 @default.
- W2912256776 cites W2751351862 @default.
- W2912256776 cites W3005057573 @default.
- W2912256776 cites W4243105236 @default.
- W2912256776 doi "https://doi.org/10.1145/3281629" @default.
- W2912256776 hasPublicationYear "2019" @default.
- W2912256776 type Work @default.
- W2912256776 sameAs 2912256776 @default.
- W2912256776 citedByCount "11" @default.
- W2912256776 countsByYear W29122567762018 @default.
- W2912256776 countsByYear W29122567762019 @default.
- W2912256776 countsByYear W29122567762020 @default.
- W2912256776 countsByYear W29122567762021 @default.
- W2912256776 countsByYear W29122567762022 @default.
- W2912256776 crossrefType "journal-article" @default.
- W2912256776 hasAuthorship W2912256776A5002413906 @default.
- W2912256776 hasAuthorship W2912256776A5050846585 @default.
- W2912256776 hasAuthorship W2912256776A5061693378 @default.
- W2912256776 hasBestOaLocation W29122567762 @default.
- W2912256776 hasConcept C109701466 @default.
- W2912256776 hasConcept C135257023 @default.
- W2912256776 hasConcept C154945302 @default.
- W2912256776 hasConcept C199360897 @default.
- W2912256776 hasConcept C2781215313 @default.
- W2912256776 hasConcept C34165917 @default.
- W2912256776 hasConcept C41008148 @default.
- W2912256776 hasConcept C42383842 @default.
- W2912256776 hasConcept C510870499 @default.
- W2912256776 hasConcept C519991488 @default.
- W2912256776 hasConcept C548217200 @default.
- W2912256776 hasConcept C60048249 @default.
- W2912256776 hasConcept C96324660 @default.
- W2912256776 hasConceptScore W2912256776C109701466 @default.
- W2912256776 hasConceptScore W2912256776C135257023 @default.
- W2912256776 hasConceptScore W2912256776C154945302 @default.
- W2912256776 hasConceptScore W2912256776C199360897 @default.
- W2912256776 hasConceptScore W2912256776C2781215313 @default.
- W2912256776 hasConceptScore W2912256776C34165917 @default.
- W2912256776 hasConceptScore W2912256776C41008148 @default.
- W2912256776 hasConceptScore W2912256776C42383842 @default.
- W2912256776 hasConceptScore W2912256776C510870499 @default.
- W2912256776 hasConceptScore W2912256776C519991488 @default.
- W2912256776 hasConceptScore W2912256776C548217200 @default.
- W2912256776 hasConceptScore W2912256776C60048249 @default.
- W2912256776 hasConceptScore W2912256776C96324660 @default.
- W2912256776 hasFunder F4320320300 @default.