Matches in SemOpenAlex for { <https://semopenalex.org/work/W2139874739> ?p ?o ?g. }
- W2139874739 abstract "Frequent sequence mining is one of the fundamental building blocks in data mining. While the problem has been extensively studied, few of the available techniques are sufficiently scalable to handle datasets with billions of sequences; such large-scale datasets arise, for instance, in text mining and session analysis. In this paper, we propose MG-FSM, a scalable algorithm for frequent sequence mining on MapReduce. MG-FSM can handle so-called gap constraints, which can be used to limit the output to a controlled set of frequent sequences. At its heart, MG-FSM partitions the input in a way that allows us to mine each partition independently using any existing frequent sequence mining algorithm. We introduce the notion of w-equivalency, which is a generalization of the notion of a projected database used by many frequent pattern mining algorithms. We also present a number of optimization techniques that minimize partition size, and therefore computational and communication costs, while still maintaining correctness. Our experimental study in the context of text mining suggests that MG-FSM is significantly more efficient and scalable than alternative approaches." @default.
- W2139874739 created "2016-06-24" @default.
- W2139874739 creator A5039440854 @default.
- W2139874739 creator A5041227178 @default.
- W2139874739 creator A5064647811 @default.
- W2139874739 creator A5088973453 @default.
- W2139874739 date "2013-01-01" @default.
- W2139874739 modified "2023-09-26" @default.
- W2139874739 title "Mind the gap" @default.
- W2139874739 cites W1494776307 @default.
- W2139874739 cites W1608194207 @default.
- W2139874739 cites W1631260214 @default.
- W2139874739 cites W1641039719 @default.
- W2139874739 cites W1990800901 @default.
- W2139874739 cites W1999281295 @default.
- W2139874739 cites W2008475755 @default.
- W2139874739 cites W2025122101 @default.
- W2139874739 cites W2032175749 @default.
- W2139874739 cites W2032226242 @default.
- W2139874739 cites W2045495924 @default.
- W2139874739 cites W2056910091 @default.
- W2139874739 cites W2063771604 @default.
- W2139874739 cites W2076489648 @default.
- W2139874739 cites W2109664771 @default.
- W2139874739 cites W2110893883 @default.
- W2139874739 cites W2115482638 @default.
- W2139874739 cites W2120101509 @default.
- W2139874739 cites W2122182354 @default.
- W2139874739 cites W2137548844 @default.
- W2139874739 cites W2155113671 @default.
- W2139874739 cites W2156026066 @default.
- W2139874739 cites W2160484851 @default.
- W2139874739 cites W2166559705 @default.
- W2139874739 cites W2166951489 @default.
- W2139874739 cites W2168196587 @default.
- W2139874739 cites W2168859760 @default.
- W2139874739 cites W2170036644 @default.
- W2139874739 cites W2173213060 @default.
- W2139874739 cites W2621280964 @default.
- W2139874739 cites W3158986179 @default.
- W2139874739 doi "https://doi.org/10.1145/2463676.2465285" @default.
- W2139874739 hasPublicationYear "2013" @default.
- W2139874739 type Work @default.
- W2139874739 sameAs 2139874739 @default.
- W2139874739 citedByCount "47" @default.
- W2139874739 countsByYear W21398747392013 @default.
- W2139874739 countsByYear W21398747392014 @default.
- W2139874739 countsByYear W21398747392015 @default.
- W2139874739 countsByYear W21398747392016 @default.
- W2139874739 countsByYear W21398747392017 @default.
- W2139874739 countsByYear W21398747392018 @default.
- W2139874739 countsByYear W21398747392019 @default.
- W2139874739 countsByYear W21398747392020 @default.
- W2139874739 countsByYear W21398747392021 @default.
- W2139874739 countsByYear W21398747392022 @default.
- W2139874739 countsByYear W21398747392023 @default.
- W2139874739 crossrefType "proceedings-article" @default.
- W2139874739 hasAuthorship W2139874739A5039440854 @default.
- W2139874739 hasAuthorship W2139874739A5041227178 @default.
- W2139874739 hasAuthorship W2139874739A5064647811 @default.
- W2139874739 hasAuthorship W2139874739A5088973453 @default.
- W2139874739 hasConcept C11413529 @default.
- W2139874739 hasConcept C114614502 @default.
- W2139874739 hasConcept C121332964 @default.
- W2139874739 hasConcept C124101348 @default.
- W2139874739 hasConcept C134306372 @default.
- W2139874739 hasConcept C136764020 @default.
- W2139874739 hasConcept C149490388 @default.
- W2139874739 hasConcept C151730666 @default.
- W2139874739 hasConcept C176775163 @default.
- W2139874739 hasConcept C177148314 @default.
- W2139874739 hasConcept C177264268 @default.
- W2139874739 hasConcept C193524817 @default.
- W2139874739 hasConcept C197046077 @default.
- W2139874739 hasConcept C199360897 @default.
- W2139874739 hasConcept C2778112365 @default.
- W2139874739 hasConcept C2778755073 @default.
- W2139874739 hasConcept C2779182362 @default.
- W2139874739 hasConcept C2779343474 @default.
- W2139874739 hasConcept C33923547 @default.
- W2139874739 hasConcept C35578498 @default.
- W2139874739 hasConcept C41008148 @default.
- W2139874739 hasConcept C42812 @default.
- W2139874739 hasConcept C48044578 @default.
- W2139874739 hasConcept C54355233 @default.
- W2139874739 hasConcept C55439883 @default.
- W2139874739 hasConcept C62520636 @default.
- W2139874739 hasConcept C77088390 @default.
- W2139874739 hasConcept C80444323 @default.
- W2139874739 hasConcept C81440476 @default.
- W2139874739 hasConcept C86803240 @default.
- W2139874739 hasConcept C87146676 @default.
- W2139874739 hasConceptScore W2139874739C11413529 @default.
- W2139874739 hasConceptScore W2139874739C114614502 @default.
- W2139874739 hasConceptScore W2139874739C121332964 @default.
- W2139874739 hasConceptScore W2139874739C124101348 @default.
- W2139874739 hasConceptScore W2139874739C134306372 @default.
- W2139874739 hasConceptScore W2139874739C136764020 @default.
- W2139874739 hasConceptScore W2139874739C149490388 @default.
- W2139874739 hasConceptScore W2139874739C151730666 @default.