Matches in SemOpenAlex for { <https://semopenalex.org/work/W631696767> ?p ?o ?g. }
- W631696767 abstract "Data mining is an essential part of knowledge discovery, and performs the extraction of useful information from a collection of data, so as to assist human beings in making necessary decisions. This thesis describes research in the field of itemset mining, which performs the extraction of a set of items that occur together in a dataset, based on a user specified threshold. Recent focus of itemset mining has been on the discovery of closed itemsets from high-dimensional datasets, characterised by relatively few rows and a relatively larger number of columns. A closed itemset is the maximal set of items common to a set of rows. By exponentially increasing running time as the average row length increases, mining closed itemsets from such datasets renders most column enumeration-based algorithm impractical. Existing row enumeration-based algorithms also show that they struggle to reach large cardinality closed itemsets. This is due to the implementation of the support constraint, which is based on the frequency of occurrence of the itemset. Frequent closed itemsets are usually smaller in size and larger in numbers, hence taking much of the memory space. Unfortunately, large cardinality closed itemsets are likely to be more informative than small cardinality closed itemsets in this type of dataset. The research investigates the area of large cardinality closed itemset discovery by examining and analysing the literature and identifying both strengths and weaknesses of existing approaches. Based on this synthesis, a new algorithm, termed DisClose, has been designed and developed to discover large cardinality (colossal) closed itemsets from high-dimensional datasets. The algorithm strategy begins by enumerating large cardinality itemsets and from these, builds smaller itemsets. This is done by applying a bottom-up search of the row-enumeration tree. A minimum cardinality threshold has been proposed to identify colossal closed itemsets and to further reduce the search space. A novel closedness-checking method has been proposed which uses a unique generator to immediately discover closed itemsets without the need to check if each new closed itemset has previously been found. These approaches have been combined using a Compact Row-Tree (CR-Tree) data structure designed to assist in the efficient discovery of the colossal closed itemsets. For evaluation purposes four state-of-the-art algorithms have been selected for comparison. Experimental results show that algorithm DisClose is scalable and can efficiently extract colossal closed itemsets in the considered dataset, even for low support thresholds that existing algorithms cannot find." @default.
- W631696767 created "2016-06-24" @default.
- W631696767 creator A5072002836 @default.
- W631696767 date "2012-10-09" @default.
- W631696767 modified "2023-09-24" @default.
- W631696767 title "DisClose: Discovering Colossal Closed Itemsets from High Dimensional Datasets via a Compact Row-Tree" @default.
- W631696767 cites W110175884 @default.
- W631696767 cites W111044289 @default.
- W631696767 cites W1492437814 @default.
- W631696767 cites W1503729935 @default.
- W631696767 cites W1528218942 @default.
- W631696767 cites W1553696291 @default.
- W631696767 cites W1576562444 @default.
- W631696767 cites W1577170573 @default.
- W631696767 cites W1583082330 @default.
- W631696767 cites W159524162 @default.
- W631696767 cites W172019652 @default.
- W631696767 cites W1721137721 @default.
- W631696767 cites W1892399053 @default.
- W631696767 cites W1975967982 @default.
- W631696767 cites W1979180881 @default.
- W631696767 cites W1979943645 @default.
- W631696767 cites W1982932027 @default.
- W631696767 cites W1984606279 @default.
- W631696767 cites W1990951910 @default.
- W631696767 cites W1996249351 @default.
- W631696767 cites W2007087405 @default.
- W631696767 cites W2009380894 @default.
- W631696767 cites W2011141305 @default.
- W631696767 cites W2012451152 @default.
- W631696767 cites W2012659293 @default.
- W631696767 cites W2016814901 @default.
- W631696767 cites W2018481848 @default.
- W631696767 cites W2026562765 @default.
- W631696767 cites W2030969394 @default.
- W631696767 cites W2032427901 @default.
- W631696767 cites W2037965136 @default.
- W631696767 cites W2040072657 @default.
- W631696767 cites W2041989192 @default.
- W631696767 cites W2058012762 @default.
- W631696767 cites W2059837064 @default.
- W631696767 cites W2060406328 @default.
- W631696767 cites W2064853889 @default.
- W631696767 cites W2064983661 @default.
- W631696767 cites W2066771339 @default.
- W631696767 cites W2068987343 @default.
- W631696767 cites W2069356553 @default.
- W631696767 cites W2080632942 @default.
- W631696767 cites W2092072097 @default.
- W631696767 cites W2093397547 @default.
- W631696767 cites W2094974204 @default.
- W631696767 cites W2097800052 @default.
- W631696767 cites W2099404336 @default.
- W631696767 cites W2101476067 @default.
- W631696767 cites W2102794349 @default.
- W631696767 cites W2109363337 @default.
- W631696767 cites W2109606974 @default.
- W631696767 cites W2114000666 @default.
- W631696767 cites W2117530081 @default.
- W631696767 cites W2117769694 @default.
- W631696767 cites W2118020653 @default.
- W631696767 cites W2118843309 @default.
- W631696767 cites W2125208521 @default.
- W631696767 cites W2126400629 @default.
- W631696767 cites W2129555316 @default.
- W631696767 cites W2132280281 @default.
- W631696767 cites W2134937628 @default.
- W631696767 cites W2138073318 @default.
- W631696767 cites W2138660495 @default.
- W631696767 cites W2139936633 @default.
- W631696767 cites W2140190241 @default.
- W631696767 cites W2141115288 @default.
- W631696767 cites W2143945853 @default.
- W631696767 cites W2145539275 @default.
- W631696767 cites W2147864684 @default.
- W631696767 cites W2149901417 @default.
- W631696767 cites W2151953639 @default.
- W631696767 cites W2154005727 @default.
- W631696767 cites W2158454296 @default.
- W631696767 cites W2159257592 @default.
- W631696767 cites W2162034534 @default.
- W631696767 cites W2162379288 @default.
- W631696767 cites W2163598528 @default.
- W631696767 cites W2164197909 @default.
- W631696767 cites W2166559705 @default.
- W631696767 cites W2167190345 @default.
- W631696767 cites W2171612826 @default.
- W631696767 cites W2172122080 @default.
- W631696767 cites W2172186225 @default.
- W631696767 cites W2210278139 @default.
- W631696767 cites W2613161123 @default.
- W631696767 cites W85121894 @default.
- W631696767 cites W2124704116 @default.
- W631696767 hasPublicationYear "2012" @default.
- W631696767 type Work @default.
- W631696767 sameAs 631696767 @default.
- W631696767 citedByCount "0" @default.
- W631696767 crossrefType "dissertation" @default.
- W631696767 hasAuthorship W631696767A5072002836 @default.
- W631696767 hasConcept C113174947 @default.