Matches in SemOpenAlex for { <https://semopenalex.org/work/W134561036> ?p ?o ?g. }
- W134561036 abstract "Classification is a core method widely studied in machine learning, statistics, and data mining. A lot of classification methods have been proposed in literature, such as Support Vector Machines, Decision Trees, and Bayesian Networks, most of which assume that the input data is in a feature vector representation. However, in some classification problems, the predefined feature space is not discriminative enough to distinguish between different classes. More seriously, in many other applications, the input data has very complex structures, but with no initial feature vector representation, such as transaction data (e.g., customer shopping transactions), sequences (e.g., protein sequences and software execution traces), graphs (e.g., chemical compounds and molecules, social and biological networks), semi-structured data (e.g., XML documents), and text data. For both scenarios, a primary question is how to construct a discriminative and compact feature set, on the basis of which, classification could be performed to achieve good classification performance. Although a lot of kernel-based approaches have been proposed to transform the feature space and, as a way to measure the similarity between two data objects, the implicit definition of feature space makes the kernel-based approach hard to interpret, and the high computational complexity makes it hard to scale to large problem sizes. A concrete example of complex structural data classification is classifying chemical compounds to various classes ( e.g., toxic vs. nontoxic, active vs. inactive), where a key challenge is how to construct discriminative graph features. While simple features such as atoms and links are too simple to preserve the structural information, graph kernel methods make it hard to interpret the classifiers. In this dissertation, I proposed to use frequent patterns as higher-order and discriminative features to characterize data, especially complex structural data, and thus enhance the classification power. Towards this goal, I designed a framework of discriminative frequent pattern-based classification which has been shown to improve the classification performance significantly. Theoretical analysis is provided to reveal the association between a feature's frequency and its discriminative power, thus demonstrate that frequent pattern is a good candidate as discriminative feature. Due to the explosive nature of frequent pattern mining, the frequent pattern-based feature construction could be a computational bottleneck, if the whole set of frequent patterns w.r.t. a minimum support threshold are generated. To overcome this computational bottleneck, I proposed two solutions: DDPMine and LEAP which directly mine the most discriminative features without generating the complete set. Both methods have been shown to improve efficiency while maintaining the classification accuracy.I further applied the discriminative frequent pattern-based classification to classifying chemical compounds with very skewed class distribution, which poses challenges for both feature construction and model learning. An ensemble framework which includes the ensembles in both the data space and the feature space is proposed to handle the challenges and shown to achieve good classification performance.In conclusion, the framework of discriminative frequent pattern-based classification could lead to a highly accurate, efficient and interpretable classifier on complex data. The pattern-based classification technique would have great impact in a wide range of applications including text categorization, chemical compound classification, software behavior analysis and so on." @default.
- W134561036 created "2016-06-24" @default.
- W134561036 creator A5019539533 @default.
- W134561036 creator A5085614235 @default.
- W134561036 date "2008-01-01" @default.
- W134561036 modified "2023-10-11" @default.
- W134561036 title "Towards accurate and efficient classification: a discriminative and frequent pattern-based approach" @default.
- W134561036 cites W131532953 @default.
- W134561036 cites W1480643256 @default.
- W134561036 cites W1484413656 @default.
- W134561036 cites W1489380107 @default.
- W134561036 cites W1528547644 @default.
- W134561036 cites W1531743498 @default.
- W134561036 cites W1534477342 @default.
- W134561036 cites W1549565124 @default.
- W134561036 cites W1556507321 @default.
- W134561036 cites W1558063662 @default.
- W134561036 cites W1570448133 @default.
- W134561036 cites W1597561788 @default.
- W134561036 cites W1623342295 @default.
- W134561036 cites W1641749581 @default.
- W134561036 cites W1835509607 @default.
- W134561036 cites W1975068437 @default.
- W134561036 cites W1985104873 @default.
- W134561036 cites W2020816856 @default.
- W134561036 cites W2029817244 @default.
- W134561036 cites W2029896651 @default.
- W134561036 cites W2038812321 @default.
- W134561036 cites W2039444222 @default.
- W134561036 cites W2053724458 @default.
- W134561036 cites W2064853889 @default.
- W134561036 cites W2066277072 @default.
- W134561036 cites W2083305840 @default.
- W134561036 cites W2084787613 @default.
- W134561036 cites W2106072141 @default.
- W134561036 cites W2106202847 @default.
- W134561036 cites W2107844279 @default.
- W134561036 cites W2108150886 @default.
- W134561036 cites W2110034858 @default.
- W134561036 cites W2111254498 @default.
- W134561036 cites W2115412287 @default.
- W134561036 cites W2116780029 @default.
- W134561036 cites W2119821739 @default.
- W134561036 cites W2129027474 @default.
- W134561036 cites W2136507529 @default.
- W134561036 cites W2136593687 @default.
- W134561036 cites W2153622112 @default.
- W134561036 cites W2153635508 @default.
- W134561036 cites W2153747975 @default.
- W134561036 cites W2154642793 @default.
- W134561036 cites W2161723275 @default.
- W134561036 cites W2166559705 @default.
- W134561036 cites W2168137269 @default.
- W134561036 cites W2170726034 @default.
- W134561036 cites W2170960297 @default.
- W134561036 cites W2435251607 @default.
- W134561036 cites W2738782639 @default.
- W134561036 cites W64908097 @default.
- W134561036 cites W72752446 @default.
- W134561036 cites W2034618876 @default.
- W134561036 cites W2056614917 @default.
- W134561036 hasPublicationYear "2008" @default.
- W134561036 type Work @default.
- W134561036 sameAs 134561036 @default.
- W134561036 citedByCount "2" @default.
- W134561036 countsByYear W1345610362013 @default.
- W134561036 countsByYear W1345610362021 @default.
- W134561036 crossrefType "journal-article" @default.
- W134561036 hasAuthorship W134561036A5019539533 @default.
- W134561036 hasAuthorship W134561036A5085614235 @default.
- W134561036 hasConcept C100595998 @default.
- W134561036 hasConcept C119857082 @default.
- W134561036 hasConcept C122280245 @default.
- W134561036 hasConcept C12267149 @default.
- W134561036 hasConcept C124101348 @default.
- W134561036 hasConcept C138885662 @default.
- W134561036 hasConcept C153180895 @default.
- W134561036 hasConcept C154945302 @default.
- W134561036 hasConcept C2776401178 @default.
- W134561036 hasConcept C41008148 @default.
- W134561036 hasConcept C41895202 @default.
- W134561036 hasConcept C75866337 @default.
- W134561036 hasConcept C83665646 @default.
- W134561036 hasConcept C97931131 @default.
- W134561036 hasConceptScore W134561036C100595998 @default.
- W134561036 hasConceptScore W134561036C119857082 @default.
- W134561036 hasConceptScore W134561036C122280245 @default.
- W134561036 hasConceptScore W134561036C12267149 @default.
- W134561036 hasConceptScore W134561036C124101348 @default.
- W134561036 hasConceptScore W134561036C138885662 @default.
- W134561036 hasConceptScore W134561036C153180895 @default.
- W134561036 hasConceptScore W134561036C154945302 @default.
- W134561036 hasConceptScore W134561036C2776401178 @default.
- W134561036 hasConceptScore W134561036C41008148 @default.
- W134561036 hasConceptScore W134561036C41895202 @default.
- W134561036 hasConceptScore W134561036C75866337 @default.
- W134561036 hasConceptScore W134561036C83665646 @default.
- W134561036 hasConceptScore W134561036C97931131 @default.
- W134561036 hasLocation W1345610361 @default.
- W134561036 hasOpenAccess W134561036 @default.