Matches in SemOpenAlex for { <https://semopenalex.org/work/W2251735955> ?p ?o ?g. }
- W2251735955 abstract "Data clustering plays an important role in many disciplines, including data mining, machine learning, bioinformatics, pattern recognition, and other fields, where there is a need to learn the inherent grouping structure of data in an unsupervised manner. There are many clustering approaches proposed in the literature with different quality/complexity tradeoffs. Each clustering algorithm works on its domain space with no optimum solution to all datasets of different properties, sizes, structures, and distributions. Challenges in data clustering include, identifying proper number of clusters, scalability of the clustering approach, robustness to noise, tackling distributed datasets, and handling clusters of different configurations. This thesis addresses some of these challenges through cooperation between multiple clustering approaches. We introduce a Cooperative Clustering (CC) model that involves multiple clustering techniques; the goal of the cooperative model is to increase the homogeneity of objects within clusters through cooperation by developing two data structures, cooperative contingency graph and histogram representation of pair-wise similarities. The two data structures are designed to find the matching sub-clusters between different clusterings and to obtain the final set of cooperative clusters through a merging process. Obtaining the co-occurred objects from the different clusterings enables the cooperative model to group objects based on a multiple agreement between the invoked clustering techniques. In addition, merging this set of sub-clusters using histograms poses a new trend of grouping objects into more homogenous clusters. The cooperative model is consistent, reusable, and scalable in terms of the number of the adopted clustering approaches.In order to deal with noisy data, a novel Cooperative Clustering Outliers Detection (CCOD) algorithm is implemented through the implication of the cooperation methodology for better detection of outliers in data. The new detection approach is designed in four phases, (1) Global non-cooperative Clustering, (2) Cooperative Clustering, (3) Possible outlier's Detection, and finally (4) Candidate Outliers Detection. The detection of outliers is established in a bottom-up scenario. The thesis also addresses cooperative clustering in distributed Peer-to-Peer (P2P) networks. Mining large and inherently distributed datasets poses many challenges, one of which is the extraction of a global model as a global summary of the clustering solutions generated from all nodes for the purpose of interpreting the clustering quality of the distributed dataset as if it was located at one node. We developed distributed cooperative model and architecture that work on a two-tier super-peer P2P network. The model is called Distributed Cooperative Clustering in Super-peer P2P Networks (DCCP2P). This model aims at producing one clustering solution across the whole network. It specifically addresses scalability of network size, and consequently the distributed clustering complexity, by modeling the distributed clustering problem as two layers of peer neighborhoods and super-peers. Summarization of the global distributed clusters is achieved through a distributed version of the cooperative clustering model.Three clustering algorithms, k-means (KM), Bisecting k-means (BKM) and Partitioning Around Medoids (PAM) are invoked in the cooperative model. Results on various gene expression and text documents datasets with different properties, configurations and different degree of outliers reveal that: (i) the cooperative clustering model achieves significant improvement in the quality of the clustering solutions compared to that of the non-cooperative individual approaches; (ii) the cooperative detection algorithm discovers the nonconforming objects in data with better accuracy than the contemporary approaches, and (iii) the distributed cooperative model attains the same quality or even better as the centralized approach and achieves decent speedup by increasing number of nodes. The distributed model offers high degree of flexibility, scalability, and interpretability of large distributed repositories. Achieving the same results using current methodologies requires polling the data first to one center location, which is sometimes not feasible." @default.
- W2251735955 created "2016-06-24" @default.
- W2251735955 creator A5058900041 @default.
- W2251735955 date "2008-01-01" @default.
- W2251735955 modified "2023-09-23" @default.
- W2251735955 title "Cooperative clustering model and its applications" @default.
- W2251735955 cites W147860157 @default.
- W2251735955 cites W1492327544 @default.
- W2251735955 cites W1493454437 @default.
- W2251735955 cites W1507051419 @default.
- W2251735955 cites W1509098796 @default.
- W2251735955 cites W1509342228 @default.
- W2251735955 cites W1510543252 @default.
- W2251735955 cites W1515613432 @default.
- W2251735955 cites W1521624597 @default.
- W2251735955 cites W1524787014 @default.
- W2251735955 cites W1530232915 @default.
- W2251735955 cites W1547566968 @default.
- W2251735955 cites W1548779692 @default.
- W2251735955 cites W1549779541 @default.
- W2251735955 cites W1556860917 @default.
- W2251735955 cites W1560562394 @default.
- W2251735955 cites W1575476631 @default.
- W2251735955 cites W1592702894 @default.
- W2251735955 cites W1603122341 @default.
- W2251735955 cites W1651093245 @default.
- W2251735955 cites W1673310716 @default.
- W2251735955 cites W1870625491 @default.
- W2251735955 cites W190215796 @default.
- W2251735955 cites W1963624219 @default.
- W2251735955 cites W1975007803 @default.
- W2251735955 cites W1977556410 @default.
- W2251735955 cites W1984874056 @default.
- W2251735955 cites W1992419399 @default.
- W2251735955 cites W1995450389 @default.
- W2251735955 cites W1996747841 @default.
- W2251735955 cites W2014351296 @default.
- W2251735955 cites W2018199316 @default.
- W2251735955 cites W2022460594 @default.
- W2251735955 cites W2036987424 @default.
- W2251735955 cites W2045064676 @default.
- W2251735955 cites W2049631158 @default.
- W2251735955 cites W2050439513 @default.
- W2251735955 cites W2057712948 @default.
- W2251735955 cites W2058312826 @default.
- W2251735955 cites W2058929792 @default.
- W2251735955 cites W2061122559 @default.
- W2251735955 cites W2071301466 @default.
- W2251735955 cites W2075900721 @default.
- W2251735955 cites W2091091431 @default.
- W2251735955 cites W2095897464 @default.
- W2251735955 cites W2098162425 @default.
- W2251735955 cites W2100958137 @default.
- W2251735955 cites W2101961340 @default.
- W2251735955 cites W2102015928 @default.
- W2251735955 cites W2108323654 @default.
- W2251735955 cites W2109363337 @default.
- W2251735955 cites W2111596024 @default.
- W2251735955 cites W2112617148 @default.
- W2251735955 cites W2115116401 @default.
- W2251735955 cites W2118268275 @default.
- W2251735955 cites W2119773896 @default.
- W2251735955 cites W2123980993 @default.
- W2251735955 cites W2125543909 @default.
- W2251735955 cites W2129281431 @default.
- W2251735955 cites W2131687179 @default.
- W2251735955 cites W2134511197 @default.
- W2251735955 cites W2135000328 @default.
- W2251735955 cites W2135187880 @default.
- W2251735955 cites W2136278509 @default.
- W2251735955 cites W2137763598 @default.
- W2251735955 cites W2141465109 @default.
- W2251735955 cites W2144182447 @default.
- W2251735955 cites W2144709747 @default.
- W2251735955 cites W2147128677 @default.
- W2251735955 cites W2149433841 @default.
- W2251735955 cites W2150926065 @default.
- W2251735955 cites W2152820192 @default.
- W2251735955 cites W2153233077 @default.
- W2251735955 cites W2160974515 @default.
- W2251735955 cites W2165232124 @default.
- W2251735955 cites W2165612380 @default.
- W2251735955 cites W2168807180 @default.
- W2251735955 cites W2171975443 @default.
- W2251735955 cites W2565546808 @default.
- W2251735955 cites W2611831635 @default.
- W2251735955 cites W2799061466 @default.
- W2251735955 cites W2913066018 @default.
- W2251735955 cites W3041642014 @default.
- W2251735955 cites W52853789 @default.
- W2251735955 cites W90763689 @default.
- W2251735955 cites W94995267 @default.
- W2251735955 cites W1523439439 @default.
- W2251735955 cites W1768935529 @default.
- W2251735955 hasPublicationYear "2008" @default.
- W2251735955 type Work @default.
- W2251735955 sameAs 2251735955 @default.
- W2251735955 citedByCount "4" @default.
- W2251735955 countsByYear W22517359552013 @default.
- W2251735955 countsByYear W22517359552015 @default.