Matches in SemOpenAlex for { <https://semopenalex.org/work/W3165344121> ?p ?o ?g. }
Showing items 1 to 69 of
69
with 100 items per page.
- W3165344121 abstract "Clustering is a branch of machine learning consisting in dividing a dataset into several groups, called clusters. Each cluster contains data with similar characteristics. Several clustering approaches exist that differ in complexity and efficiency due to the multitude of clustering applications. In this thesis, we are mainly interested in centroid-based methods, more specifically k-means and density-based methods. In each approach, we have made contributions that address different problems.Due to the growth of the amount of data produced by different sources (sensors,social networks, information systems...), it is necessary to design fast algorithms to manage this growth. One of the best-known problems in clustering is the k-means problem. It is considered NP-hard in the number of points and clusters. Lloyd’sheuristic has approximated the solution to this problem. This is one of the ten most used methods in data mining because of its algorithmic simplicity. Nevertheless, this iterative heuristic does not propose an optimization strategy that avoids repetitive calculations. Versions based on geometric reasoning have partially addressed this problem. In this manuscript, we proposed a strategy to reduce unnecessary compu-tations in Lloyd’s version and the versions based on geometric reasoning. It consists mainly in identifying, by estimation, the stable points, i.e., they no longer contribute to improving the solution during the iterative process of k-means. Thus, calculations related to stable points are avoided.K-means requires a priori, from users, the value of the number of K clusters. It is necessary for K to be the closest to the ground truth. Otherwise, the result of partitioning is of low quality or even unusable. We proposed Kd-means, an algorithm based on a hierarchical approach. It consists in hierarchizing data in a Kd-tree data structure and then merging sub-groups of points recursively in the bottom-up direction using new inter-group merging criteria that we have developed. These criteria guide the merging process to estimate K closest to real and produce clusters with a more complex shape than sphericity. Through experimentation, Kd-means has clearly shown its superiority over its competitors in execution time, clustering quality and K estimation. The density-based approach’s challenges are the high dimensionality of the points,the difficulty to separate low-density clusters from groups of outliers, and the sep-aration of close clusters of the same density. To address these challenges, we have developed DECWA, a method based on a probabilistic approach. In DECWA, we proposed 1) a strategy of dividing a dataset into sub-groups where each of them follows its probability law; 2) followed by another strategy that merges subgroups, similar in probability law, into final clusters. Experimentally, DECWA, in high-dimensional spaces, produces a good quality clustering compared to its competitors" @default.
- W3165344121 created "2021-06-07" @default.
- W3165344121 creator A5058607906 @default.
- W3165344121 date "2021-01-13" @default.
- W3165344121 modified "2023-09-25" @default.
- W3165344121 title "New Partition-based and Density-based approaches for improving clustering" @default.
- W3165344121 hasPublicationYear "2021" @default.
- W3165344121 type Work @default.
- W3165344121 sameAs 3165344121 @default.
- W3165344121 citedByCount "0" @default.
- W3165344121 crossrefType "dissertation" @default.
- W3165344121 hasAuthorship W3165344121A5058607906 @default.
- W3165344121 hasConcept C111472728 @default.
- W3165344121 hasConcept C11413529 @default.
- W3165344121 hasConcept C114614502 @default.
- W3165344121 hasConcept C119857082 @default.
- W3165344121 hasConcept C124101348 @default.
- W3165344121 hasConcept C138885662 @default.
- W3165344121 hasConcept C146599234 @default.
- W3165344121 hasConcept C154945302 @default.
- W3165344121 hasConcept C173801870 @default.
- W3165344121 hasConcept C33923547 @default.
- W3165344121 hasConcept C41008148 @default.
- W3165344121 hasConcept C42812 @default.
- W3165344121 hasConcept C73555534 @default.
- W3165344121 hasConcept C75553542 @default.
- W3165344121 hasConcept C80444323 @default.
- W3165344121 hasConceptScore W3165344121C111472728 @default.
- W3165344121 hasConceptScore W3165344121C11413529 @default.
- W3165344121 hasConceptScore W3165344121C114614502 @default.
- W3165344121 hasConceptScore W3165344121C119857082 @default.
- W3165344121 hasConceptScore W3165344121C124101348 @default.
- W3165344121 hasConceptScore W3165344121C138885662 @default.
- W3165344121 hasConceptScore W3165344121C146599234 @default.
- W3165344121 hasConceptScore W3165344121C154945302 @default.
- W3165344121 hasConceptScore W3165344121C173801870 @default.
- W3165344121 hasConceptScore W3165344121C33923547 @default.
- W3165344121 hasConceptScore W3165344121C41008148 @default.
- W3165344121 hasConceptScore W3165344121C42812 @default.
- W3165344121 hasConceptScore W3165344121C73555534 @default.
- W3165344121 hasConceptScore W3165344121C75553542 @default.
- W3165344121 hasConceptScore W3165344121C80444323 @default.
- W3165344121 hasLocation W31653441211 @default.
- W3165344121 hasOpenAccess W3165344121 @default.
- W3165344121 hasPrimaryLocation W31653441211 @default.
- W3165344121 hasRelatedWork W179374575 @default.
- W3165344121 hasRelatedWork W2063958902 @default.
- W3165344121 hasRelatedWork W2182037563 @default.
- W3165344121 hasRelatedWork W2183980339 @default.
- W3165344121 hasRelatedWork W2186589717 @default.
- W3165344121 hasRelatedWork W2404230925 @default.
- W3165344121 hasRelatedWork W2556539676 @default.
- W3165344121 hasRelatedWork W2559341273 @default.
- W3165344121 hasRelatedWork W2598125667 @default.
- W3165344121 hasRelatedWork W2612232036 @default.
- W3165344121 hasRelatedWork W2811059409 @default.
- W3165344121 hasRelatedWork W2979792990 @default.
- W3165344121 hasRelatedWork W2996899071 @default.
- W3165344121 hasRelatedWork W3011897559 @default.
- W3165344121 hasRelatedWork W3025751404 @default.
- W3165344121 hasRelatedWork W3146449272 @default.
- W3165344121 hasRelatedWork W3159200155 @default.
- W3165344121 hasRelatedWork W8124646 @default.
- W3165344121 hasRelatedWork W2185743328 @default.
- W3165344121 hasRelatedWork W2444586519 @default.
- W3165344121 isParatext "false" @default.
- W3165344121 isRetracted "false" @default.
- W3165344121 magId "3165344121" @default.
- W3165344121 workType "dissertation" @default.