Matches in SemOpenAlex for { <https://semopenalex.org/work/W80917968> ?p ?o ?g. }
- W80917968 abstract "Previous chapter Next chapter Full AccessProceedings Proceedings of the 2003 SIAM International Conference on Data Mining (SDM)Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional DataLevent Ertöz, Michael Steinbach, and Vipin KumarLevent Ertöz, Michael Steinbach, and Vipin Kumarpp.47 - 58Chapter DOI:https://doi.org/10.1137/1.9781611972733.5PDFBibTexSections ToolsAdd to favoritesExport CitationTrack CitationsEmail SectionsAboutAbstract Finding clusters in data, especially high dimensional data, is challenging when the clusters are of widely differing shapes, sizes, and densities, and when the data contains noise and outliers. We present a novel clustering technique that addresses these issues. Our algorithm first finds the nearest neighbors of each data point and then redefines the similarity between pairs of points in terms of how many nearest neighbors the two points share. Using this definition of similarity, our algorithm identifies core points and then builds clusters around the core points. The use of a shared nearest neighbor definition of similarity alleviates problems with varying densities and high dimensionality, while the use of core points handles problems with shape and size. While our algorithm can find the “dense” clusters that other clustering algorithms find, it also finds clusters that these approaches overlook, i.e., clusters of low or medium density which represent relatively uniform regions “surrounded” by non-uniform or higher density areas. We experimentally show that our algorithm performs better than traditional methods (e.g., K-means, DBSCAN, CURE) on a variety of data sets: KDD Cup ‘99 network intrusion data, NASA Earth science time series data, and two-dimensional point sets. The run-time complexity of our technique is O(n2) if the similarity matrix has to be constructed. However, we discuss a number of optimizations that allow the algorithm to handle large data sets efficiently. Previous chapter Next chapter RelatedDetails Published:2003ISBN:978-0-89871-545-3eISBN:978-1-61197-273-3 https://doi.org/10.1137/1.9781611972733Book Series Name:ProceedingsBook Code:PR112Book Pages:xiv + 347Key words:cluster analysis, shared nearest neighbor, time series, network intrusion, spatial data" @default.
- W80917968 created "2016-06-24" @default.
- W80917968 creator A5022878003 @default.
- W80917968 creator A5049521806 @default.
- W80917968 creator A5089436894 @default.
- W80917968 date "2003-05-01" @default.
- W80917968 modified "2023-10-10" @default.
- W80917968 title "Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data" @default.
- W80917968 cites W1502904630 @default.
- W80917968 cites W1515923189 @default.
- W80917968 cites W1521439890 @default.
- W80917968 cites W1522930108 @default.
- W80917968 cites W1529320607 @default.
- W80917968 cites W1620204465 @default.
- W80917968 cites W1651093245 @default.
- W80917968 cites W1660133578 @default.
- W80917968 cites W1673310716 @default.
- W80917968 cites W1790954942 @default.
- W80917968 cites W1971318281 @default.
- W80917968 cites W1971784203 @default.
- W80917968 cites W1996764654 @default.
- W80917968 cites W2030951871 @default.
- W80917968 cites W2036216970 @default.
- W80917968 cites W2039576845 @default.
- W80917968 cites W206036920 @default.
- W80917968 cites W2070412788 @default.
- W80917968 cites W2089923519 @default.
- W80917968 cites W2099581008 @default.
- W80917968 cites W2126751256 @default.
- W80917968 cites W2127391575 @default.
- W80917968 cites W2131687179 @default.
- W80917968 cites W2141585940 @default.
- W80917968 cites W2144182447 @default.
- W80917968 cites W2160642098 @default.
- W80917968 cites W2164122986 @default.
- W80917968 cites W2562836854 @default.
- W80917968 cites W2799061466 @default.
- W80917968 cites W2999729612 @default.
- W80917968 cites W9613553 @default.
- W80917968 doi "https://doi.org/10.1137/1.9781611972733.5" @default.
- W80917968 hasPublicationYear "2003" @default.
- W80917968 type Work @default.
- W80917968 sameAs 80917968 @default.
- W80917968 citedByCount "494" @default.
- W80917968 countsByYear W809179682012 @default.
- W80917968 countsByYear W809179682013 @default.
- W80917968 countsByYear W809179682014 @default.
- W80917968 countsByYear W809179682015 @default.
- W80917968 countsByYear W809179682016 @default.
- W80917968 countsByYear W809179682017 @default.
- W80917968 countsByYear W809179682018 @default.
- W80917968 countsByYear W809179682019 @default.
- W80917968 countsByYear W809179682020 @default.
- W80917968 countsByYear W809179682021 @default.
- W80917968 countsByYear W809179682022 @default.
- W80917968 countsByYear W809179682023 @default.
- W80917968 crossrefType "proceedings-article" @default.
- W80917968 hasAuthorship W80917968A5022878003 @default.
- W80917968 hasAuthorship W80917968A5049521806 @default.
- W80917968 hasAuthorship W80917968A5089436894 @default.
- W80917968 hasBestOaLocation W809179682 @default.
- W80917968 hasConcept C102164700 @default.
- W80917968 hasConcept C103278499 @default.
- W80917968 hasConcept C104047586 @default.
- W80917968 hasConcept C111030470 @default.
- W80917968 hasConcept C113238511 @default.
- W80917968 hasConcept C11413529 @default.
- W80917968 hasConcept C115961682 @default.
- W80917968 hasConcept C116738811 @default.
- W80917968 hasConcept C124101348 @default.
- W80917968 hasConcept C143724316 @default.
- W80917968 hasConcept C151730666 @default.
- W80917968 hasConcept C153180895 @default.
- W80917968 hasConcept C154945302 @default.
- W80917968 hasConcept C21080849 @default.
- W80917968 hasConcept C2164484 @default.
- W80917968 hasConcept C2524010 @default.
- W80917968 hasConcept C28719098 @default.
- W80917968 hasConcept C33704608 @default.
- W80917968 hasConcept C33923547 @default.
- W80917968 hasConcept C41008148 @default.
- W80917968 hasConcept C46576248 @default.
- W80917968 hasConcept C73555534 @default.
- W80917968 hasConcept C76155785 @default.
- W80917968 hasConcept C79337645 @default.
- W80917968 hasConcept C86803240 @default.
- W80917968 hasConcept C94641424 @default.
- W80917968 hasConceptScore W80917968C102164700 @default.
- W80917968 hasConceptScore W80917968C103278499 @default.
- W80917968 hasConceptScore W80917968C104047586 @default.
- W80917968 hasConceptScore W80917968C111030470 @default.
- W80917968 hasConceptScore W80917968C113238511 @default.
- W80917968 hasConceptScore W80917968C11413529 @default.
- W80917968 hasConceptScore W80917968C115961682 @default.
- W80917968 hasConceptScore W80917968C116738811 @default.
- W80917968 hasConceptScore W80917968C124101348 @default.
- W80917968 hasConceptScore W80917968C143724316 @default.
- W80917968 hasConceptScore W80917968C151730666 @default.
- W80917968 hasConceptScore W80917968C153180895 @default.
- W80917968 hasConceptScore W80917968C154945302 @default.