Matches in SemOpenAlex for { <https://semopenalex.org/work/W2953282389> ?p ?o ?g. }
- W2953282389 abstract "Clustering of data points is a fundamental tool in data analysis. We consider points X in a relaxed metric space, where the triangle inequality holds within a constant factor. A clustering of X is a partition of X defined by a set of points Q(centroids), according to the closest centroid. The cost of clustering X by Q is V(Q)= ∑x ∈ X dxQ. This formulation generalizes classic k-means clustering, which uses squared distances. Two basic tasks, parametrized by k ≥ 1, are cost estimation, which returns (approximate) V(Q) for queries Q such that |Q| = k and clustering, which returns an (approximate) minimizer of V(Q) of size |Q|= k. When the data set X is very large, we seek efficient constructions of small samples that can act as surrogates for performing these tasks. Existing constructions that provide quality guarantees, however, are either worst-case, and unable to benefit from structure of real data sets, or make explicit strong assumptions on the structure. We show here how to avoid both these pitfalls using adaptive designs. The core of our design are the novel one2all probabilities, computed for a set M of centroids and α ≥ 1: The clustering cost of each Q with cost V(Q) ≥ V(M)/α can be estimated well from a sample of size O(α |M| ε-2). For cost estimation, we apply one2all with a bicriteria approximate M, while adaptively balancing |M| and α to optimize sample size per quality. For clustering, we present a wrapper that adaptively applies a base clustering algorithm to a sample S, using the smallest sample that provides the desired statistical guarantees on quality. We demonstrate experimentally the huge gains of using our adaptive instead of worst-case methods." @default.
- W2953282389 created "2019-06-27" @default.
- W2953282389 creator A5006699796 @default.
- W2953282389 creator A5026385549 @default.
- W2953282389 creator A5066485203 @default.
- W2953282389 date "2018-04-29" @default.
- W2953282389 modified "2023-09-25" @default.
- W2953282389 title "Clustering Small Samples With Quality Guarantees: Adaptivity With One2all PPS" @default.
- W2953282389 cites W1530581016 @default.
- W2953282389 cites W1571664355 @default.
- W2953282389 cites W1876468466 @default.
- W2953282389 cites W1937109390 @default.
- W2953282389 cites W1956536100 @default.
- W2953282389 cites W1965996575 @default.
- W2953282389 cites W1981773323 @default.
- W2953282389 cites W2001947543 @default.
- W2953282389 cites W2029685080 @default.
- W2953282389 cites W2045555847 @default.
- W2953282389 cites W2045964207 @default.
- W2953282389 cites W2058295780 @default.
- W2953282389 cites W2073459066 @default.
- W2953282389 cites W2077499269 @default.
- W2953282389 cites W2086959852 @default.
- W2953282389 cites W2092236286 @default.
- W2953282389 cites W2094048240 @default.
- W2953282389 cites W2133157266 @default.
- W2953282389 cites W2142035328 @default.
- W2953282389 cites W2150593711 @default.
- W2953282389 cites W2199495299 @default.
- W2953282389 cites W2229238337 @default.
- W2953282389 cites W2247953682 @default.
- W2953282389 cites W2750384547 @default.
- W2953282389 cites W2963320491 @default.
- W2953282389 doi "https://doi.org/10.1609/aaai.v32i1.11772" @default.
- W2953282389 hasPublicationYear "2018" @default.
- W2953282389 type Work @default.
- W2953282389 sameAs 2953282389 @default.
- W2953282389 citedByCount "0" @default.
- W2953282389 crossrefType "journal-article" @default.
- W2953282389 hasAuthorship W2953282389A5006699796 @default.
- W2953282389 hasAuthorship W2953282389A5026385549 @default.
- W2953282389 hasAuthorship W2953282389A5066485203 @default.
- W2953282389 hasBestOaLocation W29532823891 @default.
- W2953282389 hasConcept C105795698 @default.
- W2953282389 hasConcept C11413529 @default.
- W2953282389 hasConcept C114614502 @default.
- W2953282389 hasConcept C115328559 @default.
- W2953282389 hasConcept C124101348 @default.
- W2953282389 hasConcept C146599234 @default.
- W2953282389 hasConcept C162324750 @default.
- W2953282389 hasConcept C176217482 @default.
- W2953282389 hasConcept C177264268 @default.
- W2953282389 hasConcept C182964748 @default.
- W2953282389 hasConcept C199360897 @default.
- W2953282389 hasConcept C207968372 @default.
- W2953282389 hasConcept C21547014 @default.
- W2953282389 hasConcept C2524010 @default.
- W2953282389 hasConcept C33704608 @default.
- W2953282389 hasConcept C33923547 @default.
- W2953282389 hasConcept C41008148 @default.
- W2953282389 hasConcept C42812 @default.
- W2953282389 hasConcept C73555534 @default.
- W2953282389 hasConcept C94641424 @default.
- W2953282389 hasConceptScore W2953282389C105795698 @default.
- W2953282389 hasConceptScore W2953282389C11413529 @default.
- W2953282389 hasConceptScore W2953282389C114614502 @default.
- W2953282389 hasConceptScore W2953282389C115328559 @default.
- W2953282389 hasConceptScore W2953282389C124101348 @default.
- W2953282389 hasConceptScore W2953282389C146599234 @default.
- W2953282389 hasConceptScore W2953282389C162324750 @default.
- W2953282389 hasConceptScore W2953282389C176217482 @default.
- W2953282389 hasConceptScore W2953282389C177264268 @default.
- W2953282389 hasConceptScore W2953282389C182964748 @default.
- W2953282389 hasConceptScore W2953282389C199360897 @default.
- W2953282389 hasConceptScore W2953282389C207968372 @default.
- W2953282389 hasConceptScore W2953282389C21547014 @default.
- W2953282389 hasConceptScore W2953282389C2524010 @default.
- W2953282389 hasConceptScore W2953282389C33704608 @default.
- W2953282389 hasConceptScore W2953282389C33923547 @default.
- W2953282389 hasConceptScore W2953282389C41008148 @default.
- W2953282389 hasConceptScore W2953282389C42812 @default.
- W2953282389 hasConceptScore W2953282389C73555534 @default.
- W2953282389 hasConceptScore W2953282389C94641424 @default.
- W2953282389 hasIssue "1" @default.
- W2953282389 hasLocation W29532823891 @default.
- W2953282389 hasLocation W29532823892 @default.
- W2953282389 hasOpenAccess W2953282389 @default.
- W2953282389 hasPrimaryLocation W29532823891 @default.
- W2953282389 hasRelatedWork W2017920586 @default.
- W2953282389 hasRelatedWork W2045437074 @default.
- W2953282389 hasRelatedWork W2151036524 @default.
- W2953282389 hasRelatedWork W2187506573 @default.
- W2953282389 hasRelatedWork W2942177010 @default.
- W2953282389 hasRelatedWork W2953024232 @default.
- W2953282389 hasRelatedWork W2953282389 @default.
- W2953282389 hasRelatedWork W4298875530 @default.
- W2953282389 hasRelatedWork W4301079385 @default.
- W2953282389 hasRelatedWork W2992603957 @default.
- W2953282389 hasVolume "32" @default.
- W2953282389 isParatext "false" @default.