Matches in SemOpenAlex for { <https://semopenalex.org/work/W2189162653> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W2189162653 abstract "Multidimensional similarity join finds pairs of multi- dimensional points that are within some small distance of each other: The 6-k-d-B tree has been proposed as a data structure that scales better as the number of dimensions in- creases compared to previous data structures. We present a cost model of the E-k-d-B tree and use it to optimize the leaf size. We present novel parallel algorithms for the similar- ity join using the E-k-d-B tree. A load-balancing strategy based on equi-depth histograms is shown to work well for uniform or low-skew situations, whereas another based on weighted equi-depth histograms works far better for high- skew datasets. The latter strategy is only slightly slower than the former strategy for low skew datasets. Furthel; its cost is proportional to the overall cost of the similarity join. The E-k-d-B tree is a new multidimensional index struc- ture that has been proposed for performing similarity join on high-dimensional points (2). It has been shown to be considerably superior to other structures for performing the similarity join on high-dimensional points. In this paper, we present a cost model for performing similarity join using the 6-k-d-B tree. We use our cost model to dynamically determine the leaf size threshold. This threshold has a significant effect on the cost of the sim- ilarity join operation. Our experimental results show that our model is reasonably effective. This cost model is also useful for its parallelization. The parallelization of similarity join is difficult because of skewed amounts of work required in different parts of the tree. The amount of work required for different parts of the tree can be a superlinear function of the number of as- sociated points. In this paper, we present a novel sampling- based scheme for the parallelization of this problem. Our scheme uses a subset of data to estimate the amounts of work required based on the cost model discussed earlier. A comparison with a simplistic scheme based on assigning approximately equal numbers of points to different numbers of processors shows that our scheme performs significantly better in the presence of data skews, even for 16 processors. The rest of this paper is organized as follows. In Section 2, we describe how to determine the optimal or near opti- mal leaf size of the c-k-d-B tree. In Section 3, we describe several parallel algorithms for computing the similarity join and a novel load-balancing strategy suitable for paralleliz- ing problems which are sensitive to the presence of data skew and are not iterative in nature. Section 4 presents ex- perimental results. Section 5 presents our conclusions." @default.
- W2189162653 created "2016-06-24" @default.
- W2189162653 creator A5002766037 @default.
- W2189162653 creator A5030983403 @default.
- W2189162653 creator A5077570468 @default.
- W2189162653 date "1998-01-01" @default.
- W2189162653 modified "2023-10-11" @default.
- W2189162653 title "An Efficient Parallel Algorithm for High Dimensional Si" @default.
- W2189162653 cites W1573284583 @default.
- W2189162653 cites W2093191240 @default.
- W2189162653 cites W2542460421 @default.
- W2189162653 cites W86694589 @default.
- W2189162653 hasPublicationYear "1998" @default.
- W2189162653 type Work @default.
- W2189162653 sameAs 2189162653 @default.
- W2189162653 citedByCount "0" @default.
- W2189162653 crossrefType "journal-article" @default.
- W2189162653 hasAuthorship W2189162653A5002766037 @default.
- W2189162653 hasAuthorship W2189162653A5030983403 @default.
- W2189162653 hasAuthorship W2189162653A5077570468 @default.
- W2189162653 hasConcept C103278499 @default.
- W2189162653 hasConcept C105795698 @default.
- W2189162653 hasConcept C106278948 @default.
- W2189162653 hasConcept C113174947 @default.
- W2189162653 hasConcept C11413529 @default.
- W2189162653 hasConcept C114614502 @default.
- W2189162653 hasConcept C115961682 @default.
- W2189162653 hasConcept C124101348 @default.
- W2189162653 hasConcept C154945302 @default.
- W2189162653 hasConcept C159620131 @default.
- W2189162653 hasConcept C203689450 @default.
- W2189162653 hasConcept C2776124973 @default.
- W2189162653 hasConcept C33923547 @default.
- W2189162653 hasConcept C41008148 @default.
- W2189162653 hasConcept C43711488 @default.
- W2189162653 hasConcept C53533937 @default.
- W2189162653 hasConcept C76155785 @default.
- W2189162653 hasConceptScore W2189162653C103278499 @default.
- W2189162653 hasConceptScore W2189162653C105795698 @default.
- W2189162653 hasConceptScore W2189162653C106278948 @default.
- W2189162653 hasConceptScore W2189162653C113174947 @default.
- W2189162653 hasConceptScore W2189162653C11413529 @default.
- W2189162653 hasConceptScore W2189162653C114614502 @default.
- W2189162653 hasConceptScore W2189162653C115961682 @default.
- W2189162653 hasConceptScore W2189162653C124101348 @default.
- W2189162653 hasConceptScore W2189162653C154945302 @default.
- W2189162653 hasConceptScore W2189162653C159620131 @default.
- W2189162653 hasConceptScore W2189162653C203689450 @default.
- W2189162653 hasConceptScore W2189162653C2776124973 @default.
- W2189162653 hasConceptScore W2189162653C33923547 @default.
- W2189162653 hasConceptScore W2189162653C41008148 @default.
- W2189162653 hasConceptScore W2189162653C43711488 @default.
- W2189162653 hasConceptScore W2189162653C53533937 @default.
- W2189162653 hasConceptScore W2189162653C76155785 @default.
- W2189162653 hasLocation W21891626531 @default.
- W2189162653 hasOpenAccess W2189162653 @default.
- W2189162653 hasPrimaryLocation W21891626531 @default.
- W2189162653 hasRelatedWork W1490377673 @default.
- W2189162653 hasRelatedWork W1558727706 @default.
- W2189162653 hasRelatedWork W2061980994 @default.
- W2189162653 hasRelatedWork W2134627110 @default.
- W2189162653 hasRelatedWork W2174057495 @default.
- W2189162653 hasRelatedWork W2182642465 @default.
- W2189162653 hasRelatedWork W2270660075 @default.
- W2189162653 hasRelatedWork W2294331997 @default.
- W2189162653 hasRelatedWork W2361690768 @default.
- W2189162653 hasRelatedWork W2387874800 @default.
- W2189162653 hasRelatedWork W2486151422 @default.
- W2189162653 hasRelatedWork W2522037286 @default.
- W2189162653 hasRelatedWork W2542460421 @default.
- W2189162653 hasRelatedWork W2942208613 @default.
- W2189162653 hasRelatedWork W2950240972 @default.
- W2189162653 hasRelatedWork W2952007262 @default.
- W2189162653 hasRelatedWork W2955853211 @default.
- W2189162653 hasRelatedWork W2963146809 @default.
- W2189162653 hasRelatedWork W3090078726 @default.
- W2189162653 hasRelatedWork W3143307485 @default.
- W2189162653 isParatext "false" @default.
- W2189162653 isRetracted "false" @default.
- W2189162653 magId "2189162653" @default.
- W2189162653 workType "article" @default.