Matches in SemOpenAlex for { <https://semopenalex.org/work/W2022942700> ?p ?o ?g. }
Showing items 1 to 64 of
64
with 100 items per page.
- W2022942700 abstract "Random sampling is one of the most fundamental data management tools available. However, most current research involving sampling considers the problem of how to use a sample, and not how to compute one. The implicit assumption is that a is a small data structure that is easily maintained as new data are encountered, even though simple statistical arguments demonstrate that very large samples of gigabytes or terabytes in size can be necessary to provide high accuracy. No existing work tackles the problem of maintaining very large, disk-based samples from a data management perspective, and no techniques now exist for maintaining very large samples in an online manner from streaming data. In this paper, we present online algorithms for maintaining on-disk samples that are gigabytes or terabytes in size. The algorithms are designed for streaming data, or for any environment where a large sample must be maintained online in a single pass through a data set. The algorithms meet the strict requirement that the sample always be a true, statistically random sample (without replacement) of all of the data processed thus far. Our algorithms are also suitable for biased or unequal probability sampling." @default.
- W2022942700 created "2016-06-24" @default.
- W2022942700 creator A5022320149 @default.
- W2022942700 creator A5027798417 @default.
- W2022942700 creator A5043658441 @default.
- W2022942700 date "2004-06-13" @default.
- W2022942700 modified "2023-09-27" @default.
- W2022942700 title "Online maintenance of very large random samples" @default.
- W2022942700 cites W1992023276 @default.
- W2022942700 cites W2037701287 @default.
- W2022942700 cites W2040793975 @default.
- W2022942700 cites W2068625010 @default.
- W2022942700 cites W2078907037 @default.
- W2022942700 cites W2081189989 @default.
- W2022942700 cites W2090403603 @default.
- W2022942700 cites W2113046370 @default.
- W2022942700 cites W2118924829 @default.
- W2022942700 cites W2119885577 @default.
- W2022942700 cites W2145385038 @default.
- W2022942700 cites W2148706674 @default.
- W2022942700 cites W2154165046 @default.
- W2022942700 cites W2296677182 @default.
- W2022942700 cites W4229903866 @default.
- W2022942700 cites W4231287357 @default.
- W2022942700 cites W4234667859 @default.
- W2022942700 cites W4241185933 @default.
- W2022942700 cites W4250212406 @default.
- W2022942700 doi "https://doi.org/10.1145/1007568.1007603" @default.
- W2022942700 hasPublicationYear "2004" @default.
- W2022942700 type Work @default.
- W2022942700 sameAs 2022942700 @default.
- W2022942700 citedByCount "42" @default.
- W2022942700 countsByYear W20229427002012 @default.
- W2022942700 countsByYear W20229427002013 @default.
- W2022942700 countsByYear W20229427002014 @default.
- W2022942700 countsByYear W20229427002015 @default.
- W2022942700 countsByYear W20229427002016 @default.
- W2022942700 countsByYear W20229427002017 @default.
- W2022942700 countsByYear W20229427002018 @default.
- W2022942700 countsByYear W20229427002019 @default.
- W2022942700 countsByYear W20229427002020 @default.
- W2022942700 crossrefType "proceedings-article" @default.
- W2022942700 hasAuthorship W2022942700A5022320149 @default.
- W2022942700 hasAuthorship W2022942700A5027798417 @default.
- W2022942700 hasAuthorship W2022942700A5043658441 @default.
- W2022942700 hasConcept C41008148 @default.
- W2022942700 hasConceptScore W2022942700C41008148 @default.
- W2022942700 hasLocation W20229427001 @default.
- W2022942700 hasOpenAccess W2022942700 @default.
- W2022942700 hasPrimaryLocation W20229427001 @default.
- W2022942700 hasRelatedWork W1596801655 @default.
- W2022942700 hasRelatedWork W2130043461 @default.
- W2022942700 hasRelatedWork W2350741829 @default.
- W2022942700 hasRelatedWork W2358668433 @default.
- W2022942700 hasRelatedWork W2376932109 @default.
- W2022942700 hasRelatedWork W2382290278 @default.
- W2022942700 hasRelatedWork W2390279801 @default.
- W2022942700 hasRelatedWork W2748952813 @default.
- W2022942700 hasRelatedWork W2899084033 @default.
- W2022942700 hasRelatedWork W2530322880 @default.
- W2022942700 isParatext "false" @default.
- W2022942700 isRetracted "false" @default.
- W2022942700 magId "2022942700" @default.
- W2022942700 workType "article" @default.