Matches in SemOpenAlex for { <https://semopenalex.org/work/W3171504606> ?p ?o ?g. }
- W3171504606 abstract "We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements. The problem occurs in many applications, including biology, genomics, computer systems and linguistics. A line of research spanning the last decade resulted in algorithms that estimate the support up to $ pm varepsilon n$ from a sample of size $O(log^2(1/varepsilon) cdot n/log n)$, where $n$ is the data set size. Unfortunately, this bound is known to be tight, limiting further improvements to the complexity of this problem. In this paper we consider estimation algorithms augmented with a machine-learning-based predictor that, given any element, returns an estimation of its frequency. We show that if the predictor is correct up to a constant approximation factor, then the sample complexity can be reduced significantly, to [ log (1/varepsilon) cdot n^{1-Theta(1/log(1/varepsilon))}. ] We evaluate the proposed algorithms on a collection of data sets, using the neural-network based estimators from {Hsu et al, ICLR'19} as predictors. Our experiments demonstrate substantial (up to 3x) improvements in the estimation accuracy compared to the state of the art algorithm." @default.
- W3171504606 created "2021-06-22" @default.
- W3171504606 creator A5029495902 @default.
- W3171504606 creator A5032573706 @default.
- W3171504606 creator A5041567023 @default.
- W3171504606 creator A5076056716 @default.
- W3171504606 creator A5086071515 @default.
- W3171504606 creator A5086741302 @default.
- W3171504606 date "2021-06-15" @default.
- W3171504606 modified "2023-09-27" @default.
- W3171504606 title "Learning-based Support Estimation in Sublinear Time" @default.
- W3171504606 cites W1512638263 @default.
- W3171504606 cites W1557882449 @default.
- W3171504606 cites W2096669689 @default.
- W3171504606 cites W2102942501 @default.
- W3171504606 cites W2103126020 @default.
- W3171504606 cites W2124055802 @default.
- W3171504606 cites W2127090196 @default.
- W3171504606 cites W2146368895 @default.
- W3171504606 cites W2170990775 @default.
- W3171504606 cites W2253327025 @default.
- W3171504606 cites W2545606300 @default.
- W3171504606 cites W2554091414 @default.
- W3171504606 cites W2595294663 @default.
- W3171504606 cites W2607264901 @default.
- W3171504606 cites W2769478807 @default.
- W3171504606 cites W2798303859 @default.
- W3171504606 cites W2890643081 @default.
- W3171504606 cites W2891784792 @default.
- W3171504606 cites W2909813108 @default.
- W3171504606 cites W2952763926 @default.
- W3171504606 cites W2962771342 @default.
- W3171504606 cites W2963017284 @default.
- W3171504606 cites W2963213486 @default.
- W3171504606 cites W2963785501 @default.
- W3171504606 cites W2963836097 @default.
- W3171504606 cites W2964316188 @default.
- W3171504606 cites W2996022682 @default.
- W3171504606 cites W3102722370 @default.
- W3171504606 hasPublicationYear "2021" @default.
- W3171504606 type Work @default.
- W3171504606 sameAs 3171504606 @default.
- W3171504606 citedByCount "2" @default.
- W3171504606 countsByYear W31715046062021 @default.
- W3171504606 countsByYear W31715046062023 @default.
- W3171504606 crossrefType "posted-content" @default.
- W3171504606 hasAuthorship W3171504606A5029495902 @default.
- W3171504606 hasAuthorship W3171504606A5032573706 @default.
- W3171504606 hasAuthorship W3171504606A5041567023 @default.
- W3171504606 hasAuthorship W3171504606A5076056716 @default.
- W3171504606 hasAuthorship W3171504606A5086071515 @default.
- W3171504606 hasAuthorship W3171504606A5086741302 @default.
- W3171504606 hasConcept C105795698 @default.
- W3171504606 hasConcept C110121322 @default.
- W3171504606 hasConcept C11413529 @default.
- W3171504606 hasConcept C114614502 @default.
- W3171504606 hasConcept C117160843 @default.
- W3171504606 hasConcept C129848803 @default.
- W3171504606 hasConcept C134306372 @default.
- W3171504606 hasConcept C154945302 @default.
- W3171504606 hasConcept C177264268 @default.
- W3171504606 hasConcept C185429906 @default.
- W3171504606 hasConcept C199360897 @default.
- W3171504606 hasConcept C2777027219 @default.
- W3171504606 hasConcept C2778445095 @default.
- W3171504606 hasConcept C33923547 @default.
- W3171504606 hasConcept C41008148 @default.
- W3171504606 hasConcept C58489278 @default.
- W3171504606 hasConcept C63553672 @default.
- W3171504606 hasConcept C77553402 @default.
- W3171504606 hasConceptScore W3171504606C105795698 @default.
- W3171504606 hasConceptScore W3171504606C110121322 @default.
- W3171504606 hasConceptScore W3171504606C11413529 @default.
- W3171504606 hasConceptScore W3171504606C114614502 @default.
- W3171504606 hasConceptScore W3171504606C117160843 @default.
- W3171504606 hasConceptScore W3171504606C129848803 @default.
- W3171504606 hasConceptScore W3171504606C134306372 @default.
- W3171504606 hasConceptScore W3171504606C154945302 @default.
- W3171504606 hasConceptScore W3171504606C177264268 @default.
- W3171504606 hasConceptScore W3171504606C185429906 @default.
- W3171504606 hasConceptScore W3171504606C199360897 @default.
- W3171504606 hasConceptScore W3171504606C2777027219 @default.
- W3171504606 hasConceptScore W3171504606C2778445095 @default.
- W3171504606 hasConceptScore W3171504606C33923547 @default.
- W3171504606 hasConceptScore W3171504606C41008148 @default.
- W3171504606 hasConceptScore W3171504606C58489278 @default.
- W3171504606 hasConceptScore W3171504606C63553672 @default.
- W3171504606 hasConceptScore W3171504606C77553402 @default.
- W3171504606 hasLocation W31715046061 @default.
- W3171504606 hasOpenAccess W3171504606 @default.
- W3171504606 hasPrimaryLocation W31715046061 @default.
- W3171504606 hasRelatedWork W1724772559 @default.
- W3171504606 hasRelatedWork W1813877927 @default.
- W3171504606 hasRelatedWork W1828331400 @default.
- W3171504606 hasRelatedWork W1961501705 @default.
- W3171504606 hasRelatedWork W2015649155 @default.
- W3171504606 hasRelatedWork W2062817469 @default.
- W3171504606 hasRelatedWork W2150024778 @default.
- W3171504606 hasRelatedWork W2293992861 @default.
- W3171504606 hasRelatedWork W2902308568 @default.