Matches in SemOpenAlex for { <https://semopenalex.org/work/W3126867710> ?p ?o ?g. }
- W3126867710 abstract "We consider the problem of estimating the number of distinct elements in a large data set (or, equivalently, the support size of the distribution induced by the data set) from a random sample of its elements. The problem occurs in many applications, including biology, genomics, computer systems and linguistics. A line of research spanning the last decade resulted in algorithms that estimate the support up to ±en from a sample of size O(log2(1/e)⋅n/logn), where n is the data set size. Unfortunately, this bound is known to be tight, limiting further improvements to the complexity of this problem. In this paper we consider estimation algorithms augmented with a machine-learning-based predictor that, given any element, returns an estimation of its frequency. We show that if the predictor is correct up to a constant approximation factor, then the sample complexity can be reduced significantly, to log(1/e)⋅n1−Θ(1/log(1/e)). We evaluate the proposed algorithms on a collection of data sets, using the neural-network based estimators from {Hsu et al, ICLR'19} as predictors. Our experiments demonstrate substantial (up to 3x) improvements in the estimation accuracy compared to the state of the art algorithm." @default.
- W3126867710 created "2021-02-15" @default.
- W3126867710 creator A5029495902 @default.
- W3126867710 creator A5032573706 @default.
- W3126867710 creator A5041567023 @default.
- W3126867710 creator A5076056716 @default.
- W3126867710 creator A5086071515 @default.
- W3126867710 creator A5086741302 @default.
- W3126867710 date "2021-05-03" @default.
- W3126867710 modified "2023-09-26" @default.
- W3126867710 title "Learning-based Support Estimation in Sublinear Time" @default.
- W3126867710 cites W1512638263 @default.
- W3126867710 cites W1557882449 @default.
- W3126867710 cites W2096669689 @default.
- W3126867710 cites W2102942501 @default.
- W3126867710 cites W2103126020 @default.
- W3126867710 cites W2124055802 @default.
- W3126867710 cites W2127090196 @default.
- W3126867710 cites W2146368895 @default.
- W3126867710 cites W2170990775 @default.
- W3126867710 cites W2545606300 @default.
- W3126867710 cites W2554091414 @default.
- W3126867710 cites W2595294663 @default.
- W3126867710 cites W2607264901 @default.
- W3126867710 cites W2769478807 @default.
- W3126867710 cites W2890643081 @default.
- W3126867710 cites W2891784792 @default.
- W3126867710 cites W2909813108 @default.
- W3126867710 cites W2952763926 @default.
- W3126867710 cites W2962771342 @default.
- W3126867710 cites W2963017284 @default.
- W3126867710 cites W2963213486 @default.
- W3126867710 cites W2963785501 @default.
- W3126867710 cites W2963836097 @default.
- W3126867710 cites W2964316188 @default.
- W3126867710 cites W2996022682 @default.
- W3126867710 cites W3102722370 @default.
- W3126867710 hasPublicationYear "2021" @default.
- W3126867710 type Work @default.
- W3126867710 sameAs 3126867710 @default.
- W3126867710 citedByCount "4" @default.
- W3126867710 countsByYear W31268677102019 @default.
- W3126867710 countsByYear W31268677102020 @default.
- W3126867710 countsByYear W31268677102021 @default.
- W3126867710 crossrefType "proceedings-article" @default.
- W3126867710 hasAuthorship W3126867710A5029495902 @default.
- W3126867710 hasAuthorship W3126867710A5032573706 @default.
- W3126867710 hasAuthorship W3126867710A5041567023 @default.
- W3126867710 hasAuthorship W3126867710A5076056716 @default.
- W3126867710 hasAuthorship W3126867710A5086071515 @default.
- W3126867710 hasAuthorship W3126867710A5086741302 @default.
- W3126867710 hasConcept C105795698 @default.
- W3126867710 hasConcept C11413529 @default.
- W3126867710 hasConcept C117160843 @default.
- W3126867710 hasConcept C118615104 @default.
- W3126867710 hasConcept C129848803 @default.
- W3126867710 hasConcept C134306372 @default.
- W3126867710 hasConcept C154945302 @default.
- W3126867710 hasConcept C177264268 @default.
- W3126867710 hasConcept C179799912 @default.
- W3126867710 hasConcept C185429906 @default.
- W3126867710 hasConcept C199360897 @default.
- W3126867710 hasConcept C33923547 @default.
- W3126867710 hasConcept C41008148 @default.
- W3126867710 hasConcept C50644808 @default.
- W3126867710 hasConcept C58489278 @default.
- W3126867710 hasConcept C77553402 @default.
- W3126867710 hasConceptScore W3126867710C105795698 @default.
- W3126867710 hasConceptScore W3126867710C11413529 @default.
- W3126867710 hasConceptScore W3126867710C117160843 @default.
- W3126867710 hasConceptScore W3126867710C118615104 @default.
- W3126867710 hasConceptScore W3126867710C129848803 @default.
- W3126867710 hasConceptScore W3126867710C134306372 @default.
- W3126867710 hasConceptScore W3126867710C154945302 @default.
- W3126867710 hasConceptScore W3126867710C177264268 @default.
- W3126867710 hasConceptScore W3126867710C179799912 @default.
- W3126867710 hasConceptScore W3126867710C185429906 @default.
- W3126867710 hasConceptScore W3126867710C199360897 @default.
- W3126867710 hasConceptScore W3126867710C33923547 @default.
- W3126867710 hasConceptScore W3126867710C41008148 @default.
- W3126867710 hasConceptScore W3126867710C50644808 @default.
- W3126867710 hasConceptScore W3126867710C58489278 @default.
- W3126867710 hasConceptScore W3126867710C77553402 @default.
- W3126867710 hasLocation W31268677101 @default.
- W3126867710 hasOpenAccess W3126867710 @default.
- W3126867710 hasPrimaryLocation W31268677101 @default.
- W3126867710 hasRelatedWork W1813877927 @default.
- W3126867710 hasRelatedWork W2062817469 @default.
- W3126867710 hasRelatedWork W2112452856 @default.
- W3126867710 hasRelatedWork W2150024778 @default.
- W3126867710 hasRelatedWork W2293992861 @default.
- W3126867710 hasRelatedWork W2529520435 @default.
- W3126867710 hasRelatedWork W2883374993 @default.
- W3126867710 hasRelatedWork W2918136617 @default.
- W3126867710 hasRelatedWork W2950618930 @default.
- W3126867710 hasRelatedWork W2952046009 @default.
- W3126867710 hasRelatedWork W2952793838 @default.
- W3126867710 hasRelatedWork W2963527939 @default.
- W3126867710 hasRelatedWork W2963977217 @default.
- W3126867710 hasRelatedWork W2964164735 @default.