Matches in SemOpenAlex for { <https://semopenalex.org/work/W2762763087> ?p ?o ?g. }
- W2762763087 endingPage "41" @default.
- W2762763087 startingPage "1" @default.
- W2762763087 abstract "We show that a class of statistical properties of distributions, which includes such practically relevant properties as entropy, the number of distinct elements, and distance metrics between pairs of distributions, can be estimated given a sublinear sized sample. Specifically, given a sample consisting of independent draws from any distribution over at most k distinct elements, these properties can be estimated accurately using a sample of size O ( k log k ). For these estimation tasks, this performance is optimal , to constant factors. Complementing these theoretical results, we also demonstrate that our estimators perform exceptionally well, in practice, for a variety of estimation tasks, on a variety of natural distributions, for a wide range of parameters. The key step in our approach is to first use the sample to characterize the “unseen” portion of the distribution—effectively reconstructing this portion of the distribution as accurately as if one had a logarithmic factor larger sample. This goes beyond such tools as the Good-Turing frequency estimation scheme, which estimates the total probability mass of the unobserved portion of the distribution: We seek to estimate the shape of the unobserved portion of the distribution. This work can be seen as introducing a robust, general, and theoretically principled framework that, for many practical applications, essentially amplifies the sample size by a logarithmic factor; we expect that it may be fruitfully used as a component within larger machine learning and statistical analysis systems." @default.
- W2762763087 created "2017-10-20" @default.
- W2762763087 creator A5036230157 @default.
- W2762763087 creator A5079503799 @default.
- W2762763087 date "2017-10-04" @default.
- W2762763087 modified "2023-09-24" @default.
- W2762763087 title "Estimating the Unseen" @default.
- W2762763087 cites W1595687138 @default.
- W2762763087 cites W1654945559 @default.
- W2762763087 cites W1971405816 @default.
- W2762763087 cites W1980179247 @default.
- W2762763087 cites W1982516282 @default.
- W2762763087 cites W1982918157 @default.
- W2762763087 cites W1987754412 @default.
- W2762763087 cites W1989151402 @default.
- W2762763087 cites W1992068214 @default.
- W2762763087 cites W2022257958 @default.
- W2762763087 cites W2045740840 @default.
- W2762763087 cites W2058991275 @default.
- W2762763087 cites W2069241007 @default.
- W2762763087 cites W2073479529 @default.
- W2762763087 cites W2076381458 @default.
- W2762763087 cites W2079473986 @default.
- W2762763087 cites W2082092506 @default.
- W2762763087 cites W2094608047 @default.
- W2762763087 cites W2095306947 @default.
- W2762763087 cites W2097580994 @default.
- W2762763087 cites W2101985079 @default.
- W2762763087 cites W2114771311 @default.
- W2762763087 cites W2134169350 @default.
- W2762763087 cites W2135247172 @default.
- W2762763087 cites W2135827220 @default.
- W2762763087 cites W2146368895 @default.
- W2762763087 cites W2159784709 @default.
- W2762763087 cites W2419099043 @default.
- W2762763087 cites W2510474575 @default.
- W2762763087 cites W2545606300 @default.
- W2762763087 cites W2554091414 @default.
- W2762763087 cites W2899702797 @default.
- W2762763087 cites W2949196898 @default.
- W2762763087 cites W2963608890 @default.
- W2762763087 cites W4233471163 @default.
- W2762763087 cites W4256422854 @default.
- W2762763087 doi "https://doi.org/10.1145/3125643" @default.
- W2762763087 hasPublicationYear "2017" @default.
- W2762763087 type Work @default.
- W2762763087 sameAs 2762763087 @default.
- W2762763087 citedByCount "33" @default.
- W2762763087 countsByYear W27627630872016 @default.
- W2762763087 countsByYear W27627630872018 @default.
- W2762763087 countsByYear W27627630872019 @default.
- W2762763087 countsByYear W27627630872020 @default.
- W2762763087 countsByYear W27627630872021 @default.
- W2762763087 countsByYear W27627630872022 @default.
- W2762763087 countsByYear W27627630872023 @default.
- W2762763087 crossrefType "journal-article" @default.
- W2762763087 hasAuthorship W2762763087A5036230157 @default.
- W2762763087 hasAuthorship W2762763087A5079503799 @default.
- W2762763087 hasBestOaLocation W27627630871 @default.
- W2762763087 hasConcept C105795698 @default.
- W2762763087 hasConcept C106301342 @default.
- W2762763087 hasConcept C11413529 @default.
- W2762763087 hasConcept C117160843 @default.
- W2762763087 hasConcept C118615104 @default.
- W2762763087 hasConcept C121332964 @default.
- W2762763087 hasConcept C129848803 @default.
- W2762763087 hasConcept C134306372 @default.
- W2762763087 hasConcept C136197465 @default.
- W2762763087 hasConcept C149441793 @default.
- W2762763087 hasConcept C159985019 @default.
- W2762763087 hasConcept C185429906 @default.
- W2762763087 hasConcept C185592680 @default.
- W2762763087 hasConcept C192562407 @default.
- W2762763087 hasConcept C198531522 @default.
- W2762763087 hasConcept C199360897 @default.
- W2762763087 hasConcept C204323151 @default.
- W2762763087 hasConcept C2777027219 @default.
- W2762763087 hasConcept C33923547 @default.
- W2762763087 hasConcept C39927690 @default.
- W2762763087 hasConcept C41008148 @default.
- W2762763087 hasConcept C43617362 @default.
- W2762763087 hasConcept C62520636 @default.
- W2762763087 hasConceptScore W2762763087C105795698 @default.
- W2762763087 hasConceptScore W2762763087C106301342 @default.
- W2762763087 hasConceptScore W2762763087C11413529 @default.
- W2762763087 hasConceptScore W2762763087C117160843 @default.
- W2762763087 hasConceptScore W2762763087C118615104 @default.
- W2762763087 hasConceptScore W2762763087C121332964 @default.
- W2762763087 hasConceptScore W2762763087C129848803 @default.
- W2762763087 hasConceptScore W2762763087C134306372 @default.
- W2762763087 hasConceptScore W2762763087C136197465 @default.
- W2762763087 hasConceptScore W2762763087C149441793 @default.
- W2762763087 hasConceptScore W2762763087C159985019 @default.
- W2762763087 hasConceptScore W2762763087C185429906 @default.
- W2762763087 hasConceptScore W2762763087C185592680 @default.
- W2762763087 hasConceptScore W2762763087C192562407 @default.
- W2762763087 hasConceptScore W2762763087C198531522 @default.
- W2762763087 hasConceptScore W2762763087C199360897 @default.