Matches in SemOpenAlex for { <https://semopenalex.org/work/W2140304953> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W2140304953 abstract "Language models are probability distributions over a set of unilingual natural language text used in many natural language processing tasks such as statistical machine translation, information retrieval, and speech processing. Since more well-formed training data means a better model and the increased availability of text via the Internet, the size of language modelling n-gram data sets have grown exponentially the past few years. The latest data sets available can no longer fit on a single computer. A recent investigation reported first known use of a probabilistic data structure to create a randomised language model capable of storing probability information for massive n-gram sets in a fraction of the space normally needed. We report and compare the properties of lossy language models using two probabilistic data structures: the Bloom filter and lossy dictionary. The Bloom filter has exceptional space requirements and only one-sided, false positive error returns but it is computationally slow in scale which is a potential drawback for a structure being queried millions of times per sentence. Lossy dictionaries have low space requirements and are very fast but with two-sided error that returns both false positives and false negatives. We also investigate combining the properties of both the Bloom filter and lossy dictionary and find this can be done to create a fast lossy LM with low one-sided error." @default.
- W2140304953 created "2016-06-24" @default.
- W2140304953 creator A5010226962 @default.
- W2140304953 date "2007-01-01" @default.
- W2140304953 modified "2023-09-27" @default.
- W2140304953 title "Bloom Filter and Lossy Dictionary Based Language Models" @default.
- W2140304953 cites W1542553486 @default.
- W2140304953 cites W1568513494 @default.
- W2140304953 cites W1574901103 @default.
- W2140304953 cites W1601018556 @default.
- W2140304953 cites W1631260214 @default.
- W2140304953 cites W182831726 @default.
- W2140304953 cites W1934041838 @default.
- W2140304953 cites W1993284846 @default.
- W2140304953 cites W2033672007 @default.
- W2140304953 cites W2069074882 @default.
- W2140304953 cites W2079145130 @default.
- W2140304953 cites W2081869611 @default.
- W2140304953 cites W2099111195 @default.
- W2140304953 cites W2106540279 @default.
- W2140304953 cites W2109664771 @default.
- W2140304953 cites W2113788796 @default.
- W2140304953 cites W2122056984 @default.
- W2140304953 cites W2123845384 @default.
- W2140304953 cites W2126540423 @default.
- W2140304953 cites W2132083787 @default.
- W2140304953 cites W2144679335 @default.
- W2140304953 cites W2149741699 @default.
- W2140304953 cites W2153653739 @default.
- W2140304953 cites W2154124206 @default.
- W2140304953 cites W2156088814 @default.
- W2140304953 cites W2158195707 @default.
- W2140304953 cites W2169660454 @default.
- W2140304953 cites W2487908531 @default.
- W2140304953 cites W2913618476 @default.
- W2140304953 cites W2914028059 @default.
- W2140304953 cites W3145128584 @default.
- W2140304953 cites W2797816625 @default.
- W2140304953 hasPublicationYear "2007" @default.
- W2140304953 type Work @default.
- W2140304953 sameAs 2140304953 @default.
- W2140304953 citedByCount "2" @default.
- W2140304953 crossrefType "journal-article" @default.
- W2140304953 hasAuthorship W2140304953A5010226962 @default.
- W2140304953 hasConcept C106131492 @default.
- W2140304953 hasConcept C11413529 @default.
- W2140304953 hasConcept C137293760 @default.
- W2140304953 hasConcept C147224247 @default.
- W2140304953 hasConcept C154945302 @default.
- W2140304953 hasConcept C165021410 @default.
- W2140304953 hasConcept C177264268 @default.
- W2140304953 hasConcept C199360897 @default.
- W2140304953 hasConcept C31972630 @default.
- W2140304953 hasConcept C41008148 @default.
- W2140304953 hasConcept C49937458 @default.
- W2140304953 hasConcept C58489278 @default.
- W2140304953 hasConceptScore W2140304953C106131492 @default.
- W2140304953 hasConceptScore W2140304953C11413529 @default.
- W2140304953 hasConceptScore W2140304953C137293760 @default.
- W2140304953 hasConceptScore W2140304953C147224247 @default.
- W2140304953 hasConceptScore W2140304953C154945302 @default.
- W2140304953 hasConceptScore W2140304953C165021410 @default.
- W2140304953 hasConceptScore W2140304953C177264268 @default.
- W2140304953 hasConceptScore W2140304953C199360897 @default.
- W2140304953 hasConceptScore W2140304953C31972630 @default.
- W2140304953 hasConceptScore W2140304953C41008148 @default.
- W2140304953 hasConceptScore W2140304953C49937458 @default.
- W2140304953 hasConceptScore W2140304953C58489278 @default.
- W2140304953 hasLocation W21403049531 @default.
- W2140304953 hasOpenAccess W2140304953 @default.
- W2140304953 hasPrimaryLocation W21403049531 @default.
- W2140304953 hasRelatedWork W127782651 @default.
- W2140304953 hasRelatedWork W130884385 @default.
- W2140304953 hasRelatedWork W1585096967 @default.
- W2140304953 hasRelatedWork W1591887422 @default.
- W2140304953 hasRelatedWork W2071315630 @default.
- W2140304953 hasRelatedWork W2131528687 @default.
- W2140304953 hasRelatedWork W2157073831 @default.
- W2140304953 hasRelatedWork W2166270474 @default.
- W2140304953 hasRelatedWork W2407467516 @default.
- W2140304953 hasRelatedWork W2883416004 @default.
- W2140304953 hasRelatedWork W2892312507 @default.
- W2140304953 hasRelatedWork W289703195 @default.
- W2140304953 hasRelatedWork W2914699162 @default.
- W2140304953 hasRelatedWork W2914864687 @default.
- W2140304953 hasRelatedWork W2952613254 @default.
- W2140304953 hasRelatedWork W3021136755 @default.
- W2140304953 hasRelatedWork W3082494908 @default.
- W2140304953 hasRelatedWork W3107671861 @default.
- W2140304953 hasRelatedWork W3211815492 @default.
- W2140304953 hasRelatedWork W2184468823 @default.
- W2140304953 isParatext "false" @default.
- W2140304953 isRetracted "false" @default.
- W2140304953 magId "2140304953" @default.
- W2140304953 workType "article" @default.