Matches in SemOpenAlex for { <https://semopenalex.org/work/W2108701800> ?p ?o ?g. }
Showing items 1 to 99 of
99
with 100 items per page.
- W2108701800 abstract "Statistical language models should improve as the size of the n-grams increases from 3 to 5 or higher. However, the number of parameters and calculations, and the storage requirement increase very rapidly if we attempt to store all possible combinations of n-grams. To avoid these problems, the reduced n-grams' approach previously developed by O'Boyle (1993) can be applied. A reduced n-gram language model can store an entire corpus's phrase-history length within feasible storage limits. Another theoretical advantage of reduced n-grams is that they are closer to being semantically complete than traditional models, which include all n-grams. In our experiments, the reduced n-gram Zipf curves are first presented, and compared with previously obtained conventional n-grams for both English and Chinese. The reduced n-gram model is then applied to large English and Chinese corpora. For English, we can reduce the model sizes, compared to 7-gram traditional model sizes, with factors of 14.6 for a 40-million-word corpus and 11.0 for a 500-million-word corpus while obtaining 5.8% and 4.2% improvements in perplexities. For Chinese, we gain a 16.9% perplexity reductions and we reduce the model size by a factor larger than 11.2. This paper is a step towards the modeling of English and Chinese using semantically complete phrases in an n-gram model." @default.
- W2108701800 created "2016-06-24" @default.
- W2108701800 creator A5005337593 @default.
- W2108701800 creator A5023410995 @default.
- W2108701800 creator A5026517050 @default.
- W2108701800 creator A5058587854 @default.
- W2108701800 date "2006-01-01" @default.
- W2108701800 modified "2023-10-17" @default.
- W2108701800 title "Reducedn-gram models for English and Chinese corpora" @default.
- W2108701800 cites W158414620 @default.
- W2108701800 cites W1757803263 @default.
- W2108701800 cites W1903115690 @default.
- W2108701800 cites W1934041838 @default.
- W2108701800 cites W1972331134 @default.
- W2108701800 cites W2008422177 @default.
- W2108701800 cites W2024490156 @default.
- W2108701800 cites W2060603187 @default.
- W2108701800 cites W2079656678 @default.
- W2108701800 cites W2090618725 @default.
- W2108701800 cites W2115054880 @default.
- W2108701800 cites W2116625254 @default.
- W2108701800 cites W2124008567 @default.
- W2108701800 cites W2127882143 @default.
- W2108701800 cites W2129940916 @default.
- W2108701800 cites W2134237567 @default.
- W2108701800 cites W2144219418 @default.
- W2108701800 cites W2151297548 @default.
- W2108701800 cites W2157981187 @default.
- W2108701800 cites W45102278 @default.
- W2108701800 cites W84821037 @default.
- W2108701800 doi "https://doi.org/10.3115/1273073.1273113" @default.
- W2108701800 hasPublicationYear "2006" @default.
- W2108701800 type Work @default.
- W2108701800 sameAs 2108701800 @default.
- W2108701800 citedByCount "5" @default.
- W2108701800 countsByYear W21087018002014 @default.
- W2108701800 countsByYear W21087018002019 @default.
- W2108701800 crossrefType "proceedings-article" @default.
- W2108701800 hasAuthorship W2108701800A5005337593 @default.
- W2108701800 hasAuthorship W2108701800A5023410995 @default.
- W2108701800 hasAuthorship W2108701800A5026517050 @default.
- W2108701800 hasAuthorship W2108701800A5058587854 @default.
- W2108701800 hasBestOaLocation W21087018001 @default.
- W2108701800 hasConcept C100279451 @default.
- W2108701800 hasConcept C105795698 @default.
- W2108701800 hasConcept C117884012 @default.
- W2108701800 hasConcept C125932096 @default.
- W2108701800 hasConcept C137293760 @default.
- W2108701800 hasConcept C138885662 @default.
- W2108701800 hasConcept C154945302 @default.
- W2108701800 hasConcept C161369605 @default.
- W2108701800 hasConcept C204321447 @default.
- W2108701800 hasConcept C2524010 @default.
- W2108701800 hasConcept C2776224158 @default.
- W2108701800 hasConcept C3018428822 @default.
- W2108701800 hasConcept C33923547 @default.
- W2108701800 hasConcept C41008148 @default.
- W2108701800 hasConcept C41895202 @default.
- W2108701800 hasConcept C523546767 @default.
- W2108701800 hasConcept C54355233 @default.
- W2108701800 hasConcept C86803240 @default.
- W2108701800 hasConcept C90805587 @default.
- W2108701800 hasConceptScore W2108701800C100279451 @default.
- W2108701800 hasConceptScore W2108701800C105795698 @default.
- W2108701800 hasConceptScore W2108701800C117884012 @default.
- W2108701800 hasConceptScore W2108701800C125932096 @default.
- W2108701800 hasConceptScore W2108701800C137293760 @default.
- W2108701800 hasConceptScore W2108701800C138885662 @default.
- W2108701800 hasConceptScore W2108701800C154945302 @default.
- W2108701800 hasConceptScore W2108701800C161369605 @default.
- W2108701800 hasConceptScore W2108701800C204321447 @default.
- W2108701800 hasConceptScore W2108701800C2524010 @default.
- W2108701800 hasConceptScore W2108701800C2776224158 @default.
- W2108701800 hasConceptScore W2108701800C3018428822 @default.
- W2108701800 hasConceptScore W2108701800C33923547 @default.
- W2108701800 hasConceptScore W2108701800C41008148 @default.
- W2108701800 hasConceptScore W2108701800C41895202 @default.
- W2108701800 hasConceptScore W2108701800C523546767 @default.
- W2108701800 hasConceptScore W2108701800C54355233 @default.
- W2108701800 hasConceptScore W2108701800C86803240 @default.
- W2108701800 hasConceptScore W2108701800C90805587 @default.
- W2108701800 hasLocation W21087018001 @default.
- W2108701800 hasLocation W21087018002 @default.
- W2108701800 hasOpenAccess W2108701800 @default.
- W2108701800 hasPrimaryLocation W21087018001 @default.
- W2108701800 hasRelatedWork W1542956019 @default.
- W2108701800 hasRelatedWork W2088421073 @default.
- W2108701800 hasRelatedWork W2121227244 @default.
- W2108701800 hasRelatedWork W2594077621 @default.
- W2108701800 hasRelatedWork W2906970013 @default.
- W2108701800 hasRelatedWork W2959686711 @default.
- W2108701800 hasRelatedWork W2971281071 @default.
- W2108701800 hasRelatedWork W3084943335 @default.
- W2108701800 hasRelatedWork W3107474891 @default.
- W2108701800 hasRelatedWork W3126081632 @default.
- W2108701800 isParatext "false" @default.
- W2108701800 isRetracted "false" @default.
- W2108701800 magId "2108701800" @default.
- W2108701800 workType "article" @default.