Matches in SemOpenAlex for { <https://semopenalex.org/work/W2077307609> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W2077307609 endingPage "405" @default.
- W2077307609 startingPage "385" @default.
- W2077307609 abstract "The practical challenge of creating a Hungarian e-mail reader has initiated our work on statistical text analysis. The starting point was statistical analysis for automatic discrimination of the language of texts. Later it was extended to automatic re-generation of diacritic signs and more detailed language structure analysis. A parallel study of three different languages-Hungarian, German and English-using text corpora of a similar size gives a possibility for the exploration of both similarities and differences. Corpora of publicly available Internet sources were used. The corpus size was the same (approximately 20 Mbytes, 2.5-3.5 million word forms) for all languages. Besides traditional corpus coverage, word length and occurrence statistics, some new features about prosodic boundaries (sentence initial and final positions, preceding and following a comma) were also computed. Among others, it was found that the coverage of corpora by the most frequent words follows a parallel logarithmic rule for all languages in the 40-85% coverage range, known as Zipf's law in linguistics. The functions are much nearer for English and German than for Hungarian. Further conclusions are also drawn. The language detection and diacritic regeneration applications are discussed in detail with implications on Hungarian speech generation. Diverse further application domains, such as predictive text input, word hyphenation, language modelling in speech recognition, corpus-based speech synthesis, etc. are also foreseen." @default.
- W2077307609 created "2016-06-24" @default.
- W2077307609 creator A5031237235 @default.
- W2077307609 creator A5069988513 @default.
- W2077307609 date "2002-11-01" @default.
- W2077307609 modified "2023-09-28" @default.
- W2077307609 title "Multilingual statistical text analysis, Zipf's law and Hungarian speech generation" @default.
- W2077307609 cites W2037139563 @default.
- W2077307609 doi "https://doi.org/10.1556/aling.49.2002.3-4.8" @default.
- W2077307609 hasPublicationYear "2002" @default.
- W2077307609 type Work @default.
- W2077307609 sameAs 2077307609 @default.
- W2077307609 citedByCount "19" @default.
- W2077307609 countsByYear W20773076092013 @default.
- W2077307609 countsByYear W20773076092014 @default.
- W2077307609 countsByYear W20773076092015 @default.
- W2077307609 countsByYear W20773076092017 @default.
- W2077307609 countsByYear W20773076092018 @default.
- W2077307609 countsByYear W20773076092020 @default.
- W2077307609 countsByYear W20773076092022 @default.
- W2077307609 countsByYear W20773076092023 @default.
- W2077307609 crossrefType "journal-article" @default.
- W2077307609 hasAuthorship W2077307609A5031237235 @default.
- W2077307609 hasAuthorship W2077307609A5069988513 @default.
- W2077307609 hasBestOaLocation W20773076091 @default.
- W2077307609 hasConcept C105795698 @default.
- W2077307609 hasConcept C125932096 @default.
- W2077307609 hasConcept C138885662 @default.
- W2077307609 hasConcept C154775046 @default.
- W2077307609 hasConcept C154945302 @default.
- W2077307609 hasConcept C161831844 @default.
- W2077307609 hasConcept C204321447 @default.
- W2077307609 hasConcept C2524010 @default.
- W2077307609 hasConcept C2777530160 @default.
- W2077307609 hasConcept C28719098 @default.
- W2077307609 hasConcept C33923547 @default.
- W2077307609 hasConcept C41008148 @default.
- W2077307609 hasConcept C41895202 @default.
- W2077307609 hasConcept C532629269 @default.
- W2077307609 hasConcept C90805587 @default.
- W2077307609 hasConceptScore W2077307609C105795698 @default.
- W2077307609 hasConceptScore W2077307609C125932096 @default.
- W2077307609 hasConceptScore W2077307609C138885662 @default.
- W2077307609 hasConceptScore W2077307609C154775046 @default.
- W2077307609 hasConceptScore W2077307609C154945302 @default.
- W2077307609 hasConceptScore W2077307609C161831844 @default.
- W2077307609 hasConceptScore W2077307609C204321447 @default.
- W2077307609 hasConceptScore W2077307609C2524010 @default.
- W2077307609 hasConceptScore W2077307609C2777530160 @default.
- W2077307609 hasConceptScore W2077307609C28719098 @default.
- W2077307609 hasConceptScore W2077307609C33923547 @default.
- W2077307609 hasConceptScore W2077307609C41008148 @default.
- W2077307609 hasConceptScore W2077307609C41895202 @default.
- W2077307609 hasConceptScore W2077307609C532629269 @default.
- W2077307609 hasConceptScore W2077307609C90805587 @default.
- W2077307609 hasIssue "3-4" @default.
- W2077307609 hasLocation W20773076091 @default.
- W2077307609 hasLocation W20773076092 @default.
- W2077307609 hasOpenAccess W2077307609 @default.
- W2077307609 hasPrimaryLocation W20773076091 @default.
- W2077307609 hasRelatedWork W1585034923 @default.
- W2077307609 hasRelatedWork W1978971213 @default.
- W2077307609 hasRelatedWork W2125145484 @default.
- W2077307609 hasRelatedWork W2246782773 @default.
- W2077307609 hasRelatedWork W3107474891 @default.
- W2077307609 hasRelatedWork W3124436655 @default.
- W2077307609 hasRelatedWork W4237451078 @default.
- W2077307609 hasRelatedWork W4237455847 @default.
- W2077307609 hasRelatedWork W4288055551 @default.
- W2077307609 hasRelatedWork W1551406738 @default.
- W2077307609 hasVolume "49" @default.
- W2077307609 isParatext "false" @default.
- W2077307609 isRetracted "false" @default.
- W2077307609 magId "2077307609" @default.
- W2077307609 workType "article" @default.