Matches in SemOpenAlex for { <https://semopenalex.org/work/W4300183229> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4300183229 abstract "The formation of sentences is a highly structured and history-dependent process. The probability of using a specific word in a sentence strongly depends on the 'history' of word-usage earlier in that sentence. We study a simple history-dependent model of text generation assuming that the sample-space of word usage reduces along sentence formation, on average. We first show that the model explains the approximate Zipf law found in word frequencies as a direct consequence of sample-space reduction. We then empirically quantify the amount of sample-space reduction in the sentences of ten famous English books, by analysis of corresponding word-transition tables that capture which words can follow any given word in a text. We find a highly nested structure in these transition tables and show that this `nestedness' is tightly related to the power law exponents of the observed word frequency distributions. With the proposed model it is possible to understand that the nestedness of a text can be the origin of the actual scaling exponent, and that deviations from the exact Zipf law can be understood by variations of the degree of nestedness on a book-by-book basis. On a theoretical level we are able to show that in case of weak nesting, Zipf's law breaks down in a fast transition. Unlike previous attempts to understand Zipf's law in language the sample-space reducing model is not based on assumptions of multiplicative, preferential, or self-organised critical mechanisms behind language formation, but simply used the empirically quantifiable parameter 'nestedness' to understand the statistics of word frequencies." @default.
- W4300183229 created "2022-10-03" @default.
- W4300183229 creator A5019906371 @default.
- W4300183229 creator A5025590875 @default.
- W4300183229 creator A5045565551 @default.
- W4300183229 creator A5090815103 @default.
- W4300183229 date "2014-07-17" @default.
- W4300183229 modified "2023-09-27" @default.
- W4300183229 title "Understanding Zipf's law of word frequencies through sample-space collapse in sentence formation" @default.
- W4300183229 doi "https://doi.org/10.48550/arxiv.1407.4610" @default.
- W4300183229 hasPublicationYear "2014" @default.
- W4300183229 type Work @default.
- W4300183229 citedByCount "0" @default.
- W4300183229 crossrefType "posted-content" @default.
- W4300183229 hasAuthorship W4300183229A5019906371 @default.
- W4300183229 hasAuthorship W4300183229A5025590875 @default.
- W4300183229 hasAuthorship W4300183229A5045565551 @default.
- W4300183229 hasAuthorship W4300183229A5090815103 @default.
- W4300183229 hasBestOaLocation W43001832291 @default.
- W4300183229 hasConcept C105795698 @default.
- W4300183229 hasConcept C111919701 @default.
- W4300183229 hasConcept C121332964 @default.
- W4300183229 hasConcept C125932096 @default.
- W4300183229 hasConcept C154945302 @default.
- W4300183229 hasConcept C175293574 @default.
- W4300183229 hasConcept C198531522 @default.
- W4300183229 hasConcept C2524010 @default.
- W4300183229 hasConcept C2777530160 @default.
- W4300183229 hasConcept C2778572836 @default.
- W4300183229 hasConcept C33923547 @default.
- W4300183229 hasConcept C41008148 @default.
- W4300183229 hasConcept C90805587 @default.
- W4300183229 hasConcept C97355855 @default.
- W4300183229 hasConceptScore W4300183229C105795698 @default.
- W4300183229 hasConceptScore W4300183229C111919701 @default.
- W4300183229 hasConceptScore W4300183229C121332964 @default.
- W4300183229 hasConceptScore W4300183229C125932096 @default.
- W4300183229 hasConceptScore W4300183229C154945302 @default.
- W4300183229 hasConceptScore W4300183229C175293574 @default.
- W4300183229 hasConceptScore W4300183229C198531522 @default.
- W4300183229 hasConceptScore W4300183229C2524010 @default.
- W4300183229 hasConceptScore W4300183229C2777530160 @default.
- W4300183229 hasConceptScore W4300183229C2778572836 @default.
- W4300183229 hasConceptScore W4300183229C33923547 @default.
- W4300183229 hasConceptScore W4300183229C41008148 @default.
- W4300183229 hasConceptScore W4300183229C90805587 @default.
- W4300183229 hasConceptScore W4300183229C97355855 @default.
- W4300183229 hasLocation W43001832291 @default.
- W4300183229 hasOpenAccess W4300183229 @default.
- W4300183229 hasPrimaryLocation W43001832291 @default.
- W4300183229 hasRelatedWork W1994463867 @default.
- W4300183229 hasRelatedWork W2007915090 @default.
- W4300183229 hasRelatedWork W2023581076 @default.
- W4300183229 hasRelatedWork W2041504988 @default.
- W4300183229 hasRelatedWork W2069979223 @default.
- W4300183229 hasRelatedWork W2075174955 @default.
- W4300183229 hasRelatedWork W2104826134 @default.
- W4300183229 hasRelatedWork W2793285345 @default.
- W4300183229 hasRelatedWork W2917710364 @default.
- W4300183229 hasRelatedWork W39918333 @default.
- W4300183229 isParatext "false" @default.
- W4300183229 isRetracted "false" @default.
- W4300183229 workType "article" @default.