Matches in SemOpenAlex for { <https://semopenalex.org/work/W2144115476> ?p ?o ?g. }
- W2144115476 abstract "This thesis presents work in two areas; Language Technology and Linguistic Typology. In the field of Language Technology, a specific problem is addressed: Can a computer extract a description of word conjugation in a natural language using only written text in the language? The problem is often referred to as Unsupervised Learning of Morphology and has a variety of applications, including Machine Translation, Document Categorization and Information Retrieval. The problem is also relevant for linguistic theory. We give a comprehensive survey of work done so far on the problem and then describe a new approach to the problem as well as a number of applications. The idea is that concatenative affixation, i.e., how stems and affixes are stringed together to form words, can, with some success, be modelled simplistically. Essentially, words consist of highfrequency strings (“affixes”) attached to low-frequency strings (“stems”), e.g., as in the English play-ing. Case studies show how this naive model can be used for stemming, language identification and bootstrapping language description. There are around 7 000 languages in the world, exhibiting a bewildering structural diversity. Linguistic Typology is the subfield of linguistics that aims to understand this diversity. Many of the languages in the world today are spoken only by relatively small groups of people and are threatened by extinction and it is therefore a priority to record them. Language documentation, is and has been, an extremely decentralised activity, carried out not only by linguists, but also missionaries, travellers, anthropologists etc foremostly throughout the past 200 years. There is no central record of which and how many languages have been described. To meet the priority, we have attempted to list those languages which are the most poorly described which do not belong to a language family where some other languages is decently described – a task requiring both analysis and diligence. Next, the thesis includes typological work on one of the more tractable aspects of language structure, namely numeral systems, i.e., normed expressions used to denote exact quantities. In one of the first surveys to cover the whole world, we look at rare number bases among numeral systems. One major rarity is base-6-36 systems which are only attested in South/Southwest New Guinea and we make a special inquiry into its emergence. Traditionally, linguists have had headaches over what counts as a language as opposed to a dialect, and have therefore been reluctant to give counts of the number of languages in a given area. One chapter of the present thesis shows that, contrary to popular belief, there is an intuitively sound way to count languages (as opposed to dialects). The only requirement is that, for each pair of varieties, we are told whether they are mutually intelligible or not." @default.
- W2144115476 created "2016-06-24" @default.
- W2144115476 creator A5007975524 @default.
- W2144115476 date "2009-11-16" @default.
- W2144115476 modified "2023-09-27" @default.
- W2144115476 title "Unsupervised Learning of Morphology and the Languages of the World" @default.
- W2144115476 cites W115200403 @default.
- W2144115476 cites W116558967 @default.
- W2144115476 cites W133006580 @default.
- W2144115476 cites W143309896 @default.
- W2144115476 cites W147140176 @default.
- W2144115476 cites W148262266 @default.
- W2144115476 cites W1488053726 @default.
- W2144115476 cites W1493114355 @default.
- W2144115476 cites W1498040058 @default.
- W2144115476 cites W1499122646 @default.
- W2144115476 cites W1504127138 @default.
- W2144115476 cites W1507520185 @default.
- W2144115476 cites W1510918033 @default.
- W2144115476 cites W1511503440 @default.
- W2144115476 cites W1514110799 @default.
- W2144115476 cites W1514872638 @default.
- W2144115476 cites W1517294575 @default.
- W2144115476 cites W1518891745 @default.
- W2144115476 cites W1523430623 @default.
- W2144115476 cites W1523476347 @default.
- W2144115476 cites W1525736366 @default.
- W2144115476 cites W1527606239 @default.
- W2144115476 cites W1533946607 @default.
- W2144115476 cites W1536822721 @default.
- W2144115476 cites W1542706181 @default.
- W2144115476 cites W1543512527 @default.
- W2144115476 cites W1546647706 @default.
- W2144115476 cites W1549218967 @default.
- W2144115476 cites W1549563197 @default.
- W2144115476 cites W1551341032 @default.
- W2144115476 cites W1559257965 @default.
- W2144115476 cites W1559354850 @default.
- W2144115476 cites W1563745091 @default.
- W2144115476 cites W1563946600 @default.
- W2144115476 cites W1568413275 @default.
- W2144115476 cites W1572167796 @default.
- W2144115476 cites W1573465296 @default.
- W2144115476 cites W1577204225 @default.
- W2144115476 cites W1601511876 @default.
- W2144115476 cites W1604228412 @default.
- W2144115476 cites W1605549267 @default.
- W2144115476 cites W1607400208 @default.
- W2144115476 cites W165579794 @default.
- W2144115476 cites W1656787195 @default.
- W2144115476 cites W1660390307 @default.
- W2144115476 cites W1723077045 @default.
- W2144115476 cites W1727944201 @default.
- W2144115476 cites W173654124 @default.
- W2144115476 cites W1798374601 @default.
- W2144115476 cites W1836521361 @default.
- W2144115476 cites W1840791892 @default.
- W2144115476 cites W195496810 @default.
- W2144115476 cites W1962179941 @default.
- W2144115476 cites W1966566430 @default.
- W2144115476 cites W1967172308 @default.
- W2144115476 cites W1968951234 @default.
- W2144115476 cites W1969608442 @default.
- W2144115476 cites W1971074050 @default.
- W2144115476 cites W1971105673 @default.
- W2144115476 cites W1971285215 @default.
- W2144115476 cites W1974978048 @default.
- W2144115476 cites W1975139803 @default.
- W2144115476 cites W1975638594 @default.
- W2144115476 cites W1978497800 @default.
- W2144115476 cites W1980179459 @default.
- W2144115476 cites W1981745693 @default.
- W2144115476 cites W1985057352 @default.
- W2144115476 cites W1985289892 @default.
- W2144115476 cites W1986786314 @default.
- W2144115476 cites W1987748324 @default.
- W2144115476 cites W1988069386 @default.
- W2144115476 cites W1988213781 @default.
- W2144115476 cites W1988244042 @default.
- W2144115476 cites W1992448985 @default.
- W2144115476 cites W1992466330 @default.
- W2144115476 cites W1995860467 @default.
- W2144115476 cites W1996046141 @default.
- W2144115476 cites W1996376727 @default.
- W2144115476 cites W1997658299 @default.
- W2144115476 cites W1998449347 @default.
- W2144115476 cites W2000099783 @default.
- W2144115476 cites W2002089154 @default.
- W2144115476 cites W2007976151 @default.
- W2144115476 cites W2008263650 @default.
- W2144115476 cites W2010982212 @default.
- W2144115476 cites W2011039300 @default.
- W2144115476 cites W2011585943 @default.
- W2144115476 cites W2012376027 @default.
- W2144115476 cites W2012469826 @default.
- W2144115476 cites W2014303786 @default.
- W2144115476 cites W201532657 @default.
- W2144115476 cites W2015541386 @default.
- W2144115476 cites W2016965993 @default.
- W2144115476 cites W2018637948 @default.