Matches in SemOpenAlex for { <https://semopenalex.org/work/W1528161830> ?p ?o ?g. }
- W1528161830 abstract "The goal of this dissertation is to introduce a method for deriving morphemes from Arabic words using stem patterns, a feature of Arabic morphology. The motivations are three-fold: modeling with morphemes rather than words should help address the out-of-vocabulary problem; working with stem patterns should prove to be a cross-dialectally valid method for deriving morphemes using a small amount of linguistic knowledge; and the stem patterns should allow for the prediction of short vowel sequences that are missing from the text. The out-of-vocabulary problem is acute in Modern Standard Arabic due to its rich morphology, including a large inventory of inflectional affixes and clitics that combine in many ways to increase the rate of vocabulary growth. The problem of creating tools that work across dialects is challenging due to the many differences between regional dialects and formal Arabic, and because of the lack of text resources on which to train natural language processing (NLP) tools. The short vowels, while missing from standard orthography, provide information that is crucial to both acoustic modeling and grammatical inference, and therefore must be inserted into the text to train the most predictive NLP models. While other morpheme derivation methods exist that address one or two of the above challenges, none addresses all three with a single solution. The stem pattern derivation method is tested in the task of automatic speech recognition (ASR), and compared to three other morpheme derivation methods as well as word-based language models. We find that the utility of morphemes in increasing word accuracy scores on the ASR task is highly dependent on the ASR system’s ability to accommodate the morphemes in the acoustic and pronunciation models. In experiments involving both Modern Standard Arabic and Levantine Conversational Arabic data, we find that knowledge-light methods of morpheme derivation may work as well as knowledge-rich methods. We also find that morpheme derivation methods that result in a single morpheme hypothesis per word result in stronger models than those that spread probability mass across several hypotheses per word, however, the multi-hypothesis model may be strengthened by applying informed weights to the predicted morpheme sequences. Furthermore, we exploit the flexibility of Finite State Machines, with which the stem pattern derivation method is implemented, to predict short vowels. The result is a comprehensive exploration not only of the stem pattern derivation method, but of the use of morphemes in Arabic language modeling for automatic speech recognition." @default.
- W1528161830 created "2016-06-24" @default.
- W1528161830 creator A5018515711 @default.
- W1528161830 creator A5056667180 @default.
- W1528161830 creator A5057378436 @default.
- W1528161830 date "2010-01-01" @default.
- W1528161830 modified "2023-09-26" @default.
- W1528161830 title "Arabic language modeling with stem-derived morphemes for automatic speech recognition" @default.
- W1528161830 cites W14331692 @default.
- W1528161830 cites W150057594 @default.
- W1528161830 cites W1517381888 @default.
- W1528161830 cites W1564002882 @default.
- W1528161830 cites W1580585137 @default.
- W1528161830 cites W1582482241 @default.
- W1528161830 cites W1606494205 @default.
- W1528161830 cites W1607161872 @default.
- W1528161830 cites W1631260214 @default.
- W1528161830 cites W1707124376 @default.
- W1528161830 cites W1719940802 @default.
- W1528161830 cites W1895315011 @default.
- W1528161830 cites W2009639677 @default.
- W1528161830 cites W2020079054 @default.
- W1528161830 cites W2022858625 @default.
- W1528161830 cites W2041349462 @default.
- W1528161830 cites W2042777049 @default.
- W1528161830 cites W2045298992 @default.
- W1528161830 cites W2050065334 @default.
- W1528161830 cites W2054647223 @default.
- W1528161830 cites W2063475534 @default.
- W1528161830 cites W2063718015 @default.
- W1528161830 cites W2069712814 @default.
- W1528161830 cites W2073942425 @default.
- W1528161830 cites W2077444666 @default.
- W1528161830 cites W2082716575 @default.
- W1528161830 cites W2096244280 @default.
- W1528161830 cites W2096257473 @default.
- W1528161830 cites W2097341304 @default.
- W1528161830 cites W2097661835 @default.
- W1528161830 cites W2100373303 @default.
- W1528161830 cites W2100976324 @default.
- W1528161830 cites W2102062345 @default.
- W1528161830 cites W2103589071 @default.
- W1528161830 cites W2114747788 @default.
- W1528161830 cites W2115084322 @default.
- W1528161830 cites W2123261808 @default.
- W1528161830 cites W2124973918 @default.
- W1528161830 cites W2125838338 @default.
- W1528161830 cites W2128014038 @default.
- W1528161830 cites W2132714218 @default.
- W1528161830 cites W2134141008 @default.
- W1528161830 cites W2134607064 @default.
- W1528161830 cites W2139813717 @default.
- W1528161830 cites W2142983806 @default.
- W1528161830 cites W2144810223 @default.
- W1528161830 cites W2147895967 @default.
- W1528161830 cites W2158069733 @default.
- W1528161830 cites W2159781504 @default.
- W1528161830 cites W2173213060 @default.
- W1528161830 cites W2276283915 @default.
- W1528161830 cites W2312598869 @default.
- W1528161830 cites W2328556883 @default.
- W1528161830 cites W2592403716 @default.
- W1528161830 cites W26431120 @default.
- W1528161830 cites W2950186769 @default.
- W1528161830 cites W3183153947 @default.
- W1528161830 cites W431139010 @default.
- W1528161830 cites W6397387 @default.
- W1528161830 cites W73234223 @default.
- W1528161830 cites W95829446 @default.
- W1528161830 hasPublicationYear "2010" @default.
- W1528161830 type Work @default.
- W1528161830 sameAs 1528161830 @default.
- W1528161830 citedByCount "2" @default.
- W1528161830 countsByYear W15281618302015 @default.
- W1528161830 crossrefType "book-chapter" @default.
- W1528161830 hasAuthorship W1528161830A5018515711 @default.
- W1528161830 hasAuthorship W1528161830A5056667180 @default.
- W1528161830 hasAuthorship W1528161830A5057378436 @default.
- W1528161830 hasConcept C138885662 @default.
- W1528161830 hasConcept C154945302 @default.
- W1528161830 hasConcept C165297611 @default.
- W1528161830 hasConcept C204321447 @default.
- W1528161830 hasConcept C2776214188 @default.
- W1528161830 hasConcept C2777601683 @default.
- W1528161830 hasConcept C2778243841 @default.
- W1528161830 hasConcept C28490314 @default.
- W1528161830 hasConcept C41008148 @default.
- W1528161830 hasConcept C41895202 @default.
- W1528161830 hasConcept C80875076 @default.
- W1528161830 hasConcept C96455323 @default.
- W1528161830 hasConceptScore W1528161830C138885662 @default.
- W1528161830 hasConceptScore W1528161830C154945302 @default.
- W1528161830 hasConceptScore W1528161830C165297611 @default.
- W1528161830 hasConceptScore W1528161830C204321447 @default.
- W1528161830 hasConceptScore W1528161830C2776214188 @default.
- W1528161830 hasConceptScore W1528161830C2777601683 @default.
- W1528161830 hasConceptScore W1528161830C2778243841 @default.
- W1528161830 hasConceptScore W1528161830C28490314 @default.
- W1528161830 hasConceptScore W1528161830C41008148 @default.
- W1528161830 hasConceptScore W1528161830C41895202 @default.