Matches in SemOpenAlex for { <https://semopenalex.org/work/W82340936> ?p ?o ?g. }
Showing items 1 to 88 of
88
with 100 items per page.
- W82340936 abstract "Part-of-speech tagging, like any supervised statistical NLP task, is more difficult when test sets are very different from training sets, for example when tagging across genres or language varieties. We examined the problem of POS tagging of different varieties of Mandarin Chinese (PRC-Mainland, PRCHong Kong, and Taiwan). An analytic study first showed that unknown words were a major source of difficulty in cross-variety tagging. Unknown words in English tend to be proper nouns. By contrast, we found that Mandarin unknown words were mostly common nouns and verbs. We showed these results are caused by the high frequency of morphological compounding in Mandarin; in this sense Mandarin is more like German than English. Based on this analysis, we propose a variety of new morphological unknown-word features for POS tagging, extending earlier work by others on unknown-word tagging in English and German. Our features were implemented in a maximum entropy Markov model. Our system achieves state-of-the-art performance in Mandarin tagging, including improving unknown-word tagging performance on unseen varieties in Chinese Treebank 5.0 from 61% to 80% correct." @default.
- W82340936 created "2016-06-24" @default.
- W82340936 creator A5036266399 @default.
- W82340936 creator A5041717466 @default.
- W82340936 creator A5046006076 @default.
- W82340936 date "2005-01-01" @default.
- W82340936 modified "2023-09-25" @default.
- W82340936 title "Morphological features help POS tagging of unknown words across language varieties." @default.
- W82340936 cites W1575907248 @default.
- W82340936 cites W1632114991 @default.
- W82340936 cites W1773803948 @default.
- W82340936 cites W1934019294 @default.
- W82340936 cites W1996430422 @default.
- W82340936 cites W2000566875 @default.
- W82340936 cites W2056321066 @default.
- W82340936 cites W2135843243 @default.
- W82340936 cites W2155280192 @default.
- W82340936 cites W2425667873 @default.
- W82340936 cites W3200634688 @default.
- W82340936 cites W76271513 @default.
- W82340936 hasPublicationYear "2005" @default.
- W82340936 type Work @default.
- W82340936 sameAs 82340936 @default.
- W82340936 citedByCount "54" @default.
- W82340936 countsByYear W823409362012 @default.
- W82340936 countsByYear W823409362013 @default.
- W82340936 countsByYear W823409362014 @default.
- W82340936 countsByYear W823409362015 @default.
- W82340936 countsByYear W823409362016 @default.
- W82340936 countsByYear W823409362017 @default.
- W82340936 countsByYear W823409362018 @default.
- W82340936 countsByYear W823409362019 @default.
- W82340936 countsByYear W823409362020 @default.
- W82340936 countsByYear W823409362021 @default.
- W82340936 countsByYear W823409362022 @default.
- W82340936 crossrefType "journal-article" @default.
- W82340936 hasAuthorship W82340936A5036266399 @default.
- W82340936 hasAuthorship W82340936A5041717466 @default.
- W82340936 hasAuthorship W82340936A5046006076 @default.
- W82340936 hasConcept C121934690 @default.
- W82340936 hasConcept C136197465 @default.
- W82340936 hasConcept C138885662 @default.
- W82340936 hasConcept C138954614 @default.
- W82340936 hasConcept C154775046 @default.
- W82340936 hasConcept C154945302 @default.
- W82340936 hasConcept C204321447 @default.
- W82340936 hasConcept C41008148 @default.
- W82340936 hasConcept C41895202 @default.
- W82340936 hasConcept C90805587 @default.
- W82340936 hasConcept C9679016 @default.
- W82340936 hasConceptScore W82340936C121934690 @default.
- W82340936 hasConceptScore W82340936C136197465 @default.
- W82340936 hasConceptScore W82340936C138885662 @default.
- W82340936 hasConceptScore W82340936C138954614 @default.
- W82340936 hasConceptScore W82340936C154775046 @default.
- W82340936 hasConceptScore W82340936C154945302 @default.
- W82340936 hasConceptScore W82340936C204321447 @default.
- W82340936 hasConceptScore W82340936C41008148 @default.
- W82340936 hasConceptScore W82340936C41895202 @default.
- W82340936 hasConceptScore W82340936C90805587 @default.
- W82340936 hasConceptScore W82340936C9679016 @default.
- W82340936 hasLocation W823409361 @default.
- W82340936 hasOpenAccess W82340936 @default.
- W82340936 hasPrimaryLocation W823409361 @default.
- W82340936 hasRelatedWork W108437174 @default.
- W82340936 hasRelatedWork W112498106 @default.
- W82340936 hasRelatedWork W1535015163 @default.
- W82340936 hasRelatedWork W1575907248 @default.
- W82340936 hasRelatedWork W1632114991 @default.
- W82340936 hasRelatedWork W174553219 @default.
- W82340936 hasRelatedWork W1773803948 @default.
- W82340936 hasRelatedWork W1996430422 @default.
- W82340936 hasRelatedWork W2008652694 @default.
- W82340936 hasRelatedWork W2092654472 @default.
- W82340936 hasRelatedWork W2096204319 @default.
- W82340936 hasRelatedWork W2099873701 @default.
- W82340936 hasRelatedWork W2121227244 @default.
- W82340936 hasRelatedWork W2128634885 @default.
- W82340936 hasRelatedWork W2135843243 @default.
- W82340936 hasRelatedWork W2139621418 @default.
- W82340936 hasRelatedWork W2147880316 @default.
- W82340936 hasRelatedWork W2152561660 @default.
- W82340936 hasRelatedWork W2170986599 @default.
- W82340936 hasRelatedWork W2399720833 @default.
- W82340936 isParatext "false" @default.
- W82340936 isRetracted "false" @default.
- W82340936 magId "82340936" @default.
- W82340936 workType "article" @default.