Matches in SemOpenAlex for { <https://semopenalex.org/work/W3022409609> ?p ?o ?g. }
Showing items 1 to 95 of
95
with 100 items per page.
- W3022409609 abstract "Informal romanization is an idiosyncratic process used by humans in informal digital communication to encode non-Latin script languages into Latin character sets found on common keyboards. Character substitution choices differ between users but have been shown to be governed by the same main principles observed across a variety of languages---namely, character pairs are often associated through phonetic or visual similarity. We propose a noisy-channel WFST cascade model for deciphering the original non-Latin script from observed romanized text in an unsupervised fashion. We train our model directly on romanized data from two languages: Egyptian Arabic and Russian. We demonstrate that adding inductive bias through phonetic and visual priors on character mappings substantially improves the model's performance on both languages, yielding results much closer to the supervised skyline. Finally, we introduce a new dataset of romanized Russian, collected from a Russian social network website and partially annotated for our experiments." @default.
- W3022409609 created "2020-05-13" @default.
- W3022409609 creator A5004582255 @default.
- W3022409609 creator A5017455302 @default.
- W3022409609 creator A5061241015 @default.
- W3022409609 date "2020-05-05" @default.
- W3022409609 modified "2023-09-23" @default.
- W3022409609 title "Phonetic and Visual Priors for Decipherment of Informal Romanization" @default.
- W3022409609 cites W1582482241 @default.
- W3022409609 cites W1797288984 @default.
- W3022409609 cites W1966863499 @default.
- W3022409609 cites W2008225289 @default.
- W3022409609 cites W2083055907 @default.
- W3022409609 cites W2104463314 @default.
- W3022409609 cites W2105738468 @default.
- W3022409609 cites W2113641473 @default.
- W3022409609 cites W2132714218 @default.
- W3022409609 cites W2134859678 @default.
- W3022409609 cites W2163489213 @default.
- W3022409609 cites W2166660646 @default.
- W3022409609 cites W2250414785 @default.
- W3022409609 cites W2250967669 @default.
- W3022409609 cites W2296073425 @default.
- W3022409609 cites W2306118375 @default.
- W3022409609 cites W2746762542 @default.
- W3022409609 cites W2757281913 @default.
- W3022409609 cites W2913739034 @default.
- W3022409609 cites W2914442349 @default.
- W3022409609 cites W2963602293 @default.
- W3022409609 cites W2980576249 @default.
- W3022409609 cites W2995197202 @default.
- W3022409609 cites W3088578619 @default.
- W3022409609 hasPublicationYear "2020" @default.
- W3022409609 type Work @default.
- W3022409609 sameAs 3022409609 @default.
- W3022409609 citedByCount "0" @default.
- W3022409609 crossrefType "posted-content" @default.
- W3022409609 hasAuthorship W3022409609A5004582255 @default.
- W3022409609 hasAuthorship W3022409609A5017455302 @default.
- W3022409609 hasAuthorship W3022409609A5061241015 @default.
- W3022409609 hasConcept C106930687 @default.
- W3022409609 hasConcept C115961682 @default.
- W3022409609 hasConcept C138885662 @default.
- W3022409609 hasConcept C154945302 @default.
- W3022409609 hasConcept C204321447 @default.
- W3022409609 hasConcept C2524010 @default.
- W3022409609 hasConcept C2778467380 @default.
- W3022409609 hasConcept C2780144916 @default.
- W3022409609 hasConcept C2780719617 @default.
- W3022409609 hasConcept C2780861071 @default.
- W3022409609 hasConcept C2987247673 @default.
- W3022409609 hasConcept C32717103 @default.
- W3022409609 hasConcept C33923547 @default.
- W3022409609 hasConcept C41008148 @default.
- W3022409609 hasConcept C41895202 @default.
- W3022409609 hasConceptScore W3022409609C106930687 @default.
- W3022409609 hasConceptScore W3022409609C115961682 @default.
- W3022409609 hasConceptScore W3022409609C138885662 @default.
- W3022409609 hasConceptScore W3022409609C154945302 @default.
- W3022409609 hasConceptScore W3022409609C204321447 @default.
- W3022409609 hasConceptScore W3022409609C2524010 @default.
- W3022409609 hasConceptScore W3022409609C2778467380 @default.
- W3022409609 hasConceptScore W3022409609C2780144916 @default.
- W3022409609 hasConceptScore W3022409609C2780719617 @default.
- W3022409609 hasConceptScore W3022409609C2780861071 @default.
- W3022409609 hasConceptScore W3022409609C2987247673 @default.
- W3022409609 hasConceptScore W3022409609C32717103 @default.
- W3022409609 hasConceptScore W3022409609C33923547 @default.
- W3022409609 hasConceptScore W3022409609C41008148 @default.
- W3022409609 hasConceptScore W3022409609C41895202 @default.
- W3022409609 hasOpenAccess W3022409609 @default.
- W3022409609 hasRelatedWork W1899713395 @default.
- W3022409609 hasRelatedWork W2085612714 @default.
- W3022409609 hasRelatedWork W2101098006 @default.
- W3022409609 hasRelatedWork W2125670750 @default.
- W3022409609 hasRelatedWork W2139755556 @default.
- W3022409609 hasRelatedWork W2250300493 @default.
- W3022409609 hasRelatedWork W2294053401 @default.
- W3022409609 hasRelatedWork W2621360146 @default.
- W3022409609 hasRelatedWork W2831901288 @default.
- W3022409609 hasRelatedWork W2889151028 @default.
- W3022409609 hasRelatedWork W2964010678 @default.
- W3022409609 hasRelatedWork W3034771443 @default.
- W3022409609 hasRelatedWork W3093995887 @default.
- W3022409609 hasRelatedWork W3095706668 @default.
- W3022409609 hasRelatedWork W3108039100 @default.
- W3022409609 hasRelatedWork W3111726896 @default.
- W3022409609 hasRelatedWork W3118330591 @default.
- W3022409609 hasRelatedWork W3120160660 @default.
- W3022409609 hasRelatedWork W3161021334 @default.
- W3022409609 hasRelatedWork W3118302739 @default.
- W3022409609 isParatext "false" @default.
- W3022409609 isRetracted "false" @default.
- W3022409609 magId "3022409609" @default.
- W3022409609 workType "article" @default.