Matches in SemOpenAlex for { <https://semopenalex.org/work/W3002109251> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W3002109251 endingPage "21" @default.
- W3002109251 startingPage "1" @default.
- W3002109251 abstract "The evolution of information and communication technology has markedly influenced communication between correspondents. This evolution has facilitated the transmission of information and has engendered new forms of written communication (email, chat, SMS, comments, etc.). Most of these messages and comments are written in Latin script, also called Arabizi . Moreover, the language used in social media and SMS messaging is characterized by the use of informal and non-standard vocabulary, such as repeated letters for emphasis, typos, non-standard abbreviations, and nonlinguistic content like emoticons. Since the Tunisian dialect suffers from the unavailability of basic tools and linguistic resources compared to Modern Standard Arabic, we resort to the use of these written sources as a starting point to build large corpora automatically. In the context of natural language processing and to benefit from these networks’ data, transliterating from Arabizi to Arabic script is a necessary step because most recently available tools for processing the Tunisian dialect expect Arabic script input. Indeed, the transliteration task can help construct and enrich parallel corpora and dictionaries for the Tunisian dialect and can be useful for developing various natural language processing applications such as sentiment analysis, opinion mining, topic detection, and machine translation. In this article, we focus on converting the Tunisian dialect text that is written in Latin script to Arabic script following the Conventional Orthography for Dialectal Arabic. Then, we propose two models to transliterate Arabizi into Arabic script for the Tunisian dialect, namely a rule-based model and a discriminative model as a sequence classification task based on conditional random fields). In the first model, we use a set of transliteration rules to convert the Tunisian dialect Arabizi texts to Arabic script. In the second model, transliteration is performed both at word and character levels. In the end, our models got a character error rate of 10.47%." @default.
- W3002109251 created "2020-01-30" @default.
- W3002109251 creator A5001984598 @default.
- W3002109251 creator A5025454924 @default.
- W3002109251 creator A5037224525 @default.
- W3002109251 creator A5091661528 @default.
- W3002109251 date "2019-11-28" @default.
- W3002109251 modified "2023-10-01" @default.
- W3002109251 title "Transliteration of Arabizi into Arabic Script for Tunisian Dialect" @default.
- W3002109251 cites W1659740212 @default.
- W3002109251 cites W2097497389 @default.
- W3002109251 cites W2104463314 @default.
- W3002109251 cites W2333367237 @default.
- W3002109251 cites W2581564515 @default.
- W3002109251 cites W2760231680 @default.
- W3002109251 cites W2906891164 @default.
- W3002109251 doi "https://doi.org/10.1145/3364319" @default.
- W3002109251 hasPublicationYear "2019" @default.
- W3002109251 type Work @default.
- W3002109251 sameAs 3002109251 @default.
- W3002109251 citedByCount "11" @default.
- W3002109251 countsByYear W30021092512020 @default.
- W3002109251 countsByYear W30021092512021 @default.
- W3002109251 countsByYear W30021092512022 @default.
- W3002109251 crossrefType "journal-article" @default.
- W3002109251 hasAuthorship W3002109251A5001984598 @default.
- W3002109251 hasAuthorship W3002109251A5025454924 @default.
- W3002109251 hasAuthorship W3002109251A5037224525 @default.
- W3002109251 hasAuthorship W3002109251A5091661528 @default.
- W3002109251 hasConcept C138885662 @default.
- W3002109251 hasConcept C150670947 @default.
- W3002109251 hasConcept C154945302 @default.
- W3002109251 hasConcept C166957645 @default.
- W3002109251 hasConcept C204321447 @default.
- W3002109251 hasConcept C2777323237 @default.
- W3002109251 hasConcept C2778243841 @default.
- W3002109251 hasConcept C2779343474 @default.
- W3002109251 hasConcept C41008148 @default.
- W3002109251 hasConcept C41895202 @default.
- W3002109251 hasConcept C520968082 @default.
- W3002109251 hasConcept C554936623 @default.
- W3002109251 hasConcept C95457728 @default.
- W3002109251 hasConcept C96455323 @default.
- W3002109251 hasConceptScore W3002109251C138885662 @default.
- W3002109251 hasConceptScore W3002109251C150670947 @default.
- W3002109251 hasConceptScore W3002109251C154945302 @default.
- W3002109251 hasConceptScore W3002109251C166957645 @default.
- W3002109251 hasConceptScore W3002109251C204321447 @default.
- W3002109251 hasConceptScore W3002109251C2777323237 @default.
- W3002109251 hasConceptScore W3002109251C2778243841 @default.
- W3002109251 hasConceptScore W3002109251C2779343474 @default.
- W3002109251 hasConceptScore W3002109251C41008148 @default.
- W3002109251 hasConceptScore W3002109251C41895202 @default.
- W3002109251 hasConceptScore W3002109251C520968082 @default.
- W3002109251 hasConceptScore W3002109251C554936623 @default.
- W3002109251 hasConceptScore W3002109251C95457728 @default.
- W3002109251 hasConceptScore W3002109251C96455323 @default.
- W3002109251 hasIssue "2" @default.
- W3002109251 hasLocation W30021092511 @default.
- W3002109251 hasOpenAccess W3002109251 @default.
- W3002109251 hasPrimaryLocation W30021092511 @default.
- W3002109251 hasRelatedWork W1538473846 @default.
- W3002109251 hasRelatedWork W2166660646 @default.
- W3002109251 hasRelatedWork W2505414515 @default.
- W3002109251 hasRelatedWork W2901664186 @default.
- W3002109251 hasRelatedWork W2963265201 @default.
- W3002109251 hasRelatedWork W3002109251 @default.
- W3002109251 hasRelatedWork W3095671335 @default.
- W3002109251 hasRelatedWork W3118330591 @default.
- W3002109251 hasRelatedWork W3120115961 @default.
- W3002109251 hasRelatedWork W4289243833 @default.
- W3002109251 hasVolume "19" @default.
- W3002109251 isParatext "false" @default.
- W3002109251 isRetracted "false" @default.
- W3002109251 magId "3002109251" @default.
- W3002109251 workType "article" @default.