Matches in SemOpenAlex for { <https://semopenalex.org/work/W4283331272> ?p ?o ?g. }
- W4283331272 endingPage "101192" @default.
- W4283331272 startingPage "101192" @default.
- W4283331272 abstract "User generated texts on the web are freely-available and lucrative sources of data for language technology researchers. Unfortunately, these texts are often dominated by informal writing styles and the language used in user generated content poses processing difficulties for natural language tools. Experienced performance drops and processing issues can be addressed either by adapting language tools to user generated content or by normalizing noisy texts before being processed. In this article, we propose a Turkish text normalizer that maps non-standard words to their appropriate standard forms using a graph-based methodology and a context-tailoring approach. Our normalizer benefits from both contextual and lexical similarities between normalization pairs as identified by a graph-based subnormalizer and a transformation-based subnormalizer. The performance of our normalizer is demonstrated on a tweet dataset in the most comprehensive intrinsic and extrinsic evaluations reported so far for Turkish. In this article, we present the first graph-based solution to Turkish text normalization with a novel context-tailoring approach, which advances the state-of-the-art results by outperforming other publicly available normalizers. For the first time in the literature, we measure the extent to which the accuracy of a Turkish language processing tool is affected by normalizing noisy texts before being processed. An analysis of these extrinsic evaluations that focus on more than one Turkish NLP task (i.e., part-of-speech tagger and dependency parser) reveals that Turkish language tools are not robust to noisy texts and a normalizer leads to remarkable performance improvements once used as a preprocessing tool in this morphologically-rich language." @default.
- W4283331272 created "2022-06-24" @default.
- W4283331272 creator A5011391948 @default.
- W4283331272 creator A5080465274 @default.
- W4283331272 date "2022-11-01" @default.
- W4283331272 modified "2023-10-14" @default.
- W4283331272 title "Graph-based Turkish text normalization and its impact on noisy text processing" @default.
- W4283331272 cites W1978878825 @default.
- W4283331272 cites W1979839410 @default.
- W4283331272 cites W2012149514 @default.
- W4283331272 cites W2016443085 @default.
- W4283331272 cites W2027056760 @default.
- W4283331272 cites W2055828199 @default.
- W4283331272 cites W2059067870 @default.
- W4283331272 cites W2096044384 @default.
- W4283331272 cites W2101200183 @default.
- W4283331272 cites W2133503566 @default.
- W4283331272 cites W2148122886 @default.
- W4283331272 cites W2164107060 @default.
- W4283331272 cites W2171313960 @default.
- W4283331272 cites W2218943781 @default.
- W4283331272 cites W2371227879 @default.
- W4283331272 cites W2399101711 @default.
- W4283331272 cites W2509799460 @default.
- W4283331272 cites W2584538970 @default.
- W4283331272 cites W2615975817 @default.
- W4283331272 cites W2621251601 @default.
- W4283331272 cites W274041255 @default.
- W4283331272 cites W2793141750 @default.
- W4283331272 cites W2806253224 @default.
- W4283331272 cites W2891326422 @default.
- W4283331272 cites W2891905999 @default.
- W4283331272 cites W2896950236 @default.
- W4283331272 cites W2899555336 @default.
- W4283331272 cites W2914820290 @default.
- W4283331272 cites W2924677654 @default.
- W4283331272 cites W2953006368 @default.
- W4283331272 cites W2981458766 @default.
- W4283331272 cites W3098567718 @default.
- W4283331272 cites W3158656338 @default.
- W4283331272 cites W655994999 @default.
- W4283331272 cites W2806417975 @default.
- W4283331272 doi "https://doi.org/10.1016/j.jestch.2022.101192" @default.
- W4283331272 hasPublicationYear "2022" @default.
- W4283331272 type Work @default.
- W4283331272 citedByCount "0" @default.
- W4283331272 crossrefType "journal-article" @default.
- W4283331272 hasAuthorship W4283331272A5011391948 @default.
- W4283331272 hasAuthorship W4283331272A5080465274 @default.
- W4283331272 hasBestOaLocation W42833312721 @default.
- W4283331272 hasConcept C132525143 @default.
- W4283331272 hasConcept C136886441 @default.
- W4283331272 hasConcept C138885662 @default.
- W4283331272 hasConcept C144024400 @default.
- W4283331272 hasConcept C154945302 @default.
- W4283331272 hasConcept C164883195 @default.
- W4283331272 hasConcept C186644900 @default.
- W4283331272 hasConcept C19165224 @default.
- W4283331272 hasConcept C202444582 @default.
- W4283331272 hasConcept C204321447 @default.
- W4283331272 hasConcept C2779500292 @default.
- W4283331272 hasConcept C2781121862 @default.
- W4283331272 hasConcept C33923547 @default.
- W4283331272 hasConcept C41008148 @default.
- W4283331272 hasConcept C41895202 @default.
- W4283331272 hasConcept C75174853 @default.
- W4283331272 hasConcept C80444323 @default.
- W4283331272 hasConceptScore W4283331272C132525143 @default.
- W4283331272 hasConceptScore W4283331272C136886441 @default.
- W4283331272 hasConceptScore W4283331272C138885662 @default.
- W4283331272 hasConceptScore W4283331272C144024400 @default.
- W4283331272 hasConceptScore W4283331272C154945302 @default.
- W4283331272 hasConceptScore W4283331272C164883195 @default.
- W4283331272 hasConceptScore W4283331272C186644900 @default.
- W4283331272 hasConceptScore W4283331272C19165224 @default.
- W4283331272 hasConceptScore W4283331272C202444582 @default.
- W4283331272 hasConceptScore W4283331272C204321447 @default.
- W4283331272 hasConceptScore W4283331272C2779500292 @default.
- W4283331272 hasConceptScore W4283331272C2781121862 @default.
- W4283331272 hasConceptScore W4283331272C33923547 @default.
- W4283331272 hasConceptScore W4283331272C41008148 @default.
- W4283331272 hasConceptScore W4283331272C41895202 @default.
- W4283331272 hasConceptScore W4283331272C75174853 @default.
- W4283331272 hasConceptScore W4283331272C80444323 @default.
- W4283331272 hasLocation W42833312721 @default.
- W4283331272 hasOpenAccess W4283331272 @default.
- W4283331272 hasPrimaryLocation W42833312721 @default.
- W4283331272 hasRelatedWork W2020540721 @default.
- W4283331272 hasRelatedWork W2250574586 @default.
- W4283331272 hasRelatedWork W2293457016 @default.
- W4283331272 hasRelatedWork W2351267244 @default.
- W4283331272 hasRelatedWork W2502722637 @default.
- W4283331272 hasRelatedWork W2888625260 @default.
- W4283331272 hasRelatedWork W4210956481 @default.
- W4283331272 hasRelatedWork W4283331272 @default.
- W4283331272 hasRelatedWork W4287887149 @default.
- W4283331272 hasRelatedWork W65617392 @default.
- W4283331272 hasVolume "35" @default.