Matches in SemOpenAlex for { <https://semopenalex.org/work/W4281934001> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W4281934001 abstract "Data sparsity is a main problem hindering the development of code-switching (CS) NLP systems. In this paper, we investigate data augmentation techniques for synthesizing dialectal Arabic-English CS text. We perform lexical replacements using word-aligned parallel corpora where CS points are either randomly chosen or learnt using a sequence-to-sequence model. We compare these approaches against dictionary-based replacements. We assess the quality of the generated sentences through human evaluation and evaluate the effectiveness of data augmentation on machine translation (MT), automatic speech recognition (ASR), and speech translation (ST) tasks. Results show that using a predictive model results in more natural CS sentences compared to the random approach, as reported in human judgements. In the downstream tasks, despite the random approach generating more data, both approaches perform equally (outperforming dictionary-based replacements). Overall, data augmentation achieves 34% improvement in perplexity, 5.2% relative improvement on WER for ASR task, +4.0-5.1 BLEU points on MT task, and +2.1-2.2 BLEU points on ST over a baseline trained on available data without augmentation." @default.
- W4281934001 created "2022-06-13" @default.
- W4281934001 creator A5004299578 @default.
- W4281934001 creator A5020700841 @default.
- W4281934001 creator A5080036358 @default.
- W4281934001 creator A5084517393 @default.
- W4281934001 date "2022-05-25" @default.
- W4281934001 modified "2023-09-28" @default.
- W4281934001 title "Investigating Lexical Replacements for Arabic-English Code-Switched Data Augmentation" @default.
- W4281934001 doi "https://doi.org/10.48550/arxiv.2205.12649" @default.
- W4281934001 hasPublicationYear "2022" @default.
- W4281934001 type Work @default.
- W4281934001 citedByCount "0" @default.
- W4281934001 crossrefType "posted-content" @default.
- W4281934001 hasAuthorship W4281934001A5004299578 @default.
- W4281934001 hasAuthorship W4281934001A5020700841 @default.
- W4281934001 hasAuthorship W4281934001A5080036358 @default.
- W4281934001 hasAuthorship W4281934001A5084517393 @default.
- W4281934001 hasBestOaLocation W42819340011 @default.
- W4281934001 hasConcept C100279451 @default.
- W4281934001 hasConcept C104317684 @default.
- W4281934001 hasConcept C105580179 @default.
- W4281934001 hasConcept C137293760 @default.
- W4281934001 hasConcept C138885662 @default.
- W4281934001 hasConcept C149364088 @default.
- W4281934001 hasConcept C154945302 @default.
- W4281934001 hasConcept C162324750 @default.
- W4281934001 hasConcept C177264268 @default.
- W4281934001 hasConcept C185592680 @default.
- W4281934001 hasConcept C187736073 @default.
- W4281934001 hasConcept C199360897 @default.
- W4281934001 hasConcept C203005215 @default.
- W4281934001 hasConcept C204321447 @default.
- W4281934001 hasConcept C2776760102 @default.
- W4281934001 hasConcept C2778112365 @default.
- W4281934001 hasConcept C2780451532 @default.
- W4281934001 hasConcept C28490314 @default.
- W4281934001 hasConcept C35639132 @default.
- W4281934001 hasConcept C41008148 @default.
- W4281934001 hasConcept C41895202 @default.
- W4281934001 hasConcept C54355233 @default.
- W4281934001 hasConcept C55493867 @default.
- W4281934001 hasConcept C622187 @default.
- W4281934001 hasConcept C86803240 @default.
- W4281934001 hasConcept C90805587 @default.
- W4281934001 hasConcept C96455323 @default.
- W4281934001 hasConceptScore W4281934001C100279451 @default.
- W4281934001 hasConceptScore W4281934001C104317684 @default.
- W4281934001 hasConceptScore W4281934001C105580179 @default.
- W4281934001 hasConceptScore W4281934001C137293760 @default.
- W4281934001 hasConceptScore W4281934001C138885662 @default.
- W4281934001 hasConceptScore W4281934001C149364088 @default.
- W4281934001 hasConceptScore W4281934001C154945302 @default.
- W4281934001 hasConceptScore W4281934001C162324750 @default.
- W4281934001 hasConceptScore W4281934001C177264268 @default.
- W4281934001 hasConceptScore W4281934001C185592680 @default.
- W4281934001 hasConceptScore W4281934001C187736073 @default.
- W4281934001 hasConceptScore W4281934001C199360897 @default.
- W4281934001 hasConceptScore W4281934001C203005215 @default.
- W4281934001 hasConceptScore W4281934001C204321447 @default.
- W4281934001 hasConceptScore W4281934001C2776760102 @default.
- W4281934001 hasConceptScore W4281934001C2778112365 @default.
- W4281934001 hasConceptScore W4281934001C2780451532 @default.
- W4281934001 hasConceptScore W4281934001C28490314 @default.
- W4281934001 hasConceptScore W4281934001C35639132 @default.
- W4281934001 hasConceptScore W4281934001C41008148 @default.
- W4281934001 hasConceptScore W4281934001C41895202 @default.
- W4281934001 hasConceptScore W4281934001C54355233 @default.
- W4281934001 hasConceptScore W4281934001C55493867 @default.
- W4281934001 hasConceptScore W4281934001C622187 @default.
- W4281934001 hasConceptScore W4281934001C86803240 @default.
- W4281934001 hasConceptScore W4281934001C90805587 @default.
- W4281934001 hasConceptScore W4281934001C96455323 @default.
- W4281934001 hasLocation W42819340011 @default.
- W4281934001 hasOpenAccess W4281934001 @default.
- W4281934001 hasPrimaryLocation W42819340011 @default.
- W4281934001 hasRelatedWork W1539050421 @default.
- W4281934001 hasRelatedWork W170118632 @default.
- W4281934001 hasRelatedWork W1885811119 @default.
- W4281934001 hasRelatedWork W2044223291 @default.
- W4281934001 hasRelatedWork W2084301656 @default.
- W4281934001 hasRelatedWork W2104127567 @default.
- W4281934001 hasRelatedWork W2132122285 @default.
- W4281934001 hasRelatedWork W2791940393 @default.
- W4281934001 hasRelatedWork W2883550961 @default.
- W4281934001 hasRelatedWork W3107474891 @default.
- W4281934001 isParatext "false" @default.
- W4281934001 isRetracted "false" @default.
- W4281934001 workType "article" @default.