Matches in SemOpenAlex for { <https://semopenalex.org/work/W4221146105> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W4221146105 abstract "Recent progress in NLP is driven by pretrained models leveraging massive datasets and has predominantly benefited the world's political and economic superpowers. Technologically underserved languages are left behind because they lack such resources. Hundreds of underserved languages, nevertheless, have available data sources in the form of interlinear glossed text (IGT) from language documentation efforts. IGT remains underutilized in NLP work, perhaps because its annotations are only semi-structured and often language-specific. With this paper, we make the case that IGT data can be leveraged successfully provided that target language expertise is available. We specifically advocate for collaboration with documentary linguists. Our paper provides a roadmap for successful projects utilizing IGT data: (1) It is essential to define which NLP tasks can be accomplished with the given IGT data and how these will benefit the speech community. (2) Great care and target language expertise is required when converting the data into structured formats commonly employed in NLP. (3) Task-specific and user-specific evaluation can help to ascertain that the tools which are created benefit the target language speech community. We illustrate each step through a case study on developing a morphological reinflection system for the Tsimchianic language Gitksan." @default.
- W4221146105 created "2022-04-03" @default.
- W4221146105 creator A5006644924 @default.
- W4221146105 creator A5015990971 @default.
- W4221146105 creator A5027026724 @default.
- W4221146105 creator A5033471648 @default.
- W4221146105 creator A5039794310 @default.
- W4221146105 creator A5047499562 @default.
- W4221146105 creator A5086887575 @default.
- W4221146105 date "2022-03-17" @default.
- W4221146105 modified "2023-10-16" @default.
- W4221146105 title "Dim Wihl Gat Tun: The Case for Linguistic Expertise in NLP for Underdocumented Languages" @default.
- W4221146105 doi "https://doi.org/10.48550/arxiv.2203.09632" @default.
- W4221146105 hasPublicationYear "2022" @default.
- W4221146105 type Work @default.
- W4221146105 citedByCount "0" @default.
- W4221146105 crossrefType "posted-content" @default.
- W4221146105 hasAuthorship W4221146105A5006644924 @default.
- W4221146105 hasAuthorship W4221146105A5015990971 @default.
- W4221146105 hasAuthorship W4221146105A5027026724 @default.
- W4221146105 hasAuthorship W4221146105A5033471648 @default.
- W4221146105 hasAuthorship W4221146105A5039794310 @default.
- W4221146105 hasAuthorship W4221146105A5047499562 @default.
- W4221146105 hasAuthorship W4221146105A5086887575 @default.
- W4221146105 hasBestOaLocation W42211461051 @default.
- W4221146105 hasConcept C127413603 @default.
- W4221146105 hasConcept C138885662 @default.
- W4221146105 hasConcept C154945302 @default.
- W4221146105 hasConcept C199360897 @default.
- W4221146105 hasConcept C201995342 @default.
- W4221146105 hasConcept C204321447 @default.
- W4221146105 hasConcept C2780451532 @default.
- W4221146105 hasConcept C41008148 @default.
- W4221146105 hasConcept C41895202 @default.
- W4221146105 hasConcept C56666940 @default.
- W4221146105 hasConceptScore W4221146105C127413603 @default.
- W4221146105 hasConceptScore W4221146105C138885662 @default.
- W4221146105 hasConceptScore W4221146105C154945302 @default.
- W4221146105 hasConceptScore W4221146105C199360897 @default.
- W4221146105 hasConceptScore W4221146105C201995342 @default.
- W4221146105 hasConceptScore W4221146105C204321447 @default.
- W4221146105 hasConceptScore W4221146105C2780451532 @default.
- W4221146105 hasConceptScore W4221146105C41008148 @default.
- W4221146105 hasConceptScore W4221146105C41895202 @default.
- W4221146105 hasConceptScore W4221146105C56666940 @default.
- W4221146105 hasLocation W42211461051 @default.
- W4221146105 hasOpenAccess W4221146105 @default.
- W4221146105 hasPrimaryLocation W42211461051 @default.
- W4221146105 hasRelatedWork W1572279561 @default.
- W4221146105 hasRelatedWork W2071085159 @default.
- W4221146105 hasRelatedWork W2081647779 @default.
- W4221146105 hasRelatedWork W2368651715 @default.
- W4221146105 hasRelatedWork W2590682089 @default.
- W4221146105 hasRelatedWork W2611614995 @default.
- W4221146105 hasRelatedWork W2789919619 @default.
- W4221146105 hasRelatedWork W2897778959 @default.
- W4221146105 hasRelatedWork W3107474891 @default.
- W4221146105 hasRelatedWork W3185852197 @default.
- W4221146105 isParatext "false" @default.
- W4221146105 isRetracted "false" @default.
- W4221146105 workType "article" @default.