Matches in SemOpenAlex for { <https://semopenalex.org/work/W4226207254> ?p ?o ?g. }
- W4226207254 endingPage "413" @default.
- W4226207254 startingPage "393" @default.
- W4226207254 abstract "Abstract Common designs of model evaluation typically focus on monolingual settings, where different models are compared according to their performance on a single data set that is assumed to be representative of all possible data for the task at hand. While this may be reasonable for a large data set, this assumption is difficult to maintain in low-resource scenarios, where artifacts of the data collection can yield data sets that are outliers, potentially making conclusions about model performance coincidental. To address these concerns, we investigate model generalizability in crosslinguistic low-resource scenarios. Using morphological segmentation as the test case, we compare three broad classes of models with different parameterizations, taking data from 11 languages across 6 language families. In each experimental setting, we evaluate all models on a first data set, then examine their performance consistency when introducing new randomly sampled data sets with the same size and when applying the trained models to unseen test sets of varying sizes. The results demonstrate that the extent of model generalization depends on the characteristics of the data set, and does not necessarily rely heavily on the data set size. Among the characteristics that we studied, the ratio of morpheme overlap and that of the average number of morphemes per word between the training and test sets are the two most prominent factors. Our findings suggest that future work should adopt random sampling to construct data sets with different sizes in order to make more responsible claims about model evaluation." @default.
- W4226207254 created "2022-05-05" @default.
- W4226207254 creator A5063501461 @default.
- W4226207254 creator A5085014772 @default.
- W4226207254 date "2022-01-01" @default.
- W4226207254 modified "2023-09-25" @default.
- W4226207254 title "Data-driven Model Generalizability in Crosslinguistic Low-resource Morphological Segmentation" @default.
- W4226207254 cites W14331692 @default.
- W4226207254 cites W1533169541 @default.
- W4226207254 cites W1632114991 @default.
- W4226207254 cites W2008652694 @default.
- W4226207254 cites W2013556410 @default.
- W4226207254 cites W2051434435 @default.
- W4226207254 cites W2094544353 @default.
- W4226207254 cites W2105378193 @default.
- W4226207254 cites W2110485445 @default.
- W4226207254 cites W2117621558 @default.
- W4226207254 cites W2157331557 @default.
- W4226207254 cites W2173928607 @default.
- W4226207254 cites W2251565024 @default.
- W4226207254 cites W2317357401 @default.
- W4226207254 cites W2461808544 @default.
- W4226207254 cites W2549835527 @default.
- W4226207254 cites W2562053465 @default.
- W4226207254 cites W2564287196 @default.
- W4226207254 cites W2798935874 @default.
- W4226207254 cites W2888922637 @default.
- W4226207254 cites W2893191669 @default.
- W4226207254 cites W2911227954 @default.
- W4226207254 cites W2918996109 @default.
- W4226207254 cites W2921890305 @default.
- W4226207254 cites W2923014074 @default.
- W4226207254 cites W2951243568 @default.
- W4226207254 cites W2951286828 @default.
- W4226207254 cites W2951815953 @default.
- W4226207254 cites W2963212250 @default.
- W4226207254 cites W2963748441 @default.
- W4226207254 cites W2963834860 @default.
- W4226207254 cites W2963940534 @default.
- W4226207254 cites W2963969878 @default.
- W4226207254 cites W2970283086 @default.
- W4226207254 cites W2970529259 @default.
- W4226207254 cites W2983089349 @default.
- W4226207254 cites W3035064549 @default.
- W4226207254 cites W3035267217 @default.
- W4226207254 cites W3098135552 @default.
- W4226207254 cites W3099624838 @default.
- W4226207254 cites W3100501376 @default.
- W4226207254 cites W3100679627 @default.
- W4226207254 cites W3103536442 @default.
- W4226207254 cites W3104469895 @default.
- W4226207254 cites W3104492344 @default.
- W4226207254 cites W3114610051 @default.
- W4226207254 cites W3155192455 @default.
- W4226207254 cites W3155808134 @default.
- W4226207254 cites W3166161990 @default.
- W4226207254 cites W3168656614 @default.
- W4226207254 cites W3174896143 @default.
- W4226207254 cites W3199526956 @default.
- W4226207254 doi "https://doi.org/10.1162/tacl_a_00467" @default.
- W4226207254 hasPublicationYear "2022" @default.
- W4226207254 type Work @default.
- W4226207254 citedByCount "0" @default.
- W4226207254 crossrefType "journal-article" @default.
- W4226207254 hasAuthorship W4226207254A5063501461 @default.
- W4226207254 hasAuthorship W4226207254A5085014772 @default.
- W4226207254 hasBestOaLocation W42262072541 @default.
- W4226207254 hasConcept C105795698 @default.
- W4226207254 hasConcept C119857082 @default.
- W4226207254 hasConcept C134306372 @default.
- W4226207254 hasConcept C154945302 @default.
- W4226207254 hasConcept C165297611 @default.
- W4226207254 hasConcept C16910744 @default.
- W4226207254 hasConcept C169903167 @default.
- W4226207254 hasConcept C177148314 @default.
- W4226207254 hasConcept C177264268 @default.
- W4226207254 hasConcept C199360897 @default.
- W4226207254 hasConcept C204321447 @default.
- W4226207254 hasConcept C27158222 @default.
- W4226207254 hasConcept C2776436953 @default.
- W4226207254 hasConcept C33923547 @default.
- W4226207254 hasConcept C41008148 @default.
- W4226207254 hasConcept C58489278 @default.
- W4226207254 hasConcept C79337645 @default.
- W4226207254 hasConcept C89600930 @default.
- W4226207254 hasConceptScore W4226207254C105795698 @default.
- W4226207254 hasConceptScore W4226207254C119857082 @default.
- W4226207254 hasConceptScore W4226207254C134306372 @default.
- W4226207254 hasConceptScore W4226207254C154945302 @default.
- W4226207254 hasConceptScore W4226207254C165297611 @default.
- W4226207254 hasConceptScore W4226207254C16910744 @default.
- W4226207254 hasConceptScore W4226207254C169903167 @default.
- W4226207254 hasConceptScore W4226207254C177148314 @default.
- W4226207254 hasConceptScore W4226207254C177264268 @default.
- W4226207254 hasConceptScore W4226207254C199360897 @default.
- W4226207254 hasConceptScore W4226207254C204321447 @default.
- W4226207254 hasConceptScore W4226207254C27158222 @default.
- W4226207254 hasConceptScore W4226207254C2776436953 @default.