Matches in SemOpenAlex for { <https://semopenalex.org/work/W4309969340> ?p ?o ?g. }
Showing items 1 to 56 of
56
with 100 items per page.
- W4309969340 abstract "Crowdsourcing platforms are often used to collect datasets for training machine learning models, despite higher levels of inaccurate labeling compared to expert labeling. There are two common strategies to manage the impact of such noise. The first involves aggregating redundant annotations, but comes at the expense of labeling substantially fewer examples. Secondly, prior works have also considered using the entire annotation budget to label as many examples as possible and subsequently apply denoising algorithms to implicitly clean the dataset. We find a middle ground and propose an approach which reserves a fraction of annotations to explicitly clean up highly probable error samples to optimize the annotation process. In particular, we allocate a large portion of the labeling budget to form an initial dataset used to train a model. This model is then used to identify specific examples that appear most likely to be incorrect, which we spend the remaining budget to relabel. Experiments across three model variations and four natural language processing tasks show our approach outperforms or matches both label aggregation and advanced denoising methods designed to handle noisy labels when allocated the same finite annotation budget." @default.
- W4309969340 created "2022-11-30" @default.
- W4309969340 creator A5067390670 @default.
- W4309969340 creator A5084866278 @default.
- W4309969340 creator A5086343573 @default.
- W4309969340 date "2021-10-15" @default.
- W4309969340 modified "2023-10-11" @default.
- W4309969340 title "Clean or Annotate: How to Spend a Limited Data Collection Budget" @default.
- W4309969340 doi "https://doi.org/10.48550/arxiv.2110.08355" @default.
- W4309969340 hasPublicationYear "2021" @default.
- W4309969340 type Work @default.
- W4309969340 citedByCount "0" @default.
- W4309969340 crossrefType "posted-content" @default.
- W4309969340 hasAuthorship W4309969340A5067390670 @default.
- W4309969340 hasAuthorship W4309969340A5084866278 @default.
- W4309969340 hasAuthorship W4309969340A5086343573 @default.
- W4309969340 hasBestOaLocation W43099693401 @default.
- W4309969340 hasConcept C111919701 @default.
- W4309969340 hasConcept C115961682 @default.
- W4309969340 hasConcept C119857082 @default.
- W4309969340 hasConcept C124101348 @default.
- W4309969340 hasConcept C136764020 @default.
- W4309969340 hasConcept C154945302 @default.
- W4309969340 hasConcept C2776321320 @default.
- W4309969340 hasConcept C41008148 @default.
- W4309969340 hasConcept C62230096 @default.
- W4309969340 hasConcept C98045186 @default.
- W4309969340 hasConcept C99498987 @default.
- W4309969340 hasConceptScore W4309969340C111919701 @default.
- W4309969340 hasConceptScore W4309969340C115961682 @default.
- W4309969340 hasConceptScore W4309969340C119857082 @default.
- W4309969340 hasConceptScore W4309969340C124101348 @default.
- W4309969340 hasConceptScore W4309969340C136764020 @default.
- W4309969340 hasConceptScore W4309969340C154945302 @default.
- W4309969340 hasConceptScore W4309969340C2776321320 @default.
- W4309969340 hasConceptScore W4309969340C41008148 @default.
- W4309969340 hasConceptScore W4309969340C62230096 @default.
- W4309969340 hasConceptScore W4309969340C98045186 @default.
- W4309969340 hasConceptScore W4309969340C99498987 @default.
- W4309969340 hasLocation W43099693401 @default.
- W4309969340 hasLocation W43099693402 @default.
- W4309969340 hasOpenAccess W4309969340 @default.
- W4309969340 hasPrimaryLocation W43099693401 @default.
- W4309969340 hasRelatedWork W1543060214 @default.
- W4309969340 hasRelatedWork W2110165259 @default.
- W4309969340 hasRelatedWork W2768021031 @default.
- W4309969340 hasRelatedWork W2832644133 @default.
- W4309969340 hasRelatedWork W2914515217 @default.
- W4309969340 hasRelatedWork W3114228236 @default.
- W4309969340 hasRelatedWork W3174177938 @default.
- W4309969340 hasRelatedWork W4254346324 @default.
- W4309969340 hasRelatedWork W4287549221 @default.
- W4309969340 hasRelatedWork W4288474950 @default.
- W4309969340 isParatext "false" @default.
- W4309969340 isRetracted "false" @default.
- W4309969340 workType "article" @default.