Matches in SemOpenAlex for { <https://semopenalex.org/work/W3146670357> ?p ?o ?g. }
- W3146670357 abstract "We algorithmically identify label errors in the test sets of 10 of the most commonly-used computer vision, natural language, and audio datasets, and subsequently study the potential for these label errors to affect benchmark results. Errors in test sets are numerous and widespread: we estimate an average of 3.4% errors across the 10 datasets, where for example 2916 label errors comprise 6% of the ImageNet validation set. Putative label errors are found using confident learning and then human-validated via crowdsourcing (54% of the algorithmically-flagged candidates are indeed erroneously labeled). Surprisingly, we find that lower capacity models may be practically more useful than higher capacity models in real-world datasets with high proportions of erroneously labeled data. For example, on ImageNet with corrected labels: ResNet-18 outperforms ResNet-50 if the prevalence of originally mislabeled test examples increases by just 6%. On CIFAR-10 with corrected labels: VGG-11 outperforms VGG-19 if the prevalence of originally mislabeled test examples increases by 5%. Traditionally, ML practitioners choose which model to deploy based on test accuracy -- our findings advise caution here, proposing that judging models over correctly labeled test sets may be more useful, especially for noisy real-world datasets." @default.
- W3146670357 created "2021-04-13" @default.
- W3146670357 creator A5071352839 @default.
- W3146670357 creator A5074303284 @default.
- W3146670357 creator A5091720485 @default.
- W3146670357 date "2021-03-26" @default.
- W3146670357 modified "2023-10-11" @default.
- W3146670357 title "Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks" @default.
- W3146670357 cites W1576445103 @default.
- W3146670357 cites W1686810756 @default.
- W3146670357 cites W2027731328 @default.
- W3146670357 cites W2107189314 @default.
- W3146670357 cites W2108598243 @default.
- W3146670357 cites W2112796928 @default.
- W3146670357 cites W2113290770 @default.
- W3146670357 cites W2113459411 @default.
- W3146670357 cites W2167460663 @default.
- W3146670357 cites W2194775991 @default.
- W3146670357 cites W2593116425 @default.
- W3146670357 cites W2598946096 @default.
- W3146670357 cites W2606712314 @default.
- W3146670357 cites W2618574054 @default.
- W3146670357 cites W2743200750 @default.
- W3146670357 cites W2752971446 @default.
- W3146670357 cites W2799090584 @default.
- W3146670357 cites W2897080984 @default.
- W3146670357 cites W2943997064 @default.
- W3146670357 cites W2946963372 @default.
- W3146670357 cites W2948920178 @default.
- W3146670357 cites W2962739340 @default.
- W3146670357 cites W2962766164 @default.
- W3146670357 cites W2962843773 @default.
- W3146670357 cites W2962900737 @default.
- W3146670357 cites W2963096987 @default.
- W3146670357 cites W2963113424 @default.
- W3146670357 cites W2963703197 @default.
- W3146670357 cites W2964081807 @default.
- W3146670357 cites W2964273174 @default.
- W3146670357 cites W2964292098 @default.
- W3146670357 cites W2964309657 @default.
- W3146670357 cites W2970038028 @default.
- W3146670357 cites W2973562770 @default.
- W3146670357 cites W2989457543 @default.
- W3146670357 cites W3026649487 @default.
- W3146670357 cites W3035314311 @default.
- W3146670357 cites W3035343530 @default.
- W3146670357 cites W3035702670 @default.
- W3146670357 cites W3042879175 @default.
- W3146670357 cites W3109044914 @default.
- W3146670357 cites W3118608800 @default.
- W3146670357 cites W3118813946 @default.
- W3146670357 cites W3134631405 @default.
- W3146670357 cites W3156669901 @default.
- W3146670357 cites W3176502563 @default.
- W3146670357 cites W369786348 @default.
- W3146670357 cites W9014458 @default.
- W3146670357 cites W2962915384 @default.
- W3146670357 hasPublicationYear "2021" @default.
- W3146670357 type Work @default.
- W3146670357 sameAs 3146670357 @default.
- W3146670357 citedByCount "25" @default.
- W3146670357 countsByYear W31466703572020 @default.
- W3146670357 countsByYear W31466703572021 @default.
- W3146670357 countsByYear W31466703572022 @default.
- W3146670357 crossrefType "posted-content" @default.
- W3146670357 hasAuthorship W3146670357A5071352839 @default.
- W3146670357 hasAuthorship W3146670357A5074303284 @default.
- W3146670357 hasAuthorship W3146670357A5091720485 @default.
- W3146670357 hasConcept C119857082 @default.
- W3146670357 hasConcept C124101348 @default.
- W3146670357 hasConcept C13280743 @default.
- W3146670357 hasConcept C136764020 @default.
- W3146670357 hasConcept C151730666 @default.
- W3146670357 hasConcept C154945302 @default.
- W3146670357 hasConcept C169903167 @default.
- W3146670357 hasConcept C177264268 @default.
- W3146670357 hasConcept C185798385 @default.
- W3146670357 hasConcept C199360897 @default.
- W3146670357 hasConcept C204321447 @default.
- W3146670357 hasConcept C205649164 @default.
- W3146670357 hasConcept C2777267654 @default.
- W3146670357 hasConcept C41008148 @default.
- W3146670357 hasConcept C62230096 @default.
- W3146670357 hasConcept C86803240 @default.
- W3146670357 hasConceptScore W3146670357C119857082 @default.
- W3146670357 hasConceptScore W3146670357C124101348 @default.
- W3146670357 hasConceptScore W3146670357C13280743 @default.
- W3146670357 hasConceptScore W3146670357C136764020 @default.
- W3146670357 hasConceptScore W3146670357C151730666 @default.
- W3146670357 hasConceptScore W3146670357C154945302 @default.
- W3146670357 hasConceptScore W3146670357C169903167 @default.
- W3146670357 hasConceptScore W3146670357C177264268 @default.
- W3146670357 hasConceptScore W3146670357C185798385 @default.
- W3146670357 hasConceptScore W3146670357C199360897 @default.
- W3146670357 hasConceptScore W3146670357C204321447 @default.
- W3146670357 hasConceptScore W3146670357C205649164 @default.
- W3146670357 hasConceptScore W3146670357C2777267654 @default.
- W3146670357 hasConceptScore W3146670357C41008148 @default.
- W3146670357 hasConceptScore W3146670357C62230096 @default.
- W3146670357 hasConceptScore W3146670357C86803240 @default.