Matches in SemOpenAlex for { <https://semopenalex.org/work/W3120706522> ?p ?o ?g. }
- W3120706522 abstract "Despite impressive performance on standard benchmarks, deep neural networks are often brittle when deployed in real-world systems. Consequently, recent research has focused on testing the robustness of such models, resulting in a diverse set of evaluation methodologies ranging from adversarial attacks to rule-based data transformations. In this work, we identify challenges with evaluating NLP systems and propose a solution in the form of Robustness Gym (RG), a simple and extensible evaluation toolkit that unifies 4 standard evaluation paradigms: subpopulations, transformations, evaluation sets, and adversarial attacks. By providing a common platform for evaluation, Robustness Gym enables practitioners to compare results from all 4 evaluation paradigms with just a few clicks, and to easily develop and share novel evaluation methods using a built-in set of abstractions. To validate Robustness Gym's utility to practitioners, we conducted a real-world case study with a sentiment-modeling team, revealing performance degradations of 18%+. To verify that Robustness Gym can aid novel research analyses, we perform the first study of state-of-the-art commercial and academic named entity linking (NEL) systems, as well as a fine-grained analysis of state-of-the-art summarization models. For NEL, commercial systems struggle to link rare entities and lag their academic counterparts by 10%+, while state-of-the-art summarization models struggle on examples that require abstraction and distillation, degrading by 9%+. Robustness Gym can be found at this https URL" @default.
- W3120706522 created "2021-01-18" @default.
- W3120706522 creator A5001987532 @default.
- W3120706522 creator A5032046813 @default.
- W3120706522 creator A5041381106 @default.
- W3120706522 creator A5044735229 @default.
- W3120706522 creator A5052900932 @default.
- W3120706522 creator A5062516561 @default.
- W3120706522 creator A5063897942 @default.
- W3120706522 creator A5070267604 @default.
- W3120706522 creator A5076580255 @default.
- W3120706522 date "2021-01-13" @default.
- W3120706522 modified "2023-09-27" @default.
- W3120706522 title "Robustness Gym: Unifying the NLP Evaluation Landscape" @default.
- W3120706522 cites W1506806321 @default.
- W3120706522 cites W1555759181 @default.
- W3120706522 cites W2089383796 @default.
- W3120706522 cites W2123142779 @default.
- W3120706522 cites W2131241448 @default.
- W3120706522 cites W2136297100 @default.
- W3120706522 cites W2182361439 @default.
- W3120706522 cites W2219598741 @default.
- W3120706522 cites W2251869843 @default.
- W3120706522 cites W2785366763 @default.
- W3120706522 cites W2790415926 @default.
- W3120706522 cites W2796868841 @default.
- W3120706522 cites W2798280648 @default.
- W3120706522 cites W2798665661 @default.
- W3120706522 cites W2799064010 @default.
- W3120706522 cites W2886614482 @default.
- W3120706522 cites W2889468083 @default.
- W3120706522 cites W2896807716 @default.
- W3120706522 cites W2899361264 @default.
- W3120706522 cites W2911435132 @default.
- W3120706522 cites W2913991091 @default.
- W3120706522 cites W2949615363 @default.
- W3120706522 cites W2949858875 @default.
- W3120706522 cites W2949911172 @default.
- W3120706522 cites W2951458896 @default.
- W3120706522 cites W2951623480 @default.
- W3120706522 cites W2953084091 @default.
- W3120706522 cites W2956281901 @default.
- W3120706522 cites W2963351832 @default.
- W3120706522 cites W2963394326 @default.
- W3120706522 cites W2963542100 @default.
- W3120706522 cites W2963607157 @default.
- W3120706522 cites W2963661177 @default.
- W3120706522 cites W2963846996 @default.
- W3120706522 cites W2963969878 @default.
- W3120706522 cites W2969442125 @default.
- W3120706522 cites W2969670093 @default.
- W3120706522 cites W2970611986 @default.
- W3120706522 cites W2970846123 @default.
- W3120706522 cites W2971034336 @default.
- W3120706522 cites W2971296908 @default.
- W3120706522 cites W2973631113 @default.
- W3120706522 cites W2974284957 @default.
- W3120706522 cites W2977235550 @default.
- W3120706522 cites W2984812384 @default.
- W3120706522 cites W2990704537 @default.
- W3120706522 cites W3005040148 @default.
- W3120706522 cites W3006437051 @default.
- W3120706522 cites W3014564055 @default.
- W3120706522 cites W3015830489 @default.
- W3120706522 cites W3016839026 @default.
- W3120706522 cites W3022116759 @default.
- W3120706522 cites W3023955814 @default.
- W3120706522 cites W3025408396 @default.
- W3120706522 cites W3030163527 @default.
- W3120706522 cites W3030218064 @default.
- W3120706522 cites W3034715004 @default.
- W3120706522 cites W3034850762 @default.
- W3120706522 cites W3034999214 @default.
- W3120706522 cites W3035507081 @default.
- W3120706522 cites W3037492894 @default.
- W3120706522 cites W3044324512 @default.
- W3120706522 cites W3045321166 @default.
- W3120706522 cites W3082274269 @default.
- W3120706522 cites W3094024085 @default.
- W3120706522 cites W3098350697 @default.
- W3120706522 cites W3098824823 @default.
- W3120706522 cites W3098987177 @default.
- W3120706522 cites W3100279624 @default.
- W3120706522 cites W3101662419 @default.
- W3120706522 cites W3104423855 @default.
- W3120706522 cites W3104911444 @default.
- W3120706522 cites W3112486745 @default.
- W3120706522 cites W9657784 @default.
- W3120706522 cites W2525127255 @default.
- W3120706522 cites W2971350322 @default.
- W3120706522 hasPublicationYear "2021" @default.
- W3120706522 type Work @default.
- W3120706522 sameAs 3120706522 @default.
- W3120706522 citedByCount "29" @default.
- W3120706522 countsByYear W31207065222020 @default.
- W3120706522 countsByYear W31207065222021 @default.
- W3120706522 crossrefType "posted-content" @default.
- W3120706522 hasAuthorship W3120706522A5001987532 @default.
- W3120706522 hasAuthorship W3120706522A5032046813 @default.
- W3120706522 hasAuthorship W3120706522A5041381106 @default.