Matches in SemOpenAlex for { <https://semopenalex.org/work/W3100355250> ?p ?o ?g. }
- W3100355250 abstract "Pretrained neural language models (LMs) are prone to generating racist, sexist, or otherwise toxic language which hinders their safe deployment. We investigate the extent to which pretrained LMs can be prompted to generate toxic language, and the effectiveness of controllable text generation algorithms at preventing such toxic degeneration. We create and release RealToxicityPrompts, a dataset of 100K naturally occurring, sentence-level prompts derived from a large corpus of English web text, paired with toxicity scores from a widely-used toxicity classifier. Using RealToxicityPrompts, we find that pretrained LMs can degenerate into toxic text even from seemingly innocuous prompts. We empirically assess several controllable generation methods, and find that while data- or compute-intensive methods (e.g., adaptive pretraining on non-toxic data) are more effective at steering away from toxicity than simpler solutions (e.g., banning “bad” words), no current method is failsafe against neural toxic degeneration. To pinpoint the potential cause of such persistent toxic degeneration, we analyze two web text corpora used to pretrain several LMs (including GPT-2; Radford et. al, 2019), and find a significant amount of offensive, factually unreliable, and otherwise toxic content. Our work provides a test bed for evaluating toxic generations by LMs and stresses the need for better data selection processes for pretraining." @default.
- W3100355250 created "2020-11-23" @default.
- W3100355250 creator A5015128745 @default.
- W3100355250 creator A5022328505 @default.
- W3100355250 creator A5045464993 @default.
- W3100355250 creator A5075783850 @default.
- W3100355250 creator A5088517824 @default.
- W3100355250 date "2020-01-01" @default.
- W3100355250 modified "2023-10-13" @default.
- W3100355250 title "RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models" @default.
- W3100355250 cites W1535357018 @default.
- W3100355250 cites W1566289585 @default.
- W3100355250 cites W1736726159 @default.
- W3100355250 cites W1987545044 @default.
- W3100355250 cites W2012942264 @default.
- W3100355250 cites W2112796928 @default.
- W3100355250 cites W2165279024 @default.
- W3100355250 cites W2301953046 @default.
- W3100355250 cites W2317486271 @default.
- W3100355250 cites W2489130013 @default.
- W3100355250 cites W2493916176 @default.
- W3100355250 cites W2563826943 @default.
- W3100355250 cites W2585712495 @default.
- W3100355250 cites W2597603852 @default.
- W3100355250 cites W2607388700 @default.
- W3100355250 cites W2608438166 @default.
- W3100355250 cites W2734862619 @default.
- W3100355250 cites W2791170418 @default.
- W3100355250 cites W2911227954 @default.
- W3100355250 cites W2915023677 @default.
- W3100355250 cites W2926555354 @default.
- W3100355250 cites W2946930197 @default.
- W3100355250 cites W2947160092 @default.
- W3100355250 cites W2949678053 @default.
- W3100355250 cites W2962784628 @default.
- W3100355250 cites W2962788902 @default.
- W3100355250 cites W2963078909 @default.
- W3100355250 cites W2963205619 @default.
- W3100355250 cites W2963283805 @default.
- W3100355250 cites W2963341956 @default.
- W3100355250 cites W2963403868 @default.
- W3100355250 cites W2963790016 @default.
- W3100355250 cites W2963955897 @default.
- W3100355250 cites W2964235839 @default.
- W3100355250 cites W2965373594 @default.
- W3100355250 cites W2968297680 @default.
- W3100355250 cites W2970283086 @default.
- W3100355250 cites W2970395295 @default.
- W3100355250 cites W2970562804 @default.
- W3100355250 cites W2970971581 @default.
- W3100355250 cites W2971008823 @default.
- W3100355250 cites W2971173235 @default.
- W3100355250 cites W2971307358 @default.
- W3100355250 cites W2972413484 @default.
- W3100355250 cites W2972486443 @default.
- W3100355250 cites W2972657613 @default.
- W3100355250 cites W2972668795 @default.
- W3100355250 cites W2972735048 @default.
- W3100355250 cites W2973049837 @default.
- W3100355250 cites W2981852735 @default.
- W3100355250 cites W2982756474 @default.
- W3100355250 cites W2996287690 @default.
- W3100355250 cites W2997195635 @default.
- W3100355250 cites W3013451997 @default.
- W3100355250 cites W3030163527 @default.
- W3100355250 cites W3031851887 @default.
- W3100355250 cites W3033639609 @default.
- W3100355250 cites W3034238904 @default.
- W3100355250 cites W3034937117 @default.
- W3100355250 cites W3035296331 @default.
- W3100355250 cites W3037831233 @default.
- W3100355250 cites W3043315258 @default.
- W3100355250 cites W3047056223 @default.
- W3100355250 cites W3086249591 @default.
- W3100355250 cites W3088059392 @default.
- W3100355250 cites W3100279624 @default.
- W3100355250 cites W3102914525 @default.
- W3100355250 cites W3127086053 @default.
- W3100355250 cites W3133874049 @default.
- W3100355250 doi "https://doi.org/10.18653/v1/2020.findings-emnlp.301" @default.
- W3100355250 hasPublicationYear "2020" @default.
- W3100355250 type Work @default.
- W3100355250 sameAs 3100355250 @default.
- W3100355250 citedByCount "102" @default.
- W3100355250 countsByYear W31003552502012 @default.
- W3100355250 countsByYear W31003552502020 @default.
- W3100355250 countsByYear W31003552502021 @default.
- W3100355250 countsByYear W31003552502022 @default.
- W3100355250 countsByYear W31003552502023 @default.
- W3100355250 crossrefType "proceedings-article" @default.
- W3100355250 hasAuthorship W3100355250A5015128745 @default.
- W3100355250 hasAuthorship W3100355250A5022328505 @default.
- W3100355250 hasAuthorship W3100355250A5045464993 @default.
- W3100355250 hasAuthorship W3100355250A5075783850 @default.
- W3100355250 hasAuthorship W3100355250A5088517824 @default.
- W3100355250 hasBestOaLocation W31003552501 @default.
- W3100355250 hasConcept C119857082 @default.
- W3100355250 hasConcept C12267149 @default.
- W3100355250 hasConcept C127413603 @default.
- W3100355250 hasConcept C137293760 @default.