Matches in SemOpenAlex for { <https://semopenalex.org/work/W3199369541> ?p ?o ?g. }
- W3199369541 abstract "Large language models (LM) generate remarkably fluent text and can be efficiently adapted across NLP tasks. Measuring and guaranteeing the quality of generated text in terms of safety is imperative for deploying LMs in the real world; to this end, prior work often relies on automatic evaluation of LM toxicity. We critically discuss this approach, evaluate several toxicity mitigation strategies with respect to both automatic and human evaluation, and analyze consequences of toxicity mitigation in terms of model bias and LM quality. We demonstrate that while basic intervention strategies can effectively optimize previously established automatic metrics on the RealToxicityPrompts dataset, this comes at the cost of reduced LM coverage for both texts about, and dialects of, marginalized groups. Additionally, we find that human raters often disagree with high automatic toxicity scores after strong toxicity reduction interventions -- highlighting further the nuances involved in careful evaluation of LM toxicity." @default.
- W3199369541 created "2021-09-27" @default.
- W3199369541 creator A5005117726 @default.
- W3199369541 creator A5013834379 @default.
- W3199369541 creator A5020758501 @default.
- W3199369541 creator A5049998479 @default.
- W3199369541 creator A5059226057 @default.
- W3199369541 creator A5067022550 @default.
- W3199369541 creator A5069590583 @default.
- W3199369541 creator A5076474156 @default.
- W3199369541 creator A5089424256 @default.
- W3199369541 creator A5091853559 @default.
- W3199369541 date "2021-09-15" @default.
- W3199369541 modified "2023-09-28" @default.
- W3199369541 title "Challenges in Detoxifying Language Models" @default.
- W3199369541 cites W1964164866 @default.
- W3199369541 cites W2099057450 @default.
- W3199369541 cites W2473344385 @default.
- W3199369541 cites W2473555522 @default.
- W3199369541 cites W2540646130 @default.
- W3199369541 cites W2585712495 @default.
- W3199369541 cites W2595653137 @default.
- W3199369541 cites W2791170418 @default.
- W3199369541 cites W2803517648 @default.
- W3199369541 cites W2920807444 @default.
- W3199369541 cites W2922580172 @default.
- W3199369541 cites W2942370121 @default.
- W3199369541 cites W2948223045 @default.
- W3199369541 cites W2949678053 @default.
- W3199369541 cites W2962937198 @default.
- W3199369541 cites W2962990575 @default.
- W3199369541 cites W2963206148 @default.
- W3199369541 cites W2963250244 @default.
- W3199369541 cites W2963341956 @default.
- W3199369541 cites W2963494889 @default.
- W3199369541 cites W2964110616 @default.
- W3199369541 cites W2964121744 @default.
- W3199369541 cites W2964235839 @default.
- W3199369541 cites W2971307358 @default.
- W3199369541 cites W2972735048 @default.
- W3199369541 cites W2982756474 @default.
- W3199369541 cites W2993398598 @default.
- W3199369541 cites W2996287690 @default.
- W3199369541 cites W3017311573 @default.
- W3199369541 cites W3034238904 @default.
- W3199369541 cites W3034937117 @default.
- W3199369541 cites W3082274269 @default.
- W3199369541 cites W3093233911 @default.
- W3199369541 cites W3100355250 @default.
- W3199369541 cites W3101767999 @default.
- W3199369541 cites W3102924767 @default.
- W3199369541 cites W3115772171 @default.
- W3199369541 cites W3124120397 @default.
- W3199369541 cites W3134354193 @default.
- W3199369541 cites W3135734416 @default.
- W3199369541 cites W3135773605 @default.
- W3199369541 cites W3153490941 @default.
- W3199369541 cites W3153611199 @default.
- W3199369541 cites W3155742828 @default.
- W3199369541 cites W3156216837 @default.
- W3199369541 cites W3184144760 @default.
- W3199369541 cites W3185376810 @default.
- W3199369541 cites W3190860428 @default.
- W3199369541 cites W3216852152 @default.
- W3199369541 cites W78136081 @default.
- W3199369541 cites W80056832 @default.
- W3199369541 hasPublicationYear "2021" @default.
- W3199369541 type Work @default.
- W3199369541 sameAs 3199369541 @default.
- W3199369541 citedByCount "2" @default.
- W3199369541 countsByYear W31993695412021 @default.
- W3199369541 crossrefType "posted-content" @default.
- W3199369541 hasAuthorship W3199369541A5005117726 @default.
- W3199369541 hasAuthorship W3199369541A5013834379 @default.
- W3199369541 hasAuthorship W3199369541A5020758501 @default.
- W3199369541 hasAuthorship W3199369541A5049998479 @default.
- W3199369541 hasAuthorship W3199369541A5059226057 @default.
- W3199369541 hasAuthorship W3199369541A5067022550 @default.
- W3199369541 hasAuthorship W3199369541A5069590583 @default.
- W3199369541 hasAuthorship W3199369541A5076474156 @default.
- W3199369541 hasAuthorship W3199369541A5089424256 @default.
- W3199369541 hasAuthorship W3199369541A5091853559 @default.
- W3199369541 hasConcept C111472728 @default.
- W3199369541 hasConcept C112930515 @default.
- W3199369541 hasConcept C127413603 @default.
- W3199369541 hasConcept C138885662 @default.
- W3199369541 hasConcept C144133560 @default.
- W3199369541 hasConcept C154945302 @default.
- W3199369541 hasConcept C178790620 @default.
- W3199369541 hasConcept C185592680 @default.
- W3199369541 hasConcept C18762648 @default.
- W3199369541 hasConcept C2779530757 @default.
- W3199369541 hasConcept C29730261 @default.
- W3199369541 hasConcept C41008148 @default.
- W3199369541 hasConcept C78519656 @default.
- W3199369541 hasConceptScore W3199369541C111472728 @default.
- W3199369541 hasConceptScore W3199369541C112930515 @default.
- W3199369541 hasConceptScore W3199369541C127413603 @default.
- W3199369541 hasConceptScore W3199369541C138885662 @default.
- W3199369541 hasConceptScore W3199369541C144133560 @default.