Matches in SemOpenAlex for { <https://semopenalex.org/work/W4220851215> ?p ?o ?g. }
- W4220851215 endingPage "100205" @default.
- W4220851215 startingPage "100205" @default.
- W4220851215 abstract "Toxic comment classification models are often found biased towards identity terms, i.e., terms characterizing a specific group of people such as “Muslim” and “black”. Such bias is commonly reflected in false positive predictions, i.e., non-toxic comments with identity terms. In this work, we propose a novel approach to debias the model in toxic comment classification, leveraging the notion of subjectivity level of a comment and the presence of identity terms. We hypothesize that toxic comments containing identity terms are more likely to be expressions of subjective feelings or opinions. Therefore, the subjectivity level of a comment containing identity terms can be helpful for classifying toxic comments and mitigating the identity term bias. To implement this idea, we propose a model based on BERT and study two different methods of measuring the subjectivity level. The first method uses a lexicon-based tool. The second method is based on the idea of calculating the embedding similarity between a comment and a relevant Wikipedia text of the identity term in the comment. We thoroughly evaluate our method on an extensive collection of four datasets collected from different social media platforms. Our results show that: (1) our models that incorporate both features of subjectivity and identity terms consistently outperform strong SOTA baselines, with our best performing model achieving an improvement in F1 of 4.75% over a Twitter dataset; (2) our idea of measuring subjectivity based on the similarity to the relevant Wikipedia text is very effective on toxic comment classification as our model using this has achieved the best performance on 3 out of 4 datasets while obtaining comparative performance on the remaining dataset. We further test our method on RoBERTa to evaluate the generality of our method and the results show the biggest improvement in F1 of up to 1.29% (on a dataset from a white supremacist online forum)." @default.
- W4220851215 created "2022-04-03" @default.
- W4220851215 creator A5053944287 @default.
- W4220851215 creator A5063065717 @default.
- W4220851215 creator A5080524660 @default.
- W4220851215 date "2022-05-01" @default.
- W4220851215 modified "2023-10-16" @default.
- W4220851215 title "Utilizing subjectivity level to mitigate identity term bias in toxic comments classification" @default.
- W4220851215 cites W1986154307 @default.
- W4220851215 cites W1986614398 @default.
- W4220851215 cites W1994361353 @default.
- W4220851215 cites W2015717769 @default.
- W4220851215 cites W2065100127 @default.
- W4220851215 cites W2077513286 @default.
- W4220851215 cites W2101883461 @default.
- W4220851215 cites W2114524997 @default.
- W4220851215 cites W2152303055 @default.
- W4220851215 cites W2181854537 @default.
- W4220851215 cites W2311430799 @default.
- W4220851215 cites W2470673105 @default.
- W4220851215 cites W2473555522 @default.
- W4220851215 cites W2511234952 @default.
- W4220851215 cites W2607719644 @default.
- W4220851215 cites W2622814278 @default.
- W4220851215 cites W2740168486 @default.
- W4220851215 cites W2791170418 @default.
- W4220851215 cites W2911227954 @default.
- W4220851215 cites W2914097099 @default.
- W4220851215 cites W2920807444 @default.
- W4220851215 cites W2949678053 @default.
- W4220851215 cites W2953884906 @default.
- W4220851215 cites W2956090150 @default.
- W4220851215 cites W2962990575 @default.
- W4220851215 cites W2963116854 @default.
- W4220851215 cites W2964110616 @default.
- W4220851215 cites W2964235839 @default.
- W4220851215 cites W2972735048 @default.
- W4220851215 cites W2981977471 @default.
- W4220851215 cites W3034282334 @default.
- W4220851215 cites W3034723486 @default.
- W4220851215 cites W3035507081 @default.
- W4220851215 cites W3037546592 @default.
- W4220851215 cites W3037831233 @default.
- W4220851215 cites W3045555759 @default.
- W4220851215 cites W3049565363 @default.
- W4220851215 cites W3105928338 @default.
- W4220851215 cites W3129361063 @default.
- W4220851215 cites W3133702157 @default.
- W4220851215 cites W3134678353 @default.
- W4220851215 cites W3135514117 @default.
- W4220851215 cites W3135855479 @default.
- W4220851215 cites W3155742828 @default.
- W4220851215 cites W3166550823 @default.
- W4220851215 cites W3172314079 @default.
- W4220851215 cites W3176580738 @default.
- W4220851215 cites W3208372599 @default.
- W4220851215 cites W3212862986 @default.
- W4220851215 cites W80056832 @default.
- W4220851215 doi "https://doi.org/10.1016/j.osnem.2022.100205" @default.
- W4220851215 hasPublicationYear "2022" @default.
- W4220851215 type Work @default.
- W4220851215 citedByCount "2" @default.
- W4220851215 countsByYear W42208512152022 @default.
- W4220851215 countsByYear W42208512152023 @default.
- W4220851215 crossrefType "journal-article" @default.
- W4220851215 hasAuthorship W4220851215A5053944287 @default.
- W4220851215 hasAuthorship W4220851215A5063065717 @default.
- W4220851215 hasAuthorship W4220851215A5080524660 @default.
- W4220851215 hasConcept C103278499 @default.
- W4220851215 hasConcept C107038049 @default.
- W4220851215 hasConcept C111472728 @default.
- W4220851215 hasConcept C115961682 @default.
- W4220851215 hasConcept C119857082 @default.
- W4220851215 hasConcept C121332964 @default.
- W4220851215 hasConcept C122980154 @default.
- W4220851215 hasConcept C138885662 @default.
- W4220851215 hasConcept C154945302 @default.
- W4220851215 hasConcept C15744967 @default.
- W4220851215 hasConcept C202889954 @default.
- W4220851215 hasConcept C204321447 @default.
- W4220851215 hasConcept C23123220 @default.
- W4220851215 hasConcept C2522767166 @default.
- W4220851215 hasConcept C2778121359 @default.
- W4220851215 hasConcept C2778355321 @default.
- W4220851215 hasConcept C41008148 @default.
- W4220851215 hasConcept C61797465 @default.
- W4220851215 hasConcept C62520636 @default.
- W4220851215 hasConcept C77805123 @default.
- W4220851215 hasConceptScore W4220851215C103278499 @default.
- W4220851215 hasConceptScore W4220851215C107038049 @default.
- W4220851215 hasConceptScore W4220851215C111472728 @default.
- W4220851215 hasConceptScore W4220851215C115961682 @default.
- W4220851215 hasConceptScore W4220851215C119857082 @default.
- W4220851215 hasConceptScore W4220851215C121332964 @default.
- W4220851215 hasConceptScore W4220851215C122980154 @default.
- W4220851215 hasConceptScore W4220851215C138885662 @default.
- W4220851215 hasConceptScore W4220851215C154945302 @default.
- W4220851215 hasConceptScore W4220851215C15744967 @default.