Matches in SemOpenAlex for { <https://semopenalex.org/work/W4283461411> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W4283461411 endingPage "108416" @default.
- W4283461411 startingPage "108416" @default.
- W4283461411 abstract "Toxic Language in social media is a newly emerging virtual disorder of human society. Detecting toxic language is an NLP task that requires a Dataset of utterances [1]. For the Bangla language, very few datasets have been developed on toxicity or similar concepts [2]. A dataset has been developed using user-generated content from Facebook and that will cover the demographic and thematic distribution of Bangla toxic language generated on the web. Therefore, 2207590 comments have been collected, annotated, and thus extract about 1959 unique bigrams as utterances, which were considered as base-entry of a toxic language dataset. The core derivatives of the dataset are bigram-based wordlists, which are annotated inductively and divided into 08 thematic classes that give some ideas on toxicity variations found in the Bengali community. These thematic classes cover political hate speech [3] and misogynist bullies dominantly. However, these thematic labels will serve as classifiers in the text classification process through machine learning. In addition to the thematic classification labels, this dataset includes some additional features such as imprecise meanings in English, IPA transliteration, real occurrences in the source pages, spelling standards, and degree of toxicity. As this is a dataset of utterance, it has de-identified and anonymous entries and no difficulties for public disclosure. Therefore, we consider this dataset as Toxic lexicon (Toxlex) as an exhaustive wordlist that is essentially a curated value-added and analyzed dataset which can be used as classifier material to detect toxicity in social media." @default.
- W4283461411 created "2022-06-26" @default.
- W4283461411 creator A5047231876 @default.
- W4283461411 date "2022-08-01" @default.
- W4283461411 modified "2023-10-14" @default.
- W4283461411 title "ToxLex_bn: A curated dataset of bangla toxic language derived from Facebook comment" @default.
- W4283461411 cites W3096800297 @default.
- W4283461411 doi "https://doi.org/10.1016/j.dib.2022.108416" @default.
- W4283461411 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/35811647" @default.
- W4283461411 hasPublicationYear "2022" @default.
- W4283461411 type Work @default.
- W4283461411 citedByCount "1" @default.
- W4283461411 countsByYear W42834614112023 @default.
- W4283461411 crossrefType "journal-article" @default.
- W4283461411 hasAuthorship W4283461411A5047231876 @default.
- W4283461411 hasBestOaLocation W42834614111 @default.
- W4283461411 hasConcept C108757681 @default.
- W4283461411 hasConcept C136764020 @default.
- W4283461411 hasConcept C137546455 @default.
- W4283461411 hasConcept C138885662 @default.
- W4283461411 hasConcept C154945302 @default.
- W4283461411 hasConcept C19235068 @default.
- W4283461411 hasConcept C204321447 @default.
- W4283461411 hasConcept C205649164 @default.
- W4283461411 hasConcept C2777801307 @default.
- W4283461411 hasConcept C2778121359 @default.
- W4283461411 hasConcept C41008148 @default.
- W4283461411 hasConcept C41895202 @default.
- W4283461411 hasConcept C518677369 @default.
- W4283461411 hasConcept C520968082 @default.
- W4283461411 hasConcept C58640448 @default.
- W4283461411 hasConcept C66402592 @default.
- W4283461411 hasConcept C93692415 @default.
- W4283461411 hasConcept C95623464 @default.
- W4283461411 hasConceptScore W4283461411C108757681 @default.
- W4283461411 hasConceptScore W4283461411C136764020 @default.
- W4283461411 hasConceptScore W4283461411C137546455 @default.
- W4283461411 hasConceptScore W4283461411C138885662 @default.
- W4283461411 hasConceptScore W4283461411C154945302 @default.
- W4283461411 hasConceptScore W4283461411C19235068 @default.
- W4283461411 hasConceptScore W4283461411C204321447 @default.
- W4283461411 hasConceptScore W4283461411C205649164 @default.
- W4283461411 hasConceptScore W4283461411C2777801307 @default.
- W4283461411 hasConceptScore W4283461411C2778121359 @default.
- W4283461411 hasConceptScore W4283461411C41008148 @default.
- W4283461411 hasConceptScore W4283461411C41895202 @default.
- W4283461411 hasConceptScore W4283461411C518677369 @default.
- W4283461411 hasConceptScore W4283461411C520968082 @default.
- W4283461411 hasConceptScore W4283461411C58640448 @default.
- W4283461411 hasConceptScore W4283461411C66402592 @default.
- W4283461411 hasConceptScore W4283461411C93692415 @default.
- W4283461411 hasConceptScore W4283461411C95623464 @default.
- W4283461411 hasLocation W42834614111 @default.
- W4283461411 hasLocation W42834614112 @default.
- W4283461411 hasLocation W42834614113 @default.
- W4283461411 hasLocation W42834614114 @default.
- W4283461411 hasOpenAccess W4283461411 @default.
- W4283461411 hasPrimaryLocation W42834614111 @default.
- W4283461411 hasRelatedWork W180823474 @default.
- W4283461411 hasRelatedWork W2095452022 @default.
- W4283461411 hasRelatedWork W2346975490 @default.
- W4283461411 hasRelatedWork W2903456573 @default.
- W4283461411 hasRelatedWork W3011677438 @default.
- W4283461411 hasRelatedWork W3011860286 @default.
- W4283461411 hasRelatedWork W3015200540 @default.
- W4283461411 hasRelatedWork W3192589309 @default.
- W4283461411 hasRelatedWork W4226097025 @default.
- W4283461411 hasRelatedWork W825857799 @default.
- W4283461411 hasVolume "43" @default.
- W4283461411 isParatext "false" @default.
- W4283461411 isRetracted "false" @default.
- W4283461411 workType "article" @default.