Matches in SemOpenAlex for { <https://semopenalex.org/work/W4366201187> ?p ?o ?g. }
- W4366201187 endingPage "45205" @default.
- W4366201187 startingPage "45194" @default.
- W4366201187 abstract "Within the Natural Language Processing (NLP) framework, Named Entity Recognition (NER) is regarded as the basis for extracting key information to understand texts in any language. As Bangla is a highly inflectional, morphologically rich, and resource-scarce language, building a balanced NER corpus with large and diverse entities is a demanding task. However, previously developed Bangla NER systems are limited to recognizing only three familiar entities: person, location, and organization. To address this significant limitation, we introduce a novel Bangla NER dataset B-NER, which was created using 22,144 manually annotated Bangla sentences collected from Bangla newspapers and Bangla Wikipedia. This dataset includes a total of 9,895 unique words which were manually categorized into eight different entity types, such as a person, organization, event, artifact, time indicator, natural phenomenon, geopolitical entity, and geographical location. Inter-annotator agreement experiments were conducted to validate the quality of annotations performed by three annotators, resulting in a Kappa score of 0.82. In this paper, we provide an outline of the annotation guideline illustrated with examples, discuss the B-NER dataset properties, and present benchmark evaluations of the dataset. To establish that B-NER is more comprehensive and balanced in comparison to other publicly accessible datasets, we conducted cross-dataset modeling and validation, i.e. trained NER model on one dataset while tested on another, and found that the model trained on B-NER performed the best in that settings. Furthermore, we performed exhaustive benchmark evaluations based on Bidirectional LSTM with fastText embeddings and sentence transformer models. Among these models, fine-tuned IndicBERT achieved noticeable results with a Macro-F1 of 86%. This dataset and baseline results will be publicly available under a CC-BY 4.0 license in the CoNLL-2002 format to facilitate further research on Bangla NER." @default.
- W4366201187 created "2023-04-19" @default.
- W4366201187 creator A5022813420 @default.
- W4366201187 creator A5023922327 @default.
- W4366201187 creator A5028396045 @default.
- W4366201187 creator A5038033301 @default.
- W4366201187 creator A5062822385 @default.
- W4366201187 creator A5079101835 @default.
- W4366201187 date "2023-01-01" @default.
- W4366201187 modified "2023-10-06" @default.
- W4366201187 title "B-NER: A Novel Bangla Named Entity Recognition Dataset With Largest Entities and Its Baseline Evaluation" @default.
- W4366201187 cites W1587977534 @default.
- W4366201187 cites W1623072288 @default.
- W4366201187 cites W1988995507 @default.
- W4366201187 cites W2010949961 @default.
- W4366201187 cites W2020278455 @default.
- W4366201187 cites W2064675550 @default.
- W4366201187 cites W2067326963 @default.
- W4366201187 cites W2068882115 @default.
- W4366201187 cites W2097978833 @default.
- W4366201187 cites W2101070712 @default.
- W4366201187 cites W2123556395 @default.
- W4366201187 cites W2124592697 @default.
- W4366201187 cites W2135843243 @default.
- W4366201187 cites W2138780451 @default.
- W4366201187 cites W2143933463 @default.
- W4366201187 cites W2148540243 @default.
- W4366201187 cites W2164840730 @default.
- W4366201187 cites W2169423212 @default.
- W4366201187 cites W2340954483 @default.
- W4366201187 cites W2625800120 @default.
- W4366201187 cites W2786223050 @default.
- W4366201187 cites W2886146035 @default.
- W4366201187 cites W2913498662 @default.
- W4366201187 cites W2921362618 @default.
- W4366201187 cites W2955175590 @default.
- W4366201187 cites W2962902328 @default.
- W4366201187 cites W2963625095 @default.
- W4366201187 cites W2966394569 @default.
- W4366201187 cites W2970509139 @default.
- W4366201187 cites W3013836620 @default.
- W4366201187 cites W3022556802 @default.
- W4366201187 cites W3035625205 @default.
- W4366201187 cites W3102295886 @default.
- W4366201187 cites W4290714600 @default.
- W4366201187 doi "https://doi.org/10.1109/access.2023.3267746" @default.
- W4366201187 hasPublicationYear "2023" @default.
- W4366201187 type Work @default.
- W4366201187 citedByCount "1" @default.
- W4366201187 countsByYear W43662011872023 @default.
- W4366201187 crossrefType "journal-article" @default.
- W4366201187 hasAuthorship W4366201187A5022813420 @default.
- W4366201187 hasAuthorship W4366201187A5023922327 @default.
- W4366201187 hasAuthorship W4366201187A5028396045 @default.
- W4366201187 hasAuthorship W4366201187A5038033301 @default.
- W4366201187 hasAuthorship W4366201187A5062822385 @default.
- W4366201187 hasAuthorship W4366201187A5079101835 @default.
- W4366201187 hasBestOaLocation W43662011871 @default.
- W4366201187 hasConcept C111368507 @default.
- W4366201187 hasConcept C12725497 @default.
- W4366201187 hasConcept C127313418 @default.
- W4366201187 hasConcept C13280743 @default.
- W4366201187 hasConcept C148524875 @default.
- W4366201187 hasConcept C154945302 @default.
- W4366201187 hasConcept C162324750 @default.
- W4366201187 hasConcept C185798385 @default.
- W4366201187 hasConcept C187736073 @default.
- W4366201187 hasConcept C19235068 @default.
- W4366201187 hasConcept C204321447 @default.
- W4366201187 hasConcept C205649164 @default.
- W4366201187 hasConcept C23123220 @default.
- W4366201187 hasConcept C2777530160 @default.
- W4366201187 hasConcept C2779135771 @default.
- W4366201187 hasConcept C2780451532 @default.
- W4366201187 hasConcept C41008148 @default.
- W4366201187 hasConcept C4554734 @default.
- W4366201187 hasConcept C96711827 @default.
- W4366201187 hasConceptScore W4366201187C111368507 @default.
- W4366201187 hasConceptScore W4366201187C12725497 @default.
- W4366201187 hasConceptScore W4366201187C127313418 @default.
- W4366201187 hasConceptScore W4366201187C13280743 @default.
- W4366201187 hasConceptScore W4366201187C148524875 @default.
- W4366201187 hasConceptScore W4366201187C154945302 @default.
- W4366201187 hasConceptScore W4366201187C162324750 @default.
- W4366201187 hasConceptScore W4366201187C185798385 @default.
- W4366201187 hasConceptScore W4366201187C187736073 @default.
- W4366201187 hasConceptScore W4366201187C19235068 @default.
- W4366201187 hasConceptScore W4366201187C204321447 @default.
- W4366201187 hasConceptScore W4366201187C205649164 @default.
- W4366201187 hasConceptScore W4366201187C23123220 @default.
- W4366201187 hasConceptScore W4366201187C2777530160 @default.
- W4366201187 hasConceptScore W4366201187C2779135771 @default.
- W4366201187 hasConceptScore W4366201187C2780451532 @default.
- W4366201187 hasConceptScore W4366201187C41008148 @default.
- W4366201187 hasConceptScore W4366201187C4554734 @default.
- W4366201187 hasConceptScore W4366201187C96711827 @default.
- W4366201187 hasLocation W43662011871 @default.
- W4366201187 hasOpenAccess W4366201187 @default.