Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385571879> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W4385571879 abstract "Although pre-trained named entity recognition (NER) models are highly accurate on modern corpora, they underperform on historical texts due to differences in language OCR errors. In this work, we develop a new NER corpus of 3.6M sentences from late medieval charters written mainly in Czech, Latin, and German.We show that we can start with a list of known historical figures and locations and an unannotated corpus of historical texts, and use information retrieval techniques to automatically bootstrap a NER-annotated corpus. Using our corpus, we train a NER model that achieves entity-level Precision of 72.81–93.98% with 58.14–81.77% Recall on a manually-annotated test dataset. Furthermore, we show that using a weighted loss function helps to combat class imbalance in token classification tasks. To make it easy for others to reproduce and build upon our work, we publicly release our corpus, models, and experimental code." @default.
- W4385571879 created "2023-08-05" @default.
- W4385571879 creator A5043467916 @default.
- W4385571879 creator A5054652735 @default.
- W4385571879 creator A5062524005 @default.
- W4385571879 creator A5088482978 @default.
- W4385571879 creator A5092596302 @default.
- W4385571879 date "2023-01-01" @default.
- W4385571879 modified "2023-10-10" @default.
- W4385571879 title "People and Places of Historical Europe: Bootstrapping Annotation Pipeline and a New Corpus of Named Entities in Late Medieval Texts" @default.
- W4385571879 doi "https://doi.org/10.18653/v1/2023.findings-acl.887" @default.
- W4385571879 hasPublicationYear "2023" @default.
- W4385571879 type Work @default.
- W4385571879 citedByCount "1" @default.
- W4385571879 crossrefType "proceedings-article" @default.
- W4385571879 hasAuthorship W4385571879A5043467916 @default.
- W4385571879 hasAuthorship W4385571879A5054652735 @default.
- W4385571879 hasAuthorship W4385571879A5062524005 @default.
- W4385571879 hasAuthorship W4385571879A5088482978 @default.
- W4385571879 hasAuthorship W4385571879A5092596302 @default.
- W4385571879 hasBestOaLocation W43855718791 @default.
- W4385571879 hasConcept C106159729 @default.
- W4385571879 hasConcept C138885662 @default.
- W4385571879 hasConcept C154775046 @default.
- W4385571879 hasConcept C154945302 @default.
- W4385571879 hasConcept C162324750 @default.
- W4385571879 hasConcept C187736073 @default.
- W4385571879 hasConcept C199360897 @default.
- W4385571879 hasConcept C204321447 @default.
- W4385571879 hasConcept C207609745 @default.
- W4385571879 hasConcept C23123220 @default.
- W4385571879 hasConcept C2474386 @default.
- W4385571879 hasConcept C2776321320 @default.
- W4385571879 hasConcept C2777842544 @default.
- W4385571879 hasConcept C2779135771 @default.
- W4385571879 hasConcept C2780451532 @default.
- W4385571879 hasConcept C38652104 @default.
- W4385571879 hasConcept C41008148 @default.
- W4385571879 hasConcept C41895202 @default.
- W4385571879 hasConcept C43521106 @default.
- W4385571879 hasConcept C48145219 @default.
- W4385571879 hasConcept C81669768 @default.
- W4385571879 hasConceptScore W4385571879C106159729 @default.
- W4385571879 hasConceptScore W4385571879C138885662 @default.
- W4385571879 hasConceptScore W4385571879C154775046 @default.
- W4385571879 hasConceptScore W4385571879C154945302 @default.
- W4385571879 hasConceptScore W4385571879C162324750 @default.
- W4385571879 hasConceptScore W4385571879C187736073 @default.
- W4385571879 hasConceptScore W4385571879C199360897 @default.
- W4385571879 hasConceptScore W4385571879C204321447 @default.
- W4385571879 hasConceptScore W4385571879C207609745 @default.
- W4385571879 hasConceptScore W4385571879C23123220 @default.
- W4385571879 hasConceptScore W4385571879C2474386 @default.
- W4385571879 hasConceptScore W4385571879C2776321320 @default.
- W4385571879 hasConceptScore W4385571879C2777842544 @default.
- W4385571879 hasConceptScore W4385571879C2779135771 @default.
- W4385571879 hasConceptScore W4385571879C2780451532 @default.
- W4385571879 hasConceptScore W4385571879C38652104 @default.
- W4385571879 hasConceptScore W4385571879C41008148 @default.
- W4385571879 hasConceptScore W4385571879C41895202 @default.
- W4385571879 hasConceptScore W4385571879C43521106 @default.
- W4385571879 hasConceptScore W4385571879C48145219 @default.
- W4385571879 hasConceptScore W4385571879C81669768 @default.
- W4385571879 hasLocation W43855718791 @default.
- W4385571879 hasLocation W43855718792 @default.
- W4385571879 hasOpenAccess W4385571879 @default.
- W4385571879 hasPrimaryLocation W43855718791 @default.
- W4385571879 hasRelatedWork W1534274833 @default.
- W4385571879 hasRelatedWork W156620619 @default.
- W4385571879 hasRelatedWork W1598221548 @default.
- W4385571879 hasRelatedWork W1963695443 @default.
- W4385571879 hasRelatedWork W2081850291 @default.
- W4385571879 hasRelatedWork W2616249226 @default.
- W4385571879 hasRelatedWork W2914363205 @default.
- W4385571879 hasRelatedWork W2947569483 @default.
- W4385571879 hasRelatedWork W3031263788 @default.
- W4385571879 hasRelatedWork W3117246195 @default.
- W4385571879 isParatext "false" @default.
- W4385571879 isRetracted "false" @default.
- W4385571879 workType "article" @default.