Matches in SemOpenAlex for { <https://semopenalex.org/work/W3168603268> ?p ?o ?g. }
- W3168603268 abstract "Given the biodiversity crisis, we more than ever need to access information on multiple taxa (e.g. distribution, traits, diet) in the scientific literature to understand, map and predict all-inclusive biodiversity. Tools are needed to automatically extract useful information from the ever-growing corpus of ecological texts and feed this information to open data repositories. A prerequisite is the ability to recognise mentions of taxa in text, a special case of named entity recognition (NER). In recent years, deep learning-based NER systems have become ubiquitous, yielding state-of-the-art results in the general and biomedical domains. However, no such tool is available to ecologists wishing to extract information from the biodiversity literature. We propose a new tool called TaxoNERD that provides two deep neural network (DNN) models to recognise taxon mentions in ecological documents. To achieve high performance, DNN-based NER models usually need to be trained on a large corpus of manually annotated text. Creating such a gold standard corpus (GSC) is a laborious and costly process, with the result that GSCs in the ecological domain tend to be too small to learn an accurate DNN model from scratch. To address this issue, we leverage existing DNN models pretrained on large biomedical corpora using transfer learning. The performance of our models is evaluated on four GSCs and compared to the most popular taxonomic NER tools. Our experiments suggest that existing taxonomic NER tools are not suited to the extraction of ecological information from text as they performed poorly on ecologically-oriented corpora, either because they do not take account of the variability of taxon naming practices, or because they do not generalise well to the ecological domain. Conversely, a domain-specific DNN-based tool like TaxoNERD outperformed the other approaches on an ecological information extraction task. Efforts are needed in order to raise ecological information extraction to the same level of performance as its biomedical counterpart. One promising direction is to leverage the huge corpus of unlabelled ecological texts to learn a language representation model that could benefit downstream tasks. These efforts could be highly beneficial to ecologists on the long term." @default.
- W3168603268 created "2021-06-22" @default.
- W3168603268 creator A5055750415 @default.
- W3168603268 creator A5072813166 @default.
- W3168603268 date "2021-06-09" @default.
- W3168603268 modified "2023-09-24" @default.
- W3168603268 title "TaxoNERD: deep neural models for the recognition of taxonomic entities in the ecological and evolutionary literature" @default.
- W3168603268 cites W1532940542 @default.
- W3168603268 cites W1747861911 @default.
- W3168603268 cites W1987248861 @default.
- W3168603268 cites W1991154713 @default.
- W3168603268 cites W2009264880 @default.
- W3168603268 cites W2022484638 @default.
- W3168603268 cites W2029854189 @default.
- W3168603268 cites W2041241689 @default.
- W3168603268 cites W2066527056 @default.
- W3168603268 cites W2071879021 @default.
- W3168603268 cites W2094591616 @default.
- W3168603268 cites W2100627415 @default.
- W3168603268 cites W2124714582 @default.
- W3168603268 cites W2164306581 @default.
- W3168603268 cites W2200557314 @default.
- W3168603268 cites W2250539671 @default.
- W3168603268 cites W2296283641 @default.
- W3168603268 cites W2396881363 @default.
- W3168603268 cites W2493916176 @default.
- W3168603268 cites W2604710187 @default.
- W3168603268 cites W2802711901 @default.
- W3168603268 cites W2883471235 @default.
- W3168603268 cites W2909683104 @default.
- W3168603268 cites W2911489562 @default.
- W3168603268 cites W2914274808 @default.
- W3168603268 cites W2944277284 @default.
- W3168603268 cites W2949176808 @default.
- W3168603268 cites W2950021574 @default.
- W3168603268 cites W2963738950 @default.
- W3168603268 cites W2983315135 @default.
- W3168603268 cites W3011594683 @default.
- W3168603268 cites W3082330004 @default.
- W3168603268 cites W3100452049 @default.
- W3168603268 cites W3105741911 @default.
- W3168603268 cites W3106224367 @default.
- W3168603268 doi "https://doi.org/10.1101/2021.06.08.444426" @default.
- W3168603268 hasPublicationYear "2021" @default.
- W3168603268 type Work @default.
- W3168603268 sameAs 3168603268 @default.
- W3168603268 citedByCount "2" @default.
- W3168603268 countsByYear W31686032682022 @default.
- W3168603268 crossrefType "posted-content" @default.
- W3168603268 hasAuthorship W3168603268A5055750415 @default.
- W3168603268 hasAuthorship W3168603268A5072813166 @default.
- W3168603268 hasBestOaLocation W31686032681 @default.
- W3168603268 hasConcept C108583219 @default.
- W3168603268 hasConcept C119857082 @default.
- W3168603268 hasConcept C134306372 @default.
- W3168603268 hasConcept C153083717 @default.
- W3168603268 hasConcept C154945302 @default.
- W3168603268 hasConcept C162324750 @default.
- W3168603268 hasConcept C187736073 @default.
- W3168603268 hasConcept C18903297 @default.
- W3168603268 hasConcept C204321447 @default.
- W3168603268 hasConcept C23123220 @default.
- W3168603268 hasConcept C2522767166 @default.
- W3168603268 hasConcept C2779135771 @default.
- W3168603268 hasConcept C2780451532 @default.
- W3168603268 hasConcept C33923547 @default.
- W3168603268 hasConcept C36503486 @default.
- W3168603268 hasConcept C41008148 @default.
- W3168603268 hasConcept C50644808 @default.
- W3168603268 hasConcept C71640776 @default.
- W3168603268 hasConcept C86803240 @default.
- W3168603268 hasConceptScore W3168603268C108583219 @default.
- W3168603268 hasConceptScore W3168603268C119857082 @default.
- W3168603268 hasConceptScore W3168603268C134306372 @default.
- W3168603268 hasConceptScore W3168603268C153083717 @default.
- W3168603268 hasConceptScore W3168603268C154945302 @default.
- W3168603268 hasConceptScore W3168603268C162324750 @default.
- W3168603268 hasConceptScore W3168603268C187736073 @default.
- W3168603268 hasConceptScore W3168603268C18903297 @default.
- W3168603268 hasConceptScore W3168603268C204321447 @default.
- W3168603268 hasConceptScore W3168603268C23123220 @default.
- W3168603268 hasConceptScore W3168603268C2522767166 @default.
- W3168603268 hasConceptScore W3168603268C2779135771 @default.
- W3168603268 hasConceptScore W3168603268C2780451532 @default.
- W3168603268 hasConceptScore W3168603268C33923547 @default.
- W3168603268 hasConceptScore W3168603268C36503486 @default.
- W3168603268 hasConceptScore W3168603268C41008148 @default.
- W3168603268 hasConceptScore W3168603268C50644808 @default.
- W3168603268 hasConceptScore W3168603268C71640776 @default.
- W3168603268 hasConceptScore W3168603268C86803240 @default.
- W3168603268 hasLocation W31686032681 @default.
- W3168603268 hasOpenAccess W3168603268 @default.
- W3168603268 hasPrimaryLocation W31686032681 @default.
- W3168603268 hasRelatedWork W2787045460 @default.
- W3168603268 hasRelatedWork W2993873509 @default.
- W3168603268 hasRelatedWork W3014300295 @default.
- W3168603268 hasRelatedWork W3128216712 @default.
- W3168603268 hasRelatedWork W4223943233 @default.
- W3168603268 hasRelatedWork W4225161397 @default.
- W3168603268 hasRelatedWork W4299487748 @default.