Matches in SemOpenAlex for { <https://semopenalex.org/work/W2009264880> ?p ?o ?g. }
- W2009264880 abstract "A scientific name for an organism can be associated with almost all biological data. Name identification is an important step in many text mining tasks aiming to extract useful information from biological, biomedical and biodiversity text sources. A scientific name acts as an important metadata element to link biological information.We present NetiNeti (Name Extraction from Textual Information-Name Extraction for Taxonomic Indexing), a machine learning based approach for recognition of scientific names including the discovery of new species names from text that will also handle misspellings, OCR errors and other variations in names. The system generates candidate names using rules for scientific names and applies probabilistic machine learning methods to classify names based on structural features of candidate names and features derived from their contexts. NetiNeti can also disambiguate scientific names from other names using the contextual information. We evaluated NetiNeti on legacy biodiversity texts and biomedical literature (MEDLINE). NetiNeti performs better (precision = 98.9% and recall = 70.5%) compared to a popular dictionary based approach (precision = 97.5% and recall = 54.3%) on a 600-page biodiversity book that was manually marked by an annotator. On a small set of PubMed Central's full text articles annotated with scientific names, the precision and recall values are 98.5% and 96.2% respectively. NetiNeti found more than 190,000 unique binomial and trinomial names in more than 1,880,000 PubMed records when used on the full MEDLINE database. NetiNeti also successfully identifies almost all of the new species names mentioned within web pages.We present NetiNeti, a machine learning based approach for identification and discovery of scientific names. The system implementing the approach can be accessed at http://namefinding.ubio.org." @default.
- W2009264880 created "2016-06-24" @default.
- W2009264880 creator A5033709727 @default.
- W2009264880 creator A5037804623 @default.
- W2009264880 creator A5076179481 @default.
- W2009264880 date "2012-08-22" @default.
- W2009264880 modified "2023-10-16" @default.
- W2009264880 title "NetiNeti: discovery of scientific names from text using machine learning methods" @default.
- W2009264880 cites W1557074680 @default.
- W2009264880 cites W1833977909 @default.
- W2009264880 cites W1967383767 @default.
- W2009264880 cites W2001792610 @default.
- W2009264880 cites W2023839984 @default.
- W2009264880 cites W2045993505 @default.
- W2009264880 cites W2097678794 @default.
- W2009264880 cites W2100627415 @default.
- W2009264880 cites W2101265630 @default.
- W2009264880 cites W2117077912 @default.
- W2009264880 cites W2120609672 @default.
- W2009264880 cites W2120736575 @default.
- W2009264880 cites W2139934466 @default.
- W2009264880 cites W2140785063 @default.
- W2009264880 cites W2141869602 @default.
- W2009264880 cites W2142741334 @default.
- W2009264880 cites W2160842254 @default.
- W2009264880 cites W2161661053 @default.
- W2009264880 cites W2165488387 @default.
- W2009264880 cites W2477367047 @default.
- W2009264880 cites W3122465213 @default.
- W2009264880 cites W4293775970 @default.
- W2009264880 doi "https://doi.org/10.1186/1471-2105-13-211" @default.
- W2009264880 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3542245" @default.
- W2009264880 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/22913485" @default.
- W2009264880 hasPublicationYear "2012" @default.
- W2009264880 type Work @default.
- W2009264880 sameAs 2009264880 @default.
- W2009264880 citedByCount "34" @default.
- W2009264880 countsByYear W20092648802012 @default.
- W2009264880 countsByYear W20092648802013 @default.
- W2009264880 countsByYear W20092648802014 @default.
- W2009264880 countsByYear W20092648802015 @default.
- W2009264880 countsByYear W20092648802016 @default.
- W2009264880 countsByYear W20092648802017 @default.
- W2009264880 countsByYear W20092648802018 @default.
- W2009264880 countsByYear W20092648802019 @default.
- W2009264880 countsByYear W20092648802020 @default.
- W2009264880 countsByYear W20092648802021 @default.
- W2009264880 countsByYear W20092648802022 @default.
- W2009264880 countsByYear W20092648802023 @default.
- W2009264880 crossrefType "journal-article" @default.
- W2009264880 hasAuthorship W2009264880A5033709727 @default.
- W2009264880 hasAuthorship W2009264880A5037804623 @default.
- W2009264880 hasAuthorship W2009264880A5076179481 @default.
- W2009264880 hasBestOaLocation W20092648801 @default.
- W2009264880 hasConcept C100660578 @default.
- W2009264880 hasConcept C116834253 @default.
- W2009264880 hasConcept C136764020 @default.
- W2009264880 hasConcept C138885662 @default.
- W2009264880 hasConcept C151730666 @default.
- W2009264880 hasConcept C154945302 @default.
- W2009264880 hasConcept C162324750 @default.
- W2009264880 hasConcept C187736073 @default.
- W2009264880 hasConcept C195807954 @default.
- W2009264880 hasConcept C204321447 @default.
- W2009264880 hasConcept C23123220 @default.
- W2009264880 hasConcept C2779135771 @default.
- W2009264880 hasConcept C2780451532 @default.
- W2009264880 hasConcept C2781083858 @default.
- W2009264880 hasConcept C41008148 @default.
- W2009264880 hasConcept C41895202 @default.
- W2009264880 hasConcept C59822182 @default.
- W2009264880 hasConcept C75165309 @default.
- W2009264880 hasConcept C81669768 @default.
- W2009264880 hasConcept C86803240 @default.
- W2009264880 hasConcept C93518851 @default.
- W2009264880 hasConceptScore W2009264880C100660578 @default.
- W2009264880 hasConceptScore W2009264880C116834253 @default.
- W2009264880 hasConceptScore W2009264880C136764020 @default.
- W2009264880 hasConceptScore W2009264880C138885662 @default.
- W2009264880 hasConceptScore W2009264880C151730666 @default.
- W2009264880 hasConceptScore W2009264880C154945302 @default.
- W2009264880 hasConceptScore W2009264880C162324750 @default.
- W2009264880 hasConceptScore W2009264880C187736073 @default.
- W2009264880 hasConceptScore W2009264880C195807954 @default.
- W2009264880 hasConceptScore W2009264880C204321447 @default.
- W2009264880 hasConceptScore W2009264880C23123220 @default.
- W2009264880 hasConceptScore W2009264880C2779135771 @default.
- W2009264880 hasConceptScore W2009264880C2780451532 @default.
- W2009264880 hasConceptScore W2009264880C2781083858 @default.
- W2009264880 hasConceptScore W2009264880C41008148 @default.
- W2009264880 hasConceptScore W2009264880C41895202 @default.
- W2009264880 hasConceptScore W2009264880C59822182 @default.
- W2009264880 hasConceptScore W2009264880C75165309 @default.
- W2009264880 hasConceptScore W2009264880C81669768 @default.
- W2009264880 hasConceptScore W2009264880C86803240 @default.
- W2009264880 hasConceptScore W2009264880C93518851 @default.
- W2009264880 hasIssue "1" @default.
- W2009264880 hasLocation W20092648801 @default.
- W2009264880 hasLocation W20092648802 @default.
- W2009264880 hasLocation W20092648803 @default.