Matches in SemOpenAlex for { <https://semopenalex.org/work/W2022377934> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W2022377934 endingPage "e58" @default.
- W2022377934 startingPage "e47" @default.
- W2022377934 abstract "This paper explores the benefits of using n-grams and semantic features for the classification of disease outbreak reports, in the context of the BioCaster disease outbreak report text mining system. A novel feature of this work is the use of a general purpose semantic tagger – the USAS tagger – to generate features. We outline the application context for this work (the BioCaster epidemiological text mining system), before going on to describe the experimental data used in our classification experiments (the 1000 document BioCaster corpus). Three broad groups of features are used in this work: Named Entity based features, n-gram features, and features derived from the USAS semantic tagger. Three standard machine learning algorithms – Naïve Bayes, the Support Vector Machine algorithm, and the C4.5 decision tree algorithm – were used for classifying experimental data (that is, the BioCaster corpus). Feature selection was performed using the χ2 feature selection algorithm. Standard text classification performance metrics – Accuracy, Precision, Recall, Specificity and F-score – are reported. A feature representation composed of unigrams, bigrams, trigrams and features derived from a semantic tagger, in conjunction with the Naïve Bayes algorithm and feature selection yielded the highest classification accuracy (and F-score). This result was statistically significant compared to a baseline unigram representation and to previous work on the same task. However, it was feature selection rather than semantic tagging that contributed most to the improved performance. This study has shown that for the classification of disease outbreak reports, a combination of bag-of-words, n-grams and semantic features, in conjunction with feature selection, increases classification accuracy at a statistically significant level compared to previous work in this domain." @default.
- W2022377934 created "2016-06-24" @default.
- W2022377934 creator A5038899864 @default.
- W2022377934 creator A5073413742 @default.
- W2022377934 creator A5082051013 @default.
- W2022377934 creator A5090787812 @default.
- W2022377934 date "2009-12-01" @default.
- W2022377934 modified "2023-10-01" @default.
- W2022377934 title "Classifying disease outbreak reports using n-grams and semantic features" @default.
- W2022377934 cites W2006491909 @default.
- W2022377934 cites W2031253971 @default.
- W2022377934 cites W2083589838 @default.
- W2022377934 cites W2103235794 @default.
- W2022377934 cites W2126385932 @default.
- W2022377934 cites W4241931738 @default.
- W2022377934 doi "https://doi.org/10.1016/j.ijmedinf.2009.03.010" @default.
- W2022377934 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/19447070" @default.
- W2022377934 hasPublicationYear "2009" @default.
- W2022377934 type Work @default.
- W2022377934 sameAs 2022377934 @default.
- W2022377934 citedByCount "49" @default.
- W2022377934 countsByYear W20223779342012 @default.
- W2022377934 countsByYear W20223779342013 @default.
- W2022377934 countsByYear W20223779342014 @default.
- W2022377934 countsByYear W20223779342015 @default.
- W2022377934 countsByYear W20223779342016 @default.
- W2022377934 countsByYear W20223779342017 @default.
- W2022377934 countsByYear W20223779342018 @default.
- W2022377934 countsByYear W20223779342019 @default.
- W2022377934 countsByYear W20223779342020 @default.
- W2022377934 countsByYear W20223779342021 @default.
- W2022377934 countsByYear W20223779342022 @default.
- W2022377934 countsByYear W20223779342023 @default.
- W2022377934 crossrefType "journal-article" @default.
- W2022377934 hasAuthorship W2022377934A5038899864 @default.
- W2022377934 hasAuthorship W2022377934A5073413742 @default.
- W2022377934 hasAuthorship W2022377934A5082051013 @default.
- W2022377934 hasAuthorship W2022377934A5090787812 @default.
- W2022377934 hasConcept C108757681 @default.
- W2022377934 hasConcept C119857082 @default.
- W2022377934 hasConcept C12267149 @default.
- W2022377934 hasConcept C124101348 @default.
- W2022377934 hasConcept C137546455 @default.
- W2022377934 hasConcept C138885662 @default.
- W2022377934 hasConcept C148483581 @default.
- W2022377934 hasConcept C151730666 @default.
- W2022377934 hasConcept C153180895 @default.
- W2022377934 hasConcept C154945302 @default.
- W2022377934 hasConcept C204321447 @default.
- W2022377934 hasConcept C2776401178 @default.
- W2022377934 hasConcept C2779343474 @default.
- W2022377934 hasConcept C41008148 @default.
- W2022377934 hasConcept C41895202 @default.
- W2022377934 hasConcept C52001869 @default.
- W2022377934 hasConcept C81669768 @default.
- W2022377934 hasConcept C84525736 @default.
- W2022377934 hasConcept C86803240 @default.
- W2022377934 hasConceptScore W2022377934C108757681 @default.
- W2022377934 hasConceptScore W2022377934C119857082 @default.
- W2022377934 hasConceptScore W2022377934C12267149 @default.
- W2022377934 hasConceptScore W2022377934C124101348 @default.
- W2022377934 hasConceptScore W2022377934C137546455 @default.
- W2022377934 hasConceptScore W2022377934C138885662 @default.
- W2022377934 hasConceptScore W2022377934C148483581 @default.
- W2022377934 hasConceptScore W2022377934C151730666 @default.
- W2022377934 hasConceptScore W2022377934C153180895 @default.
- W2022377934 hasConceptScore W2022377934C154945302 @default.
- W2022377934 hasConceptScore W2022377934C204321447 @default.
- W2022377934 hasConceptScore W2022377934C2776401178 @default.
- W2022377934 hasConceptScore W2022377934C2779343474 @default.
- W2022377934 hasConceptScore W2022377934C41008148 @default.
- W2022377934 hasConceptScore W2022377934C41895202 @default.
- W2022377934 hasConceptScore W2022377934C52001869 @default.
- W2022377934 hasConceptScore W2022377934C81669768 @default.
- W2022377934 hasConceptScore W2022377934C84525736 @default.
- W2022377934 hasConceptScore W2022377934C86803240 @default.
- W2022377934 hasIssue "12" @default.
- W2022377934 hasLocation W20223779341 @default.
- W2022377934 hasLocation W20223779342 @default.
- W2022377934 hasOpenAccess W2022377934 @default.
- W2022377934 hasPrimaryLocation W20223779341 @default.
- W2022377934 hasRelatedWork W1470425429 @default.
- W2022377934 hasRelatedWork W2183598011 @default.
- W2022377934 hasRelatedWork W2971898452 @default.
- W2022377934 hasRelatedWork W2985924212 @default.
- W2022377934 hasRelatedWork W3006552719 @default.
- W2022377934 hasRelatedWork W3186233728 @default.
- W2022377934 hasRelatedWork W3210877509 @default.
- W2022377934 hasRelatedWork W4377964522 @default.
- W2022377934 hasRelatedWork W4384345534 @default.
- W2022377934 hasRelatedWork W2345184372 @default.
- W2022377934 hasVolume "78" @default.
- W2022377934 isParatext "false" @default.
- W2022377934 isRetracted "false" @default.
- W2022377934 magId "2022377934" @default.
- W2022377934 workType "article" @default.