Matches in SemOpenAlex for { <https://semopenalex.org/work/W4281690387> ?p ?o ?g. }
Showing items 1 to 49 of
49
with 100 items per page.
- W4281690387 abstract "Document classification is the detection specific content of interest in text documents. In contrast to the data-driven machine learning classifiers, knowledge-based classifiers can be constructed based on domain specific knowledge, which usually takes the form of a collection of subject related keywords. While typical knowledge-based classifiers compute a prediction score based on the keyword abundance, it generally suffers from noisy detections due to the lack of guiding principle in gauging the keyword matches. In this paper, we propose a novel knowledge-based model equipped with Shannon Entropy, which measures the richness of information and favors uniform and diverse keyword matches. Without invoking any positive sample, such method provides a simple and explainable solution for document classification. We show that the Shannon Entropy significantly improves the recall at fixed level of false positive rate. Also, we show that the model is more robust against change of data distribution at inference while compared with traditional machine learning, particularly when the positive training samples are very limited." @default.
- W4281690387 created "2022-06-13" @default.
- W4281690387 creator A5076947036 @default.
- W4281690387 date "2022-06-06" @default.
- W4281690387 modified "2023-09-23" @default.
- W4281690387 title "Knowledge-based Document Classification with Shannon Entropy" @default.
- W4281690387 doi "https://doi.org/10.48550/arxiv.2206.02363" @default.
- W4281690387 hasPublicationYear "2022" @default.
- W4281690387 type Work @default.
- W4281690387 citedByCount "0" @default.
- W4281690387 crossrefType "posted-content" @default.
- W4281690387 hasAuthorship W4281690387A5076947036 @default.
- W4281690387 hasBestOaLocation W42816903871 @default.
- W4281690387 hasConcept C106301342 @default.
- W4281690387 hasConcept C119857082 @default.
- W4281690387 hasConcept C121332964 @default.
- W4281690387 hasConcept C153180895 @default.
- W4281690387 hasConcept C154945302 @default.
- W4281690387 hasConcept C2776214188 @default.
- W4281690387 hasConcept C41008148 @default.
- W4281690387 hasConcept C51632099 @default.
- W4281690387 hasConcept C62520636 @default.
- W4281690387 hasConcept C9679016 @default.
- W4281690387 hasConceptScore W4281690387C106301342 @default.
- W4281690387 hasConceptScore W4281690387C119857082 @default.
- W4281690387 hasConceptScore W4281690387C121332964 @default.
- W4281690387 hasConceptScore W4281690387C153180895 @default.
- W4281690387 hasConceptScore W4281690387C154945302 @default.
- W4281690387 hasConceptScore W4281690387C2776214188 @default.
- W4281690387 hasConceptScore W4281690387C41008148 @default.
- W4281690387 hasConceptScore W4281690387C51632099 @default.
- W4281690387 hasConceptScore W4281690387C62520636 @default.
- W4281690387 hasConceptScore W4281690387C9679016 @default.
- W4281690387 hasLocation W42816903871 @default.
- W4281690387 hasOpenAccess W4281690387 @default.
- W4281690387 hasPrimaryLocation W42816903871 @default.
- W4281690387 hasRelatedWork W103782288 @default.
- W4281690387 hasRelatedWork W1527859954 @default.
- W4281690387 hasRelatedWork W2511279186 @default.
- W4281690387 hasRelatedWork W2753840555 @default.
- W4281690387 hasRelatedWork W2799803467 @default.
- W4281690387 hasRelatedWork W2897410528 @default.
- W4281690387 hasRelatedWork W2948131761 @default.
- W4281690387 hasRelatedWork W2950051839 @default.
- W4281690387 hasRelatedWork W4280575929 @default.
- W4281690387 hasRelatedWork W4298557927 @default.
- W4281690387 isParatext "false" @default.
- W4281690387 isRetracted "false" @default.
- W4281690387 workType "article" @default.