Matches in SemOpenAlex for { <https://semopenalex.org/work/W1991966777> ?p ?o ?g. }
- W1991966777 endingPage "362" @default.
- W1991966777 startingPage "351" @default.
- W1991966777 abstract "Text categorization (TC) becomes the key technology to find relevant and timely information from a volume of digital documents, and feature selection techniques are proposed to overcome the high dimensionality which causes the high computational complexity and low accuracy in TC tasks. Chi-square statistics (CHI) is one of the most efficient feature selection methods; however, it has two weaknesses. (1) It is document frequency based, and only counts whether the term occurs or not. Actually, high-frequency term occurring in few documents is often regarded as a discriminator in corpus. (2) It does not consider the term distribution. A term has more discriminating power for a specific category when its difference in degree of distribution is lower. In this paper, we propose a modified CHI feature selection approach which is called term frequency and distribution based CHI to overcome these weaknesses. We use sample variance to calculate the term distribution, and improve the classic CHI with maximum term frequency. Extensive and comparative experiments on three corpora show that the proposed approach is comparable to the classic feature selection methods in terms of macro-F1 and micro-F1." @default.
- W1991966777 created "2016-06-24" @default.
- W1991966777 creator A5028825049 @default.
- W1991966777 creator A5046462412 @default.
- W1991966777 creator A5053920875 @default.
- W1991966777 creator A5065644867 @default.
- W1991966777 creator A5074238751 @default.
- W1991966777 creator A5075258792 @default.
- W1991966777 date "2015-03-20" @default.
- W1991966777 modified "2023-10-18" @default.
- W1991966777 title "Chi-square Statistics Feature Selection Based on Term Frequency and Distribution for Text Categorization" @default.
- W1991966777 cites W1966947440 @default.
- W1991966777 cites W1973637371 @default.
- W1991966777 cites W1976420489 @default.
- W1991966777 cites W1999635750 @default.
- W1991966777 cites W1999928587 @default.
- W1991966777 cites W2010997871 @default.
- W1991966777 cites W2011578637 @default.
- W1991966777 cites W2024031808 @default.
- W1991966777 cites W2027147487 @default.
- W1991966777 cites W2040884411 @default.
- W1991966777 cites W2055784192 @default.
- W1991966777 cites W2057455558 @default.
- W1991966777 cites W2059586463 @default.
- W1991966777 cites W2063249739 @default.
- W1991966777 cites W2068833644 @default.
- W1991966777 cites W2069602768 @default.
- W1991966777 cites W2071496623 @default.
- W1991966777 cites W2080401345 @default.
- W1991966777 cites W2089870669 @default.
- W1991966777 cites W2092158574 @default.
- W1991966777 cites W2099606292 @default.
- W1991966777 cites W2102294253 @default.
- W1991966777 cites W2108150588 @default.
- W1991966777 cites W2118020653 @default.
- W1991966777 cites W2119479037 @default.
- W1991966777 cites W2119821739 @default.
- W1991966777 cites W2154053567 @default.
- W1991966777 cites W2155975780 @default.
- W1991966777 cites W2162223169 @default.
- W1991966777 doi "https://doi.org/10.1080/03772063.2015.1021385" @default.
- W1991966777 hasPublicationYear "2015" @default.
- W1991966777 type Work @default.
- W1991966777 sameAs 1991966777 @default.
- W1991966777 citedByCount "23" @default.
- W1991966777 countsByYear W19919667772016 @default.
- W1991966777 countsByYear W19919667772017 @default.
- W1991966777 countsByYear W19919667772018 @default.
- W1991966777 countsByYear W19919667772019 @default.
- W1991966777 countsByYear W19919667772020 @default.
- W1991966777 countsByYear W19919667772021 @default.
- W1991966777 countsByYear W19919667772022 @default.
- W1991966777 countsByYear W19919667772023 @default.
- W1991966777 crossrefType "journal-article" @default.
- W1991966777 hasAuthorship W1991966777A5028825049 @default.
- W1991966777 hasAuthorship W1991966777A5046462412 @default.
- W1991966777 hasAuthorship W1991966777A5053920875 @default.
- W1991966777 hasAuthorship W1991966777A5065644867 @default.
- W1991966777 hasAuthorship W1991966777A5074238751 @default.
- W1991966777 hasAuthorship W1991966777A5075258792 @default.
- W1991966777 hasConcept C105795698 @default.
- W1991966777 hasConcept C111030470 @default.
- W1991966777 hasConcept C121332964 @default.
- W1991966777 hasConcept C121955636 @default.
- W1991966777 hasConcept C124101348 @default.
- W1991966777 hasConcept C138885662 @default.
- W1991966777 hasConcept C144133560 @default.
- W1991966777 hasConcept C148483581 @default.
- W1991966777 hasConcept C153180895 @default.
- W1991966777 hasConcept C154945302 @default.
- W1991966777 hasConcept C196083921 @default.
- W1991966777 hasConcept C2776401178 @default.
- W1991966777 hasConcept C2779803651 @default.
- W1991966777 hasConcept C33923547 @default.
- W1991966777 hasConcept C41008148 @default.
- W1991966777 hasConcept C41895202 @default.
- W1991966777 hasConcept C51921466 @default.
- W1991966777 hasConcept C61797465 @default.
- W1991966777 hasConcept C62520636 @default.
- W1991966777 hasConcept C76155785 @default.
- W1991966777 hasConcept C81917197 @default.
- W1991966777 hasConcept C94124525 @default.
- W1991966777 hasConcept C94915269 @default.
- W1991966777 hasConceptScore W1991966777C105795698 @default.
- W1991966777 hasConceptScore W1991966777C111030470 @default.
- W1991966777 hasConceptScore W1991966777C121332964 @default.
- W1991966777 hasConceptScore W1991966777C121955636 @default.
- W1991966777 hasConceptScore W1991966777C124101348 @default.
- W1991966777 hasConceptScore W1991966777C138885662 @default.
- W1991966777 hasConceptScore W1991966777C144133560 @default.
- W1991966777 hasConceptScore W1991966777C148483581 @default.
- W1991966777 hasConceptScore W1991966777C153180895 @default.
- W1991966777 hasConceptScore W1991966777C154945302 @default.
- W1991966777 hasConceptScore W1991966777C196083921 @default.
- W1991966777 hasConceptScore W1991966777C2776401178 @default.
- W1991966777 hasConceptScore W1991966777C2779803651 @default.
- W1991966777 hasConceptScore W1991966777C33923547 @default.
- W1991966777 hasConceptScore W1991966777C41008148 @default.