Matches in SemOpenAlex for { <https://semopenalex.org/work/W2045424838> ?p ?o ?g. }
- W2045424838 endingPage "10" @default.
- W2045424838 startingPage "1" @default.
- W2045424838 abstract "Abstract Feature selection techniques play an important role in text categorization (TC), especially for the large-scale TC tasks. Many new and improved methods have been proposed, and most of them are based on document frequency, such as the famous Chi-square statistic and information gain etc. These methods based on document frequency, however, have two shortcomings: (1) they are not reliable for low-frequency terms, that is, low-frequency terms will be filtered because of their smaller weights; and (2) they only count whether one term occurs within a document and ignore term frequency. Actually, high-frequency term (except stop words) occurred in few documents is often regards as a discriminators in the real-life corpus. Aimed at solving the above drawbacks, the paper focuses on how to construct a feature selection function based on term frequency, and proposes a new approach using student t-test. The t-test function is used to measure the diversity of the distributions of a term frequency between the specific category and the entire corpus. Extensive comparative experiments on two text corpora using three classifiers show that the proposed approach is comparable to the state-of-the-art feature selection methods in terms of macro- F 1 and micro- F 1 . Especially on micro- F 1 , our method achieves slightly better performance on Reuters with kNN and SVMs classifiers, compared to χ 2 , and IG." @default.
- W2045424838 created "2016-06-24" @default.
- W2045424838 creator A5003759585 @default.
- W2045424838 creator A5005118720 @default.
- W2045424838 creator A5006320644 @default.
- W2045424838 creator A5078691114 @default.
- W2045424838 creator A5081675173 @default.
- W2045424838 date "2014-08-01" @default.
- W2045424838 modified "2023-09-23" @default.
- W2045424838 title "t-Test feature selection approach based on term frequency for text categorization" @default.
- W2045424838 cites W1972640883 @default.
- W2045424838 cites W1975980877 @default.
- W2045424838 cites W1978394996 @default.
- W2045424838 cites W1982827240 @default.
- W2045424838 cites W1999928587 @default.
- W2045424838 cites W2068833644 @default.
- W2045424838 cites W2073901485 @default.
- W2045424838 cites W2081891767 @default.
- W2045424838 cites W2090091537 @default.
- W2045424838 cites W2097169398 @default.
- W2045424838 cites W2099975513 @default.
- W2045424838 cites W2100253618 @default.
- W2045424838 cites W2105330923 @default.
- W2045424838 cites W2105393881 @default.
- W2045424838 cites W2114404520 @default.
- W2045424838 cites W2118020653 @default.
- W2045424838 cites W2136272402 @default.
- W2045424838 cites W2138550913 @default.
- W2045424838 cites W2153962014 @default.
- W2045424838 cites W2165008816 @default.
- W2045424838 cites W2166317455 @default.
- W2045424838 cites W4205110562 @default.
- W2045424838 cites W4233499618 @default.
- W2045424838 cites W4236137412 @default.
- W2045424838 cites W4239510810 @default.
- W2045424838 doi "https://doi.org/10.1016/j.patrec.2014.02.013" @default.
- W2045424838 hasPublicationYear "2014" @default.
- W2045424838 type Work @default.
- W2045424838 sameAs 2045424838 @default.
- W2045424838 citedByCount "92" @default.
- W2045424838 countsByYear W20454248382015 @default.
- W2045424838 countsByYear W20454248382016 @default.
- W2045424838 countsByYear W20454248382017 @default.
- W2045424838 countsByYear W20454248382018 @default.
- W2045424838 countsByYear W20454248382019 @default.
- W2045424838 countsByYear W20454248382020 @default.
- W2045424838 countsByYear W20454248382021 @default.
- W2045424838 countsByYear W20454248382022 @default.
- W2045424838 countsByYear W20454248382023 @default.
- W2045424838 crossrefType "journal-article" @default.
- W2045424838 hasAuthorship W2045424838A5003759585 @default.
- W2045424838 hasAuthorship W2045424838A5005118720 @default.
- W2045424838 hasAuthorship W2045424838A5006320644 @default.
- W2045424838 hasAuthorship W2045424838A5078691114 @default.
- W2045424838 hasAuthorship W2045424838A5081675173 @default.
- W2045424838 hasConcept C119857082 @default.
- W2045424838 hasConcept C121332964 @default.
- W2045424838 hasConcept C138885662 @default.
- W2045424838 hasConcept C148483581 @default.
- W2045424838 hasConcept C151730666 @default.
- W2045424838 hasConcept C153180895 @default.
- W2045424838 hasConcept C154945302 @default.
- W2045424838 hasConcept C204321447 @default.
- W2045424838 hasConcept C2776401178 @default.
- W2045424838 hasConcept C2777267654 @default.
- W2045424838 hasConcept C2986744138 @default.
- W2045424838 hasConcept C41008148 @default.
- W2045424838 hasConcept C41895202 @default.
- W2045424838 hasConcept C61797465 @default.
- W2045424838 hasConcept C62520636 @default.
- W2045424838 hasConcept C81917197 @default.
- W2045424838 hasConcept C86803240 @default.
- W2045424838 hasConcept C94124525 @default.
- W2045424838 hasConceptScore W2045424838C119857082 @default.
- W2045424838 hasConceptScore W2045424838C121332964 @default.
- W2045424838 hasConceptScore W2045424838C138885662 @default.
- W2045424838 hasConceptScore W2045424838C148483581 @default.
- W2045424838 hasConceptScore W2045424838C151730666 @default.
- W2045424838 hasConceptScore W2045424838C153180895 @default.
- W2045424838 hasConceptScore W2045424838C154945302 @default.
- W2045424838 hasConceptScore W2045424838C204321447 @default.
- W2045424838 hasConceptScore W2045424838C2776401178 @default.
- W2045424838 hasConceptScore W2045424838C2777267654 @default.
- W2045424838 hasConceptScore W2045424838C2986744138 @default.
- W2045424838 hasConceptScore W2045424838C41008148 @default.
- W2045424838 hasConceptScore W2045424838C41895202 @default.
- W2045424838 hasConceptScore W2045424838C61797465 @default.
- W2045424838 hasConceptScore W2045424838C62520636 @default.
- W2045424838 hasConceptScore W2045424838C81917197 @default.
- W2045424838 hasConceptScore W2045424838C86803240 @default.
- W2045424838 hasConceptScore W2045424838C94124525 @default.
- W2045424838 hasLocation W20454248381 @default.
- W2045424838 hasOpenAccess W2045424838 @default.
- W2045424838 hasPrimaryLocation W20454248381 @default.
- W2045424838 hasRelatedWork W2111353337 @default.
- W2045424838 hasRelatedWork W2144227738 @default.
- W2045424838 hasRelatedWork W2348570206 @default.
- W2045424838 hasRelatedWork W2355149094 @default.