Matches in SemOpenAlex for { <https://semopenalex.org/work/W100013670> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W100013670 abstract "A selective-phrase-based preprocessor for use in generating the feature set of an e-mail message prior to its classification as spam or non-spam is presented. The proposed preprocessor subsumes the phrase-based preprocessor used by Yerazunis and the word-based preprocessor employed by Graham. In Yerazunis' phrased-based approach, a sliding window of W contiguous words is used to generate phrases, and all sub-phrases of each W-word phrase are used as features. In Graham's word-based method, just the individual words in an e-mail message are used as features. The primary goal of this investigation was to determine whether the classification accuracy attained by Yerazunis' preprocessor can be achieved at a lower computational cost by selecting a much smaller but promising set of phrases from which to generate the sub-phrases that constitute the features. The proposed preprocessor first identifies a small number f of words from the input text (namely, those that are most likely to appear in spam messages). For each such word, it then selects B distinct phrases of W contiguous words that contain that word and uses all sub-phrases of each such phrase as features. A secondary goal of the research was to investigate the sensitivity of the classification accuracy on f, B, and W. The methods used by the proposed preprocessor and the preprocessors devised by Yerazunis and Graham were tested on a benchmark corpus of e-mail messages. The classification accuracy and other metrics were measured and reported. The results indicated that the classification accuracy of the proposed preprocessor was comparable to that of Yerazunis' preprocessor while utilizing a much smaller set of features. In addition, the research indicated optimum values for f, B, and W: The best accuracies were achieved at (f, B, W) = (2, 1, 2) and (f, B, W) = (2, 2, 2)." @default.
- W100013670 created "2016-06-24" @default.
- W100013670 creator A5048729068 @default.
- W100013670 creator A5067379693 @default.
- W100013670 date "2008-01-01" @default.
- W100013670 modified "2023-09-26" @default.
- W100013670 title "A selective-phrase-based preprocessor for improved spam filtering" @default.
- W100013670 hasPublicationYear "2008" @default.
- W100013670 type Work @default.
- W100013670 sameAs 100013670 @default.
- W100013670 citedByCount "0" @default.
- W100013670 crossrefType "journal-article" @default.
- W100013670 hasAuthorship W100013670A5048729068 @default.
- W100013670 hasAuthorship W100013670A5067379693 @default.
- W100013670 hasConcept C13280743 @default.
- W100013670 hasConcept C138885662 @default.
- W100013670 hasConcept C154945302 @default.
- W100013670 hasConcept C177264268 @default.
- W100013670 hasConcept C185798385 @default.
- W100013670 hasConcept C188338183 @default.
- W100013670 hasConcept C199360897 @default.
- W100013670 hasConcept C204321447 @default.
- W100013670 hasConcept C205649164 @default.
- W100013670 hasConcept C2524010 @default.
- W100013670 hasConcept C2776224158 @default.
- W100013670 hasConcept C2776401178 @default.
- W100013670 hasConcept C28490314 @default.
- W100013670 hasConcept C33923547 @default.
- W100013670 hasConcept C34736171 @default.
- W100013670 hasConcept C41008148 @default.
- W100013670 hasConcept C41895202 @default.
- W100013670 hasConcept C90805587 @default.
- W100013670 hasConceptScore W100013670C13280743 @default.
- W100013670 hasConceptScore W100013670C138885662 @default.
- W100013670 hasConceptScore W100013670C154945302 @default.
- W100013670 hasConceptScore W100013670C177264268 @default.
- W100013670 hasConceptScore W100013670C185798385 @default.
- W100013670 hasConceptScore W100013670C188338183 @default.
- W100013670 hasConceptScore W100013670C199360897 @default.
- W100013670 hasConceptScore W100013670C204321447 @default.
- W100013670 hasConceptScore W100013670C205649164 @default.
- W100013670 hasConceptScore W100013670C2524010 @default.
- W100013670 hasConceptScore W100013670C2776224158 @default.
- W100013670 hasConceptScore W100013670C2776401178 @default.
- W100013670 hasConceptScore W100013670C28490314 @default.
- W100013670 hasConceptScore W100013670C33923547 @default.
- W100013670 hasConceptScore W100013670C34736171 @default.
- W100013670 hasConceptScore W100013670C41008148 @default.
- W100013670 hasConceptScore W100013670C41895202 @default.
- W100013670 hasConceptScore W100013670C90805587 @default.
- W100013670 hasLocation W1000136701 @default.
- W100013670 hasOpenAccess W100013670 @default.
- W100013670 hasPrimaryLocation W1000136701 @default.
- W100013670 hasRelatedWork W114321176 @default.
- W100013670 hasRelatedWork W2045457204 @default.
- W100013670 hasRelatedWork W2067797441 @default.
- W100013670 hasRelatedWork W2122167483 @default.
- W100013670 hasRelatedWork W2202116757 @default.
- W100013670 hasRelatedWork W2208753469 @default.
- W100013670 hasRelatedWork W2357046830 @default.
- W100013670 hasRelatedWork W2397568968 @default.
- W100013670 hasRelatedWork W3020147673 @default.
- W100013670 hasRelatedWork W3021596267 @default.
- W100013670 hasRelatedWork W3095995985 @default.
- W100013670 hasRelatedWork W3098953385 @default.
- W100013670 hasRelatedWork W3115621631 @default.
- W100013670 hasRelatedWork W2124322817 @default.
- W100013670 hasRelatedWork W2815028661 @default.
- W100013670 hasRelatedWork W2851004265 @default.
- W100013670 hasRelatedWork W2862491776 @default.
- W100013670 hasRelatedWork W2928429109 @default.
- W100013670 hasRelatedWork W2959900528 @default.
- W100013670 hasRelatedWork W3086311423 @default.
- W100013670 isParatext "false" @default.
- W100013670 isRetracted "false" @default.
- W100013670 magId "100013670" @default.
- W100013670 workType "article" @default.