Matches in SemOpenAlex for { <https://semopenalex.org/work/W5470710> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W5470710 abstract "One of the biggest challenges in building effective anti-spam solutions is designing systems to defend against the everevolving bag of tricks spammers use to defeat them. Because of this, spam filters that work well today may not work well tomorrow. The adversarial nature of the spam problem makes large, up-to-date, and diverse e-mail corpora critical for the development and evaluation of new anti-spam filtering technologies. Gathering large collections of messages can actually be quite easy, especially in the context of a large, corporate or ISP environment. The challenge is not necessarily in collecting enough mail, however, but in collecting a representative distribution of mail types as seen “in the wild” and in then accurately labeling the hundreds of thousands or millions of accumulated messages as spam or non-spam. In the field of machine learning Uncertainty Sampling is a well-known Active Learning algorithm which uses a collaborative model to minimize the human effort required to label large datasets. While conventional Uncertainty Sampling has been shown to be very effective, it is also computationally very expensive since the learner must reclassify all the unlabeled instances during each learning iteration. We propose a new algorithm, Approximate Uncertainty Sampling (AUS), which is nearly as efficacious as Uncertainty Sampling, but has substantially lower computational complexity. The reduced computational cost allows Approximate Uncertainty Sampling to be applied to labeling larger datasets and also makes it possible to update the learned model more frequently. Approximate Uncertainty Sampling encourages the building of larger, more topical, and more realistic example e-mail corpora for evaluating new anti-spam filters. While we focus on the binary labeling of large volumes of e-mail messages, as with Uncertainty Sampling, Approximate Uncertainty Sampling can be used with a wide range of underlying classification algorithms for a variety of categorization tasks." @default.
- W5470710 created "2016-06-24" @default.
- W5470710 creator A5038017609 @default.
- W5470710 creator A5045414075 @default.
- W5470710 creator A5070745247 @default.
- W5470710 date "2006-01-01" @default.
- W5470710 modified "2023-09-25" @default.
- W5470710 title "Fast Uncertainty Sampling for Labeling Large E-mail Corpora." @default.
- W5470710 cites W114599062 @default.
- W5470710 cites W1484084878 @default.
- W5470710 cites W1513874326 @default.
- W5470710 cites W1756896031 @default.
- W5470710 cites W2018810220 @default.
- W5470710 cites W2080021732 @default.
- W5470710 cites W2085989833 @default.
- W5470710 cites W2091110562 @default.
- W5470710 cites W2122520221 @default.
- W5470710 cites W2149684865 @default.
- W5470710 cites W2426031434 @default.
- W5470710 cites W2949071206 @default.
- W5470710 cites W3112138688 @default.
- W5470710 hasPublicationYear "2006" @default.
- W5470710 type Work @default.
- W5470710 sameAs 5470710 @default.
- W5470710 citedByCount "15" @default.
- W5470710 countsByYear W54707102012 @default.
- W5470710 countsByYear W54707102013 @default.
- W5470710 countsByYear W54707102015 @default.
- W5470710 crossrefType "proceedings-article" @default.
- W5470710 hasAuthorship W5470710A5038017609 @default.
- W5470710 hasAuthorship W5470710A5045414075 @default.
- W5470710 hasAuthorship W5470710A5070745247 @default.
- W5470710 hasConcept C106131492 @default.
- W5470710 hasConcept C119857082 @default.
- W5470710 hasConcept C124101348 @default.
- W5470710 hasConcept C140779682 @default.
- W5470710 hasConcept C151730666 @default.
- W5470710 hasConcept C154945302 @default.
- W5470710 hasConcept C202444582 @default.
- W5470710 hasConcept C2779343474 @default.
- W5470710 hasConcept C31972630 @default.
- W5470710 hasConcept C33923547 @default.
- W5470710 hasConcept C37736160 @default.
- W5470710 hasConcept C41008148 @default.
- W5470710 hasConcept C86803240 @default.
- W5470710 hasConcept C9652623 @default.
- W5470710 hasConceptScore W5470710C106131492 @default.
- W5470710 hasConceptScore W5470710C119857082 @default.
- W5470710 hasConceptScore W5470710C124101348 @default.
- W5470710 hasConceptScore W5470710C140779682 @default.
- W5470710 hasConceptScore W5470710C151730666 @default.
- W5470710 hasConceptScore W5470710C154945302 @default.
- W5470710 hasConceptScore W5470710C202444582 @default.
- W5470710 hasConceptScore W5470710C2779343474 @default.
- W5470710 hasConceptScore W5470710C31972630 @default.
- W5470710 hasConceptScore W5470710C33923547 @default.
- W5470710 hasConceptScore W5470710C37736160 @default.
- W5470710 hasConceptScore W5470710C41008148 @default.
- W5470710 hasConceptScore W5470710C86803240 @default.
- W5470710 hasConceptScore W5470710C9652623 @default.
- W5470710 hasLocation W54707101 @default.
- W5470710 hasOpenAccess W5470710 @default.
- W5470710 hasPrimaryLocation W54707101 @default.
- W5470710 hasRelatedWork W1483816357 @default.
- W5470710 hasRelatedWork W1484084878 @default.
- W5470710 hasRelatedWork W1513874326 @default.
- W5470710 hasRelatedWork W1514707997 @default.
- W5470710 hasRelatedWork W1514940655 @default.
- W5470710 hasRelatedWork W1528361845 @default.
- W5470710 hasRelatedWork W1978633512 @default.
- W5470710 hasRelatedWork W2041597619 @default.
- W5470710 hasRelatedWork W2080021732 @default.
- W5470710 hasRelatedWork W2085989833 @default.
- W5470710 hasRelatedWork W2107008379 @default.
- W5470710 hasRelatedWork W2125667824 @default.
- W5470710 hasRelatedWork W2138079527 @default.
- W5470710 hasRelatedWork W2154890143 @default.
- W5470710 hasRelatedWork W2169384781 @default.
- W5470710 hasRelatedWork W2426031434 @default.
- W5470710 hasRelatedWork W2570764145 @default.
- W5470710 hasRelatedWork W2903158431 @default.
- W5470710 hasRelatedWork W2917688635 @default.
- W5470710 hasRelatedWork W2953947325 @default.
- W5470710 isParatext "false" @default.
- W5470710 isRetracted "false" @default.
- W5470710 magId "5470710" @default.
- W5470710 workType "article" @default.