Matches in SemOpenAlex for { <https://semopenalex.org/work/W3146688313> ?p ?o ?g. }
- W3146688313 abstract "Text classification is a significant branch of natural language processing, and has many applications including document classification and sentiment analysis. Unsurprisingly, those who do text classification are concerned with the run-time of their algorithms, many of which depend on the size of the corpus' vocabulary due to their bag-of-words representation. Although many studies have examined the effect of preprocessing techniques on vocabulary size and accuracy, none have examined how these methods affect a model's run-time. To fill this gap, we provide a comprehensive study that examines how preprocessing techniques affect the vocabulary size, model performance, and model run-time, evaluating ten techniques over four models and two datasets. We show that some individual methods can reduce run-time with no loss of accuracy, while some combinations of methods can trade 2-5% of the accuracy for up to a 65% reduction of run-time. Furthermore, some combinations of preprocessing techniques can even provide a 15% reduction in run-time while simultaneously improving model accuracy." @default.
- W3146688313 created "2021-04-13" @default.
- W3146688313 creator A5029074659 @default.
- W3146688313 creator A5056096860 @default.
- W3146688313 creator A5087816276 @default.
- W3146688313 date "2021-04-08" @default.
- W3146688313 modified "2023-09-26" @default.
- W3146688313 title "Exploring the Relationship Between Algorithm Performance, Vocabulary, and Run-Time in Text Classification" @default.
- W3146688313 cites W122956856 @default.
- W3146688313 cites W128199165 @default.
- W3146688313 cites W15334911 @default.
- W3146688313 cites W1541781468 @default.
- W3146688313 cites W1596717185 @default.
- W3146688313 cites W179179905 @default.
- W3146688313 cites W1880262756 @default.
- W3146688313 cites W1975158701 @default.
- W3146688313 cites W1976511452 @default.
- W3146688313 cites W1980867644 @default.
- W3146688313 cites W1985258161 @default.
- W3146688313 cites W1989894969 @default.
- W3146688313 cites W2014545475 @default.
- W3146688313 cites W2080445080 @default.
- W3146688313 cites W2081980673 @default.
- W3146688313 cites W2101234009 @default.
- W3146688313 cites W2103333826 @default.
- W3146688313 cites W2135161317 @default.
- W3146688313 cites W2141251646 @default.
- W3146688313 cites W2149684865 @default.
- W3146688313 cites W2162077817 @default.
- W3146688313 cites W2387314750 @default.
- W3146688313 cites W2435251607 @default.
- W3146688313 cites W2512317583 @default.
- W3146688313 cites W2543728926 @default.
- W3146688313 cites W2742034229 @default.
- W3146688313 cites W2898641722 @default.
- W3146688313 cites W2950476268 @default.
- W3146688313 cites W3021743785 @default.
- W3146688313 cites W3026397464 @default.
- W3146688313 cites W3031961393 @default.
- W3146688313 cites W3034480528 @default.
- W3146688313 cites W3034837209 @default.
- W3146688313 cites W3088524227 @default.
- W3146688313 cites W3098649723 @default.
- W3146688313 cites W3124741351 @default.
- W3146688313 cites W3203149905 @default.
- W3146688313 hasPublicationYear "2021" @default.
- W3146688313 type Work @default.
- W3146688313 sameAs 3146688313 @default.
- W3146688313 citedByCount "0" @default.
- W3146688313 crossrefType "posted-content" @default.
- W3146688313 hasAuthorship W3146688313A5029074659 @default.
- W3146688313 hasAuthorship W3146688313A5056096860 @default.
- W3146688313 hasAuthorship W3146688313A5087816276 @default.
- W3146688313 hasConcept C10551718 @default.
- W3146688313 hasConcept C111335779 @default.
- W3146688313 hasConcept C11413529 @default.
- W3146688313 hasConcept C119857082 @default.
- W3146688313 hasConcept C124101348 @default.
- W3146688313 hasConcept C138885662 @default.
- W3146688313 hasConcept C154945302 @default.
- W3146688313 hasConcept C17744445 @default.
- W3146688313 hasConcept C199539241 @default.
- W3146688313 hasConcept C204321447 @default.
- W3146688313 hasConcept C2524010 @default.
- W3146688313 hasConcept C2776359362 @default.
- W3146688313 hasConcept C2777601683 @default.
- W3146688313 hasConcept C33923547 @default.
- W3146688313 hasConcept C34736171 @default.
- W3146688313 hasConcept C41008148 @default.
- W3146688313 hasConcept C41895202 @default.
- W3146688313 hasConcept C66402592 @default.
- W3146688313 hasConcept C94625758 @default.
- W3146688313 hasConceptScore W3146688313C10551718 @default.
- W3146688313 hasConceptScore W3146688313C111335779 @default.
- W3146688313 hasConceptScore W3146688313C11413529 @default.
- W3146688313 hasConceptScore W3146688313C119857082 @default.
- W3146688313 hasConceptScore W3146688313C124101348 @default.
- W3146688313 hasConceptScore W3146688313C138885662 @default.
- W3146688313 hasConceptScore W3146688313C154945302 @default.
- W3146688313 hasConceptScore W3146688313C17744445 @default.
- W3146688313 hasConceptScore W3146688313C199539241 @default.
- W3146688313 hasConceptScore W3146688313C204321447 @default.
- W3146688313 hasConceptScore W3146688313C2524010 @default.
- W3146688313 hasConceptScore W3146688313C2776359362 @default.
- W3146688313 hasConceptScore W3146688313C2777601683 @default.
- W3146688313 hasConceptScore W3146688313C33923547 @default.
- W3146688313 hasConceptScore W3146688313C34736171 @default.
- W3146688313 hasConceptScore W3146688313C41008148 @default.
- W3146688313 hasConceptScore W3146688313C41895202 @default.
- W3146688313 hasConceptScore W3146688313C66402592 @default.
- W3146688313 hasConceptScore W3146688313C94625758 @default.
- W3146688313 hasLocation W31466883131 @default.
- W3146688313 hasOpenAccess W3146688313 @default.
- W3146688313 hasPrimaryLocation W31466883131 @default.
- W3146688313 hasRelatedWork W129655323 @default.
- W3146688313 hasRelatedWork W190279687 @default.
- W3146688313 hasRelatedWork W2011027654 @default.
- W3146688313 hasRelatedWork W2045424838 @default.
- W3146688313 hasRelatedWork W2082965028 @default.
- W3146688313 hasRelatedWork W21726264 @default.