Matches in SemOpenAlex for { <https://semopenalex.org/work/W4309836720> ?p ?o ?g. }
Showing items 1 to 91 of
91
with 100 items per page.
- W4309836720 endingPage "58" @default.
- W4309836720 startingPage "42" @default.
- W4309836720 abstract "We address class imbalance problems. These are classification problems where the target variable is binary, and one class dominates over the other. A central objective in these problems is to identify features that yield models with high precision/recall values, the standard yardsticks for assessing such models. Our features are extracted from the textual data inherent in such problems. We use n-gram frequencies as features and introduce a discrepancy score that measures the efficacy of an n-gram in highlighting the minority class. The frequency counts of n-grams with the highest discrepancy scores are used as features to construct models with the desired metrics. According to the best practices followed by the services industry, many customer support tickets will get audited and tagged as “contract-compliant” whereas some will be tagged as “over-delivered”. Based on in-field data, we use a random forest classifier and perform a randomized grid search over the model hyperparameters. The model scoring is performed using an scoring function. Our objective is to minimize the follow-up costs by optimizing the recall score while maintaining a base-level precision score. The final optimized model achieves an acceptable recall score while staying above the target precision. We validate our feature selection method by comparing our model with one constructed using frequency counts of n-grams chosen randomly. We propose extensions of our feature extraction method to general classification (binary and multi-class) and regression problems. The discrepancy score is one measure of dissimilarity of distributions and other (more general) measures that we formulate could potentially yield more effective models." @default.
- W4309836720 created "2022-11-29" @default.
- W4309836720 creator A5016933352 @default.
- W4309836720 creator A5033347955 @default.
- W4309836720 creator A5071818365 @default.
- W4309836720 date "2022-11-23" @default.
- W4309836720 modified "2023-10-18" @default.
- W4309836720 title "Extracting Features from Textual Data in Class Imbalance Problems" @default.
- W4309836720 cites W102369970 @default.
- W4309836720 cites W1766594731 @default.
- W4309836720 cites W2008815175 @default.
- W4309836720 cites W2023847169 @default.
- W4309836720 cites W2066443039 @default.
- W4309836720 cites W2104167780 @default.
- W4309836720 cites W2107327607 @default.
- W4309836720 cites W2118978333 @default.
- W4309836720 cites W2120457925 @default.
- W4309836720 cites W2125877832 @default.
- W4309836720 cites W2147813562 @default.
- W4309836720 cites W2604756720 @default.
- W4309836720 cites W2806416578 @default.
- W4309836720 cites W2896236534 @default.
- W4309836720 cites W2922656120 @default.
- W4309836720 cites W3049415291 @default.
- W4309836720 cites W3083850392 @default.
- W4309836720 cites W4293718652 @default.
- W4309836720 doi "https://doi.org/10.4995/jclr.2022.18200" @default.
- W4309836720 hasPublicationYear "2022" @default.
- W4309836720 type Work @default.
- W4309836720 citedByCount "0" @default.
- W4309836720 crossrefType "journal-article" @default.
- W4309836720 hasAuthorship W4309836720A5016933352 @default.
- W4309836720 hasAuthorship W4309836720A5033347955 @default.
- W4309836720 hasAuthorship W4309836720A5071818365 @default.
- W4309836720 hasBestOaLocation W43098367201 @default.
- W4309836720 hasConcept C100660578 @default.
- W4309836720 hasConcept C10485038 @default.
- W4309836720 hasConcept C119857082 @default.
- W4309836720 hasConcept C12267149 @default.
- W4309836720 hasConcept C124101348 @default.
- W4309836720 hasConcept C138885662 @default.
- W4309836720 hasConcept C148483581 @default.
- W4309836720 hasConcept C148524875 @default.
- W4309836720 hasConcept C153180895 @default.
- W4309836720 hasConcept C154945302 @default.
- W4309836720 hasConcept C169258074 @default.
- W4309836720 hasConcept C2777212361 @default.
- W4309836720 hasConcept C41008148 @default.
- W4309836720 hasConcept C41895202 @default.
- W4309836720 hasConcept C66905080 @default.
- W4309836720 hasConcept C81669768 @default.
- W4309836720 hasConcept C8642999 @default.
- W4309836720 hasConcept C95623464 @default.
- W4309836720 hasConceptScore W4309836720C100660578 @default.
- W4309836720 hasConceptScore W4309836720C10485038 @default.
- W4309836720 hasConceptScore W4309836720C119857082 @default.
- W4309836720 hasConceptScore W4309836720C12267149 @default.
- W4309836720 hasConceptScore W4309836720C124101348 @default.
- W4309836720 hasConceptScore W4309836720C138885662 @default.
- W4309836720 hasConceptScore W4309836720C148483581 @default.
- W4309836720 hasConceptScore W4309836720C148524875 @default.
- W4309836720 hasConceptScore W4309836720C153180895 @default.
- W4309836720 hasConceptScore W4309836720C154945302 @default.
- W4309836720 hasConceptScore W4309836720C169258074 @default.
- W4309836720 hasConceptScore W4309836720C2777212361 @default.
- W4309836720 hasConceptScore W4309836720C41008148 @default.
- W4309836720 hasConceptScore W4309836720C41895202 @default.
- W4309836720 hasConceptScore W4309836720C66905080 @default.
- W4309836720 hasConceptScore W4309836720C81669768 @default.
- W4309836720 hasConceptScore W4309836720C8642999 @default.
- W4309836720 hasConceptScore W4309836720C95623464 @default.
- W4309836720 hasLocation W43098367201 @default.
- W4309836720 hasLocation W43098367202 @default.
- W4309836720 hasOpenAccess W4309836720 @default.
- W4309836720 hasPrimaryLocation W43098367201 @default.
- W4309836720 hasRelatedWork W3090711245 @default.
- W4309836720 hasRelatedWork W3096565539 @default.
- W4309836720 hasRelatedWork W3113421719 @default.
- W4309836720 hasRelatedWork W3188307501 @default.
- W4309836720 hasRelatedWork W4200551482 @default.
- W4309836720 hasRelatedWork W4295309597 @default.
- W4309836720 hasRelatedWork W4320494184 @default.
- W4309836720 hasRelatedWork W4322775603 @default.
- W4309836720 hasRelatedWork W4362544620 @default.
- W4309836720 hasRelatedWork W4382864507 @default.
- W4309836720 hasVolume "6" @default.
- W4309836720 isParatext "false" @default.
- W4309836720 isRetracted "false" @default.
- W4309836720 workType "article" @default.