Matches in SemOpenAlex for { <https://semopenalex.org/work/W4382196087> ?p ?o ?g. }
- W4382196087 endingPage "120253" @default.
- W4382196087 startingPage "120253" @default.
- W4382196087 abstract "Machine learning (ML) is increasingly used in cognitive, computational and clinical neuroscience. The reliable and efficient application of ML requires a sound understanding of its subtleties and limitations. Training ML models on datasets with imbalanced classes is a particularly common problem, and it can have severe consequences if not adequately addressed. With the neuroscience ML user in mind, this paper provides a didactic assessment of the class imbalance problem and illustrates its impact through systematic manipulation of data imbalance ratios in (i) simulated data and (ii) brain data recorded with electroencephalography (EEG), magnetoencephalography (MEG) and functional magnetic resonance imaging (fMRI). Our results illustrate how the widely-used Accuracy (Acc) metric, which measures the overall proportion of successful predictions, yields misleadingly high performances, as class imbalance increases. Because Acc weights the per-class ratios of correct predictions proportionally to class size, it largely disregards the performance on the minority class. A binary classification model that learns to systematically vote for the majority class will yield an artificially high decoding accuracy that directly reflects the imbalance between the two classes, rather than any genuine generalizable ability to discriminate between them. We show that other evaluation metrics such as the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC), and the less common Balanced Accuracy (BAcc) metric - defined as the arithmetic mean between sensitivity and specificity, provide more reliable performance evaluations for imbalanced data. Our findings also highlight the robustness of Random Forest (RF), and the benefits of using stratified cross-validation and hyperprameter optimization to tackle data imbalance. Critically, for neuroscience ML applications that seek to minimize overall classification error, we recommend the routine use of BAcc, which in the specific case of balanced data is equivalent to using standard Acc, and readily extends to multi-class settings. Importantly, we present a list of recommendations for dealing with imbalanced data, as well as open-source code to allow the neuroscience community to replicate and extend our observations and explore alternative approaches to coping with imbalanced data." @default.
- W4382196087 created "2023-06-28" @default.
- W4382196087 creator A5009196214 @default.
- W4382196087 creator A5029793290 @default.
- W4382196087 creator A5030563709 @default.
- W4382196087 creator A5034130184 @default.
- W4382196087 creator A5044191334 @default.
- W4382196087 creator A5048056141 @default.
- W4382196087 creator A5050169032 @default.
- W4382196087 creator A5051870567 @default.
- W4382196087 creator A5062242803 @default.
- W4382196087 creator A5062572752 @default.
- W4382196087 creator A5063302976 @default.
- W4382196087 creator A5063380287 @default.
- W4382196087 creator A5068477151 @default.
- W4382196087 creator A5073677867 @default.
- W4382196087 creator A5074484345 @default.
- W4382196087 creator A5077611371 @default.
- W4382196087 creator A5083394397 @default.
- W4382196087 creator A5071948436 @default.
- W4382196087 date "2023-08-01" @default.
- W4382196087 modified "2023-10-15" @default.
- W4382196087 title "Class imbalance should not throw you off balance: Choosing the right classifiers and performance metrics for brain decoding with imbalanced data" @default.
- W4382196087 cites W1824528708 @default.
- W4382196087 cites W1892018222 @default.
- W4382196087 cites W1912982817 @default.
- W4382196087 cites W1941659294 @default.
- W4382196087 cites W1986959058 @default.
- W4382196087 cites W1997319821 @default.
- W4382196087 cites W2003922371 @default.
- W4382196087 cites W2011301426 @default.
- W4382196087 cites W2015452969 @default.
- W4382196087 cites W2033184625 @default.
- W4382196087 cites W2053724458 @default.
- W4382196087 cites W2064959667 @default.
- W4382196087 cites W2072735345 @default.
- W4382196087 cites W2084187427 @default.
- W4382196087 cites W2089632738 @default.
- W4382196087 cites W2118320651 @default.
- W4382196087 cites W2118978333 @default.
- W4382196087 cites W2130915922 @default.
- W4382196087 cites W2142402086 @default.
- W4382196087 cites W2148143831 @default.
- W4382196087 cites W2150306018 @default.
- W4382196087 cites W2151591509 @default.
- W4382196087 cites W2151669316 @default.
- W4382196087 cites W2158485497 @default.
- W4382196087 cites W2162800060 @default.
- W4382196087 cites W2338318698 @default.
- W4382196087 cites W2407212869 @default.
- W4382196087 cites W2412809073 @default.
- W4382196087 cites W2562319768 @default.
- W4382196087 cites W2582043155 @default.
- W4382196087 cites W2738724892 @default.
- W4382196087 cites W2767106145 @default.
- W4382196087 cites W2781900029 @default.
- W4382196087 cites W2789758093 @default.
- W4382196087 cites W2800788706 @default.
- W4382196087 cites W2801253848 @default.
- W4382196087 cites W2803718659 @default.
- W4382196087 cites W2911964244 @default.
- W4382196087 cites W2914877698 @default.
- W4382196087 cites W2918408501 @default.
- W4382196087 cites W2936503027 @default.
- W4382196087 cites W2950202952 @default.
- W4382196087 cites W2962938532 @default.
- W4382196087 cites W2967348491 @default.
- W4382196087 cites W2978368159 @default.
- W4382196087 cites W2981111316 @default.
- W4382196087 cites W2989219518 @default.
- W4382196087 cites W3010894680 @default.
- W4382196087 cites W3029588370 @default.
- W4382196087 cites W3043374725 @default.
- W4382196087 cites W3045004532 @default.
- W4382196087 cites W3046749939 @default.
- W4382196087 cites W3049795607 @default.
- W4382196087 cites W3092041776 @default.
- W4382196087 cites W3116370331 @default.
- W4382196087 cites W3150635270 @default.
- W4382196087 cites W3163243254 @default.
- W4382196087 cites W3197459106 @default.
- W4382196087 cites W3203356073 @default.
- W4382196087 cites W4239510810 @default.
- W4382196087 cites W4376848438 @default.
- W4382196087 cites W3139562178 @default.
- W4382196087 doi "https://doi.org/10.1016/j.neuroimage.2023.120253" @default.
- W4382196087 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37385392" @default.
- W4382196087 hasPublicationYear "2023" @default.
- W4382196087 type Work @default.
- W4382196087 citedByCount "0" @default.
- W4382196087 crossrefType "journal-article" @default.
- W4382196087 hasAuthorship W4382196087A5009196214 @default.
- W4382196087 hasAuthorship W4382196087A5029793290 @default.
- W4382196087 hasAuthorship W4382196087A5030563709 @default.
- W4382196087 hasAuthorship W4382196087A5034130184 @default.
- W4382196087 hasAuthorship W4382196087A5044191334 @default.
- W4382196087 hasAuthorship W4382196087A5048056141 @default.
- W4382196087 hasAuthorship W4382196087A5050169032 @default.