Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313598914> ?p ?o ?g. }
- W4313598914 endingPage "e1010820" @default.
- W4313598914 startingPage "e1010820" @default.
- W4313598914 abstract "In recent years, unsupervised analysis of microbiome data, such as microbial network analysis and clustering, has increased in popularity. Many new statistical and computational methods have been proposed for these tasks. This multiplicity of analysis strategies poses a challenge for researchers, who are often unsure which method(s) to use and might be tempted to try different methods on their dataset to look for the best ones. However, if only the best results are selectively reported, this may cause over-optimism: the best method is overly fitted to the specific dataset, and the results might be non-replicable on validation data. Such effects will ultimately hinder research progress. Yet so far, these topics have been given little attention in the context of unsupervised microbiome analysis. In our illustrative study, we aim to quantify over-optimism effects in this context. We model the approach of a hypothetical microbiome researcher who undertakes four unsupervised research tasks: clustering of bacterial genera, hub detection in microbial networks, differential microbial network analysis, and clustering of samples. While these tasks are unsupervised, the researcher might still have certain expectations as to what constitutes interesting results. We translate these expectations into concrete evaluation criteria that the hypothetical researcher might want to optimize. We then randomly split an exemplary dataset from the American Gut Project into discovery and validation sets multiple times. For each research task, multiple method combinations (e.g., methods for data normalization, network generation, and/or clustering) are tried on the discovery data, and the combination that yields the best result according to the evaluation criterion is chosen. While the hypothetical researcher might only report this result, we also apply the best method combination to the validation dataset. The results are then compared between discovery and validation data. In all four research tasks, there are notable over-optimism effects; the results on the validation data set are worse compared to the discovery data, averaged over multiple random splits into discovery/validation data. Our study thus highlights the importance of validation and replication in microbiome analysis to obtain reliable results and demonstrates that the issue of over-optimism goes beyond the context of statistical testing and fishing for significance." @default.
- W4313598914 created "2023-01-06" @default.
- W4313598914 creator A5010487429 @default.
- W4313598914 creator A5020021560 @default.
- W4313598914 creator A5040820559 @default.
- W4313598914 creator A5082908973 @default.
- W4313598914 creator A5085691045 @default.
- W4313598914 date "2023-01-06" @default.
- W4313598914 modified "2023-10-14" @default.
- W4313598914 title "Over-optimism in unsupervised microbiome analysis: Insights from network learning and clustering" @default.
- W4313598914 cites W1493454437 @default.
- W4313598914 cites W1897139626 @default.
- W4313598914 cites W1968105193 @default.
- W4313598914 cites W1987184081 @default.
- W4313598914 cites W1987971958 @default.
- W4313598914 cites W2005852722 @default.
- W4313598914 cites W2028221693 @default.
- W4313598914 cites W2032230795 @default.
- W4313598914 cites W2044712133 @default.
- W4313598914 cites W2047020168 @default.
- W4313598914 cites W2047940964 @default.
- W4313598914 cites W2052969143 @default.
- W4313598914 cites W2053801811 @default.
- W4313598914 cites W2056944867 @default.
- W4313598914 cites W2083717261 @default.
- W4313598914 cites W2111358533 @default.
- W4313598914 cites W2112408821 @default.
- W4313598914 cites W2118629634 @default.
- W4313598914 cites W2123402141 @default.
- W4313598914 cites W2131681506 @default.
- W4313598914 cites W2135303340 @default.
- W4313598914 cites W2144981148 @default.
- W4313598914 cites W2152239989 @default.
- W4313598914 cites W2161498332 @default.
- W4313598914 cites W2164005910 @default.
- W4313598914 cites W2242390630 @default.
- W4313598914 cites W2281227836 @default.
- W4313598914 cites W2322006099 @default.
- W4313598914 cites W2342543340 @default.
- W4313598914 cites W2503647350 @default.
- W4313598914 cites W2547512372 @default.
- W4313598914 cites W2562137041 @default.
- W4313598914 cites W2611957242 @default.
- W4313598914 cites W2752667320 @default.
- W4313598914 cites W2754041971 @default.
- W4313598914 cites W2762425175 @default.
- W4313598914 cites W2775152143 @default.
- W4313598914 cites W2779812635 @default.
- W4313598914 cites W2785509713 @default.
- W4313598914 cites W2794407684 @default.
- W4313598914 cites W2802287725 @default.
- W4313598914 cites W2804854320 @default.
- W4313598914 cites W2805044645 @default.
- W4313598914 cites W2885319825 @default.
- W4313598914 cites W2908338457 @default.
- W4313598914 cites W2913835110 @default.
- W4313598914 cites W2914251418 @default.
- W4313598914 cites W2940897462 @default.
- W4313598914 cites W2945015580 @default.
- W4313598914 cites W2970428830 @default.
- W4313598914 cites W3009013481 @default.
- W4313598914 cites W3013926312 @default.
- W4313598914 cites W3016700505 @default.
- W4313598914 cites W3088303111 @default.
- W4313598914 cites W3098834468 @default.
- W4313598914 cites W3106625866 @default.
- W4313598914 cites W3110476256 @default.
- W4313598914 cites W3115999643 @default.
- W4313598914 cites W3131993763 @default.
- W4313598914 cites W3143324940 @default.
- W4313598914 cites W3153302987 @default.
- W4313598914 cites W3164872214 @default.
- W4313598914 cites W3179596469 @default.
- W4313598914 cites W3201611947 @default.
- W4313598914 cites W4205364647 @default.
- W4313598914 cites W4206080966 @default.
- W4313598914 cites W4206912861 @default.
- W4313598914 cites W4213163295 @default.
- W4313598914 cites W4214929190 @default.
- W4313598914 cites W4230931443 @default.
- W4313598914 cites W4235169531 @default.
- W4313598914 doi "https://doi.org/10.1371/journal.pcbi.1010820" @default.
- W4313598914 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36608142" @default.
- W4313598914 hasPublicationYear "2023" @default.
- W4313598914 type Work @default.
- W4313598914 citedByCount "1" @default.
- W4313598914 countsByYear W43135989142023 @default.
- W4313598914 crossrefType "journal-article" @default.
- W4313598914 hasAuthorship W4313598914A5010487429 @default.
- W4313598914 hasAuthorship W4313598914A5020021560 @default.
- W4313598914 hasAuthorship W4313598914A5040820559 @default.
- W4313598914 hasAuthorship W4313598914A5082908973 @default.
- W4313598914 hasAuthorship W4313598914A5085691045 @default.
- W4313598914 hasBestOaLocation W43135989141 @default.
- W4313598914 hasConcept C119857082 @default.
- W4313598914 hasConcept C124101348 @default.
- W4313598914 hasConcept C136886441 @default.
- W4313598914 hasConcept C143121216 @default.