Matches in SemOpenAlex for { <https://semopenalex.org/work/W4283690880> ?p ?o ?g. }
- W4283690880 abstract "Abstract In recent years, unsupervised analysis of microbiome data, such as microbial network analysis and clustering, has increased in popularity. Many new statistical and computational methods have been proposed for these tasks. This multiplicity of analysis strategies poses a challenge for researchers, who are often unsure which method(s) to use and might be tempted to try different methods on their dataset to look for the “best” ones. However, if only the best results are selectively reported, this may cause over-optimism: the “best” method is overly fitted to the specific dataset, and the results might be non-replicable on validation data. Such effects will ultimately hinder research progress. Yet so far, these topics have been given little attention in the context of unsupervised microbiome analysis. In our illustrative study, we aim to quantify over-optimism effects in this context. We model the approach of a hypothetical microbiome researcher who undertakes three unsupervised research tasks: clustering of bacterial genera, hub detection in microbial networks, and differential microbial network analysis. While these tasks are unsupervised, the researcher might still have certain expectations as to what constitutes interesting results. We translate these expectations into concrete evaluation criteria that the hypothetical researcher might want to optimize. We then randomly split an exemplary dataset from the American Gut Project into discovery and validation sets multiple times. For each research task, multiple method combinations (e.g., methods for data normalization, network generation, and/or clustering) are tried on the discovery data, and the combination that yields the best result according to the evaluation criterion is chosen. While the hypothetical researcher might only report this result, we also apply the “best” method combination to the validation dataset. The results are then compared between discovery and validation data. In all three research tasks, there are notable over-optimism effects; the results on the validation data set are worse compared to the discovery data, averaged over multiple random splits into discovery/validation data. Our study thus highlights the importance of validation and replication in microbiome analysis to obtain reliable results and demonstrates that the issue of over-optimism goes beyond the context of statistical testing and fishing for significance." @default.
- W4283690880 created "2022-06-30" @default.
- W4283690880 creator A5015152016 @default.
- W4283690880 creator A5020021560 @default.
- W4283690880 creator A5040820559 @default.
- W4283690880 creator A5082908973 @default.
- W4283690880 creator A5085691045 @default.
- W4283690880 date "2022-06-28" @default.
- W4283690880 modified "2023-09-27" @default.
- W4283690880 title "Over-optimism in unsupervised microbiome analysis: Insights from network learning and clustering" @default.
- W4283690880 cites W2005852722 @default.
- W4283690880 cites W2047940964 @default.
- W4283690880 cites W2056944867 @default.
- W4283690880 cites W2083717261 @default.
- W4283690880 cites W2112408821 @default.
- W4283690880 cites W2123402141 @default.
- W4283690880 cites W2131681506 @default.
- W4283690880 cites W2144981148 @default.
- W4283690880 cites W2152239989 @default.
- W4283690880 cites W2161498332 @default.
- W4283690880 cites W2164005910 @default.
- W4283690880 cites W2281227836 @default.
- W4283690880 cites W2339856074 @default.
- W4283690880 cites W2547512372 @default.
- W4283690880 cites W2562137041 @default.
- W4283690880 cites W2611957242 @default.
- W4283690880 cites W2621415147 @default.
- W4283690880 cites W2754041971 @default.
- W4283690880 cites W2762425175 @default.
- W4283690880 cites W2779812635 @default.
- W4283690880 cites W2794407684 @default.
- W4283690880 cites W2794904294 @default.
- W4283690880 cites W2804854320 @default.
- W4283690880 cites W2805044645 @default.
- W4283690880 cites W2913835110 @default.
- W4283690880 cites W2914251418 @default.
- W4283690880 cites W2945015580 @default.
- W4283690880 cites W3013926312 @default.
- W4283690880 cites W3016700505 @default.
- W4283690880 cites W3098834468 @default.
- W4283690880 cites W3106625866 @default.
- W4283690880 cites W3115999643 @default.
- W4283690880 cites W3127723704 @default.
- W4283690880 cites W3131993763 @default.
- W4283690880 cites W3135625811 @default.
- W4283690880 cites W3143324940 @default.
- W4283690880 cites W3153302987 @default.
- W4283690880 cites W3161876650 @default.
- W4283690880 cites W4206912861 @default.
- W4283690880 cites W4213163295 @default.
- W4283690880 cites W4235169531 @default.
- W4283690880 cites W4282834942 @default.
- W4283690880 doi "https://doi.org/10.1101/2022.06.24.497500" @default.
- W4283690880 hasPublicationYear "2022" @default.
- W4283690880 type Work @default.
- W4283690880 citedByCount "1" @default.
- W4283690880 countsByYear W42836908802022 @default.
- W4283690880 crossrefType "posted-content" @default.
- W4283690880 hasAuthorship W4283690880A5015152016 @default.
- W4283690880 hasAuthorship W4283690880A5020021560 @default.
- W4283690880 hasAuthorship W4283690880A5040820559 @default.
- W4283690880 hasAuthorship W4283690880A5082908973 @default.
- W4283690880 hasAuthorship W4283690880A5085691045 @default.
- W4283690880 hasBestOaLocation W42836908801 @default.
- W4283690880 hasConcept C119857082 @default.
- W4283690880 hasConcept C136886441 @default.
- W4283690880 hasConcept C143121216 @default.
- W4283690880 hasConcept C144024400 @default.
- W4283690880 hasConcept C151730666 @default.
- W4283690880 hasConcept C154945302 @default.
- W4283690880 hasConcept C15744967 @default.
- W4283690880 hasConcept C19165224 @default.
- W4283690880 hasConcept C2522767166 @default.
- W4283690880 hasConcept C2779343474 @default.
- W4283690880 hasConcept C2780586970 @default.
- W4283690880 hasConcept C41008148 @default.
- W4283690880 hasConcept C60644358 @default.
- W4283690880 hasConcept C73555534 @default.
- W4283690880 hasConcept C77805123 @default.
- W4283690880 hasConcept C8038995 @default.
- W4283690880 hasConcept C86803240 @default.
- W4283690880 hasConceptScore W4283690880C119857082 @default.
- W4283690880 hasConceptScore W4283690880C136886441 @default.
- W4283690880 hasConceptScore W4283690880C143121216 @default.
- W4283690880 hasConceptScore W4283690880C144024400 @default.
- W4283690880 hasConceptScore W4283690880C151730666 @default.
- W4283690880 hasConceptScore W4283690880C154945302 @default.
- W4283690880 hasConceptScore W4283690880C15744967 @default.
- W4283690880 hasConceptScore W4283690880C19165224 @default.
- W4283690880 hasConceptScore W4283690880C2522767166 @default.
- W4283690880 hasConceptScore W4283690880C2779343474 @default.
- W4283690880 hasConceptScore W4283690880C2780586970 @default.
- W4283690880 hasConceptScore W4283690880C41008148 @default.
- W4283690880 hasConceptScore W4283690880C60644358 @default.
- W4283690880 hasConceptScore W4283690880C73555534 @default.
- W4283690880 hasConceptScore W4283690880C77805123 @default.
- W4283690880 hasConceptScore W4283690880C8038995 @default.
- W4283690880 hasConceptScore W4283690880C86803240 @default.
- W4283690880 hasLocation W42836908801 @default.
- W4283690880 hasLocation W42836908802 @default.