Matches in SemOpenAlex for { <https://semopenalex.org/work/W2898204126> ?p ?o ?g. }
- W2898204126 abstract "The identification of body site-specific microbial biomarkers and their use for classification tasks have promising applications in medicine, microbial ecology, and forensics. Previous studies have characterized site-specific microbiota and shown that sample origin can be accurately predicted by microbial content. However, these studies were usually restricted to single datasets with consistent experimental methods and conditions, as well as comparatively small sample numbers. The effects of study-specific biases and statistical power on classification performance and biomarker identification thus remain poorly understood. Furthermore, reliable detection in mixtures of different body sites or with noise from environmental contamination has rarely been investigated thus far. Finally, the impact of ecological associations between microbes on biomarker discovery was usually not considered in previous work. Here we present the analysis of one of the largest cross-study sequencing datasets of microbial communities from human body sites (15,082 samples from 57 publicly available studies). We show that training a Random Forest Classifier on this aggregated dataset increases prediction performance for body sites by 35% compared to a single-study classifier. Using simulated datasets, we further demonstrate that the source of different microbial contributions in mixtures of different body sites or with soil can be detected starting at 1% of the total microbial community. We apply a biomarker selection method that excludes indirect environmental associations driven by microbe-microbe associations, yielding a parsimonious set of highly predictive taxa including novel biomarkers and excluding many previously reported taxa. We find a considerable fraction of unclassified biomarkers (“microbial dark matter”) and observe that negatively associated taxa have a surprisingly high impact on classification performance. We further detect a significant enrichment of rod-shaped, motile, and sporulating taxa for feces biomarkers, consistent with a highly competitive environment. Our machine learning model shows strong body site classification performance, both in single-source samples and mixtures, making it promising for tasks requiring high accuracy, such as forensic applications. We report a core set of ecologically informed biomarkers, inferred across a wide range of experimental protocols and conditions, providing the most concise, general, and least biased overview of body site-associated microbes to date." @default.
- W2898204126 created "2018-11-02" @default.
- W2898204126 creator A5000334281 @default.
- W2898204126 creator A5034959018 @default.
- W2898204126 creator A5043584296 @default.
- W2898204126 creator A5067475853 @default.
- W2898204126 creator A5072646321 @default.
- W2898204126 date "2018-10-24" @default.
- W2898204126 modified "2023-10-01" @default.
- W2898204126 title "Ecologically informed microbial biomarkers and accurate classification of mixed and unmixed samples in an extensive cross-study of human body sites" @default.
- W2898204126 cites W1512978277 @default.
- W2898204126 cites W1811186957 @default.
- W2898204126 cites W1964582198 @default.
- W2898204126 cites W1979672417 @default.
- W2898204126 cites W1980764720 @default.
- W2898204126 cites W1989889539 @default.
- W2898204126 cites W2013371150 @default.
- W2898204126 cites W2016123152 @default.
- W2898204126 cites W2023695505 @default.
- W2898204126 cites W2031611770 @default.
- W2898204126 cites W2032230795 @default.
- W2898204126 cites W2040539560 @default.
- W2898204126 cites W2042833415 @default.
- W2898204126 cites W2047014799 @default.
- W2898204126 cites W2050411378 @default.
- W2898204126 cites W2051163560 @default.
- W2898204126 cites W2054407804 @default.
- W2898204126 cites W2066011810 @default.
- W2898204126 cites W2086455456 @default.
- W2898204126 cites W2089509878 @default.
- W2898204126 cites W2104266030 @default.
- W2898204126 cites W2106479432 @default.
- W2898204126 cites W2110065044 @default.
- W2898204126 cites W2115701239 @default.
- W2898204126 cites W2116041602 @default.
- W2898204126 cites W2116218627 @default.
- W2898204126 cites W2118996188 @default.
- W2898204126 cites W2120033072 @default.
- W2898204126 cites W2124927293 @default.
- W2898204126 cites W2125826054 @default.
- W2898204126 cites W2128769815 @default.
- W2898204126 cites W2130725058 @default.
- W2898204126 cites W2131186249 @default.
- W2898204126 cites W2133856765 @default.
- W2898204126 cites W2141152740 @default.
- W2898204126 cites W2160464661 @default.
- W2898204126 cites W2162088497 @default.
- W2898204126 cites W2164308071 @default.
- W2898204126 cites W2166123725 @default.
- W2898204126 cites W2170951896 @default.
- W2898204126 cites W2171406218 @default.
- W2898204126 cites W2173732482 @default.
- W2898204126 cites W2473355215 @default.
- W2898204126 cites W2506843120 @default.
- W2898204126 cites W2582499436 @default.
- W2898204126 cites W2748559568 @default.
- W2898204126 cites W2756958651 @default.
- W2898204126 cites W2771045365 @default.
- W2898204126 cites W2911964244 @default.
- W2898204126 cites W4241167997 @default.
- W2898204126 cites W8935401 @default.
- W2898204126 doi "https://doi.org/10.1186/s40168-018-0565-6" @default.
- W2898204126 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/6201589" @default.
- W2898204126 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/30355348" @default.
- W2898204126 hasPublicationYear "2018" @default.
- W2898204126 type Work @default.
- W2898204126 sameAs 2898204126 @default.
- W2898204126 citedByCount "21" @default.
- W2898204126 countsByYear W28982041262019 @default.
- W2898204126 countsByYear W28982041262020 @default.
- W2898204126 countsByYear W28982041262021 @default.
- W2898204126 countsByYear W28982041262022 @default.
- W2898204126 countsByYear W28982041262023 @default.
- W2898204126 crossrefType "journal-article" @default.
- W2898204126 hasAuthorship W2898204126A5000334281 @default.
- W2898204126 hasAuthorship W2898204126A5034959018 @default.
- W2898204126 hasAuthorship W2898204126A5043584296 @default.
- W2898204126 hasAuthorship W2898204126A5067475853 @default.
- W2898204126 hasAuthorship W2898204126A5072646321 @default.
- W2898204126 hasBestOaLocation W28982041261 @default.
- W2898204126 hasConcept C104317684 @default.
- W2898204126 hasConcept C116834253 @default.
- W2898204126 hasConcept C119857082 @default.
- W2898204126 hasConcept C124535831 @default.
- W2898204126 hasConcept C15151743 @default.
- W2898204126 hasConcept C154945302 @default.
- W2898204126 hasConcept C169258074 @default.
- W2898204126 hasConcept C18903297 @default.
- W2898204126 hasConcept C2781197716 @default.
- W2898204126 hasConcept C41008148 @default.
- W2898204126 hasConcept C46111723 @default.
- W2898204126 hasConcept C523546767 @default.
- W2898204126 hasConcept C54355233 @default.
- W2898204126 hasConcept C69562835 @default.
- W2898204126 hasConcept C70721500 @default.
- W2898204126 hasConcept C81407943 @default.
- W2898204126 hasConcept C86803240 @default.
- W2898204126 hasConcept C95623464 @default.
- W2898204126 hasConceptScore W2898204126C104317684 @default.
- W2898204126 hasConceptScore W2898204126C116834253 @default.