Matches in SemOpenAlex for { <https://semopenalex.org/work/W3104465172> ?p ?o ?g. }
- W3104465172 abstract "Abstract Microbiomes are complex ecological systems that play crucial roles in understanding natural phenomena from human disease to climate change. Especially in human gut microbiome studies, where collecting clinical samples can be arduous, the number of taxa considered in any one study often exceeds the number of samples ten to one hundred-fold. This discrepancy decreases the power of studies to identify meaningful differences between samples, increases the likelihood of false positive results, and subsequently limits reproducibility. Despite the vast collections of microbiome data already available, biome-specific patterns of microbial structure are not currently leveraged to inform studies. Instead, most microbiome survey studies focus on differential abundance testing per taxa in pursuit of specific biomarkers for a given phenotype. This methodology assumes differences in individual species, genera, or families can be used to distinguish between microbial communities and ignores community-level response. In this paper, we propose to leverage public microbiome databases to shift the analysis paradigm from a focus on taxonomic counts to a focus on comprehensive properties that more completely characterize microbial community members’ function and environmental relationships. We learn these properties by applying an embedding algorithm to quantify taxa co-occurrence patterns in over 18,000 samples from the American Gut Project (AGP) microbiome crowdsourcing effort. The resulting set of embeddings transforms human gut microbiome data from thousands of taxa counts to a latent variable landscape of only one hundred “properties”, or contextual relationships. We then compare the predictive power of models trained using properties, normalized taxonomic count data, and another commonly used dimensionality reduction method, Principal Component Analysis in categorizing samples from individuals with inflammatory bowel disease (IBD) and healthy controls. We show that predictive models trained using property data are the most accurate, robust, and generalizable, and that property-based models can be trained on one dataset and deployed on another with positive results. Furthermore, we find that these properties can be interpreted in the context of current knowledge; properties correlate significantly with known metabolic pathways, and distances between taxa in “property space” roughly correlate with their phylogenetic distances. Using these properties, we are able to extract known and new bacterial metabolic pathways associated with inflammatory bowel disease across two completely independent studies. More broadly, this paper explores a reframing of the microbiome analysis mindset, from taxonomic counts to comprehensive community-level properties. By providing a set of pre-trained embeddings, we allow any V4 16S amplicon study to leverage and apply the publicly informed properties presented to increase the statistical power, reproducibility, and generalizability of analysis." @default.
- W3104465172 created "2020-11-23" @default.
- W3104465172 creator A5002060091 @default.
- W3104465172 creator A5068423857 @default.
- W3104465172 date "2019-09-05" @default.
- W3104465172 modified "2023-09-25" @default.
- W3104465172 title "Decoding the Language of Microbiomes: Leveraging Patterns in 16S Public Data using Word-Embedding Techniques and Applications in Inflammatory Bowel Disease" @default.
- W3104465172 cites W1523266228 @default.
- W3104465172 cites W1965092590 @default.
- W3104465172 cites W1968052582 @default.
- W3104465172 cites W1971058409 @default.
- W3104465172 cites W2008771322 @default.
- W3104465172 cites W2016098250 @default.
- W3104465172 cites W2026006003 @default.
- W3104465172 cites W2032236023 @default.
- W3104465172 cites W2056279562 @default.
- W3104465172 cites W2064837899 @default.
- W3104465172 cites W2085284704 @default.
- W3104465172 cites W2085646557 @default.
- W3104465172 cites W2099856240 @default.
- W3104465172 cites W2111162388 @default.
- W3104465172 cites W2117029190 @default.
- W3104465172 cites W2128769815 @default.
- W3104465172 cites W2135785255 @default.
- W3104465172 cites W2139736670 @default.
- W3104465172 cites W2144981148 @default.
- W3104465172 cites W2163286994 @default.
- W3104465172 cites W2179438025 @default.
- W3104465172 cites W2232186956 @default.
- W3104465172 cites W2250539671 @default.
- W3104465172 cites W2250879510 @default.
- W3104465172 cites W2401404581 @default.
- W3104465172 cites W2555180552 @default.
- W3104465172 cites W2588997555 @default.
- W3104465172 cites W2592367925 @default.
- W3104465172 cites W2757588961 @default.
- W3104465172 cites W2776445964 @default.
- W3104465172 cites W2787055217 @default.
- W3104465172 cites W2795091386 @default.
- W3104465172 cites W2801146255 @default.
- W3104465172 cites W2802622649 @default.
- W3104465172 cites W2805009121 @default.
- W3104465172 cites W2806007080 @default.
- W3104465172 cites W2888613215 @default.
- W3104465172 cites W2901555018 @default.
- W3104465172 cites W2902183439 @default.
- W3104465172 cites W2903406988 @default.
- W3104465172 cites W2905645620 @default.
- W3104465172 cites W2910372325 @default.
- W3104465172 cites W2912729292 @default.
- W3104465172 cites W2918851553 @default.
- W3104465172 cites W2921959633 @default.
- W3104465172 cites W2938574745 @default.
- W3104465172 cites W2943169224 @default.
- W3104465172 cites W2946648915 @default.
- W3104465172 cites W2946715306 @default.
- W3104465172 cites W2947170763 @default.
- W3104465172 cites W2947640277 @default.
- W3104465172 cites W2948003108 @default.
- W3104465172 cites W2949346354 @default.
- W3104465172 cites W2949947154 @default.
- W3104465172 cites W2963702053 @default.
- W3104465172 cites W2963923670 @default.
- W3104465172 cites W4294216483 @default.
- W3104465172 doi "https://doi.org/10.1101/748152" @default.
- W3104465172 hasPublicationYear "2019" @default.
- W3104465172 type Work @default.
- W3104465172 sameAs 3104465172 @default.
- W3104465172 citedByCount "1" @default.
- W3104465172 countsByYear W31044651722021 @default.
- W3104465172 crossrefType "posted-content" @default.
- W3104465172 hasAuthorship W3104465172A5002060091 @default.
- W3104465172 hasAuthorship W3104465172A5068423857 @default.
- W3104465172 hasBestOaLocation W31044651721 @default.
- W3104465172 hasConcept C110872660 @default.
- W3104465172 hasConcept C143121216 @default.
- W3104465172 hasConcept C18903297 @default.
- W3104465172 hasConcept C189592816 @default.
- W3104465172 hasConcept C190944805 @default.
- W3104465172 hasConcept C2522767166 @default.
- W3104465172 hasConcept C2776321320 @default.
- W3104465172 hasConcept C41008148 @default.
- W3104465172 hasConcept C60644358 @default.
- W3104465172 hasConcept C70721500 @default.
- W3104465172 hasConcept C71640776 @default.
- W3104465172 hasConcept C86803240 @default.
- W3104465172 hasConcept C89920630 @default.
- W3104465172 hasConcept C91478284 @default.
- W3104465172 hasConceptScore W3104465172C110872660 @default.
- W3104465172 hasConceptScore W3104465172C143121216 @default.
- W3104465172 hasConceptScore W3104465172C18903297 @default.
- W3104465172 hasConceptScore W3104465172C189592816 @default.
- W3104465172 hasConceptScore W3104465172C190944805 @default.
- W3104465172 hasConceptScore W3104465172C2522767166 @default.
- W3104465172 hasConceptScore W3104465172C2776321320 @default.
- W3104465172 hasConceptScore W3104465172C41008148 @default.
- W3104465172 hasConceptScore W3104465172C60644358 @default.
- W3104465172 hasConceptScore W3104465172C70721500 @default.
- W3104465172 hasConceptScore W3104465172C71640776 @default.
- W3104465172 hasConceptScore W3104465172C86803240 @default.