Matches in SemOpenAlex for { <https://semopenalex.org/work/W4251502010> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4251502010 abstract "Abstract Background :The advent of metagenomic sequencing provides microbial abundance patterns that can be leveraged for sample origin prediction. Supervised machine learning classification approaches have been reported to predict sample origin accurately when the origin has been previously sampled. Using metagenomic datasets provided by the 2019 CAMDA challenge, we evaluated the influence of technical, analytical and machine learning approaches for result interpretation and source prediction of new origins. Results :Comparison between 16S rRNA amplicon and shotgun sequencing approaches as well as metagenomic analytical tools showed differences in measured microbial abundance of the same samples, especially for organisms present at low abundance. Shotgun sequence data analyzed using Kraken2 and Bracken taxonomic annotation, had higher detection sensitivity than did other methods. As classification models are limited to labeling previously trained origins, we proposed an alternative approach using Lasso-regularized multivariate regression to predict geographic coordinates for comparison. In both models, the prediction errors were much higher in Leave-1-city-out than in 10-fold cross validation, the former of which realistically forecasted the difficulty in accurately predicting samples from new origins than pre-trained origins. The challenge was further confirmed using mystery samples obtained from new origins. Overall, prediction performances between regression and classification models, as measured by mean squared error, were comparable on mystery samples. Due to higher prediction errors for samples from new origins, we provided an additional strategy based on prediction ambiguity to infer whether a sample is from a new origin for practical applications. Lastly, we showed increased prediction error when data from a different sequencing protocol were included as training data. Conclusions :Here we highlighted the capacity of predicting sample origin accurately with pre-trained origins and the challenge of predicting new origins through both regression and classification models. Overall, the work provided a summary evaluation of sequencing techniques, protocol, taxonomic analytical approaches, and machine learning approaches to inform future designs in metagenomic prediction of sample origin." @default.
- W4251502010 created "2022-05-12" @default.
- W4251502010 creator A5030973241 @default.
- W4251502010 creator A5032832981 @default.
- W4251502010 date "2020-03-02" @default.
- W4251502010 modified "2023-10-16" @default.
- W4251502010 title "Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data" @default.
- W4251502010 doi "https://doi.org/10.21203/rs.3.rs-15502/v1" @default.
- W4251502010 hasPublicationYear "2020" @default.
- W4251502010 type Work @default.
- W4251502010 citedByCount "0" @default.
- W4251502010 crossrefType "posted-content" @default.
- W4251502010 hasAuthorship W4251502010A5030973241 @default.
- W4251502010 hasAuthorship W4251502010A5032832981 @default.
- W4251502010 hasBestOaLocation W42515020101 @default.
- W4251502010 hasConcept C104317684 @default.
- W4251502010 hasConcept C105795698 @default.
- W4251502010 hasConcept C119857082 @default.
- W4251502010 hasConcept C124101348 @default.
- W4251502010 hasConcept C136764020 @default.
- W4251502010 hasConcept C15151743 @default.
- W4251502010 hasConcept C154945302 @default.
- W4251502010 hasConcept C161584116 @default.
- W4251502010 hasConcept C185592680 @default.
- W4251502010 hasConcept C198531522 @default.
- W4251502010 hasConcept C33923547 @default.
- W4251502010 hasConcept C37616216 @default.
- W4251502010 hasConcept C41008148 @default.
- W4251502010 hasConcept C43617362 @default.
- W4251502010 hasConcept C55493867 @default.
- W4251502010 hasConcept C83546350 @default.
- W4251502010 hasConcept C86803240 @default.
- W4251502010 hasConceptScore W4251502010C104317684 @default.
- W4251502010 hasConceptScore W4251502010C105795698 @default.
- W4251502010 hasConceptScore W4251502010C119857082 @default.
- W4251502010 hasConceptScore W4251502010C124101348 @default.
- W4251502010 hasConceptScore W4251502010C136764020 @default.
- W4251502010 hasConceptScore W4251502010C15151743 @default.
- W4251502010 hasConceptScore W4251502010C154945302 @default.
- W4251502010 hasConceptScore W4251502010C161584116 @default.
- W4251502010 hasConceptScore W4251502010C185592680 @default.
- W4251502010 hasConceptScore W4251502010C198531522 @default.
- W4251502010 hasConceptScore W4251502010C33923547 @default.
- W4251502010 hasConceptScore W4251502010C37616216 @default.
- W4251502010 hasConceptScore W4251502010C41008148 @default.
- W4251502010 hasConceptScore W4251502010C43617362 @default.
- W4251502010 hasConceptScore W4251502010C55493867 @default.
- W4251502010 hasConceptScore W4251502010C83546350 @default.
- W4251502010 hasConceptScore W4251502010C86803240 @default.
- W4251502010 hasLocation W42515020101 @default.
- W4251502010 hasLocation W42515020102 @default.
- W4251502010 hasOpenAccess W4251502010 @default.
- W4251502010 hasPrimaryLocation W42515020101 @default.
- W4251502010 hasRelatedWork W10576317 @default.
- W4251502010 hasRelatedWork W11562254 @default.
- W4251502010 hasRelatedWork W13692438 @default.
- W4251502010 hasRelatedWork W14506204 @default.
- W4251502010 hasRelatedWork W4528552 @default.
- W4251502010 hasRelatedWork W5673233 @default.
- W4251502010 hasRelatedWork W5755083 @default.
- W4251502010 hasRelatedWork W7597497 @default.
- W4251502010 hasRelatedWork W8610196 @default.
- W4251502010 hasRelatedWork W9303900 @default.
- W4251502010 isParatext "false" @default.
- W4251502010 isRetracted "false" @default.
- W4251502010 workType "article" @default.