Matches in SemOpenAlex for { <https://semopenalex.org/work/W2007124761> ?p ?o ?g. }
Showing items 1 to 74 of
74
with 100 items per page.
- W2007124761 abstract "Motivation: A major challenge facing metagenomics is the development of tools for the characterization of functional and taxonomic content of vast amounts of short metagenome reads. The efficacy of clustering methods depends on the number of reads in the dataset, the read length and relative abundances of source genomes in the microbial community.Results: In this paper, we formulate an unsupervised naive Bayes multi-species, multi-dimensional mixture model for reads from a metagenome. We use the proposed model to cluster metagenomic reads by their species of origin and to characterize the abundance of each species. We model the distribution of word counts along a genome as a Gaussian for shorter, frequent words and as a Poisson for longer words that are rare. We employ either a mixture of Gaussians or mixture of Poissons to model reads within each bin. An additional reason to use these distributions is their flexibility and ease of parameter estimation. Such a paradigm characterizes the compositional heterogeneity of the words along a genome, signifying its genome signature. Further, we handle the high-dimensionality and sparsity associated with the data, by grouping the set of words comprising the reads, resulting in a two-way mixture model. Finally, we derive an unsupervised Expectation Maximization algorithm for the models. Our method provides a general statistical framework for modeling metagenome reads. We demonstrate the accuracy and applicability of this method on simulated and real metagenomes. Our method can accurately cluster reads as short as 100 bps and estimate the species abundance as well. Our method outperforms LikelyBin, another unsupervised composition-based binning method for metagenomes, on datasets of varying abundances, divergences and read lengths." @default.
- W2007124761 created "2016-06-24" @default.
- W2007124761 creator A5020757869 @default.
- W2007124761 creator A5023786448 @default.
- W2007124761 date "2011-08-01" @default.
- W2007124761 modified "2023-09-26" @default.
- W2007124761 title "A two-way multi-dimensional mixture model for clustering metagenomic sequences" @default.
- W2007124761 cites W1840964494 @default.
- W2007124761 cites W1969017314 @default.
- W2007124761 cites W1993293753 @default.
- W2007124761 cites W2018821242 @default.
- W2007124761 cites W2047531378 @default.
- W2007124761 cites W2051093690 @default.
- W2007124761 cites W2068308871 @default.
- W2007124761 cites W2081676964 @default.
- W2007124761 cites W2092145460 @default.
- W2007124761 cites W2098290178 @default.
- W2007124761 cites W2103970813 @default.
- W2007124761 cites W2107000647 @default.
- W2007124761 cites W2107594154 @default.
- W2007124761 cites W2116895571 @default.
- W2007124761 cites W2117428599 @default.
- W2007124761 cites W2123472832 @default.
- W2007124761 cites W2124637227 @default.
- W2007124761 cites W2132415967 @default.
- W2007124761 cites W2145137617 @default.
- W2007124761 cites W2147378425 @default.
- W2007124761 cites W2150337627 @default.
- W2007124761 cites W2153990926 @default.
- W2007124761 cites W2163352587 @default.
- W2007124761 cites W2166278306 @default.
- W2007124761 doi "https://doi.org/10.1145/2147805.2147826" @default.
- W2007124761 hasPublicationYear "2011" @default.
- W2007124761 type Work @default.
- W2007124761 sameAs 2007124761 @default.
- W2007124761 citedByCount "4" @default.
- W2007124761 countsByYear W20071247612012 @default.
- W2007124761 countsByYear W20071247612013 @default.
- W2007124761 crossrefType "proceedings-article" @default.
- W2007124761 hasAuthorship W2007124761A5020757869 @default.
- W2007124761 hasAuthorship W2007124761A5023786448 @default.
- W2007124761 hasConcept C104317684 @default.
- W2007124761 hasConcept C124101348 @default.
- W2007124761 hasConcept C15151743 @default.
- W2007124761 hasConcept C154945302 @default.
- W2007124761 hasConcept C41008148 @default.
- W2007124761 hasConcept C55493867 @default.
- W2007124761 hasConcept C73555534 @default.
- W2007124761 hasConcept C86803240 @default.
- W2007124761 hasConceptScore W2007124761C104317684 @default.
- W2007124761 hasConceptScore W2007124761C124101348 @default.
- W2007124761 hasConceptScore W2007124761C15151743 @default.
- W2007124761 hasConceptScore W2007124761C154945302 @default.
- W2007124761 hasConceptScore W2007124761C41008148 @default.
- W2007124761 hasConceptScore W2007124761C55493867 @default.
- W2007124761 hasConceptScore W2007124761C73555534 @default.
- W2007124761 hasConceptScore W2007124761C86803240 @default.
- W2007124761 hasLocation W20071247611 @default.
- W2007124761 hasOpenAccess W2007124761 @default.
- W2007124761 hasPrimaryLocation W20071247611 @default.
- W2007124761 hasRelatedWork W1849651648 @default.
- W2007124761 hasRelatedWork W1979871427 @default.
- W2007124761 hasRelatedWork W1999627569 @default.
- W2007124761 hasRelatedWork W2348097614 @default.
- W2007124761 hasRelatedWork W2387405106 @default.
- W2007124761 hasRelatedWork W2392374020 @default.
- W2007124761 hasRelatedWork W2740219386 @default.
- W2007124761 hasRelatedWork W3107474891 @default.
- W2007124761 hasRelatedWork W4282933932 @default.
- W2007124761 hasRelatedWork W763609066 @default.
- W2007124761 isParatext "false" @default.
- W2007124761 isRetracted "false" @default.
- W2007124761 magId "2007124761" @default.
- W2007124761 workType "article" @default.