Matches in SemOpenAlex for { <https://semopenalex.org/work/W63172340> ?p ?o ?g. }
- W63172340 abstract "In this dissertation, we address three different problems in high-throughput metagenomics and cheminformatics.(1) Metagenomics studies the genomic content of an entire microbial community by simultaneously sequencing all genomes in an environmental sample. The advent of next-generation sequencing (NGS) technologies has drastically reduced sequencing time and cost, leading to the generation of millions of sequences (reads) in a single run. An important problem in metagenomic analysis is to determine and quantify species (or genomes) in a metagenomic sample. The problem is challenging due to an unknown number of genomes and their abundance ratios, presence of repeats and sequencing errors, and the short length of NGS reads. We propose two algorithms to address these challenges. First, we present an algorithm for separating short paired-end reads from genomes with similar abundance levels. Second, we propose a method to accurately estimate the abundance levels of species. The algorithm automatically determines the number of abundance groups in a metagenomic dataset and bins the reads into these groups.(2) NGS coupled with metagenomics has led to the rapid growth of sequence databases and enabled a new branch of microbiology called comparative metagenomics. It is a fast growing field that requires the development of novel supervised learning techniques. In particular, the problem of microbial community classification may have useful applications enabling efficient organization and search in rapidly growing metagenomic databases, detection of disease phenotypes in clinical samples, and forensic identification. We propose a novel supervised classification method for metagenomic samples that takes advantage of the natural structure in microbial community data encoded by a phylogenetic tree.(3) In modern drug discovery, ultra-high-throughput screening is applied to millions of drug-like compounds in one experiment. Hierarchical clustering is an important step in the drug discovery process. Standard implementations of the exact algorithm for hierarchical clustering require O(n 2 ) time and O(n 2 ) memory. Even though approximate hierarchical clustering methods overcome this problem, they either rely on embedding into spaces that are not biologically sensible, or produce very low resolution hierarchical structures. We present a hybrid hierarchical clustering algorithm requiring approximately O(n sqrt(n)) time and O(n sqrt(n)) memory while still preserving the most desirable properties of the exact algorithm." @default.
- W63172340 created "2016-06-24" @default.
- W63172340 creator A5013571436 @default.
- W63172340 date "2013-01-01" @default.
- W63172340 modified "2023-09-27" @default.
- W63172340 title "Some Clustering and Classification Problems in High-Throughput Metagenomics and Cheminformatics" @default.
- W63172340 cites W1490760466 @default.
- W63172340 cites W1497745584 @default.
- W63172340 cites W1524688041 @default.
- W63172340 cites W1539550196 @default.
- W63172340 cites W1548486355 @default.
- W63172340 cites W1553258895 @default.
- W63172340 cites W1562379206 @default.
- W63172340 cites W1562607193 @default.
- W63172340 cites W1566768190 @default.
- W63172340 cites W1783384641 @default.
- W63172340 cites W1811186957 @default.
- W63172340 cites W1966711026 @default.
- W63172340 cites W1966822396 @default.
- W63172340 cites W1969017314 @default.
- W63172340 cites W1969346416 @default.
- W63172340 cites W1970062872 @default.
- W63172340 cites W1970554427 @default.
- W63172340 cites W1974482866 @default.
- W63172340 cites W1975931252 @default.
- W63172340 cites W1978478796 @default.
- W63172340 cites W1981556499 @default.
- W63172340 cites W1985372952 @default.
- W63172340 cites W1985894132 @default.
- W63172340 cites W1988654243 @default.
- W63172340 cites W1989889539 @default.
- W63172340 cites W1992419399 @default.
- W63172340 cites W1993784588 @default.
- W63172340 cites W1995393174 @default.
- W63172340 cites W1998601670 @default.
- W63172340 cites W2000259070 @default.
- W63172340 cites W2007124761 @default.
- W63172340 cites W2016381774 @default.
- W63172340 cites W2017727472 @default.
- W63172340 cites W2018055814 @default.
- W63172340 cites W2022892561 @default.
- W63172340 cites W2023829651 @default.
- W63172340 cites W2027496182 @default.
- W63172340 cites W2029437806 @default.
- W63172340 cites W2030644393 @default.
- W63172340 cites W2031611770 @default.
- W63172340 cites W2032230795 @default.
- W63172340 cites W2033403400 @default.
- W63172340 cites W2035890032 @default.
- W63172340 cites W2040163542 @default.
- W63172340 cites W2043398720 @default.
- W63172340 cites W2046149102 @default.
- W63172340 cites W2048818637 @default.
- W63172340 cites W2055043387 @default.
- W63172340 cites W2055057012 @default.
- W63172340 cites W2061789405 @default.
- W63172340 cites W2065427498 @default.
- W63172340 cites W2067045493 @default.
- W63172340 cites W2072970694 @default.
- W63172340 cites W2075716829 @default.
- W63172340 cites W2082203850 @default.
- W63172340 cites W2082361336 @default.
- W63172340 cites W2089509878 @default.
- W63172340 cites W2089923519 @default.
- W63172340 cites W2091146986 @default.
- W63172340 cites W2096525273 @default.
- W63172340 cites W2097936772 @default.
- W63172340 cites W2101234009 @default.
- W63172340 cites W2104149855 @default.
- W63172340 cites W2104318549 @default.
- W63172340 cites W2106398669 @default.
- W63172340 cites W2106651224 @default.
- W63172340 cites W2107000647 @default.
- W63172340 cites W2107233609 @default.
- W63172340 cites W2107854630 @default.
- W63172340 cites W2108718991 @default.
- W63172340 cites W2113601822 @default.
- W63172340 cites W2116895571 @default.
- W63172340 cites W2117544284 @default.
- W63172340 cites W2119663322 @default.
- W63172340 cites W2120636855 @default.
- W63172340 cites W2121564430 @default.
- W63172340 cites W2122189635 @default.
- W63172340 cites W2123837003 @default.
- W63172340 cites W2124351063 @default.
- W63172340 cites W2124637227 @default.
- W63172340 cites W2126809954 @default.
- W63172340 cites W2127218421 @default.
- W63172340 cites W2127651281 @default.
- W63172340 cites W2128114769 @default.
- W63172340 cites W2131988453 @default.
- W63172340 cites W2139398630 @default.
- W63172340 cites W2140604849 @default.
- W63172340 cites W2141012957 @default.
- W63172340 cites W2142103600 @default.
- W63172340 cites W2142740566 @default.
- W63172340 cites W2145336165 @default.
- W63172340 cites W2146577751 @default.
- W63172340 cites W2147378425 @default.
- W63172340 cites W2149573313 @default.