Matches in SemOpenAlex for { <https://semopenalex.org/work/W2566346075> ?p ?o ?g. }
- W2566346075 endingPage "309" @default.
- W2566346075 startingPage "300" @default.
- W2566346075 abstract "We are rapidly approaching the point where we have sequenced millions of human genomes. There is a pressing need for new data structures to store raw sequencing data and efficient algorithms for population scale analysis. Current reference-based data formats do not fully exploit the redundancy in population sequencing nor take advantage of shared genetic variation. In recent years, the Burrows-Wheeler transform (BWT) and FM-index have been widely employed as a full-text searchable index for read alignment and de novo assembly. We introduce the concept of a population BWT and use it to store and index the sequencing reads of 2705 samples from the 1000 Genomes Project. A key feature is that, as more genomes are added, identical read sequences are increasingly observed, and compression becomes more efficient. We assess the support in the 1000 Genomes read data for every base position of two human reference assembly versions, identifying that 3.2 Mbp with population support was lost in the transition from GRCh37 with 13.7 Mbp added to GRCh38. We show that the vast majority of variant alleles can be uniquely described by overlapping 31-mers and show how rapid and accurate SNP and indel genotyping can be carried out across the genomes in the population BWT. We use the population BWT to carry out nonreference queries to search for the presence of all known viral genomes and discover human T-lymphotropic virus 1 integrations in six samples in a recognized epidemiological distribution." @default.
- W2566346075 created "2017-01-06" @default.
- W2566346075 creator A5003935660 @default.
- W2566346075 creator A5004195878 @default.
- W2566346075 creator A5030669774 @default.
- W2566346075 creator A5050290060 @default.
- W2566346075 creator A5050782615 @default.
- W2566346075 creator A5071625927 @default.
- W2566346075 creator A5078769186 @default.
- W2566346075 creator A5082040351 @default.
- W2566346075 date "2016-12-16" @default.
- W2566346075 modified "2023-09-23" @default.
- W2566346075 title "Using reference-free compressed data structures to analyze sequencing reads from thousands of human genomes" @default.
- W2566346075 cites W1489349053 @default.
- W2566346075 cites W1879624628 @default.
- W2566346075 cites W1907868928 @default.
- W2566346075 cites W1982767969 @default.
- W2566346075 cites W2000493448 @default.
- W2566346075 cites W2004736524 @default.
- W2566346075 cites W2010361633 @default.
- W2566346075 cites W2039301421 @default.
- W2566346075 cites W2069066547 @default.
- W2566346075 cites W2085800524 @default.
- W2566346075 cites W2087689337 @default.
- W2566346075 cites W2096465161 @default.
- W2566346075 cites W2103441770 @default.
- W2566346075 cites W2104124522 @default.
- W2566346075 cites W2104549677 @default.
- W2566346075 cites W2104846587 @default.
- W2566346075 cites W2107966497 @default.
- W2566346075 cites W2108234281 @default.
- W2566346075 cites W2111350087 @default.
- W2566346075 cites W2120330409 @default.
- W2566346075 cites W2122707695 @default.
- W2566346075 cites W2123845384 @default.
- W2566346075 cites W2124535406 @default.
- W2566346075 cites W2124985265 @default.
- W2566346075 cites W2131106408 @default.
- W2566346075 cites W2132731072 @default.
- W2566346075 cites W2147544359 @default.
- W2566346075 cites W2157155233 @default.
- W2566346075 cites W2159084616 @default.
- W2566346075 cites W2159213603 @default.
- W2566346075 cites W2159954944 @default.
- W2566346075 cites W2170486072 @default.
- W2566346075 cites W2170727800 @default.
- W2566346075 cites W2180506325 @default.
- W2566346075 cites W2264578395 @default.
- W2566346075 cites W2266239166 @default.
- W2566346075 cites W2296597603 @default.
- W2566346075 cites W2323608404 @default.
- W2566346075 cites W2952870794 @default.
- W2566346075 cites W4210849789 @default.
- W2566346075 doi "https://doi.org/10.1101/gr.211748.116" @default.
- W2566346075 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/5287235" @default.
- W2566346075 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/27986821" @default.
- W2566346075 hasPublicationYear "2016" @default.
- W2566346075 type Work @default.
- W2566346075 sameAs 2566346075 @default.
- W2566346075 citedByCount "23" @default.
- W2566346075 countsByYear W25663460752017 @default.
- W2566346075 countsByYear W25663460752018 @default.
- W2566346075 countsByYear W25663460752019 @default.
- W2566346075 countsByYear W25663460752020 @default.
- W2566346075 countsByYear W25663460752021 @default.
- W2566346075 countsByYear W25663460752022 @default.
- W2566346075 countsByYear W25663460752023 @default.
- W2566346075 crossrefType "journal-article" @default.
- W2566346075 hasAuthorship W2566346075A5003935660 @default.
- W2566346075 hasAuthorship W2566346075A5004195878 @default.
- W2566346075 hasAuthorship W2566346075A5030669774 @default.
- W2566346075 hasAuthorship W2566346075A5050290060 @default.
- W2566346075 hasAuthorship W2566346075A5050782615 @default.
- W2566346075 hasAuthorship W2566346075A5071625927 @default.
- W2566346075 hasAuthorship W2566346075A5078769186 @default.
- W2566346075 hasAuthorship W2566346075A5082040351 @default.
- W2566346075 hasBestOaLocation W25663460751 @default.
- W2566346075 hasConcept C104317684 @default.
- W2566346075 hasConcept C119054055 @default.
- W2566346075 hasConcept C135763542 @default.
- W2566346075 hasConcept C141231307 @default.
- W2566346075 hasConcept C144024400 @default.
- W2566346075 hasConcept C149923435 @default.
- W2566346075 hasConcept C150194340 @default.
- W2566346075 hasConcept C153209595 @default.
- W2566346075 hasConcept C162317418 @default.
- W2566346075 hasConcept C189206191 @default.
- W2566346075 hasConcept C18949551 @default.
- W2566346075 hasConcept C192953774 @default.
- W2566346075 hasConcept C197077220 @default.
- W2566346075 hasConcept C2279292 @default.
- W2566346075 hasConcept C2908647359 @default.
- W2566346075 hasConcept C54355233 @default.
- W2566346075 hasConcept C70721500 @default.
- W2566346075 hasConcept C86803240 @default.
- W2566346075 hasConcept C97425143 @default.
- W2566346075 hasConceptScore W2566346075C104317684 @default.
- W2566346075 hasConceptScore W2566346075C119054055 @default.