Matches in SemOpenAlex for { <https://semopenalex.org/work/W2782869019> ?p ?o ?g. }
- W2782869019 abstract "Abstract k -mer profiling has been one of the trending approaches to analyze read data generated by high-throughput sequencing technologies. The tasks of k -mer profiling include, but are not limited to, counting the frequencies and determining the occurrences of short sequences in a dataset. The notion of k -mer has been extensively used to build de Bruijn graphs in genome or transcriptome assembly, which requires examining all possible k -mers presented in the dataset. Recently, an alternative way of profiling has been proposed, which constructs a set of representative k -mers as genomic markers and profiles their occurrences in the sequencing data. This technique has been applied in both transcript quantification through RNA-Seq and taxonomic classification of metagenomic reads. Most of these applications use a set of fixed-size k -mers since the majority of existing k -mer counters are inadequate to process genomic sequences with variable-length k -mers. However, choosing the appropriate k is challenging, as it varies for different applications. As a pioneer work to profile a set of variable-length k -mers, we propose TahcoRoll in order to enhance the Aho-Corasick algorithm. More specifically, we use one bit to represent each nucleotide, and integrate the rolling hash technique to construct an efficient in-memory data structure for this task. Using both synthetic and real datasets, results show that TahcoRoll outperforms existing approaches in either or both time and memory efficiency without using any disk space. In addition, compared to the most efficient state-of-the-art k -mer counters, such as KMC and MSBWT, TahcoRoll is the only approach that can process long read data from both PacBio and Oxford Nanopore on a commodity desktop computer. The source code of TahcoRoll is implemented in C++14, and available at https://github.com/chelseaju/TahcoRoll.git ." @default.
- W2782869019 created "2018-01-26" @default.
- W2782869019 creator A5048749901 @default.
- W2782869019 creator A5049640298 @default.
- W2782869019 creator A5060007972 @default.
- W2782869019 creator A5073504925 @default.
- W2782869019 creator A5086214932 @default.
- W2782869019 date "2017-12-06" @default.
- W2782869019 modified "2023-09-23" @default.
- W2782869019 title "TahcoRoll: An Efficient Approach for Signature Profiling in Genomic Data through Variable-Length k-mers" @default.
- W2782869019 cites W1971581375 @default.
- W2782869019 cites W1972418517 @default.
- W2782869019 cites W1984081611 @default.
- W2782869019 cites W2003347102 @default.
- W2782869019 cites W2014464285 @default.
- W2782869019 cites W2027667941 @default.
- W2782869019 cites W2037444377 @default.
- W2782869019 cites W2051304374 @default.
- W2782869019 cites W2053229585 @default.
- W2782869019 cites W2054841963 @default.
- W2782869019 cites W2057253402 @default.
- W2782869019 cites W2065128082 @default.
- W2782869019 cites W2080234606 @default.
- W2782869019 cites W2080722955 @default.
- W2782869019 cites W2096128575 @default.
- W2782869019 cites W2099964107 @default.
- W2782869019 cites W2112876600 @default.
- W2782869019 cites W2115546424 @default.
- W2782869019 cites W2121873871 @default.
- W2782869019 cites W2135515858 @default.
- W2782869019 cites W2159954944 @default.
- W2782869019 cites W2163584430 @default.
- W2782869019 cites W2171003081 @default.
- W2782869019 cites W2266239166 @default.
- W2782869019 cites W2323326409 @default.
- W2782869019 cites W2411730464 @default.
- W2782869019 cites W2616931575 @default.
- W2782869019 cites W2952932047 @default.
- W2782869019 cites W4230266413 @default.
- W2782869019 doi "https://doi.org/10.1101/229708" @default.
- W2782869019 hasPublicationYear "2017" @default.
- W2782869019 type Work @default.
- W2782869019 sameAs 2782869019 @default.
- W2782869019 citedByCount "0" @default.
- W2782869019 crossrefType "posted-content" @default.
- W2782869019 hasAuthorship W2782869019A5048749901 @default.
- W2782869019 hasAuthorship W2782869019A5049640298 @default.
- W2782869019 hasAuthorship W2782869019A5060007972 @default.
- W2782869019 hasAuthorship W2782869019A5073504925 @default.
- W2782869019 hasAuthorship W2782869019A5086214932 @default.
- W2782869019 hasBestOaLocation W27828690191 @default.
- W2782869019 hasConcept C104317684 @default.
- W2782869019 hasConcept C111919701 @default.
- W2782869019 hasConcept C11413529 @default.
- W2782869019 hasConcept C124101348 @default.
- W2782869019 hasConcept C132525143 @default.
- W2782869019 hasConcept C141231307 @default.
- W2782869019 hasConcept C187191949 @default.
- W2782869019 hasConcept C20218877 @default.
- W2782869019 hasConcept C2279292 @default.
- W2782869019 hasConcept C38652104 @default.
- W2782869019 hasConcept C41008148 @default.
- W2782869019 hasConcept C54355233 @default.
- W2782869019 hasConcept C70721500 @default.
- W2782869019 hasConcept C80444323 @default.
- W2782869019 hasConcept C86803240 @default.
- W2782869019 hasConcept C99138194 @default.
- W2782869019 hasConceptScore W2782869019C104317684 @default.
- W2782869019 hasConceptScore W2782869019C111919701 @default.
- W2782869019 hasConceptScore W2782869019C11413529 @default.
- W2782869019 hasConceptScore W2782869019C124101348 @default.
- W2782869019 hasConceptScore W2782869019C132525143 @default.
- W2782869019 hasConceptScore W2782869019C141231307 @default.
- W2782869019 hasConceptScore W2782869019C187191949 @default.
- W2782869019 hasConceptScore W2782869019C20218877 @default.
- W2782869019 hasConceptScore W2782869019C2279292 @default.
- W2782869019 hasConceptScore W2782869019C38652104 @default.
- W2782869019 hasConceptScore W2782869019C41008148 @default.
- W2782869019 hasConceptScore W2782869019C54355233 @default.
- W2782869019 hasConceptScore W2782869019C70721500 @default.
- W2782869019 hasConceptScore W2782869019C80444323 @default.
- W2782869019 hasConceptScore W2782869019C86803240 @default.
- W2782869019 hasConceptScore W2782869019C99138194 @default.
- W2782869019 hasLocation W27828690191 @default.
- W2782869019 hasOpenAccess W2782869019 @default.
- W2782869019 hasPrimaryLocation W27828690191 @default.
- W2782869019 hasRelatedWork W2027667941 @default.
- W2782869019 hasRelatedWork W2111307596 @default.
- W2782869019 hasRelatedWork W2138785419 @default.
- W2782869019 hasRelatedWork W2163584430 @default.
- W2782869019 hasRelatedWork W2182215453 @default.
- W2782869019 hasRelatedWork W2251282885 @default.
- W2782869019 hasRelatedWork W2542276527 @default.
- W2782869019 hasRelatedWork W2733391109 @default.
- W2782869019 hasRelatedWork W2736647588 @default.
- W2782869019 hasRelatedWork W2806639262 @default.
- W2782869019 hasRelatedWork W2890941206 @default.
- W2782869019 hasRelatedWork W2898982100 @default.
- W2782869019 hasRelatedWork W2945220023 @default.
- W2782869019 hasRelatedWork W3045874575 @default.