Matches in SemOpenAlex for { <https://semopenalex.org/work/W3018162653> ?p ?o ?g. }
- W3018162653 endingPage "394" @default.
- W3018162653 startingPage "381" @default.
- W3018162653 abstract "Given the popularity and elegance of k-mer-based tools, finding a space-efficient way to represent a set of k-mers is important for improving the scalability of bioinformatics analyses. One popular approach is to convert the set of k-mers into the more compact set of unitigs. We generalize this approach and formulate it as the problem of finding a smallest spectrum-preserving string set (SPSS) representation. We show that this problem is equivalent to finding a smallest path cover in a compacted de Bruijn graph. Using this reduction, we prove a lower bound on the size of the optimal SPSS and propose a greedy method called UST (Unitig-STitch) that results in a smaller representation than unitigs and is nearly optimal with respect to our lower bound. We demonstrate the usefulness of the SPSS formulation with two applications of UST. The first one is a compression algorithm, UST-Compress, which, we show, can store a set of k-mers by using an order-of-magnitude less disk space than other lossless compression tools. The second one is an exact static k-mer membership index, UST-FM, which, we show, improves index size by 10%-44% compared with other state-of-the-art low-memory indices." @default.
- W3018162653 created "2020-05-01" @default.
- W3018162653 creator A5021640421 @default.
- W3018162653 creator A5055479093 @default.
- W3018162653 date "2021-04-01" @default.
- W3018162653 modified "2023-10-06" @default.
- W3018162653 title "Representation of <i>k</i>-Mer Sets Using Spectrum-Preserving String Sets" @default.
- W3018162653 cites W1822485921 @default.
- W3018162653 cites W1964377951 @default.
- W3018162653 cites W2011657487 @default.
- W3018162653 cites W2057253402 @default.
- W3018162653 cites W2096128575 @default.
- W3018162653 cites W2101250487 @default.
- W3018162653 cites W2127768708 @default.
- W3018162653 cites W2133531097 @default.
- W3018162653 cites W2158678815 @default.
- W3018162653 cites W2166588423 @default.
- W3018162653 cites W2278452282 @default.
- W3018162653 cites W2438121987 @default.
- W3018162653 cites W2500932352 @default.
- W3018162653 cites W2531091319 @default.
- W3018162653 cites W2538355508 @default.
- W3018162653 cites W2583363792 @default.
- W3018162653 cites W2735897904 @default.
- W3018162653 cites W2759261668 @default.
- W3018162653 cites W2763390627 @default.
- W3018162653 cites W2788228074 @default.
- W3018162653 cites W2809002316 @default.
- W3018162653 cites W2887897252 @default.
- W3018162653 cites W2913847081 @default.
- W3018162653 cites W2922407406 @default.
- W3018162653 cites W2937044610 @default.
- W3018162653 cites W2949074212 @default.
- W3018162653 cites W2949672921 @default.
- W3018162653 cites W2962686126 @default.
- W3018162653 cites W2967769217 @default.
- W3018162653 cites W2972805712 @default.
- W3018162653 cites W3085875665 @default.
- W3018162653 cites W3150463527 @default.
- W3018162653 cites W2899681763 @default.
- W3018162653 doi "https://doi.org/10.1089/cmb.2020.0431" @default.
- W3018162653 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/8066325" @default.
- W3018162653 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/33290137" @default.
- W3018162653 hasPublicationYear "2021" @default.
- W3018162653 type Work @default.
- W3018162653 sameAs 3018162653 @default.
- W3018162653 citedByCount "20" @default.
- W3018162653 countsByYear W30181626532021 @default.
- W3018162653 countsByYear W30181626532022 @default.
- W3018162653 countsByYear W30181626532023 @default.
- W3018162653 crossrefType "journal-article" @default.
- W3018162653 hasAuthorship W3018162653A5021640421 @default.
- W3018162653 hasAuthorship W3018162653A5055479093 @default.
- W3018162653 hasBestOaLocation W30181626532 @default.
- W3018162653 hasConcept C111335779 @default.
- W3018162653 hasConcept C11413529 @default.
- W3018162653 hasConcept C114614502 @default.
- W3018162653 hasConcept C118615104 @default.
- W3018162653 hasConcept C132525143 @default.
- W3018162653 hasConcept C134306372 @default.
- W3018162653 hasConcept C157486923 @default.
- W3018162653 hasConcept C177264268 @default.
- W3018162653 hasConcept C17744445 @default.
- W3018162653 hasConcept C199360897 @default.
- W3018162653 hasConcept C199539241 @default.
- W3018162653 hasConcept C20218877 @default.
- W3018162653 hasConcept C2524010 @default.
- W3018162653 hasConcept C2776359362 @default.
- W3018162653 hasConcept C33923547 @default.
- W3018162653 hasConcept C37914503 @default.
- W3018162653 hasConcept C41008148 @default.
- W3018162653 hasConcept C48044578 @default.
- W3018162653 hasConcept C77088390 @default.
- W3018162653 hasConcept C77553402 @default.
- W3018162653 hasConcept C78548338 @default.
- W3018162653 hasConcept C80444323 @default.
- W3018162653 hasConcept C81081738 @default.
- W3018162653 hasConcept C94625758 @default.
- W3018162653 hasConceptScore W3018162653C111335779 @default.
- W3018162653 hasConceptScore W3018162653C11413529 @default.
- W3018162653 hasConceptScore W3018162653C114614502 @default.
- W3018162653 hasConceptScore W3018162653C118615104 @default.
- W3018162653 hasConceptScore W3018162653C132525143 @default.
- W3018162653 hasConceptScore W3018162653C134306372 @default.
- W3018162653 hasConceptScore W3018162653C157486923 @default.
- W3018162653 hasConceptScore W3018162653C177264268 @default.
- W3018162653 hasConceptScore W3018162653C17744445 @default.
- W3018162653 hasConceptScore W3018162653C199360897 @default.
- W3018162653 hasConceptScore W3018162653C199539241 @default.
- W3018162653 hasConceptScore W3018162653C20218877 @default.
- W3018162653 hasConceptScore W3018162653C2524010 @default.
- W3018162653 hasConceptScore W3018162653C2776359362 @default.
- W3018162653 hasConceptScore W3018162653C33923547 @default.
- W3018162653 hasConceptScore W3018162653C37914503 @default.
- W3018162653 hasConceptScore W3018162653C41008148 @default.
- W3018162653 hasConceptScore W3018162653C48044578 @default.
- W3018162653 hasConceptScore W3018162653C77088390 @default.
- W3018162653 hasConceptScore W3018162653C77553402 @default.