Matches in SemOpenAlex for { <https://semopenalex.org/work/W3175899557> ?p ?o ?g. }
- W3175899557 abstract "Abstract K-mer based methods have become prevalent in many areas of bioinformatics. In applications such as database search, they often work with large multi-terabyte-sized datasets. Storing such large datasets is a detriment to tool developers, tool users, and reproducibility efforts. General purpose compressors like gzip, or those designed for read data, are sub-optimal because they do not take into account the specific redundancy pattern in k-mer sets. In our earlier work (Rahman and Medvedev, RECOMB 2020), we presented an algorithm UST-Compress that uses a spectrum-preserving string set representation to compress a set of k-mers to disk. In this paper, we present two improved methods for disk compression of k-mer sets, called ESS-Compress and ESS-Tip-Compress. They use a more relaxed notion of string set representation to further remove redundancy from the representation of UST-Compress. We explore their behavior both theoretically and on real data. We show that they improve the compression sizes achieved by UST-Compress by up to 27 percent, across a breadth of datasets. We also derive lower bounds on how well this type of compression strategy can hope to do." @default.
- W3175899557 created "2021-07-05" @default.
- W3175899557 creator A5055479093 @default.
- W3175899557 creator A5058310668 @default.
- W3175899557 creator A5066909672 @default.
- W3175899557 date "2021-06-21" @default.
- W3175899557 modified "2023-10-14" @default.
- W3175899557 title "Disk compression of k-mer sets" @default.
- W3175899557 cites W1964377951 @default.
- W3175899557 cites W2057253402 @default.
- W3175899557 cites W2096128575 @default.
- W3175899557 cites W2120902911 @default.
- W3175899557 cites W2127768708 @default.
- W3175899557 cites W2158678815 @default.
- W3175899557 cites W2159954944 @default.
- W3175899557 cites W2266239166 @default.
- W3175899557 cites W2438121987 @default.
- W3175899557 cites W2531091319 @default.
- W3175899557 cites W2538355508 @default.
- W3175899557 cites W2543131432 @default.
- W3175899557 cites W2583363792 @default.
- W3175899557 cites W2596935532 @default.
- W3175899557 cites W2809649683 @default.
- W3175899557 cites W2883729812 @default.
- W3175899557 cites W2884435343 @default.
- W3175899557 cites W2903772466 @default.
- W3175899557 cites W2913847081 @default.
- W3175899557 cites W2922407406 @default.
- W3175899557 cites W2949074212 @default.
- W3175899557 cites W2949503312 @default.
- W3175899557 cites W2950150251 @default.
- W3175899557 cites W2951230321 @default.
- W3175899557 cites W2951594653 @default.
- W3175899557 cites W2959670948 @default.
- W3175899557 cites W2964150838 @default.
- W3175899557 cites W2969627863 @default.
- W3175899557 cites W2979522844 @default.
- W3175899557 cites W2994050678 @default.
- W3175899557 cites W3000185529 @default.
- W3175899557 cites W3013820283 @default.
- W3175899557 cites W3147478436 @default.
- W3175899557 cites W6247929 @default.
- W3175899557 doi "https://doi.org/10.1186/s13015-021-00192-7" @default.
- W3175899557 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/8218509" @default.
- W3175899557 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/34154632" @default.
- W3175899557 hasPublicationYear "2021" @default.
- W3175899557 type Work @default.
- W3175899557 sameAs 3175899557 @default.
- W3175899557 citedByCount "9" @default.
- W3175899557 countsByYear W31758995572021 @default.
- W3175899557 countsByYear W31758995572022 @default.
- W3175899557 countsByYear W31758995572023 @default.
- W3175899557 crossrefType "journal-article" @default.
- W3175899557 hasAuthorship W3175899557A5055479093 @default.
- W3175899557 hasAuthorship W3175899557A5058310668 @default.
- W3175899557 hasAuthorship W3175899557A5066909672 @default.
- W3175899557 hasBestOaLocation W31758995571 @default.
- W3175899557 hasConcept C111919701 @default.
- W3175899557 hasConcept C11413529 @default.
- W3175899557 hasConcept C124101348 @default.
- W3175899557 hasConcept C127413603 @default.
- W3175899557 hasConcept C152124472 @default.
- W3175899557 hasConcept C157486923 @default.
- W3175899557 hasConcept C159985019 @default.
- W3175899557 hasConcept C171146098 @default.
- W3175899557 hasConcept C177264268 @default.
- W3175899557 hasConcept C17744445 @default.
- W3175899557 hasConcept C180016635 @default.
- W3175899557 hasConcept C192562407 @default.
- W3175899557 hasConcept C199360897 @default.
- W3175899557 hasConcept C199539241 @default.
- W3175899557 hasConcept C199683683 @default.
- W3175899557 hasConcept C25797200 @default.
- W3175899557 hasConcept C2776359362 @default.
- W3175899557 hasConcept C33923547 @default.
- W3175899557 hasConcept C37914503 @default.
- W3175899557 hasConcept C41008148 @default.
- W3175899557 hasConcept C511840579 @default.
- W3175899557 hasConcept C78548338 @default.
- W3175899557 hasConcept C80444323 @default.
- W3175899557 hasConcept C81081738 @default.
- W3175899557 hasConcept C94625758 @default.
- W3175899557 hasConceptScore W3175899557C111919701 @default.
- W3175899557 hasConceptScore W3175899557C11413529 @default.
- W3175899557 hasConceptScore W3175899557C124101348 @default.
- W3175899557 hasConceptScore W3175899557C127413603 @default.
- W3175899557 hasConceptScore W3175899557C152124472 @default.
- W3175899557 hasConceptScore W3175899557C157486923 @default.
- W3175899557 hasConceptScore W3175899557C159985019 @default.
- W3175899557 hasConceptScore W3175899557C171146098 @default.
- W3175899557 hasConceptScore W3175899557C177264268 @default.
- W3175899557 hasConceptScore W3175899557C17744445 @default.
- W3175899557 hasConceptScore W3175899557C180016635 @default.
- W3175899557 hasConceptScore W3175899557C192562407 @default.
- W3175899557 hasConceptScore W3175899557C199360897 @default.
- W3175899557 hasConceptScore W3175899557C199539241 @default.
- W3175899557 hasConceptScore W3175899557C199683683 @default.
- W3175899557 hasConceptScore W3175899557C25797200 @default.
- W3175899557 hasConceptScore W3175899557C2776359362 @default.
- W3175899557 hasConceptScore W3175899557C33923547 @default.