Matches in SemOpenAlex for { <https://semopenalex.org/work/W2020255088> ?p ?o ?g. }
- W2020255088 abstract "Next Generation Sequencing technologies have revolutionized many fields in biology by reducing the time and cost required for sequencing. As a result, large amounts of sequencing data are being generated. A typical sequencing data file may occupy tens or even hundreds of gigabytes of disk space, prohibitively large for many users. This data consists of both the nucleotide sequences and per-base quality scores that indicate the level of confidence in the readout of these sequences. Quality scores account for about half of the required disk space in the commonly used FASTQ format (before compression), and therefore the compression of the quality scores can significantly reduce storage requirements and speed up analysis and transmission of sequencing data.In this paper, we present a new scheme for the lossy compression of the quality scores, to address the problem of storage. Our framework allows the user to specify the rate (bits per quality score) prior to compression, independent of the data to be compressed. Our algorithm can work at any rate, unlike other lossy compression algorithms. We envisage our algorithm as being part of a more general compression scheme that works with the entire FASTQ file. Numerical experiments show that we can achieve a better mean squared error (MSE) for small rates (bits per quality score) than other lossy compression schemes. For the organism PhiX, whose assembled genome is known and assumed to be correct, we show that it is possible to achieve a significant reduction in size with little compromise in performance on downstream applications (e.g., alignment).QualComp is an open source software package, written in C and freely available for download at https://sourceforge.net/projects/qualcomp." @default.
- W2020255088 created "2016-06-24" @default.
- W2020255088 creator A5001926130 @default.
- W2020255088 creator A5007666166 @default.
- W2020255088 creator A5019098931 @default.
- W2020255088 creator A5040875765 @default.
- W2020255088 creator A5043344688 @default.
- W2020255088 creator A5045294377 @default.
- W2020255088 date "2013-06-08" @default.
- W2020255088 modified "2023-10-09" @default.
- W2020255088 title "QualComp: a new lossy compressor for quality scores based on rate distortion theory" @default.
- W2020255088 cites W1969346416 @default.
- W2020255088 cites W1987370597 @default.
- W2020255088 cites W2000466469 @default.
- W2020255088 cites W2008459483 @default.
- W2020255088 cites W2044586667 @default.
- W2020255088 cites W2051929999 @default.
- W2020255088 cites W2097205777 @default.
- W2020255088 cites W2099111195 @default.
- W2020255088 cites W2103441770 @default.
- W2020255088 cites W2108234281 @default.
- W2020255088 cites W2108278098 @default.
- W2020255088 cites W2110800670 @default.
- W2020255088 cites W2111044311 @default.
- W2020255088 cites W2112113834 @default.
- W2020255088 cites W2116041602 @default.
- W2020255088 cites W2117608012 @default.
- W2020255088 cites W2119180969 @default.
- W2020255088 cites W2124985265 @default.
- W2020255088 cites W2125826054 @default.
- W2020255088 cites W2133212095 @default.
- W2020255088 cites W2137661542 @default.
- W2020255088 cites W2138196010 @default.
- W2020255088 cites W2140752095 @default.
- W2020255088 cites W2142003566 @default.
- W2020255088 cites W2146728635 @default.
- W2020255088 cites W2147492358 @default.
- W2020255088 cites W2150593711 @default.
- W2020255088 cites W2159084616 @default.
- W2020255088 cites W2159683766 @default.
- W2020255088 cites W2165272228 @default.
- W2020255088 cites W2166588423 @default.
- W2020255088 cites W2168909179 @default.
- W2020255088 doi "https://doi.org/10.1186/1471-2105-14-187" @default.
- W2020255088 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3698011" @default.
- W2020255088 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/23758828" @default.
- W2020255088 hasPublicationYear "2013" @default.
- W2020255088 type Work @default.
- W2020255088 sameAs 2020255088 @default.
- W2020255088 citedByCount "50" @default.
- W2020255088 countsByYear W20202550882013 @default.
- W2020255088 countsByYear W20202550882014 @default.
- W2020255088 countsByYear W20202550882015 @default.
- W2020255088 countsByYear W20202550882016 @default.
- W2020255088 countsByYear W20202550882017 @default.
- W2020255088 countsByYear W20202550882018 @default.
- W2020255088 countsByYear W20202550882019 @default.
- W2020255088 countsByYear W20202550882020 @default.
- W2020255088 countsByYear W20202550882021 @default.
- W2020255088 countsByYear W20202550882022 @default.
- W2020255088 countsByYear W20202550882023 @default.
- W2020255088 crossrefType "journal-article" @default.
- W2020255088 hasAuthorship W2020255088A5001926130 @default.
- W2020255088 hasAuthorship W2020255088A5007666166 @default.
- W2020255088 hasAuthorship W2020255088A5019098931 @default.
- W2020255088 hasAuthorship W2020255088A5040875765 @default.
- W2020255088 hasAuthorship W2020255088A5043344688 @default.
- W2020255088 hasAuthorship W2020255088A5045294377 @default.
- W2020255088 hasBestOaLocation W20202550881 @default.
- W2020255088 hasConcept C111919701 @default.
- W2020255088 hasConcept C113775141 @default.
- W2020255088 hasConcept C11413529 @default.
- W2020255088 hasConcept C115961682 @default.
- W2020255088 hasConcept C124101348 @default.
- W2020255088 hasConcept C127413603 @default.
- W2020255088 hasConcept C13481523 @default.
- W2020255088 hasConcept C154945302 @default.
- W2020255088 hasConcept C159985019 @default.
- W2020255088 hasConcept C165021410 @default.
- W2020255088 hasConcept C171146098 @default.
- W2020255088 hasConcept C180016635 @default.
- W2020255088 hasConcept C192562407 @default.
- W2020255088 hasConcept C25797200 @default.
- W2020255088 hasConcept C2776029614 @default.
- W2020255088 hasConcept C41008148 @default.
- W2020255088 hasConcept C511840579 @default.
- W2020255088 hasConcept C78548338 @default.
- W2020255088 hasConcept C81081738 @default.
- W2020255088 hasConcept C9417928 @default.
- W2020255088 hasConcept C94835093 @default.
- W2020255088 hasConceptScore W2020255088C111919701 @default.
- W2020255088 hasConceptScore W2020255088C113775141 @default.
- W2020255088 hasConceptScore W2020255088C11413529 @default.
- W2020255088 hasConceptScore W2020255088C115961682 @default.
- W2020255088 hasConceptScore W2020255088C124101348 @default.
- W2020255088 hasConceptScore W2020255088C127413603 @default.
- W2020255088 hasConceptScore W2020255088C13481523 @default.
- W2020255088 hasConceptScore W2020255088C154945302 @default.
- W2020255088 hasConceptScore W2020255088C159985019 @default.
- W2020255088 hasConceptScore W2020255088C165021410 @default.