Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387665655> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W4387665655 abstract "Substrings of length k, commonly referred to as k-mers, play a vital role in sequence analysis, reducing the search space by providing anchors between queries and references. However, k-mers are limited to exact matches between sequences. This has led to alternative constructs, such as spaced k-mers, that can match across substitutions. We recently introduced a class of new constructs, strobemers, that can match across substitutions and smaller insertions and deletions. Randstrobes, the most sensitive strobemer proposed in (Sahlin, 2021), has been incorporated into several bioinformatics applications such as read classification, short read mapping, and read overlap detection. Randstrobes are constructed by linking together k-mers in a pseudo-random fashion and depend on a hash function, a link function, and a comparator for their construction. Recently, we showed that the more random this linking appears (measured in entropy), the more efficient the seeds for sequence similarity analysis. The level of pseudo-randomness will depend on the hashing, linking, and comparison operators. However, no study has investigated the efficacy of the underlying operators to produce randstrobes. In this study, we propose several new construction methods. One of our proposed methods is based on a Binary Search Tree (BST), which lowers the time complexity and practical runtime to other methods for some parametrizations. To our knowledge, we are also the first to describe and study the types of biases that occur during construction. We designed three metrics to measure the bias. Using these new evaluation metrics, we uncovered biases and limitations in previous methods and showed that our proposed methods have favorable speed and sampling uniformity to previously proposed methods. Lastly, guided by our results, we change the seed construction in strobealign, a short-read mapper, and find that the results change substantially. Also, we suggest combining the two versions to improve accuracy for the shortest reads in our evaluated datasets. Our evaluation highlights sampling biases that can occur and provides guidance on which operators to use when implementing randstrobes." @default.
- W4387665655 created "2023-10-17" @default.
- W4387665655 creator A5010860687 @default.
- W4387665655 creator A5014584023 @default.
- W4387665655 creator A5017535979 @default.
- W4387665655 creator A5018738102 @default.
- W4387665655 creator A5040820400 @default.
- W4387665655 creator A5043281019 @default.
- W4387665655 creator A5082939519 @default.
- W4387665655 creator A5084971906 @default.
- W4387665655 creator A5087432763 @default.
- W4387665655 creator A5088798234 @default.
- W4387665655 date "2023-10-16" @default.
- W4387665655 modified "2023-10-17" @default.
- W4387665655 title "Designing efficient randstrobes for sequence similarity analyses" @default.
- W4387665655 doi "https://doi.org/10.1101/2023.10.11.561924" @default.
- W4387665655 hasPublicationYear "2023" @default.
- W4387665655 type Work @default.
- W4387665655 citedByCount "0" @default.
- W4387665655 crossrefType "posted-content" @default.
- W4387665655 hasAuthorship W4387665655A5010860687 @default.
- W4387665655 hasAuthorship W4387665655A5014584023 @default.
- W4387665655 hasAuthorship W4387665655A5017535979 @default.
- W4387665655 hasAuthorship W4387665655A5018738102 @default.
- W4387665655 hasAuthorship W4387665655A5040820400 @default.
- W4387665655 hasAuthorship W4387665655A5043281019 @default.
- W4387665655 hasAuthorship W4387665655A5082939519 @default.
- W4387665655 hasAuthorship W4387665655A5084971906 @default.
- W4387665655 hasAuthorship W4387665655A5087432763 @default.
- W4387665655 hasAuthorship W4387665655A5088798234 @default.
- W4387665655 hasBestOaLocation W43876656551 @default.
- W4387665655 hasConcept C103278499 @default.
- W4387665655 hasConcept C105795698 @default.
- W4387665655 hasConcept C106301342 @default.
- W4387665655 hasConcept C11413529 @default.
- W4387665655 hasConcept C115961682 @default.
- W4387665655 hasConcept C121332964 @default.
- W4387665655 hasConcept C124101348 @default.
- W4387665655 hasConcept C125112378 @default.
- W4387665655 hasConcept C14036430 @default.
- W4387665655 hasConcept C154945302 @default.
- W4387665655 hasConcept C162319229 @default.
- W4387665655 hasConcept C182407805 @default.
- W4387665655 hasConcept C199360897 @default.
- W4387665655 hasConcept C2778112365 @default.
- W4387665655 hasConcept C33923547 @default.
- W4387665655 hasConcept C38652104 @default.
- W4387665655 hasConcept C41008148 @default.
- W4387665655 hasConcept C54355233 @default.
- W4387665655 hasConcept C62520636 @default.
- W4387665655 hasConcept C78458016 @default.
- W4387665655 hasConcept C80444323 @default.
- W4387665655 hasConcept C86803240 @default.
- W4387665655 hasConcept C99138194 @default.
- W4387665655 hasConceptScore W4387665655C103278499 @default.
- W4387665655 hasConceptScore W4387665655C105795698 @default.
- W4387665655 hasConceptScore W4387665655C106301342 @default.
- W4387665655 hasConceptScore W4387665655C11413529 @default.
- W4387665655 hasConceptScore W4387665655C115961682 @default.
- W4387665655 hasConceptScore W4387665655C121332964 @default.
- W4387665655 hasConceptScore W4387665655C124101348 @default.
- W4387665655 hasConceptScore W4387665655C125112378 @default.
- W4387665655 hasConceptScore W4387665655C14036430 @default.
- W4387665655 hasConceptScore W4387665655C154945302 @default.
- W4387665655 hasConceptScore W4387665655C162319229 @default.
- W4387665655 hasConceptScore W4387665655C182407805 @default.
- W4387665655 hasConceptScore W4387665655C199360897 @default.
- W4387665655 hasConceptScore W4387665655C2778112365 @default.
- W4387665655 hasConceptScore W4387665655C33923547 @default.
- W4387665655 hasConceptScore W4387665655C38652104 @default.
- W4387665655 hasConceptScore W4387665655C41008148 @default.
- W4387665655 hasConceptScore W4387665655C54355233 @default.
- W4387665655 hasConceptScore W4387665655C62520636 @default.
- W4387665655 hasConceptScore W4387665655C78458016 @default.
- W4387665655 hasConceptScore W4387665655C80444323 @default.
- W4387665655 hasConceptScore W4387665655C86803240 @default.
- W4387665655 hasConceptScore W4387665655C99138194 @default.
- W4387665655 hasLocation W43876656551 @default.
- W4387665655 hasOpenAccess W4387665655 @default.
- W4387665655 hasPrimaryLocation W43876656551 @default.
- W4387665655 hasRelatedWork W1583922594 @default.
- W4387665655 hasRelatedWork W1953626159 @default.
- W4387665655 hasRelatedWork W1974038726 @default.
- W4387665655 hasRelatedWork W2945511280 @default.
- W4387665655 hasRelatedWork W2951756867 @default.
- W4387665655 hasRelatedWork W2998448420 @default.
- W4387665655 hasRelatedWork W3035605494 @default.
- W4387665655 hasRelatedWork W4280502676 @default.
- W4387665655 hasRelatedWork W4299420056 @default.
- W4387665655 hasRelatedWork W4304731099 @default.
- W4387665655 isParatext "false" @default.
- W4387665655 isRetracted "false" @default.
- W4387665655 workType "article" @default.