Matches in SemOpenAlex for { <https://semopenalex.org/work/W3044435074> ?p ?o ?g. }
- W3044435074 abstract "Abstract Analysis of genetic sequences is usually based on finding similar parts of sequences, e.g. DNA reads and/or genomes. For big data, this is typically done via “seeds”: simple similarities (e.g. exact matches) that can be found quickly. For huge data, sparse seeding is useful, where we only consider seeds at a subset of positions in a sequence. Here we study a simple sparse-seeding method: using seeds at positions of certain “words” (e.g. ac, at, gc , or gt ). Sensitivity is maximized by using words with minimal overlaps. That is because, in a random sequence, minimally-overlapping words are anti-clumped. We provide evidence that this is often superior to acclaimed “minimizer” sparse-seeding methods. Our approach can be unified with design of inexact (spaced and subset) seeds, further boosting sensitivity. Thus, we present a promising approach to sequence similarity search, with open questions on how to optimize it." @default.
- W3044435074 created "2020-07-29" @default.
- W3044435074 creator A5012626407 @default.
- W3044435074 creator A5055679303 @default.
- W3044435074 creator A5055829507 @default.
- W3044435074 date "2020-07-26" @default.
- W3044435074 modified "2023-09-30" @default.
- W3044435074 title "Minimally-overlapping words for sequence similarity search" @default.
- W3044435074 cites W1759861793 @default.
- W3044435074 cites W1964377951 @default.
- W3044435074 cites W1972051891 @default.
- W3044435074 cites W2106532429 @default.
- W3044435074 cites W2109616780 @default.
- W3044435074 cites W2111071896 @default.
- W3044435074 cites W2111295912 @default.
- W3044435074 cites W2125266506 @default.
- W3044435074 cites W2128591967 @default.
- W3044435074 cites W2132632499 @default.
- W3044435074 cites W2136543531 @default.
- W3044435074 cites W2139371638 @default.
- W3044435074 cites W2144560237 @default.
- W3044435074 cites W2159954944 @default.
- W3044435074 cites W2160464048 @default.
- W3044435074 cites W2162758337 @default.
- W3044435074 cites W2763390627 @default.
- W3044435074 cites W2789843538 @default.
- W3044435074 cites W2792368734 @default.
- W3044435074 cites W2892689606 @default.
- W3044435074 cites W2950572599 @default.
- W3044435074 cites W2951822379 @default.
- W3044435074 cites W3007172120 @default.
- W3044435074 cites W3098719287 @default.
- W3044435074 cites W3100437611 @default.
- W3044435074 doi "https://doi.org/10.1101/2020.07.24.220616" @default.
- W3044435074 hasPublicationYear "2020" @default.
- W3044435074 type Work @default.
- W3044435074 sameAs 3044435074 @default.
- W3044435074 citedByCount "3" @default.
- W3044435074 countsByYear W30444350742021 @default.
- W3044435074 countsByYear W30444350742022 @default.
- W3044435074 crossrefType "posted-content" @default.
- W3044435074 hasAuthorship W3044435074A5012626407 @default.
- W3044435074 hasAuthorship W3044435074A5055679303 @default.
- W3044435074 hasAuthorship W3044435074A5055829507 @default.
- W3044435074 hasBestOaLocation W30444350741 @default.
- W3044435074 hasConcept C103278499 @default.
- W3044435074 hasConcept C111472728 @default.
- W3044435074 hasConcept C11413529 @default.
- W3044435074 hasConcept C115961682 @default.
- W3044435074 hasConcept C127413603 @default.
- W3044435074 hasConcept C138885662 @default.
- W3044435074 hasConcept C153180895 @default.
- W3044435074 hasConcept C154945302 @default.
- W3044435074 hasConcept C21200559 @default.
- W3044435074 hasConcept C24326235 @default.
- W3044435074 hasConcept C2778112365 @default.
- W3044435074 hasConcept C2780586882 @default.
- W3044435074 hasConcept C33923547 @default.
- W3044435074 hasConcept C36248471 @default.
- W3044435074 hasConcept C41008148 @default.
- W3044435074 hasConcept C46686674 @default.
- W3044435074 hasConcept C54355233 @default.
- W3044435074 hasConcept C6557445 @default.
- W3044435074 hasConcept C86803240 @default.
- W3044435074 hasConceptScore W3044435074C103278499 @default.
- W3044435074 hasConceptScore W3044435074C111472728 @default.
- W3044435074 hasConceptScore W3044435074C11413529 @default.
- W3044435074 hasConceptScore W3044435074C115961682 @default.
- W3044435074 hasConceptScore W3044435074C127413603 @default.
- W3044435074 hasConceptScore W3044435074C138885662 @default.
- W3044435074 hasConceptScore W3044435074C153180895 @default.
- W3044435074 hasConceptScore W3044435074C154945302 @default.
- W3044435074 hasConceptScore W3044435074C21200559 @default.
- W3044435074 hasConceptScore W3044435074C24326235 @default.
- W3044435074 hasConceptScore W3044435074C2778112365 @default.
- W3044435074 hasConceptScore W3044435074C2780586882 @default.
- W3044435074 hasConceptScore W3044435074C33923547 @default.
- W3044435074 hasConceptScore W3044435074C36248471 @default.
- W3044435074 hasConceptScore W3044435074C41008148 @default.
- W3044435074 hasConceptScore W3044435074C46686674 @default.
- W3044435074 hasConceptScore W3044435074C54355233 @default.
- W3044435074 hasConceptScore W3044435074C6557445 @default.
- W3044435074 hasConceptScore W3044435074C86803240 @default.
- W3044435074 hasLocation W30444350741 @default.
- W3044435074 hasLocation W304443507410 @default.
- W3044435074 hasLocation W304443507411 @default.
- W3044435074 hasLocation W304443507412 @default.
- W3044435074 hasLocation W304443507413 @default.
- W3044435074 hasLocation W304443507414 @default.
- W3044435074 hasLocation W304443507415 @default.
- W3044435074 hasLocation W304443507416 @default.
- W3044435074 hasLocation W30444350742 @default.
- W3044435074 hasLocation W30444350743 @default.
- W3044435074 hasLocation W30444350744 @default.
- W3044435074 hasLocation W30444350745 @default.
- W3044435074 hasLocation W30444350746 @default.
- W3044435074 hasLocation W30444350747 @default.
- W3044435074 hasLocation W30444350748 @default.
- W3044435074 hasLocation W30444350749 @default.
- W3044435074 hasOpenAccess W3044435074 @default.