Matches in SemOpenAlex for { <https://semopenalex.org/work/W4381887729> ?p ?o ?g. }
- W4381887729 abstract "Sequence annotation plays a crucial role in bioinformatics, and relies on large-scale sequence database searches with sequence alignment tools. The accuracy of sequence alignment tools depends on their ability to differentiate between true homology and similarity due to random or biased processes. Tool developers rely on benchmarks to evaluate accuracy; these benchmarks consist of trusted homologs for measuring sensitivity and some form of decoy sequences for measuring false labeling. Various methods exist for producing decoys, including shuffling and random sampling; however, these methods may underestimate the rate of false positives by disregarding sources of bias often found in biological sequences. An alternative approach is the use of reversed biological sequences as decoys, because these preserve the distribution of letters and the existence of tandem repeats, while disrupting the original sequence’s functional properties. However, even randomly shuffled sequences contain palindromic elements that are on average longer than the longest common substrings shared between shuffled variants of the same sequence, leading to concerns that reversed sequences may retain signal of homology that has been thought to be destroyed during reversal. Motivated by concerns that this may cause the false positive rate of sequence annotation tools to be overestimated by benchmarks containing reversed sequences, we investigate the effect of reversal on sequence alignment score. We demonstrate that alignment between sequences and their reversals, even shuffled sequences with no notable repetitive or low-complexity regions, produces higher scores than alignment between truly unrelated sequences. We further demonstrate that this elevated score of alignment to reversals holds true even when the reversed sequences are subjected to moderate levels of mutation. As a result of these observations, we counsel against using reversed sequences as false-annotation benchmarks for sequence database search tools." @default.
- W4381887729 created "2023-06-25" @default.
- W4381887729 creator A5073174813 @default.
- W4381887729 creator A5092253519 @default.
- W4381887729 date "2023-06-22" @default.
- W4381887729 modified "2023-10-18" @default.
- W4381887729 title "WAS IT A MATch I SAW? Approximate palindromes lead to overstated false match rates in benchmarks using reversed sequences" @default.
- W4381887729 cites W1498512229 @default.
- W4381887729 cites W1967191258 @default.
- W4381887729 cites W1969465720 @default.
- W4381887729 cites W1996112765 @default.
- W4381887729 cites W2033807426 @default.
- W4381887729 cites W2041460881 @default.
- W4381887729 cites W2045067706 @default.
- W4381887729 cites W2057197272 @default.
- W4381887729 cites W2094519647 @default.
- W4381887729 cites W2097632784 @default.
- W4381887729 cites W2106263380 @default.
- W4381887729 cites W2112814753 @default.
- W4381887729 cites W2118581189 @default.
- W4381887729 cites W2122665495 @default.
- W4381887729 cites W2135621733 @default.
- W4381887729 cites W2136145671 @default.
- W4381887729 cites W2138122982 @default.
- W4381887729 cites W2143210482 @default.
- W4381887729 cites W2151831732 @default.
- W4381887729 cites W2158714788 @default.
- W4381887729 cites W2166637863 @default.
- W4381887729 cites W2212764171 @default.
- W4381887729 cites W2303521084 @default.
- W4381887729 cites W2888145406 @default.
- W4381887729 cites W2950954328 @default.
- W4381887729 cites W2984761660 @default.
- W4381887729 cites W3129580143 @default.
- W4381887729 cites W3143063265 @default.
- W4381887729 cites W3156366691 @default.
- W4381887729 cites W4236236547 @default.
- W4381887729 cites W4309506674 @default.
- W4381887729 cites W938539187 @default.
- W4381887729 doi "https://doi.org/10.1101/2023.06.19.545636" @default.
- W4381887729 hasPublicationYear "2023" @default.
- W4381887729 type Work @default.
- W4381887729 citedByCount "1" @default.
- W4381887729 crossrefType "posted-content" @default.
- W4381887729 hasAuthorship W4381887729A5073174813 @default.
- W4381887729 hasAuthorship W4381887729A5092253519 @default.
- W4381887729 hasBestOaLocation W43818877291 @default.
- W4381887729 hasConcept C104317684 @default.
- W4381887729 hasConcept C11413529 @default.
- W4381887729 hasConcept C153180895 @default.
- W4381887729 hasConcept C154945302 @default.
- W4381887729 hasConcept C167625842 @default.
- W4381887729 hasConcept C167927819 @default.
- W4381887729 hasConcept C177264268 @default.
- W4381887729 hasConcept C180384323 @default.
- W4381887729 hasConcept C182407805 @default.
- W4381887729 hasConcept C199360897 @default.
- W4381887729 hasConcept C2776321320 @default.
- W4381887729 hasConcept C2778112365 @default.
- W4381887729 hasConcept C41008148 @default.
- W4381887729 hasConcept C44667518 @default.
- W4381887729 hasConcept C45484198 @default.
- W4381887729 hasConcept C54355233 @default.
- W4381887729 hasConcept C64869954 @default.
- W4381887729 hasConcept C70721500 @default.
- W4381887729 hasConcept C86803240 @default.
- W4381887729 hasConcept C98108389 @default.
- W4381887729 hasConceptScore W4381887729C104317684 @default.
- W4381887729 hasConceptScore W4381887729C11413529 @default.
- W4381887729 hasConceptScore W4381887729C153180895 @default.
- W4381887729 hasConceptScore W4381887729C154945302 @default.
- W4381887729 hasConceptScore W4381887729C167625842 @default.
- W4381887729 hasConceptScore W4381887729C167927819 @default.
- W4381887729 hasConceptScore W4381887729C177264268 @default.
- W4381887729 hasConceptScore W4381887729C180384323 @default.
- W4381887729 hasConceptScore W4381887729C182407805 @default.
- W4381887729 hasConceptScore W4381887729C199360897 @default.
- W4381887729 hasConceptScore W4381887729C2776321320 @default.
- W4381887729 hasConceptScore W4381887729C2778112365 @default.
- W4381887729 hasConceptScore W4381887729C41008148 @default.
- W4381887729 hasConceptScore W4381887729C44667518 @default.
- W4381887729 hasConceptScore W4381887729C45484198 @default.
- W4381887729 hasConceptScore W4381887729C54355233 @default.
- W4381887729 hasConceptScore W4381887729C64869954 @default.
- W4381887729 hasConceptScore W4381887729C70721500 @default.
- W4381887729 hasConceptScore W4381887729C86803240 @default.
- W4381887729 hasConceptScore W4381887729C98108389 @default.
- W4381887729 hasLocation W43818877291 @default.
- W4381887729 hasOpenAccess W4381887729 @default.
- W4381887729 hasPrimaryLocation W43818877291 @default.
- W4381887729 hasRelatedWork W1783789643 @default.
- W4381887729 hasRelatedWork W1977143711 @default.
- W4381887729 hasRelatedWork W1992408360 @default.
- W4381887729 hasRelatedWork W2124984338 @default.
- W4381887729 hasRelatedWork W2152267201 @default.
- W4381887729 hasRelatedWork W2350198651 @default.
- W4381887729 hasRelatedWork W2796706644 @default.
- W4381887729 hasRelatedWork W3000366369 @default.
- W4381887729 hasRelatedWork W4381887729 @default.
- W4381887729 hasRelatedWork W88386512 @default.