Matches in SemOpenAlex for { <https://semopenalex.org/work/W1815618914> ?p ?o ?g. }
- W1815618914 abstract "Percentage Identity (PID) is frequently quoted in discussion of sequence alignments since it appears simple and easy to understand. However, although there are several different ways to calculate percentage identity and each may yield a different result for the same alignment, the method of calculation is rarely reported. Accordingly, quantification of the variation in PID caused by the different calculations would help in interpreting PID values in the literature. In this study, the variation in PID was quantified systematically on a reference set of 1028 alignments generated by comparison of the protein three-dimensional structures. Since the alignment algorithm may also affect the range of PID, this study also considered the effect of algorithm, and the combination of algorithm and PID method.The maximum variation in PID due to the calculation method was 11.5% while the effect of alignment algorithm on PID was up to 14.6% across three popular alignment methods. The combined effect of alignment algorithm and PID calculation gave a variation of up to 22% on the test data, with an average of 5.3% +/- 2.8% for sequence pairs with < 30% identity. In order to see which PID method was most highly correlated with structural similarity, four different PID calculations were compared to similarity scores (Sc) from the comparison of the corresponding protein three-dimensional structures. The highest correlation coefficient for a PID calculation was 0.80. In contrast, the more sophisticated Z-score calculated by reference to randomized sequences gave a correlation coefficient of 0.84.Although it is well known amongst expert sequence analysts that PID is a poor score for discriminating between protein sequences, the apparent simplicity of the percentage identity score encourages its widespread use in establishing cutoffs for structural similarity. This paper illustrates that not only is PID a poor measure of sequence similarity when compared to the Z-score, but that there is also a large uncertainty in reported PID values. Since better alternatives to PID exist to quantify sequence similarity, these should be quoted where possible in preference to PID. The findings presented here should prove helpful to those new to sequence analysis, and in warning those who seek to interpret the value of a PID reported in the literature." @default.
- W1815618914 created "2016-06-24" @default.
- W1815618914 creator A5051938893 @default.
- W1815618914 creator A5082043302 @default.
- W1815618914 date "2006-09-19" @default.
- W1815618914 modified "2023-10-14" @default.
- W1815618914 title "Quantification of the variation in percentage identity for protein sequence alignments" @default.
- W1815618914 cites W1502621331 @default.
- W1815618914 cites W1524940515 @default.
- W1815618914 cites W1527979595 @default.
- W1815618914 cites W1567621547 @default.
- W1815618914 cites W1604980477 @default.
- W1815618914 cites W2005313213 @default.
- W1815618914 cites W2054211501 @default.
- W1815618914 cites W2074231493 @default.
- W1815618914 cites W2075610668 @default.
- W1815618914 cites W2084367350 @default.
- W1815618914 cites W2094519647 @default.
- W1815618914 cites W2099254366 @default.
- W1815618914 cites W2106882534 @default.
- W1815618914 cites W2119511059 @default.
- W1815618914 cites W2142131700 @default.
- W1815618914 doi "https://doi.org/10.1186/1471-2105-7-415" @default.
- W1815618914 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/1592310" @default.
- W1815618914 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/16984632" @default.
- W1815618914 hasPublicationYear "2006" @default.
- W1815618914 type Work @default.
- W1815618914 sameAs 1815618914 @default.
- W1815618914 citedByCount "51" @default.
- W1815618914 countsByYear W18156189142012 @default.
- W1815618914 countsByYear W18156189142013 @default.
- W1815618914 countsByYear W18156189142014 @default.
- W1815618914 countsByYear W18156189142015 @default.
- W1815618914 countsByYear W18156189142016 @default.
- W1815618914 countsByYear W18156189142017 @default.
- W1815618914 countsByYear W18156189142018 @default.
- W1815618914 countsByYear W18156189142019 @default.
- W1815618914 countsByYear W18156189142020 @default.
- W1815618914 countsByYear W18156189142021 @default.
- W1815618914 countsByYear W18156189142022 @default.
- W1815618914 countsByYear W18156189142023 @default.
- W1815618914 crossrefType "journal-article" @default.
- W1815618914 hasAuthorship W1815618914A5051938893 @default.
- W1815618914 hasAuthorship W1815618914A5082043302 @default.
- W1815618914 hasBestOaLocation W18156189141 @default.
- W1815618914 hasConcept C103278499 @default.
- W1815618914 hasConcept C105795698 @default.
- W1815618914 hasConcept C11413529 @default.
- W1815618914 hasConcept C115961682 @default.
- W1815618914 hasConcept C117220453 @default.
- W1815618914 hasConcept C121332964 @default.
- W1815618914 hasConcept C154945302 @default.
- W1815618914 hasConcept C159985019 @default.
- W1815618914 hasConcept C177264268 @default.
- W1815618914 hasConcept C192562407 @default.
- W1815618914 hasConcept C199360897 @default.
- W1815618914 hasConcept C204323151 @default.
- W1815618914 hasConcept C24890656 @default.
- W1815618914 hasConcept C2524010 @default.
- W1815618914 hasConcept C2778334786 @default.
- W1815618914 hasConcept C2778355321 @default.
- W1815618914 hasConcept C2780092901 @default.
- W1815618914 hasConcept C33923547 @default.
- W1815618914 hasConcept C41008148 @default.
- W1815618914 hasConcept C44870925 @default.
- W1815618914 hasConcept C47116090 @default.
- W1815618914 hasConcept C536315585 @default.
- W1815618914 hasConcept C89838059 @default.
- W1815618914 hasConcept C97355855 @default.
- W1815618914 hasConceptScore W1815618914C103278499 @default.
- W1815618914 hasConceptScore W1815618914C105795698 @default.
- W1815618914 hasConceptScore W1815618914C11413529 @default.
- W1815618914 hasConceptScore W1815618914C115961682 @default.
- W1815618914 hasConceptScore W1815618914C117220453 @default.
- W1815618914 hasConceptScore W1815618914C121332964 @default.
- W1815618914 hasConceptScore W1815618914C154945302 @default.
- W1815618914 hasConceptScore W1815618914C159985019 @default.
- W1815618914 hasConceptScore W1815618914C177264268 @default.
- W1815618914 hasConceptScore W1815618914C192562407 @default.
- W1815618914 hasConceptScore W1815618914C199360897 @default.
- W1815618914 hasConceptScore W1815618914C204323151 @default.
- W1815618914 hasConceptScore W1815618914C24890656 @default.
- W1815618914 hasConceptScore W1815618914C2524010 @default.
- W1815618914 hasConceptScore W1815618914C2778334786 @default.
- W1815618914 hasConceptScore W1815618914C2778355321 @default.
- W1815618914 hasConceptScore W1815618914C2780092901 @default.
- W1815618914 hasConceptScore W1815618914C33923547 @default.
- W1815618914 hasConceptScore W1815618914C41008148 @default.
- W1815618914 hasConceptScore W1815618914C44870925 @default.
- W1815618914 hasConceptScore W1815618914C47116090 @default.
- W1815618914 hasConceptScore W1815618914C536315585 @default.
- W1815618914 hasConceptScore W1815618914C89838059 @default.
- W1815618914 hasConceptScore W1815618914C97355855 @default.
- W1815618914 hasIssue "1" @default.
- W1815618914 hasLocation W18156189141 @default.
- W1815618914 hasLocation W18156189142 @default.
- W1815618914 hasLocation W18156189143 @default.
- W1815618914 hasLocation W18156189144 @default.
- W1815618914 hasLocation W18156189145 @default.
- W1815618914 hasLocation W18156189146 @default.