Matches in SemOpenAlex for { <https://semopenalex.org/work/W2068850418> ?p ?o ?g. }
- W2068850418 endingPage "501" @default.
- W2068850418 startingPage "449" @default.
- W2068850418 abstract "In view of the fact that appearance of novel protein domain architectures (DA) is closely associated with biological innovations, there is a growing interest in the genome-scale reconstruction of the evolutionary history of the domain architectures of multidomain proteins. In such analyses, however, it is usually ignored that a significant proportion of Metazoan sequences analyzed is mispredicted and that this may seriously affect the validity of the conclusions. To estimate the contribution of errors in gene prediction to differences in DA of predicted proteins, we have used the high quality manually curated UniProtKB/Swiss-Prot database as a reference. For genome-scale analysis of domain architectures of predicted proteins we focused on RefSeq, EnsEMBL and NCBI’s GNOMON predicted sequences of Metazoan species with completely sequenced genomes. Comparison of the DA of UniProtKB/Swiss-Prot sequences of worm, fly, zebrafish, frog, chick, mouse, rat and orangutan with those of human Swiss-Prot entries have identified relatively few cases where orthologs had different DA, although the percentage with different DA increased with evolutionary distance. In contrast with this, comparison of the DA of human, orangutan, rat, mouse, chicken, frog, zebrafish, worm and fly RefSeq, EnsEMBL and NCBI’s GNOMON predicted protein sequences with those of the corresponding/orthologous human Swiss-Prot entries identified a significantly higher proportion of domain architecture differences than in the case of the comparison of Swiss-Prot entries. Analysis of RefSeq, EnsEMBL and NCBI’s GNOMON predicted protein sequences with DAs different from those of their Swiss-Prot orthologs confirmed that the higher rate of domain architecture differences is due to errors in gene prediction, the majority of which could be corrected with our FixPred protocol. We have also demonstrated that contamination of databases with incomplete, abnormal or mispredicted sequences introduces a bias in DA differences in as much as it increases the proportion of terminal over internal DA differences. Here we have shown that in the case of RefSeq, EnsEMBL and NCBI’s GNOMON predicted protein sequences of Metazoan species, the contribution of gene prediction errors to domain architecture differences of orthologs is comparable to or greater than those due to true gene rearrangements. We have also demonstrated that domain architecture comparison may serve as a useful tool for the quality control of gene predictions and may thus guide the correction of sequence errors. Our findings caution that earlier genome-scale studies based on comparison of predicted (frequently mispredicted) protein sequences may have led to some erroneous conclusions about the evolution of novel domain architectures of multidomain proteins. A reassessment of the DA evolution of orthologous and paralogous proteins is presented in an accompanying paper [1]." @default.
- W2068850418 created "2016-06-24" @default.
- W2068850418 creator A5001383415 @default.
- W2068850418 creator A5014730291 @default.
- W2068850418 creator A5052204988 @default.
- W2068850418 creator A5066228161 @default.
- W2068850418 creator A5071827404 @default.
- W2068850418 creator A5082077647 @default.
- W2068850418 date "2011-07-13" @default.
- W2068850418 modified "2023-09-29" @default.
- W2068850418 title "Reassessing Domain Architecture Evolution of Metazoan Proteins: Major Impact of Gene Prediction Errors" @default.
- W2068850418 cites W1550285268 @default.
- W2068850418 cites W1570477334 @default.
- W2068850418 cites W1581315874 @default.
- W2068850418 cites W1871466069 @default.
- W2068850418 cites W1980051733 @default.
- W2068850418 cites W1981194976 @default.
- W2068850418 cites W1982970002 @default.
- W2068850418 cites W1985471440 @default.
- W2068850418 cites W1990056477 @default.
- W2068850418 cites W1993506738 @default.
- W2068850418 cites W1997113958 @default.
- W2068850418 cites W1998931368 @default.
- W2068850418 cites W2000998778 @default.
- W2068850418 cites W2008028183 @default.
- W2068850418 cites W2008116827 @default.
- W2068850418 cites W2008856488 @default.
- W2068850418 cites W2009239589 @default.
- W2068850418 cites W2009505131 @default.
- W2068850418 cites W2016412003 @default.
- W2068850418 cites W2029833898 @default.
- W2068850418 cites W2031170804 @default.
- W2068850418 cites W2031903428 @default.
- W2068850418 cites W2035333716 @default.
- W2068850418 cites W2043092219 @default.
- W2068850418 cites W2048306434 @default.
- W2068850418 cites W2054049732 @default.
- W2068850418 cites W2056912811 @default.
- W2068850418 cites W2074007968 @default.
- W2068850418 cites W2080764633 @default.
- W2068850418 cites W2086240273 @default.
- W2068850418 cites W2094713937 @default.
- W2068850418 cites W2098354205 @default.
- W2068850418 cites W2102882929 @default.
- W2068850418 cites W2108787198 @default.
- W2068850418 cites W2117887553 @default.
- W2068850418 cites W2122281839 @default.
- W2068850418 cites W2128653811 @default.
- W2068850418 cites W2131232002 @default.
- W2068850418 cites W2132122858 @default.
- W2068850418 cites W2137096454 @default.
- W2068850418 cites W2137823340 @default.
- W2068850418 cites W2141847640 @default.
- W2068850418 cites W2142045367 @default.
- W2068850418 cites W2143335572 @default.
- W2068850418 cites W2153452535 @default.
- W2068850418 cites W2161211166 @default.
- W2068850418 cites W2161394740 @default.
- W2068850418 cites W2161745371 @default.
- W2068850418 cites W2164549814 @default.
- W2068850418 cites W4210430020 @default.
- W2068850418 cites W4210623056 @default.
- W2068850418 cites W4211208250 @default.
- W2068850418 cites W4231041617 @default.
- W2068850418 doi "https://doi.org/10.3390/genes2030449" @default.
- W2068850418 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3927609" @default.
- W2068850418 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/24710207" @default.
- W2068850418 hasPublicationYear "2011" @default.
- W2068850418 type Work @default.
- W2068850418 sameAs 2068850418 @default.
- W2068850418 citedByCount "19" @default.
- W2068850418 countsByYear W20688504182013 @default.
- W2068850418 countsByYear W20688504182014 @default.
- W2068850418 countsByYear W20688504182016 @default.
- W2068850418 countsByYear W20688504182018 @default.
- W2068850418 countsByYear W20688504182019 @default.
- W2068850418 countsByYear W20688504182022 @default.
- W2068850418 countsByYear W20688504182023 @default.
- W2068850418 crossrefType "journal-article" @default.
- W2068850418 hasAuthorship W2068850418A5001383415 @default.
- W2068850418 hasAuthorship W2068850418A5014730291 @default.
- W2068850418 hasAuthorship W2068850418A5052204988 @default.
- W2068850418 hasAuthorship W2068850418A5066228161 @default.
- W2068850418 hasAuthorship W2068850418A5071827404 @default.
- W2068850418 hasAuthorship W2068850418A5082077647 @default.
- W2068850418 hasBestOaLocation W20688504181 @default.
- W2068850418 hasConcept C104317684 @default.
- W2068850418 hasConcept C141231307 @default.
- W2068850418 hasConcept C141674004 @default.
- W2068850418 hasConcept C144292202 @default.
- W2068850418 hasConcept C151810110 @default.
- W2068850418 hasConcept C189206191 @default.
- W2068850418 hasConcept C194167682 @default.
- W2068850418 hasConcept C199360897 @default.
- W2068850418 hasConcept C202264299 @default.
- W2068850418 hasConcept C27591593 @default.
- W2068850418 hasConcept C2776878037 @default.
- W2068850418 hasConcept C2777904410 @default.