Matches in SemOpenAlex for { <https://semopenalex.org/work/W2737578470> ?p ?o ?g. }
- W2737578470 abstract "Computational prediction of protein function constitutes one of the more complex problems in Bioinformatics, because of the diversity of functions and mechanisms in that proteins exert in nature. This issue is reinforced especially for proteins that share very low primary or tertiary structure similarity to existing annotated proteomes. In this sense, new alignment-free (AF) tools are needed to overcome the inherent limitations of classic alignment-based approaches to this issue. We have recently introduced AF protein-numerical-encoding programs (TI2BioP and ProtDCal), whose sequence-based features have been successfully applied to detect remote protein homologs, post-translational modifications and antibacterial peptides. Here we aim to demonstrate the applicability of 4 AF protein descriptor families, implemented in our programs, for the identification enzyme-like proteins. At the same time, the use of our novel family of 3D–structure-based descriptors is introduced for the first time. The Dobson & Doig (D&D) benchmark dataset is used for the evaluation of our AF protein descriptors, because of its proven structural diversity that permits one to emulate an experiment within the twilight zone of alignment-based methods (pair-wise identity <30%). The performance of our sequence-based predictor was further assessed using a subset of formerly uncharacterized proteins which currently represent a benchmark annotation dataset. Four protein descriptor families (sequence-composition-based (0D), linear-topology-based (1D), pseudo-fold-topology-based (2D) and 3D–structure features (3D), were assessed using the D&D benchmark dataset. We show that only the families of ProtDCal’s descriptors (0D, 1D and 3D) encode significant information for enzymes and non-enzymes discrimination. The obtained 3D–structure-based classifier ranked first among several other SVM-based methods assessed in this dataset. Furthermore, the model leveraging 1D descriptors, showed a higher success rate than EzyPred on a benchmark annotation dataset from the Shewanella oneidensis proteome. The applicability of ProtDCal as a general-purpose-AF protein modelling method is illustrated through the discrimination between two comprehensive protein functional classes. The observed performances using the highly diverse D&D dataset, and the set of formerly uncharacterized (hard-to-annotate) proteins of Shewanella oneidensis, places our methodology on the top range of methods to model and predict protein function using alignment-free approaches." @default.
- W2737578470 created "2017-07-31" @default.
- W2737578470 creator A5000044766 @default.
- W2737578470 creator A5000961076 @default.
- W2737578470 creator A5017842519 @default.
- W2737578470 creator A5035402210 @default.
- W2737578470 creator A5054389046 @default.
- W2737578470 creator A5078223816 @default.
- W2737578470 date "2017-07-21" @default.
- W2737578470 modified "2023-10-18" @default.
- W2737578470 title "Exploring general-purpose protein features for distinguishing enzymes and non-enzymes within the twilight zone" @default.
- W2737578470 cites W112873001 @default.
- W2737578470 cites W1179283095 @default.
- W2737578470 cites W1496257230 @default.
- W2737578470 cites W1970613018 @default.
- W2737578470 cites W1971896762 @default.
- W2737578470 cites W1972863608 @default.
- W2737578470 cites W1975304761 @default.
- W2737578470 cites W1976159745 @default.
- W2737578470 cites W1976656799 @default.
- W2737578470 cites W1978914444 @default.
- W2737578470 cites W1980397590 @default.
- W2737578470 cites W1980497258 @default.
- W2737578470 cites W1982910530 @default.
- W2737578470 cites W1985718949 @default.
- W2737578470 cites W1992311227 @default.
- W2737578470 cites W1995808589 @default.
- W2737578470 cites W1995875735 @default.
- W2737578470 cites W1998490996 @default.
- W2737578470 cites W2014731953 @default.
- W2737578470 cites W2023306209 @default.
- W2737578470 cites W2026666393 @default.
- W2737578470 cites W2043886357 @default.
- W2737578470 cites W2043906184 @default.
- W2737578470 cites W2049012227 @default.
- W2737578470 cites W2050160299 @default.
- W2737578470 cites W2053423550 @default.
- W2737578470 cites W2054068479 @default.
- W2737578470 cites W2054453488 @default.
- W2737578470 cites W2055043387 @default.
- W2737578470 cites W2056053405 @default.
- W2737578470 cites W2063088204 @default.
- W2737578470 cites W2064664178 @default.
- W2737578470 cites W2066759237 @default.
- W2737578470 cites W2076022581 @default.
- W2737578470 cites W2079105106 @default.
- W2737578470 cites W2079193139 @default.
- W2737578470 cites W2087064593 @default.
- W2737578470 cites W2087866582 @default.
- W2737578470 cites W2092750499 @default.
- W2737578470 cites W2101220662 @default.
- W2737578470 cites W2101940264 @default.
- W2737578470 cites W2107749303 @default.
- W2737578470 cites W2111973517 @default.
- W2737578470 cites W2115595474 @default.
- W2737578470 cites W2117077088 @default.
- W2737578470 cites W2119421613 @default.
- W2737578470 cites W2126819771 @default.
- W2737578470 cites W2128653811 @default.
- W2737578470 cites W2131421334 @default.
- W2737578470 cites W2132292391 @default.
- W2737578470 cites W2133990480 @default.
- W2737578470 cites W2141885858 @default.
- W2737578470 cites W2145957695 @default.
- W2737578470 cites W2151831732 @default.
- W2737578470 cites W2152800101 @default.
- W2737578470 cites W2155161278 @default.
- W2737578470 cites W2155524195 @default.
- W2737578470 cites W2161550872 @default.
- W2737578470 cites W2166637863 @default.
- W2737578470 cites W2176982668 @default.
- W2737578470 cites W2190044768 @default.
- W2737578470 cites W2203756811 @default.
- W2737578470 cites W2296592747 @default.
- W2737578470 cites W2489559155 @default.
- W2737578470 cites W2557395024 @default.
- W2737578470 cites W2584459311 @default.
- W2737578470 cites W2978725006 @default.
- W2737578470 cites W4210531204 @default.
- W2737578470 cites W4213345021 @default.
- W2737578470 cites W4235848672 @default.
- W2737578470 doi "https://doi.org/10.1186/s12859-017-1758-x" @default.
- W2737578470 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/5521120" @default.
- W2737578470 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/28732462" @default.
- W2737578470 hasPublicationYear "2017" @default.
- W2737578470 type Work @default.
- W2737578470 sameAs 2737578470 @default.
- W2737578470 citedByCount "10" @default.
- W2737578470 countsByYear W27375784702018 @default.
- W2737578470 countsByYear W27375784702019 @default.
- W2737578470 countsByYear W27375784702021 @default.
- W2737578470 countsByYear W27375784702022 @default.
- W2737578470 crossrefType "journal-article" @default.
- W2737578470 hasAuthorship W2737578470A5000044766 @default.
- W2737578470 hasAuthorship W2737578470A5000961076 @default.
- W2737578470 hasAuthorship W2737578470A5017842519 @default.
- W2737578470 hasAuthorship W2737578470A5035402210 @default.
- W2737578470 hasAuthorship W2737578470A5054389046 @default.
- W2737578470 hasAuthorship W2737578470A5078223816 @default.
- W2737578470 hasBestOaLocation W27375784701 @default.