Matches in SemOpenAlex for { <https://semopenalex.org/work/W3000561236> ?p ?o ?g. }
- W3000561236 abstract "Abstract Protein secondary structure prediction remains a vital topic with improving accuracy and broad applications. By using deep learning algorithms, prediction methods not relying on structure templates were recently reported to reach as high as 87% accuracy on 3 labels (helix, sheet or coil). Due to lack of a widely accepted standard in secondary structure predictor development and evaluation, a fair comparison of predictors is challenging. A detailed examination of factors that contribute to higher accuracy is also lacking. In this paper, we present: (1) a new test set, Test2018 , consisting of proteins from structures released in 2018 with less than 25% similar to any protein published before 2018; (2) a 4-layer convolutional neural network, SecNet , with an input window of ±14 amino acids which was trained on proteins less than 25% identical to proteins in Test2018 and the commonly used CB513 test set; (3) a detailed ablation study where we reverse one algorithmic choice at a time in SecNet and evaluate the effect on the prediction accuracy; (4) new 4- and 5-label prediction alphabets that may be more practical for tertiary structure prediction methods. The 3-label accuracy of the leading predictors on both Test2018 and CB513 is 81-82%, while SecNet ’s accuracy is 84% for both sets. The ablation study of different factors (evolutionary information, neural network architecture, and training hyper-parameters) suggests the best accuracy results are achieved with good choices for each of them while the neural network architecture is not as critical as long as it is not too simple. Protocols for generating and using unbiased test, validation, and training sets are provided. Our data sets, including input features and assigned labels, and SecNet software including third-party dependencies and databases, are downloadable from dunbrack.fccc.edu/ss and github.com/sh-maxim/ss ." @default.
- W3000561236 created "2020-01-23" @default.
- W3000561236 creator A5053977807 @default.
- W3000561236 creator A5059847153 @default.
- W3000561236 creator A5075902779 @default.
- W3000561236 date "2020-01-18" @default.
- W3000561236 modified "2023-09-23" @default.
- W3000561236 title "Multifaceted analysis of training and testing convolutional neural networks for protein secondary structure prediction" @default.
- W3000561236 cites W1483749025 @default.
- W3000561236 cites W1491040459 @default.
- W3000561236 cites W1506812273 @default.
- W3000561236 cites W1548178570 @default.
- W3000561236 cites W1596947964 @default.
- W3000561236 cites W1606410455 @default.
- W3000561236 cites W1963885440 @default.
- W3000561236 cites W1967112996 @default.
- W3000561236 cites W1969007008 @default.
- W3000561236 cites W1971886065 @default.
- W3000561236 cites W198082522 @default.
- W3000561236 cites W1981132436 @default.
- W3000561236 cites W1985818354 @default.
- W3000561236 cites W1987328021 @default.
- W3000561236 cites W1992805060 @default.
- W3000561236 cites W1998723057 @default.
- W3000561236 cites W2001660190 @default.
- W3000561236 cites W2008708467 @default.
- W3000561236 cites W2011026683 @default.
- W3000561236 cites W2013136212 @default.
- W3000561236 cites W2013425283 @default.
- W3000561236 cites W2013734705 @default.
- W3000561236 cites W2018770680 @default.
- W3000561236 cites W2022964681 @default.
- W3000561236 cites W2029476353 @default.
- W3000561236 cites W2040502656 @default.
- W3000561236 cites W2044999719 @default.
- W3000561236 cites W2047819993 @default.
- W3000561236 cites W2049695588 @default.
- W3000561236 cites W2051210555 @default.
- W3000561236 cites W2056958748 @default.
- W3000561236 cites W2057157558 @default.
- W3000561236 cites W2057289558 @default.
- W3000561236 cites W2060178110 @default.
- W3000561236 cites W2062166748 @default.
- W3000561236 cites W2064281058 @default.
- W3000561236 cites W2067522896 @default.
- W3000561236 cites W2079928393 @default.
- W3000561236 cites W2084281591 @default.
- W3000561236 cites W2089597488 @default.
- W3000561236 cites W2091518964 @default.
- W3000561236 cites W2095450147 @default.
- W3000561236 cites W2099886875 @default.
- W3000561236 cites W2104972430 @default.
- W3000561236 cites W2108642468 @default.
- W3000561236 cites W2111705855 @default.
- W3000561236 cites W2112607018 @default.
- W3000561236 cites W2119423166 @default.
- W3000561236 cites W2121702291 @default.
- W3000561236 cites W2123333780 @default.
- W3000561236 cites W2124371326 @default.
- W3000561236 cites W2124464974 @default.
- W3000561236 cites W2126389772 @default.
- W3000561236 cites W2126871981 @default.
- W3000561236 cites W2127322768 @default.
- W3000561236 cites W2127479412 @default.
- W3000561236 cites W2130060890 @default.
- W3000561236 cites W2131474431 @default.
- W3000561236 cites W2133142974 @default.
- W3000561236 cites W2134299061 @default.
- W3000561236 cites W2139582206 @default.
- W3000561236 cites W2141803423 @default.
- W3000561236 cites W2141915739 @default.
- W3000561236 cites W2141920771 @default.
- W3000561236 cites W2145991251 @default.
- W3000561236 cites W2146296021 @default.
- W3000561236 cites W2147209844 @default.
- W3000561236 cites W2148204968 @default.
- W3000561236 cites W2148557779 @default.
- W3000561236 cites W2153153865 @default.
- W3000561236 cites W2153187042 @default.
- W3000561236 cites W2156798505 @default.
- W3000561236 cites W2158714788 @default.
- W3000561236 cites W2159940018 @default.
- W3000561236 cites W2161958780 @default.
- W3000561236 cites W2162685411 @default.
- W3000561236 cites W2169243280 @default.
- W3000561236 cites W2288234278 @default.
- W3000561236 cites W2399215454 @default.
- W3000561236 cites W2518750490 @default.
- W3000561236 cites W2520998691 @default.
- W3000561236 cites W2540069603 @default.
- W3000561236 cites W2567587907 @default.
- W3000561236 cites W2574496196 @default.
- W3000561236 cites W2607268717 @default.
- W3000561236 cites W2768698520 @default.
- W3000561236 cites W2791790018 @default.
- W3000561236 cites W2795336883 @default.
- W3000561236 cites W2802048770 @default.
- W3000561236 cites W2809801122 @default.
- W3000561236 cites W2887029338 @default.
- W3000561236 cites W2905446269 @default.