Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387497319> ?p ?o ?g. }
- W4387497319 endingPage "107970" @default.
- W4387497319 startingPage "107970" @default.
- W4387497319 abstract "The identification of hotspot residues at the protein-DNA binding interfaces plays a crucial role in various aspects such as drug discovery and disease treatment. Although experimental methods such as alanine scanning mutagenesis have been developed to determine the hotspot residues on protein-DNA interfaces, they are both inefficient and costly. Therefore, it is highly necessary to develop efficient and accurate computational methods for predicting hotspot residues. Several computational methods have been developed, however, they are mainly based on hand-crafted features which may not be able to represent all the information of proteins. In this regard, we propose a model called PDH-EH, which utilizes fused features of embeddings extracted from a protein language model (PLM) and handcrafted features. After we extracted the total 1141 dimensional features, we used mRMR to select the optimal feature subset. Based on the optimal feature subset, several different learning algorithms such as Random Forest, Support Vector Machine, and XGBoost were used to build the models. The cross-validation results on the training dataset show that the model built by using Random Forest achieves the highest AUROC. Further evaluation on the independent test set shows that our model outperforms the existing state-of-the-art models. Moreover, the effectiveness and interpretability of embeddings extracted from PLM were demonstrated in our analysis. The codes and datasets used in this study are available at: https://github.com/lixiangli01/PDH-EH." @default.
- W4387497319 created "2023-10-11" @default.
- W4387497319 creator A5014548132 @default.
- W4387497319 creator A5030620239 @default.
- W4387497319 creator A5048714264 @default.
- W4387497319 creator A5084858860 @default.
- W4387497319 creator A5092277199 @default.
- W4387497319 date "2023-10-01" @default.
- W4387497319 modified "2023-10-12" @default.
- W4387497319 title "Protein-DNA interface hotspots prediction based on fusion features of embeddings of protein language model and handcrafted features" @default.
- W4387497319 cites W1503104790 @default.
- W4387497319 cites W1966716734 @default.
- W4387497319 cites W197372360 @default.
- W4387497319 cites W1980825481 @default.
- W4387497319 cites W2008708467 @default.
- W4387497319 cites W2025101852 @default.
- W4387497319 cites W2029582401 @default.
- W4387497319 cites W2033893117 @default.
- W4387497319 cites W2046001773 @default.
- W4387497319 cites W2048917743 @default.
- W4387497319 cites W2050720866 @default.
- W4387497319 cites W2051628563 @default.
- W4387497319 cites W2055837091 @default.
- W4387497319 cites W2058527555 @default.
- W4387497319 cites W2076911240 @default.
- W4387497319 cites W2088038235 @default.
- W4387497319 cites W2103525038 @default.
- W4387497319 cites W2118737745 @default.
- W4387497319 cites W2131987814 @default.
- W4387497319 cites W2133462743 @default.
- W4387497319 cites W2138104170 @default.
- W4387497319 cites W2139262886 @default.
- W4387497319 cites W2141624700 @default.
- W4387497319 cites W2141627824 @default.
- W4387497319 cites W2143766664 @default.
- W4387497319 cites W2149997286 @default.
- W4387497319 cites W2152365098 @default.
- W4387497319 cites W2154053567 @default.
- W4387497319 cites W2156125289 @default.
- W4387497319 cites W2158714788 @default.
- W4387497319 cites W2363637013 @default.
- W4387497319 cites W2605361745 @default.
- W4387497319 cites W2777941847 @default.
- W4387497319 cites W2794024218 @default.
- W4387497319 cites W2798552692 @default.
- W4387497319 cites W2888728157 @default.
- W4387497319 cites W2904602199 @default.
- W4387497319 cites W2911964244 @default.
- W4387497319 cites W2935793619 @default.
- W4387497319 cites W2948934927 @default.
- W4387497319 cites W2949342052 @default.
- W4387497319 cites W2964278775 @default.
- W4387497319 cites W3081249922 @default.
- W4387497319 cites W3084524758 @default.
- W4387497319 cites W3085584475 @default.
- W4387497319 cites W3102476541 @default.
- W4387497319 cites W3118410381 @default.
- W4387497319 cites W3120031819 @default.
- W4387497319 cites W3155248445 @default.
- W4387497319 cites W3177500196 @default.
- W4387497319 cites W3187281825 @default.
- W4387497319 cites W4206247924 @default.
- W4387497319 cites W4206950245 @default.
- W4387497319 cites W4281613484 @default.
- W4387497319 cites W4321791966 @default.
- W4387497319 cites W4362596075 @default.
- W4387497319 doi "https://doi.org/10.1016/j.compbiolchem.2023.107970" @default.
- W4387497319 hasPublicationYear "2023" @default.
- W4387497319 type Work @default.
- W4387497319 citedByCount "0" @default.
- W4387497319 crossrefType "journal-article" @default.
- W4387497319 hasAuthorship W4387497319A5014548132 @default.
- W4387497319 hasAuthorship W4387497319A5030620239 @default.
- W4387497319 hasAuthorship W4387497319A5048714264 @default.
- W4387497319 hasAuthorship W4387497319A5084858860 @default.
- W4387497319 hasAuthorship W4387497319A5092277199 @default.
- W4387497319 hasConcept C119857082 @default.
- W4387497319 hasConcept C12267149 @default.
- W4387497319 hasConcept C124101348 @default.
- W4387497319 hasConcept C127313418 @default.
- W4387497319 hasConcept C146481406 @default.
- W4387497319 hasConcept C153180895 @default.
- W4387497319 hasConcept C154945302 @default.
- W4387497319 hasConcept C169258074 @default.
- W4387497319 hasConcept C2781067378 @default.
- W4387497319 hasConcept C41008148 @default.
- W4387497319 hasConcept C8058405 @default.
- W4387497319 hasConceptScore W4387497319C119857082 @default.
- W4387497319 hasConceptScore W4387497319C12267149 @default.
- W4387497319 hasConceptScore W4387497319C124101348 @default.
- W4387497319 hasConceptScore W4387497319C127313418 @default.
- W4387497319 hasConceptScore W4387497319C146481406 @default.
- W4387497319 hasConceptScore W4387497319C153180895 @default.
- W4387497319 hasConceptScore W4387497319C154945302 @default.
- W4387497319 hasConceptScore W4387497319C169258074 @default.
- W4387497319 hasConceptScore W4387497319C2781067378 @default.
- W4387497319 hasConceptScore W4387497319C41008148 @default.
- W4387497319 hasConceptScore W4387497319C8058405 @default.