Matches in SemOpenAlex for { <https://semopenalex.org/work/W4285405240> ?p ?o ?g. }
- W4285405240 endingPage "107732" @default.
- W4285405240 startingPage "107732" @default.
- W4285405240 abstract "A promoter is a sequence of DNA that initializes the process of transcription and regulates whenever and wherever genes are expressed in the organism. Because of its importance in molecular biology, identifying DNA promoters are challenging to provide useful information related to its functions and related diseases. Several computational models have been developed to early predict promoters from high-throughput sequencing over the past decade. Although some useful predictors have been proposed, there remains short-falls in those models and there is an urgent need to enhance the predictive performance to meet the practice requirements. In this study, we proposed a novel architecture that incorporated transformer natural language processing (NLP) and explainable machine learning to address this problem. More specifically, a pre-trained Bidirectional Encoder Representations from Transformers (BERT) model was employed to encode DNA sequences, and SHapley Additive exPlanations (SHAP) analysis served as a feature selection step to look at the top-rank BERT encodings. At the last stage, different machine learning classifiers were implemented to learn the top features and produce the prediction outcomes. This study not only predicted the DNA promoters but also their activities (strong or weak promoters). Overall, several experiments showed an accuracy of 85.5 % and 76.9 % for these two levels, respectively. Our performance showed a superiority to previously published predictors on the same dataset in most measurement metrics. We named our predictor as BERT-Promoter and it is freely available at https://github.com/khanhlee/bert-promoter." @default.
- W4285405240 created "2022-07-14" @default.
- W4285405240 creator A5026899391 @default.
- W4285405240 creator A5029628876 @default.
- W4285405240 creator A5063219501 @default.
- W4285405240 creator A5079500481 @default.
- W4285405240 date "2022-08-01" @default.
- W4285405240 modified "2023-10-15" @default.
- W4285405240 title "BERT-Promoter: An improved sequence-based predictor of DNA promoter using BERT pre-trained model and SHAP feature selection" @default.
- W4285405240 cites W1825157170 @default.
- W4285405240 cites W1922068952 @default.
- W4285405240 cites W2023962152 @default.
- W4285405240 cites W2072025288 @default.
- W4285405240 cites W2119455089 @default.
- W4285405240 cites W2125970547 @default.
- W4285405240 cites W2130919037 @default.
- W4285405240 cites W2137458297 @default.
- W4285405240 cites W2148503087 @default.
- W4285405240 cites W2175894204 @default.
- W4285405240 cites W2232370058 @default.
- W4285405240 cites W2905321002 @default.
- W4285405240 cites W2911489562 @default.
- W4285405240 cites W2932524182 @default.
- W4285405240 cites W2963739921 @default.
- W4285405240 cites W2967960129 @default.
- W4285405240 cites W2969662046 @default.
- W4285405240 cites W2987092090 @default.
- W4285405240 cites W3000887590 @default.
- W4285405240 cites W3021134454 @default.
- W4285405240 cites W3087291937 @default.
- W4285405240 cites W3120223152 @default.
- W4285405240 cites W3129125493 @default.
- W4285405240 cites W3131121088 @default.
- W4285405240 cites W3153138486 @default.
- W4285405240 doi "https://doi.org/10.1016/j.compbiolchem.2022.107732" @default.
- W4285405240 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/35863177" @default.
- W4285405240 hasPublicationYear "2022" @default.
- W4285405240 type Work @default.
- W4285405240 citedByCount "28" @default.
- W4285405240 countsByYear W42854052402022 @default.
- W4285405240 countsByYear W42854052402023 @default.
- W4285405240 crossrefType "journal-article" @default.
- W4285405240 hasAuthorship W4285405240A5026899391 @default.
- W4285405240 hasAuthorship W4285405240A5029628876 @default.
- W4285405240 hasAuthorship W4285405240A5063219501 @default.
- W4285405240 hasAuthorship W4285405240A5079500481 @default.
- W4285405240 hasConcept C101762097 @default.
- W4285405240 hasConcept C104317684 @default.
- W4285405240 hasConcept C111919701 @default.
- W4285405240 hasConcept C118505674 @default.
- W4285405240 hasConcept C119857082 @default.
- W4285405240 hasConcept C138885662 @default.
- W4285405240 hasConcept C148483581 @default.
- W4285405240 hasConcept C150194340 @default.
- W4285405240 hasConcept C154945302 @default.
- W4285405240 hasConcept C179926584 @default.
- W4285405240 hasConcept C41008148 @default.
- W4285405240 hasConcept C41895202 @default.
- W4285405240 hasConcept C51679486 @default.
- W4285405240 hasConcept C54355233 @default.
- W4285405240 hasConcept C552990157 @default.
- W4285405240 hasConcept C66746571 @default.
- W4285405240 hasConcept C70721500 @default.
- W4285405240 hasConcept C86803240 @default.
- W4285405240 hasConceptScore W4285405240C101762097 @default.
- W4285405240 hasConceptScore W4285405240C104317684 @default.
- W4285405240 hasConceptScore W4285405240C111919701 @default.
- W4285405240 hasConceptScore W4285405240C118505674 @default.
- W4285405240 hasConceptScore W4285405240C119857082 @default.
- W4285405240 hasConceptScore W4285405240C138885662 @default.
- W4285405240 hasConceptScore W4285405240C148483581 @default.
- W4285405240 hasConceptScore W4285405240C150194340 @default.
- W4285405240 hasConceptScore W4285405240C154945302 @default.
- W4285405240 hasConceptScore W4285405240C179926584 @default.
- W4285405240 hasConceptScore W4285405240C41008148 @default.
- W4285405240 hasConceptScore W4285405240C41895202 @default.
- W4285405240 hasConceptScore W4285405240C51679486 @default.
- W4285405240 hasConceptScore W4285405240C54355233 @default.
- W4285405240 hasConceptScore W4285405240C552990157 @default.
- W4285405240 hasConceptScore W4285405240C66746571 @default.
- W4285405240 hasConceptScore W4285405240C70721500 @default.
- W4285405240 hasConceptScore W4285405240C86803240 @default.
- W4285405240 hasFunder F4320322795 @default.
- W4285405240 hasLocation W42854052401 @default.
- W4285405240 hasLocation W42854052402 @default.
- W4285405240 hasOpenAccess W4285405240 @default.
- W4285405240 hasPrimaryLocation W42854052401 @default.
- W4285405240 hasRelatedWork W1969771092 @default.
- W4285405240 hasRelatedWork W1989130879 @default.
- W4285405240 hasRelatedWork W2063982682 @default.
- W4285405240 hasRelatedWork W2103419012 @default.
- W4285405240 hasRelatedWork W2275988210 @default.
- W4285405240 hasRelatedWork W2354198838 @default.
- W4285405240 hasRelatedWork W2468279273 @default.
- W4285405240 hasRelatedWork W2795723184 @default.
- W4285405240 hasRelatedWork W2988126442 @default.
- W4285405240 hasRelatedWork W3045696640 @default.
- W4285405240 hasVolume "99" @default.