Matches in SemOpenAlex for { <https://semopenalex.org/work/W4296907865> ?p ?o ?g. }
- W4296907865 abstract "Well understanding protein function and structure in computational biology helps in the understanding of human beings. To face the limited proteins that are annotated structurally and functionally, the scientific community embraces the self-supervised pre-training methods from large amounts of unlabeled protein sequences for protein embedding learning. However, the protein is usually represented by individual amino acids with limited vocabulary size (e.g. 20 type proteins), without considering the strong local semantics existing in protein sequences. In this work, we propose a novel pre-training modeling approach SPRoBERTa. We first present an unsupervised protein tokenizer to learn protein representations with local fragment pattern. Then, a novel framework for deep pre-training model is introduced to learn protein embeddings. After pre-training, our method can be easily fine-tuned for different protein tasks, including amino acid-level prediction task (e.g. secondary structure prediction), amino acid pair-level prediction task (e.g. contact prediction) and also protein-level prediction task (remote homology prediction, protein function prediction). Experiments show that our approach achieves significant improvements in all tasks and outperforms the previous methods. We also provide detailed ablation studies and analysis for our protein tokenizer and training framework." @default.
- W4296907865 created "2022-09-24" @default.
- W4296907865 creator A5007225481 @default.
- W4296907865 creator A5020025718 @default.
- W4296907865 creator A5021772140 @default.
- W4296907865 creator A5028284977 @default.
- W4296907865 creator A5049944728 @default.
- W4296907865 creator A5055397380 @default.
- W4296907865 creator A5070990160 @default.
- W4296907865 creator A5074951383 @default.
- W4296907865 creator A5088666942 @default.
- W4296907865 date "2022-09-22" @default.
- W4296907865 modified "2023-09-26" @default.
- W4296907865 title "SPRoBERTa: protein embedding learning with local fragment modeling" @default.
- W4296907865 cites W2033339460 @default.
- W4296907865 cites W2076048958 @default.
- W4296907865 cites W2092672051 @default.
- W4296907865 cites W2104972430 @default.
- W4296907865 cites W2121879602 @default.
- W4296907865 cites W2163584430 @default.
- W4296907865 cites W2889498145 @default.
- W4296907865 cites W2898402099 @default.
- W4296907865 cites W2949867299 @default.
- W4296907865 cites W2950374603 @default.
- W4296907865 cites W2953008890 @default.
- W4296907865 cites W2962784628 @default.
- W4296907865 cites W2963026768 @default.
- W4296907865 cites W2963250244 @default.
- W4296907865 cites W2963979492 @default.
- W4296907865 cites W2980789587 @default.
- W4296907865 cites W2995514860 @default.
- W4296907865 cites W2999481648 @default.
- W4296907865 cites W3083386021 @default.
- W4296907865 cites W3133458480 @default.
- W4296907865 cites W3146944767 @default.
- W4296907865 cites W3164046276 @default.
- W4296907865 cites W3177500196 @default.
- W4296907865 cites W3177828909 @default.
- W4296907865 cites W3197123494 @default.
- W4296907865 cites W4205773061 @default.
- W4296907865 doi "https://doi.org/10.1093/bib/bbac401" @default.
- W4296907865 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36136367" @default.
- W4296907865 hasPublicationYear "2022" @default.
- W4296907865 type Work @default.
- W4296907865 citedByCount "1" @default.
- W4296907865 countsByYear W42969078652023 @default.
- W4296907865 crossrefType "journal-article" @default.
- W4296907865 hasAuthorship W4296907865A5007225481 @default.
- W4296907865 hasAuthorship W4296907865A5020025718 @default.
- W4296907865 hasAuthorship W4296907865A5021772140 @default.
- W4296907865 hasAuthorship W4296907865A5028284977 @default.
- W4296907865 hasAuthorship W4296907865A5049944728 @default.
- W4296907865 hasAuthorship W4296907865A5055397380 @default.
- W4296907865 hasAuthorship W4296907865A5070990160 @default.
- W4296907865 hasAuthorship W4296907865A5074951383 @default.
- W4296907865 hasAuthorship W4296907865A5088666942 @default.
- W4296907865 hasConcept C104317684 @default.
- W4296907865 hasConcept C119857082 @default.
- W4296907865 hasConcept C14036430 @default.
- W4296907865 hasConcept C154945302 @default.
- W4296907865 hasConcept C162324750 @default.
- W4296907865 hasConcept C18051474 @default.
- W4296907865 hasConcept C187736073 @default.
- W4296907865 hasConcept C207060522 @default.
- W4296907865 hasConcept C2780451532 @default.
- W4296907865 hasConcept C2986374874 @default.
- W4296907865 hasConcept C41008148 @default.
- W4296907865 hasConcept C41608201 @default.
- W4296907865 hasConcept C47701112 @default.
- W4296907865 hasConcept C54355233 @default.
- W4296907865 hasConcept C55493867 @default.
- W4296907865 hasConcept C70721500 @default.
- W4296907865 hasConcept C86803240 @default.
- W4296907865 hasConceptScore W4296907865C104317684 @default.
- W4296907865 hasConceptScore W4296907865C119857082 @default.
- W4296907865 hasConceptScore W4296907865C14036430 @default.
- W4296907865 hasConceptScore W4296907865C154945302 @default.
- W4296907865 hasConceptScore W4296907865C162324750 @default.
- W4296907865 hasConceptScore W4296907865C18051474 @default.
- W4296907865 hasConceptScore W4296907865C187736073 @default.
- W4296907865 hasConceptScore W4296907865C207060522 @default.
- W4296907865 hasConceptScore W4296907865C2780451532 @default.
- W4296907865 hasConceptScore W4296907865C2986374874 @default.
- W4296907865 hasConceptScore W4296907865C41008148 @default.
- W4296907865 hasConceptScore W4296907865C41608201 @default.
- W4296907865 hasConceptScore W4296907865C47701112 @default.
- W4296907865 hasConceptScore W4296907865C54355233 @default.
- W4296907865 hasConceptScore W4296907865C55493867 @default.
- W4296907865 hasConceptScore W4296907865C70721500 @default.
- W4296907865 hasConceptScore W4296907865C86803240 @default.
- W4296907865 hasIssue "6" @default.
- W4296907865 hasLocation W42969078651 @default.
- W4296907865 hasLocation W42969078652 @default.
- W4296907865 hasOpenAccess W4296907865 @default.
- W4296907865 hasPrimaryLocation W42969078651 @default.
- W4296907865 hasRelatedWork W1831126377 @default.
- W4296907865 hasRelatedWork W1973555128 @default.
- W4296907865 hasRelatedWork W1985222378 @default.
- W4296907865 hasRelatedWork W1998014636 @default.
- W4296907865 hasRelatedWork W2015534509 @default.