Matches in SemOpenAlex for { <https://semopenalex.org/work/W2034115630> ?p ?o ?g. }
- W2034115630 endingPage "e1001047" @default.
- W2034115630 startingPage "e1001047" @default.
- W2034115630 abstract "Virtually every molecular biologist has searched a protein or DNA sequence database to find sequences that are evolutionarily related to a given query. Pairwise sequence comparison methods--i.e., measures of similarity between query and target sequences--provide the engine for sequence database search and have been the subject of 30 years of computational research. For the difficult problem of detecting remote evolutionary relationships between protein sequences, the most successful pairwise comparison methods involve building local models (e.g., profile hidden Markov models) of protein sequences. However, recent work in massive data domains like web search and natural language processing demonstrate the advantage of exploiting the global structure of the data space. Motivated by this work, we present a large-scale algorithm called ProtEmbed, which learns an embedding of protein sequences into a low-dimensional semantic space. Evolutionarily related proteins are embedded in close proximity, and additional pieces of evidence, such as 3D structural similarity or class labels, can be incorporated into the learning process. We find that ProtEmbed achieves superior accuracy to widely used pairwise sequence methods like PSI-BLAST and HHSearch for remote homology detection; it also outperforms our previous RankProp algorithm, which incorporates global structure in the form of a protein similarity network. Finally, the ProtEmbed embedding space can be visualized, both at the global level and local to a given query, yielding intuition about the structure of protein sequence space." @default.
- W2034115630 created "2016-06-24" @default.
- W2034115630 creator A5057375933 @default.
- W2034115630 creator A5058301235 @default.
- W2034115630 creator A5076635608 @default.
- W2034115630 creator A5085231879 @default.
- W2034115630 date "2011-01-27" @default.
- W2034115630 modified "2023-09-24" @default.
- W2034115630 title "Detecting Remote Evolutionary Relationships among Proteins by Large-Scale Semantic Embedding" @default.
- W2034115630 cites W1968697272 @default.
- W2034115630 cites W2055043387 @default.
- W2034115630 cites W2082667898 @default.
- W2034115630 cites W2085277871 @default.
- W2034115630 cites W2087064593 @default.
- W2034115630 cites W2096748155 @default.
- W2034115630 cites W2105381419 @default.
- W2034115630 cites W2126016150 @default.
- W2034115630 cites W2127338593 @default.
- W2034115630 cites W2133075481 @default.
- W2034115630 cites W2145358391 @default.
- W2034115630 cites W2147667050 @default.
- W2034115630 cites W2152688507 @default.
- W2034115630 cites W2158714788 @default.
- W2034115630 cites W2161056921 @default.
- W2034115630 doi "https://doi.org/10.1371/journal.pcbi.1001047" @default.
- W2034115630 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3029239" @default.
- W2034115630 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/21298082" @default.
- W2034115630 hasPublicationYear "2011" @default.
- W2034115630 type Work @default.
- W2034115630 sameAs 2034115630 @default.
- W2034115630 citedByCount "26" @default.
- W2034115630 countsByYear W20341156302012 @default.
- W2034115630 countsByYear W20341156302013 @default.
- W2034115630 countsByYear W20341156302014 @default.
- W2034115630 countsByYear W20341156302015 @default.
- W2034115630 countsByYear W20341156302016 @default.
- W2034115630 countsByYear W20341156302017 @default.
- W2034115630 countsByYear W20341156302018 @default.
- W2034115630 countsByYear W20341156302019 @default.
- W2034115630 countsByYear W20341156302020 @default.
- W2034115630 countsByYear W20341156302022 @default.
- W2034115630 crossrefType "journal-article" @default.
- W2034115630 hasAuthorship W2034115630A5057375933 @default.
- W2034115630 hasAuthorship W2034115630A5058301235 @default.
- W2034115630 hasAuthorship W2034115630A5076635608 @default.
- W2034115630 hasAuthorship W2034115630A5085231879 @default.
- W2034115630 hasBestOaLocation W20341156301 @default.
- W2034115630 hasConcept C103278499 @default.
- W2034115630 hasConcept C104317684 @default.
- W2034115630 hasConcept C115961682 @default.
- W2034115630 hasConcept C124101348 @default.
- W2034115630 hasConcept C130318100 @default.
- W2034115630 hasConcept C136475424 @default.
- W2034115630 hasConcept C154945302 @default.
- W2034115630 hasConcept C184898388 @default.
- W2034115630 hasConcept C41008148 @default.
- W2034115630 hasConcept C41584329 @default.
- W2034115630 hasConcept C41608201 @default.
- W2034115630 hasConcept C47701112 @default.
- W2034115630 hasConcept C54355233 @default.
- W2034115630 hasConcept C55493867 @default.
- W2034115630 hasConcept C58773245 @default.
- W2034115630 hasConcept C80444323 @default.
- W2034115630 hasConcept C86803240 @default.
- W2034115630 hasConceptScore W2034115630C103278499 @default.
- W2034115630 hasConceptScore W2034115630C104317684 @default.
- W2034115630 hasConceptScore W2034115630C115961682 @default.
- W2034115630 hasConceptScore W2034115630C124101348 @default.
- W2034115630 hasConceptScore W2034115630C130318100 @default.
- W2034115630 hasConceptScore W2034115630C136475424 @default.
- W2034115630 hasConceptScore W2034115630C154945302 @default.
- W2034115630 hasConceptScore W2034115630C184898388 @default.
- W2034115630 hasConceptScore W2034115630C41008148 @default.
- W2034115630 hasConceptScore W2034115630C41584329 @default.
- W2034115630 hasConceptScore W2034115630C41608201 @default.
- W2034115630 hasConceptScore W2034115630C47701112 @default.
- W2034115630 hasConceptScore W2034115630C54355233 @default.
- W2034115630 hasConceptScore W2034115630C55493867 @default.
- W2034115630 hasConceptScore W2034115630C58773245 @default.
- W2034115630 hasConceptScore W2034115630C80444323 @default.
- W2034115630 hasConceptScore W2034115630C86803240 @default.
- W2034115630 hasIssue "1" @default.
- W2034115630 hasLocation W20341156301 @default.
- W2034115630 hasLocation W20341156302 @default.
- W2034115630 hasLocation W20341156303 @default.
- W2034115630 hasLocation W20341156304 @default.
- W2034115630 hasLocation W20341156305 @default.
- W2034115630 hasOpenAccess W2034115630 @default.
- W2034115630 hasPrimaryLocation W20341156301 @default.
- W2034115630 hasRelatedWork W2034115630 @default.
- W2034115630 hasRelatedWork W2038246283 @default.
- W2034115630 hasRelatedWork W2098781440 @default.
- W2034115630 hasRelatedWork W2165135917 @default.
- W2034115630 hasRelatedWork W2280186906 @default.
- W2034115630 hasRelatedWork W2605245466 @default.
- W2034115630 hasRelatedWork W2735639039 @default.
- W2034115630 hasRelatedWork W2950239980 @default.
- W2034115630 hasRelatedWork W4205565320 @default.