Matches in SemOpenAlex for { <https://semopenalex.org/work/W1995918845> ?p ?o ?g. }
- W1995918845 endingPage "550" @default.
- W1995918845 startingPage "527" @default.
- W1995918845 abstract "We introduce novel profile-based string kernels for use with support vector machines (SVMs) for the problems of protein classification and remote homology detection. These kernels use probabilistic profiles, such as those produced by the PSI-BLAST algorithm, to define position-dependent mutation neighborhoods along protein sequences for inexact matching of k-length subsequences (k-mers) in the data. By use of an efficient data structure, the kernels are fast to compute once the profiles have been obtained. For example, the time needed to run PSI-BLAST in order to build the profiles is significantly longer than both the kernel computation time and the SVM training time. We present remote homology detection experiments based on the SCOP database where we show that profile-based string kernels used with SVM classifiers strongly outperform all recently presented supervised SVM methods. We further examine how to incorporate predicted secondary structure information into the profile kernel to obtain a small but significant performance improvement. We also show how we can use the learned SVM classifier to extract discriminative sequence motifs — short regions of the original profile that contribute almost all the weight of the SVM classification score — and show that these discriminative motifs correspond to meaningful structural features in the protein data. The use of PSI-BLAST profiles can be seen as a semi-supervised learning technique, since PSI-BLAST leverages unlabeled data from a large sequence database to build more informative profiles. Recently presented cluster kernels give general semi-supervised methods for improving SVM protein classification performance. We show that our profile kernel results also outperform cluster kernels while providing much better scalability to large datasets. Supplementary website:." @default.
- W1995918845 created "2016-06-24" @default.
- W1995918845 creator A5001093531 @default.
- W1995918845 creator A5003097366 @default.
- W1995918845 creator A5013936779 @default.
- W1995918845 creator A5036209662 @default.
- W1995918845 creator A5066338313 @default.
- W1995918845 creator A5072341135 @default.
- W1995918845 creator A5085231879 @default.
- W1995918845 date "2005-06-01" @default.
- W1995918845 modified "2023-09-26" @default.
- W1995918845 title "PROFILE-BASED STRING KERNELS FOR REMOTE HOMOLOGY DETECTION AND MOTIF EXTRACTION" @default.
- W1995918845 cites W1969051510 @default.
- W1995918845 cites W1985818354 @default.
- W1995918845 cites W2005690321 @default.
- W1995918845 cites W2008708467 @default.
- W1995918845 cites W2013570924 @default.
- W1995918845 cites W2084787613 @default.
- W1995918845 cites W2087347434 @default.
- W1995918845 cites W2102122585 @default.
- W1995918845 cites W2105801262 @default.
- W1995918845 cites W2106868411 @default.
- W1995918845 cites W2124709175 @default.
- W1995918845 cites W2140713760 @default.
- W1995918845 cites W2150627302 @default.
- W1995918845 cites W2153187042 @default.
- W1995918845 cites W2158714788 @default.
- W1995918845 cites W2165495989 @default.
- W1995918845 cites W2165979580 @default.
- W1995918845 cites W2166884322 @default.
- W1995918845 cites W2283504545 @default.
- W1995918845 cites W4238658571 @default.
- W1995918845 doi "https://doi.org/10.1142/s021972000500120x" @default.
- W1995918845 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/16108083" @default.
- W1995918845 hasPublicationYear "2005" @default.
- W1995918845 type Work @default.
- W1995918845 sameAs 1995918845 @default.
- W1995918845 citedByCount "155" @default.
- W1995918845 countsByYear W19959188452012 @default.
- W1995918845 countsByYear W19959188452013 @default.
- W1995918845 countsByYear W19959188452014 @default.
- W1995918845 countsByYear W19959188452015 @default.
- W1995918845 countsByYear W19959188452016 @default.
- W1995918845 countsByYear W19959188452017 @default.
- W1995918845 countsByYear W19959188452018 @default.
- W1995918845 countsByYear W19959188452019 @default.
- W1995918845 countsByYear W19959188452020 @default.
- W1995918845 countsByYear W19959188452021 @default.
- W1995918845 countsByYear W19959188452022 @default.
- W1995918845 countsByYear W19959188452023 @default.
- W1995918845 crossrefType "journal-article" @default.
- W1995918845 hasAuthorship W1995918845A5001093531 @default.
- W1995918845 hasAuthorship W1995918845A5003097366 @default.
- W1995918845 hasAuthorship W1995918845A5013936779 @default.
- W1995918845 hasAuthorship W1995918845A5036209662 @default.
- W1995918845 hasAuthorship W1995918845A5066338313 @default.
- W1995918845 hasAuthorship W1995918845A5072341135 @default.
- W1995918845 hasAuthorship W1995918845A5085231879 @default.
- W1995918845 hasConcept C114614502 @default.
- W1995918845 hasConcept C119857082 @default.
- W1995918845 hasConcept C122280245 @default.
- W1995918845 hasConcept C12267149 @default.
- W1995918845 hasConcept C153180895 @default.
- W1995918845 hasConcept C154945302 @default.
- W1995918845 hasConcept C33923547 @default.
- W1995918845 hasConcept C41008148 @default.
- W1995918845 hasConcept C55851704 @default.
- W1995918845 hasConcept C74193536 @default.
- W1995918845 hasConcept C75866337 @default.
- W1995918845 hasConcept C95623464 @default.
- W1995918845 hasConcept C97931131 @default.
- W1995918845 hasConceptScore W1995918845C114614502 @default.
- W1995918845 hasConceptScore W1995918845C119857082 @default.
- W1995918845 hasConceptScore W1995918845C122280245 @default.
- W1995918845 hasConceptScore W1995918845C12267149 @default.
- W1995918845 hasConceptScore W1995918845C153180895 @default.
- W1995918845 hasConceptScore W1995918845C154945302 @default.
- W1995918845 hasConceptScore W1995918845C33923547 @default.
- W1995918845 hasConceptScore W1995918845C41008148 @default.
- W1995918845 hasConceptScore W1995918845C55851704 @default.
- W1995918845 hasConceptScore W1995918845C74193536 @default.
- W1995918845 hasConceptScore W1995918845C75866337 @default.
- W1995918845 hasConceptScore W1995918845C95623464 @default.
- W1995918845 hasConceptScore W1995918845C97931131 @default.
- W1995918845 hasIssue "03" @default.
- W1995918845 hasLocation W19959188451 @default.
- W1995918845 hasLocation W19959188452 @default.
- W1995918845 hasOpenAccess W1995918845 @default.
- W1995918845 hasPrimaryLocation W19959188451 @default.
- W1995918845 hasRelatedWork W1489359949 @default.
- W1995918845 hasRelatedWork W1550105856 @default.
- W1995918845 hasRelatedWork W1558903433 @default.
- W1995918845 hasRelatedWork W1987904880 @default.
- W1995918845 hasRelatedWork W1995918845 @default.
- W1995918845 hasRelatedWork W2020816856 @default.
- W1995918845 hasRelatedWork W2107725114 @default.
- W1995918845 hasRelatedWork W2348964713 @default.
- W1995918845 hasRelatedWork W2380639250 @default.