Matches in SemOpenAlex for { <https://semopenalex.org/work/W97281011> ?p ?o ?g. }
- W97281011 abstract "1 The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncovers new proteins at a fast rate. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to the extreme diversity among its members; yet, they are an important subject in pharmacological research being the target of approximately 60% of current drugs (Muller, 2000). A comparison of BLAST, k-NN, HMM and SVM with alignment-based features by Karchin et al. (2002) has suggested that classifiers at the complexity of SVM are needed to attain high accuracy in GPCR subfamily classification. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on n-gram counts to the GPCR family and subfamily classification task. Using the dataset and evaluation protocol from the previous study, we found the Naive Bayes classifier surpassing the reported accuracy of SVM by 4.8% and 6.1% in level I and II subfamily classification with an accuracy of 93.2% and 92.4% respectively. The Decision Tree, while inferior to SVM, still outperforms HMM in both level I and II subfamily classification. Moreover, the n-grams selected by chi-square feature selection show evidence of biological importance. Thus, the document classification approach has resulted in a simpler, more accurate and interpretable classifier." @default.
- W97281011 created "2016-06-24" @default.
- W97281011 creator A5062362044 @default.
- W97281011 creator A5065265042 @default.
- W97281011 creator A5088836369 @default.
- W97281011 date "2003-01-01" @default.
- W97281011 modified "2023-09-24" @default.
- W97281011 title "Document Classification of Protein Sequences" @default.
- W97281011 cites W1498183065 @default.
- W97281011 cites W1527979595 @default.
- W97281011 cites W1536270671 @default.
- W97281011 cites W1539117683 @default.
- W97281011 cites W1604806933 @default.
- W97281011 cites W1806366435 @default.
- W97281011 cites W1971143006 @default.
- W97281011 cites W1983493107 @default.
- W97281011 cites W1986014242 @default.
- W97281011 cites W1990215053 @default.
- W97281011 cites W1990554768 @default.
- W97281011 cites W1994025789 @default.
- W97281011 cites W1994402758 @default.
- W97281011 cites W1995260859 @default.
- W97281011 cites W1997199912 @default.
- W97281011 cites W2002566401 @default.
- W97281011 cites W2020816856 @default.
- W97281011 cites W2036295979 @default.
- W97281011 cites W2036667662 @default.
- W97281011 cites W2037060512 @default.
- W97281011 cites W2041862730 @default.
- W97281011 cites W2054333829 @default.
- W97281011 cites W2055043387 @default.
- W97281011 cites W2065315036 @default.
- W97281011 cites W2073110681 @default.
- W97281011 cites W2074231493 @default.
- W97281011 cites W2080336769 @default.
- W97281011 cites W2087064593 @default.
- W97281011 cites W2087885007 @default.
- W97281011 cites W2098191315 @default.
- W97281011 cites W2099946731 @default.
- W97281011 cites W2102122585 @default.
- W97281011 cites W2102941368 @default.
- W97281011 cites W2105516199 @default.
- W97281011 cites W2106882534 @default.
- W97281011 cites W2107427637 @default.
- W97281011 cites W2107725114 @default.
- W97281011 cites W2110743132 @default.
- W97281011 cites W2117019496 @default.
- W97281011 cites W2117249420 @default.
- W97281011 cites W2117619142 @default.
- W97281011 cites W2121082582 @default.
- W97281011 cites W2124073224 @default.
- W97281011 cites W2124145810 @default.
- W97281011 cites W2124158580 @default.
- W97281011 cites W2127648442 @default.
- W97281011 cites W2131724426 @default.
- W97281011 cites W2134043769 @default.
- W97281011 cites W2135453151 @default.
- W97281011 cites W2136925037 @default.
- W97281011 cites W2140820106 @default.
- W97281011 cites W2141172196 @default.
- W97281011 cites W2141885858 @default.
- W97281011 cites W2143173841 @default.
- W97281011 cites W2147285856 @default.
- W97281011 cites W2148853951 @default.
- W97281011 cites W2158714788 @default.
- W97281011 cites W2163181687 @default.
- W97281011 cites W2165495989 @default.
- W97281011 cites W2171963266 @default.
- W97281011 cites W2283504545 @default.
- W97281011 cites W2435251607 @default.
- W97281011 cites W3130236387 @default.
- W97281011 cites W51161921 @default.
- W97281011 cites W53898165 @default.
- W97281011 cites W2784619191 @default.
- W97281011 hasPublicationYear "2003" @default.
- W97281011 type Work @default.
- W97281011 sameAs 97281011 @default.
- W97281011 citedByCount "0" @default.
- W97281011 crossrefType "journal-article" @default.
- W97281011 hasAuthorship W97281011A5062362044 @default.
- W97281011 hasAuthorship W97281011A5065265042 @default.
- W97281011 hasAuthorship W97281011A5088836369 @default.
- W97281011 hasConcept C104317684 @default.
- W97281011 hasConcept C119857082 @default.
- W97281011 hasConcept C12267149 @default.
- W97281011 hasConcept C139532973 @default.
- W97281011 hasConcept C148483581 @default.
- W97281011 hasConcept C153180895 @default.
- W97281011 hasConcept C154945302 @default.
- W97281011 hasConcept C41008148 @default.
- W97281011 hasConcept C50929876 @default.
- W97281011 hasConcept C52001869 @default.
- W97281011 hasConcept C54355233 @default.
- W97281011 hasConcept C84525736 @default.
- W97281011 hasConcept C86803240 @default.
- W97281011 hasConcept C95623464 @default.
- W97281011 hasConceptScore W97281011C104317684 @default.
- W97281011 hasConceptScore W97281011C119857082 @default.
- W97281011 hasConceptScore W97281011C12267149 @default.
- W97281011 hasConceptScore W97281011C139532973 @default.