Matches in SemOpenAlex for { <https://semopenalex.org/work/W2080737377> ?p ?o ?g. }
- W2080737377 abstract "Structural similarities among proteins can provide valuable insight into their functional mechanisms and relationships. As the number of available three-dimensional (3D) protein structures increases, a greater variety of studies can be conducted with increasing efficiency, among which is the design of protein structural alphabets. Structural alphabets allow us to characterize local structures of proteins and describe the global folding structure of a protein using a one-dimensional (1D) sequence. Thus, 1D sequences can be used to identify structural similarities among proteins using standard sequence alignment tools such as BLAST or FASTA. We used self-organizing maps in combination with a minimum spanning tree algorithm to determine the optimum size of a structural alphabet and applied the k-means algorithm to group protein fragnts into clusters. The centroids of these clusters defined the structural alphabet. We also developed a flexible matrix training system to build a substitution matrix (TRISUM-169) for our alphabet. Based on FASTA and using TRISUM-169 as the substitution matrix, we developed the SA-FAST alignment tool. We compared the performance of SA-FAST with that of various search tools in database-scale search tasks and found that SA-FAST was highly competitive in all tests conducted. Further, we evaluated the performance of our structural alphabet in recognizing specific structural domains of EGF and EGF-like proteins. Our method successfully recovered more EGF sub-domains using our structural alphabet than when using other structural alphabets. SA-FAST can be found at http://140.113.166.178/safast/ . The goal of this project was two-fold. First, we wanted to introduce a modular design pipeline to those who have been working with structural alphabets. Secondly, we wanted to open the door to researchers who have done substantial work in biological sequences but have yet to enter the field of protein structure research. Our experiments showed that by transforming the structural representations from 3D to 1D, several 1D-based tools can be applied to structural analysis, including similarity searches and structural motif finding." @default.
- W2080737377 created "2016-06-24" @default.
- W2080737377 creator A5004146365 @default.
- W2080737377 creator A5029550300 @default.
- W2080737377 date "2008-08-22" @default.
- W2080737377 modified "2023-10-10" @default.
- W2080737377 title "Protein structure search and local structure characterization" @default.
- W2080737377 cites W1519235666 @default.
- W2080737377 cites W1977556410 @default.
- W2080737377 cites W2000912174 @default.
- W2080737377 cites W2008393730 @default.
- W2080737377 cites W2022058405 @default.
- W2080737377 cites W2029667189 @default.
- W2080737377 cites W2049540991 @default.
- W2080737377 cites W2050330741 @default.
- W2080737377 cites W2057289558 @default.
- W2080737377 cites W2060589515 @default.
- W2080737377 cites W2062318714 @default.
- W2080737377 cites W2064184555 @default.
- W2080737377 cites W2070693796 @default.
- W2080737377 cites W2082120777 @default.
- W2080737377 cites W2088854697 @default.
- W2080737377 cites W2095450147 @default.
- W2080737377 cites W2100981947 @default.
- W2080737377 cites W2100990314 @default.
- W2080737377 cites W2106976556 @default.
- W2080737377 cites W2110802877 @default.
- W2080737377 cites W2125725297 @default.
- W2080737377 cites W2126016150 @default.
- W2080737377 cites W2128535557 @default.
- W2080737377 cites W2130479394 @default.
- W2080737377 cites W2135484509 @default.
- W2080737377 cites W2139445777 @default.
- W2080737377 cites W2141408390 @default.
- W2080737377 cites W2141915739 @default.
- W2080737377 cites W2143210482 @default.
- W2080737377 cites W2147351132 @default.
- W2080737377 cites W2147859827 @default.
- W2080737377 cites W2148823738 @default.
- W2080737377 cites W2150627302 @default.
- W2080737377 cites W2152326664 @default.
- W2080737377 cites W2160299673 @default.
- W2080737377 cites W2168211076 @default.
- W2080737377 cites W2170471837 @default.
- W2080737377 cites W4238388397 @default.
- W2080737377 doi "https://doi.org/10.1186/1471-2105-9-349" @default.
- W2080737377 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/2529324" @default.
- W2080737377 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/18721472" @default.
- W2080737377 hasPublicationYear "2008" @default.
- W2080737377 type Work @default.
- W2080737377 sameAs 2080737377 @default.
- W2080737377 citedByCount "16" @default.
- W2080737377 countsByYear W20807373772012 @default.
- W2080737377 countsByYear W20807373772013 @default.
- W2080737377 countsByYear W20807373772014 @default.
- W2080737377 countsByYear W20807373772019 @default.
- W2080737377 crossrefType "journal-article" @default.
- W2080737377 hasAuthorship W2080737377A5004146365 @default.
- W2080737377 hasAuthorship W2080737377A5029550300 @default.
- W2080737377 hasBestOaLocation W20807373771 @default.
- W2080737377 hasConcept C104317684 @default.
- W2080737377 hasConcept C119599485 @default.
- W2080737377 hasConcept C127413603 @default.
- W2080737377 hasConcept C146599234 @default.
- W2080737377 hasConcept C154945302 @default.
- W2080737377 hasConcept C167625842 @default.
- W2080737377 hasConcept C199360897 @default.
- W2080737377 hasConcept C2776545253 @default.
- W2080737377 hasConcept C2778112365 @default.
- W2080737377 hasConcept C2778220771 @default.
- W2080737377 hasConcept C41008148 @default.
- W2080737377 hasConcept C45484198 @default.
- W2080737377 hasConcept C4668613 @default.
- W2080737377 hasConcept C47701112 @default.
- W2080737377 hasConcept C509619924 @default.
- W2080737377 hasConcept C54355233 @default.
- W2080737377 hasConcept C55493867 @default.
- W2080737377 hasConcept C69131567 @default.
- W2080737377 hasConcept C70721500 @default.
- W2080737377 hasConcept C86803240 @default.
- W2080737377 hasConceptScore W2080737377C104317684 @default.
- W2080737377 hasConceptScore W2080737377C119599485 @default.
- W2080737377 hasConceptScore W2080737377C127413603 @default.
- W2080737377 hasConceptScore W2080737377C146599234 @default.
- W2080737377 hasConceptScore W2080737377C154945302 @default.
- W2080737377 hasConceptScore W2080737377C167625842 @default.
- W2080737377 hasConceptScore W2080737377C199360897 @default.
- W2080737377 hasConceptScore W2080737377C2776545253 @default.
- W2080737377 hasConceptScore W2080737377C2778112365 @default.
- W2080737377 hasConceptScore W2080737377C2778220771 @default.
- W2080737377 hasConceptScore W2080737377C41008148 @default.
- W2080737377 hasConceptScore W2080737377C45484198 @default.
- W2080737377 hasConceptScore W2080737377C4668613 @default.
- W2080737377 hasConceptScore W2080737377C47701112 @default.
- W2080737377 hasConceptScore W2080737377C509619924 @default.
- W2080737377 hasConceptScore W2080737377C54355233 @default.
- W2080737377 hasConceptScore W2080737377C55493867 @default.
- W2080737377 hasConceptScore W2080737377C69131567 @default.
- W2080737377 hasConceptScore W2080737377C70721500 @default.
- W2080737377 hasConceptScore W2080737377C86803240 @default.