Matches in SemOpenAlex for { <https://semopenalex.org/work/W3115809040> ?p ?o ?g. }
- W3115809040 endingPage "i725" @default.
- W3115809040 startingPage "i718" @default.
- W3115809040 abstract "Abstract Motivation As the number of experimentally solved protein structures rises, it becomes increasingly appealing to use structural information for predictive tasks involving proteins. Due to the large variation in protein sizes, folds and topologies, an attractive approach is to embed protein structures into fixed-length vectors, which can be used in machine learning algorithms aimed at predicting and understanding functional and physical properties. Many existing embedding approaches are alignment based, which is both time-consuming and ineffective for distantly related proteins. On the other hand, library- or model-based approaches depend on a small library of fragments or require the use of a trained model, both of which may not generalize well. Results We present Geometricus, a novel and universally applicable approach to embedding proteins in a fixed-dimensional space. The approach is fast, accurate, and interpretable. Geometricus uses a set of 3D moment invariants to discretize fragments of protein structures into shape-mers, which are then counted to describe the full structure as a vector of counts. We demonstrate the applicability of this approach in various tasks, ranging from fast structure similarity search, unsupervised clustering and structure classification across proteins from different superfamilies as well as within the same family. Availability and implementation Python code available at https://git.wur.nl/durai001/geometricus." @default.
- W3115809040 created "2021-01-05" @default.
- W3115809040 creator A5015936316 @default.
- W3115809040 creator A5040003101 @default.
- W3115809040 creator A5078060263 @default.
- W3115809040 creator A5080533789 @default.
- W3115809040 date "2020-12-01" @default.
- W3115809040 modified "2023-10-14" @default.
- W3115809040 title "Geometricus represents protein structures as shape-mers derived from moment invariants" @default.
- W3115809040 cites W1483099488 @default.
- W3115809040 cites W1516459493 @default.
- W3115809040 cites W1596576919 @default.
- W3115809040 cites W1971650367 @default.
- W3115809040 cites W1974126272 @default.
- W3115809040 cites W1974525959 @default.
- W3115809040 cites W2007981068 @default.
- W3115809040 cites W2021712510 @default.
- W3115809040 cites W2076443836 @default.
- W3115809040 cites W2085277871 @default.
- W3115809040 cites W2089085945 @default.
- W3115809040 cites W2091506983 @default.
- W3115809040 cites W2098871350 @default.
- W3115809040 cites W2108329881 @default.
- W3115809040 cites W2123681603 @default.
- W3115809040 cites W2136567909 @default.
- W3115809040 cites W2137382111 @default.
- W3115809040 cites W2152839250 @default.
- W3115809040 cites W2157965775 @default.
- W3115809040 cites W2159498975 @default.
- W3115809040 cites W2167212630 @default.
- W3115809040 cites W2173918725 @default.
- W3115809040 cites W2225513621 @default.
- W3115809040 cites W2276732348 @default.
- W3115809040 cites W2315943170 @default.
- W3115809040 cites W2385735394 @default.
- W3115809040 cites W2888784731 @default.
- W3115809040 cites W2895146543 @default.
- W3115809040 cites W2903357856 @default.
- W3115809040 cites W2913820882 @default.
- W3115809040 cites W2950806586 @default.
- W3115809040 cites W2980789587 @default.
- W3115809040 cites W2999044305 @default.
- W3115809040 cites W4210531204 @default.
- W3115809040 doi "https://doi.org/10.1093/bioinformatics/btaa839" @default.
- W3115809040 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/33381814" @default.
- W3115809040 hasPublicationYear "2020" @default.
- W3115809040 type Work @default.
- W3115809040 sameAs 3115809040 @default.
- W3115809040 citedByCount "19" @default.
- W3115809040 countsByYear W31158090402020 @default.
- W3115809040 countsByYear W31158090402021 @default.
- W3115809040 countsByYear W31158090402022 @default.
- W3115809040 countsByYear W31158090402023 @default.
- W3115809040 crossrefType "journal-article" @default.
- W3115809040 hasAuthorship W3115809040A5015936316 @default.
- W3115809040 hasAuthorship W3115809040A5040003101 @default.
- W3115809040 hasAuthorship W3115809040A5078060263 @default.
- W3115809040 hasAuthorship W3115809040A5080533789 @default.
- W3115809040 hasBestOaLocation W31158090401 @default.
- W3115809040 hasConcept C103278499 @default.
- W3115809040 hasConcept C111919701 @default.
- W3115809040 hasConcept C11413529 @default.
- W3115809040 hasConcept C115961682 @default.
- W3115809040 hasConcept C13336665 @default.
- W3115809040 hasConcept C134306372 @default.
- W3115809040 hasConcept C154945302 @default.
- W3115809040 hasConcept C177264268 @default.
- W3115809040 hasConcept C199360897 @default.
- W3115809040 hasConcept C2524010 @default.
- W3115809040 hasConcept C33923547 @default.
- W3115809040 hasConcept C41008148 @default.
- W3115809040 hasConcept C41608201 @default.
- W3115809040 hasConcept C47701112 @default.
- W3115809040 hasConcept C519991488 @default.
- W3115809040 hasConcept C55493867 @default.
- W3115809040 hasConcept C73000952 @default.
- W3115809040 hasConcept C73555534 @default.
- W3115809040 hasConcept C80444323 @default.
- W3115809040 hasConcept C86803240 @default.
- W3115809040 hasConceptScore W3115809040C103278499 @default.
- W3115809040 hasConceptScore W3115809040C111919701 @default.
- W3115809040 hasConceptScore W3115809040C11413529 @default.
- W3115809040 hasConceptScore W3115809040C115961682 @default.
- W3115809040 hasConceptScore W3115809040C13336665 @default.
- W3115809040 hasConceptScore W3115809040C134306372 @default.
- W3115809040 hasConceptScore W3115809040C154945302 @default.
- W3115809040 hasConceptScore W3115809040C177264268 @default.
- W3115809040 hasConceptScore W3115809040C199360897 @default.
- W3115809040 hasConceptScore W3115809040C2524010 @default.
- W3115809040 hasConceptScore W3115809040C33923547 @default.
- W3115809040 hasConceptScore W3115809040C41008148 @default.
- W3115809040 hasConceptScore W3115809040C41608201 @default.
- W3115809040 hasConceptScore W3115809040C47701112 @default.
- W3115809040 hasConceptScore W3115809040C519991488 @default.
- W3115809040 hasConceptScore W3115809040C55493867 @default.
- W3115809040 hasConceptScore W3115809040C73000952 @default.
- W3115809040 hasConceptScore W3115809040C73555534 @default.
- W3115809040 hasConceptScore W3115809040C80444323 @default.