Matches in SemOpenAlex for { <https://semopenalex.org/work/W2148488210> ?p ?o ?g. }
- W2148488210 endingPage "2301" @default.
- W2148488210 startingPage "2295" @default.
- W2148488210 abstract "Motivation: Structural knowledge, extracted from the Protein Data Bank (PDB), underlies numerous potential functions and prediction methods. The PDB, however, is highly biased: many proteins have more than one entry, while entire protein families are represented by a single structure, or even not at all. The standard solution to this problem is to limit the studies to non-redundant subsets of the PDB. While alleviating biases, this solution hides the many-to-many relations between sequences and structures. That is, non-redundant datasets conceal the diversity of sequences that share the same fold and the existence of multiple conformations for the same protein. A particularly disturbing aspect of non-redundant subsets is that they hardly benefit from the rapid pace of protein structure determination, as most newly solved structures fall within existing families. Results: In this study we explore the concept of redundancy-weighted datasets, originally suggested by Miyazawa and Jernigan. Redundancy-weighted datasets include all available structures and associate them (or features thereof) with weights that are inversely proportional to the number of their homologs. Here, we provide the first systematic comparison of redundancy-weighted datasets with non-redundant ones. We test three weighting schemes and show that the distributions of structural features that they produce are smoother (having higher entropy) compared with the distributions inferred from non-redundant datasets. We further show that these smoothed distributions are both more robust and more correct than their non-redundant counterparts. We suggest that the better distributions, inferred using redundancy-weighting, may improve the accuracy of knowledge-based potentials and increase the power of protein structure prediction methods. Consequently, they may enhance model-driven molecular biology. Contact: cheny@il.ibm.com or chen.keasar@gmail.com" @default.
- W2148488210 created "2016-06-24" @default.
- W2148488210 creator A5019080704 @default.
- W2148488210 creator A5022114369 @default.
- W2148488210 creator A5039615104 @default.
- W2148488210 creator A5072753313 @default.
- W2148488210 creator A5080435987 @default.
- W2148488210 date "2014-04-25" @default.
- W2148488210 modified "2023-10-11" @default.
- W2148488210 title "Redundancy-weighting for better inference of protein structural features" @default.
- W2148488210 cites W1499450468 @default.
- W2148488210 cites W1966041739 @default.
- W2148488210 cites W1968543088 @default.
- W2148488210 cites W1975017420 @default.
- W2148488210 cites W1986347357 @default.
- W2148488210 cites W1989228404 @default.
- W2148488210 cites W2008708467 @default.
- W2148488210 cites W2015292449 @default.
- W2148488210 cites W2044145005 @default.
- W2148488210 cites W2049695588 @default.
- W2148488210 cites W2051872583 @default.
- W2148488210 cites W2071486470 @default.
- W2148488210 cites W2074231493 @default.
- W2148488210 cites W2091614031 @default.
- W2148488210 cites W2091778154 @default.
- W2148488210 cites W2093916754 @default.
- W2148488210 cites W2095450147 @default.
- W2148488210 cites W2106154817 @default.
- W2148488210 cites W2106882534 @default.
- W2148488210 cites W2107037205 @default.
- W2148488210 cites W2110821410 @default.
- W2148488210 cites W2118245746 @default.
- W2148488210 cites W2131204071 @default.
- W2148488210 cites W2139319441 @default.
- W2148488210 cites W2141120048 @default.
- W2148488210 cites W2142529984 @default.
- W2148488210 cites W2146950091 @default.
- W2148488210 cites W2148109411 @default.
- W2148488210 cites W2148645951 @default.
- W2148488210 cites W2153153865 @default.
- W2148488210 cites W2156125289 @default.
- W2148488210 cites W2157965775 @default.
- W2148488210 cites W2158714788 @default.
- W2148488210 cites W2161776897 @default.
- W2148488210 cites W2168211076 @default.
- W2148488210 cites W2171546617 @default.
- W2148488210 doi "https://doi.org/10.1093/bioinformatics/btu242" @default.
- W2148488210 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/4192046" @default.
- W2148488210 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/24771517" @default.
- W2148488210 hasPublicationYear "2014" @default.
- W2148488210 type Work @default.
- W2148488210 sameAs 2148488210 @default.
- W2148488210 citedByCount "11" @default.
- W2148488210 countsByYear W21484882102015 @default.
- W2148488210 countsByYear W21484882102016 @default.
- W2148488210 countsByYear W21484882102017 @default.
- W2148488210 countsByYear W21484882102018 @default.
- W2148488210 countsByYear W21484882102019 @default.
- W2148488210 countsByYear W21484882102020 @default.
- W2148488210 countsByYear W21484882102022 @default.
- W2148488210 countsByYear W21484882102023 @default.
- W2148488210 crossrefType "journal-article" @default.
- W2148488210 hasAuthorship W2148488210A5019080704 @default.
- W2148488210 hasAuthorship W2148488210A5022114369 @default.
- W2148488210 hasAuthorship W2148488210A5039615104 @default.
- W2148488210 hasAuthorship W2148488210A5072753313 @default.
- W2148488210 hasAuthorship W2148488210A5080435987 @default.
- W2148488210 hasBestOaLocation W21484882101 @default.
- W2148488210 hasConcept C106301342 @default.
- W2148488210 hasConcept C111919701 @default.
- W2148488210 hasConcept C11413529 @default.
- W2148488210 hasConcept C119145174 @default.
- W2148488210 hasConcept C121332964 @default.
- W2148488210 hasConcept C124101348 @default.
- W2148488210 hasConcept C126838900 @default.
- W2148488210 hasConcept C152124472 @default.
- W2148488210 hasConcept C153180895 @default.
- W2148488210 hasConcept C154945302 @default.
- W2148488210 hasConcept C183115368 @default.
- W2148488210 hasConcept C2776214188 @default.
- W2148488210 hasConcept C41008148 @default.
- W2148488210 hasConcept C47701112 @default.
- W2148488210 hasConcept C55493867 @default.
- W2148488210 hasConcept C62520636 @default.
- W2148488210 hasConcept C65556437 @default.
- W2148488210 hasConcept C71924100 @default.
- W2148488210 hasConcept C86803240 @default.
- W2148488210 hasConceptScore W2148488210C106301342 @default.
- W2148488210 hasConceptScore W2148488210C111919701 @default.
- W2148488210 hasConceptScore W2148488210C11413529 @default.
- W2148488210 hasConceptScore W2148488210C119145174 @default.
- W2148488210 hasConceptScore W2148488210C121332964 @default.
- W2148488210 hasConceptScore W2148488210C124101348 @default.
- W2148488210 hasConceptScore W2148488210C126838900 @default.
- W2148488210 hasConceptScore W2148488210C152124472 @default.
- W2148488210 hasConceptScore W2148488210C153180895 @default.
- W2148488210 hasConceptScore W2148488210C154945302 @default.
- W2148488210 hasConceptScore W2148488210C183115368 @default.