Matches in SemOpenAlex for { <https://semopenalex.org/work/W3087655841> ?p ?o ?g. }
- W3087655841 endingPage "5681" @default.
- W3087655841 startingPage "5667" @default.
- W3087655841 abstract "Although massive data is quickly accumulating on protein sequence and structure, there is a small and limited number of protein architectural types (or structural folds). This study is addressing the following question: how well could one reveal underlying sequence-structure relationships and design protein sequences for an arbitrary, potentially novel, structural fold? In response to the question, we have developed novel deep generative models, namely, semisupervised gcWGAN (guided, conditional, Wasserstein Generative Adversarial Networks). To overcome training difficulties and improve design qualities, we build our models on conditional Wasserstein GAN (WGAN) that uses Wasserstein distance in the loss function. Our major contributions include (1) constructing a low-dimensional and generalizable representation of the fold space for the conditional input, (2) developing an ultrafast sequence-to-fold predictor (or oracle) and incorporating its feedback into WGAN as a loss to guide model training, and (3) exploiting sequence data with and without paired structures to enable a semisupervised training strategy. Assessed by the oracle over 100 novel folds not in the training set, gcWGAN generates more successful designs and covers 3.5 times more target folds compared to a competing data-driven method (cVAE). Assessed by sequence- and structure-based predictors, gcWGAN designs are physically and biologically sound. Assessed by a structure predictor over representative novel folds, including one not even part of basis folds, gcWGAN designs have comparable or better fold accuracy yet much more sequence diversity and novelty than cVAE. The ultrafast data-driven model is further shown to boost the success of a principle-driven de novo method (RosettaDesign), through generating design seeds and tailoring design space. In conclusion, gcWGAN explores uncharted sequence space to design proteins by learning generalizable principles from current sequence-structure data. Data, source codes, and trained models are available at https://github.com/Shen-Lab/gcWGAN." @default.
- W3087655841 created "2020-09-25" @default.
- W3087655841 creator A5008603338 @default.
- W3087655841 creator A5028650738 @default.
- W3087655841 creator A5057485825 @default.
- W3087655841 creator A5069807825 @default.
- W3087655841 date "2020-09-18" @default.
- W3087655841 modified "2023-09-26" @default.
- W3087655841 title "De Novo Protein Design for Novel Folds Using Guided Conditional Wasserstein Generative Adversarial Networks" @default.
- W3087655841 cites W141681130 @default.
- W3087655841 cites W1493324970 @default.
- W3087655841 cites W1587559447 @default.
- W3087655841 cites W1965186869 @default.
- W3087655841 cites W1975304761 @default.
- W3087655841 cites W1978447477 @default.
- W3087655841 cites W1979762151 @default.
- W3087655841 cites W1996576490 @default.
- W3087655841 cites W1996733064 @default.
- W3087655841 cites W2003229787 @default.
- W3087655841 cites W2013425283 @default.
- W3087655841 cites W2045777307 @default.
- W3087655841 cites W2059681116 @default.
- W3087655841 cites W2101712920 @default.
- W3087655841 cites W2102245393 @default.
- W3087655841 cites W2102461176 @default.
- W3087655841 cites W2106775540 @default.
- W3087655841 cites W2107867854 @default.
- W3087655841 cites W2114340287 @default.
- W3087655841 cites W2114850508 @default.
- W3087655841 cites W2115540209 @default.
- W3087655841 cites W2120836664 @default.
- W3087655841 cites W2121627241 @default.
- W3087655841 cites W2132644745 @default.
- W3087655841 cites W2136724628 @default.
- W3087655841 cites W2141795045 @default.
- W3087655841 cites W2144686793 @default.
- W3087655841 cites W2145350307 @default.
- W3087655841 cites W2170471837 @default.
- W3087655841 cites W2201713963 @default.
- W3087655841 cites W2252678535 @default.
- W3087655841 cites W2340987618 @default.
- W3087655841 cites W2519539312 @default.
- W3087655841 cites W2579798392 @default.
- W3087655841 cites W2735621019 @default.
- W3087655841 cites W2784920021 @default.
- W3087655841 cites W2785273668 @default.
- W3087655841 cites W2809642602 @default.
- W3087655841 cites W2809879025 @default.
- W3087655841 cites W2883583470 @default.
- W3087655841 cites W2889498145 @default.
- W3087655841 cites W2891007938 @default.
- W3087655841 cites W2891841439 @default.
- W3087655841 cites W2895487334 @default.
- W3087655841 cites W2898392948 @default.
- W3087655841 cites W2898664946 @default.
- W3087655841 cites W2899747610 @default.
- W3087655841 cites W2902353954 @default.
- W3087655841 cites W2908391177 @default.
- W3087655841 cites W2949867299 @default.
- W3087655841 cites W2951282333 @default.
- W3087655841 cites W2979389376 @default.
- W3087655841 cites W2999044305 @default.
- W3087655841 cites W3010339412 @default.
- W3087655841 cites W3098128018 @default.
- W3087655841 cites W3100751385 @default.
- W3087655841 doi "https://doi.org/10.1021/acs.jcim.0c00593" @default.
- W3087655841 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/7775287" @default.
- W3087655841 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/32945673" @default.
- W3087655841 hasPublicationYear "2020" @default.
- W3087655841 type Work @default.
- W3087655841 sameAs 3087655841 @default.
- W3087655841 citedByCount "39" @default.
- W3087655841 countsByYear W30876558412020 @default.
- W3087655841 countsByYear W30876558412021 @default.
- W3087655841 countsByYear W30876558412022 @default.
- W3087655841 countsByYear W30876558412023 @default.
- W3087655841 crossrefType "journal-article" @default.
- W3087655841 hasAuthorship W3087655841A5008603338 @default.
- W3087655841 hasAuthorship W3087655841A5028650738 @default.
- W3087655841 hasAuthorship W3087655841A5057485825 @default.
- W3087655841 hasAuthorship W3087655841A5069807825 @default.
- W3087655841 hasBestOaLocation W30876558412 @default.
- W3087655841 hasConcept C11413529 @default.
- W3087655841 hasConcept C115903868 @default.
- W3087655841 hasConcept C119857082 @default.
- W3087655841 hasConcept C153180895 @default.
- W3087655841 hasConcept C154945302 @default.
- W3087655841 hasConcept C167966045 @default.
- W3087655841 hasConcept C2778112365 @default.
- W3087655841 hasConcept C39890363 @default.
- W3087655841 hasConcept C41008148 @default.
- W3087655841 hasConcept C54355233 @default.
- W3087655841 hasConcept C55166926 @default.
- W3087655841 hasConcept C86803240 @default.
- W3087655841 hasConceptScore W3087655841C11413529 @default.
- W3087655841 hasConceptScore W3087655841C115903868 @default.
- W3087655841 hasConceptScore W3087655841C119857082 @default.
- W3087655841 hasConceptScore W3087655841C153180895 @default.