Matches in SemOpenAlex for { <https://semopenalex.org/work/W4360990468> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W4360990468 abstract "Abstract Proteins are utilised in various biotechnological applications, often requiring the optimisation of protein properties by introducing specific amino acid exchanges. Deep mutational scanning (DMS) is an effective high-throughput method for evaluating the effects of these exchanges on protein function. DMS data can then inform the training of a neural network to predict the impact of mutations. Most approaches employ some representation of the protein sequence for training and prediction. As proteins are characterised by complex structures and intricate residue interaction networks, directly providing structural information as input reduces the need to learn these features from the data. We introduce a method for encoding protein structures as stacked 2D contact maps, which capture residue interactions, their evolutionary conservation, and mutation-induced interaction changes. Furthermore, we explored techniques to augment neural network training performance on smaller DMS datasets. To validate our approach, we trained three neural network architectures originally used for image analysis on three DMS datasets, and we compared their performances with networks trained solely on protein sequences. The results confirm the effectiveness of the protein structure encoding in machine learning efforts on DMS data. Using structural representations as direct input to the networks, along with data augmentation and pre-training, significantly reduced demands on training data size and improved prediction performance, especially on smaller datasets, while performance on large datasets was on par with state-of-the-art sequence convolutional neural networks. The methods highlighted here have the potential to expand the applicability of DMS by reducing experimental requirements and by making it accessible to cases where true high-throughput screening is not feasible. Additionally, we present an open-source, user-friendly software tool to make these data analysis techniques accessible, particularly to biotechnology and protein engineering researchers who wish to apply them to their mutagenesis data. Author summary We introduce a novel approach to improve predictions of protein properties based on deep mutational scanning (DMS) data. Our method utilizes a unique data representation for protein structures through stacked 2D contact maps, which capture residue interactions and alterations in these interactions due to amino acid substitutions, as well as evolutionary conservation. We also tested several techniques to enhance the performance of neural networks when trained on small datasets. By training three neural network architectures on three public DMS datasets, we demonstrate the value of integrating protein structure information in training neural networks to predict the effects of amino acid substitutions. Utilizing structural information along with data augmentation and pre-training significantly improves performance on smaller datasets compared to state-of-the-art sequence convolutional neural networks while achieving similar performance on larger datasets. Thus, our approach could substantially reduce the experimental input needed and provide valuable predictions even for smaller DMS datasets when high-throughput screening is not feasible." @default.
- W4360990468 created "2023-03-30" @default.
- W4360990468 creator A5000823779 @default.
- W4360990468 creator A5021953327 @default.
- W4360990468 creator A5027940353 @default.
- W4360990468 creator A5033733724 @default.
- W4360990468 date "2023-03-27" @default.
- W4360990468 modified "2023-09-29" @default.
- W4360990468 title "Flattening the curve - How to get better results with small deep-mutational-scanning datasets" @default.
- W4360990468 cites W1995808589 @default.
- W4360990468 cites W2003378333 @default.
- W4360990468 cites W2014159272 @default.
- W4360990468 cites W2057520802 @default.
- W4360990468 cites W2060588922 @default.
- W4360990468 cites W2077835601 @default.
- W4360990468 cites W2097270746 @default.
- W4360990468 cites W2112796928 @default.
- W4360990468 cites W2114029728 @default.
- W4360990468 cites W2146256775 @default.
- W4360990468 cites W2146341019 @default.
- W4360990468 cites W2151887741 @default.
- W4360990468 cites W2245592118 @default.
- W4360990468 cites W2739455021 @default.
- W4360990468 cites W2774216375 @default.
- W4360990468 cites W2890223884 @default.
- W4360990468 cites W2950672524 @default.
- W4360990468 cites W2963446712 @default.
- W4360990468 cites W2975488016 @default.
- W4360990468 cites W2987741655 @default.
- W4360990468 cites W3018421652 @default.
- W4360990468 cites W3043310823 @default.
- W4360990468 cites W3048714202 @default.
- W4360990468 cites W3144239152 @default.
- W4360990468 cites W3144701084 @default.
- W4360990468 cites W3146944767 @default.
- W4360990468 cites W3177828909 @default.
- W4360990468 cites W3186179742 @default.
- W4360990468 cites W3215514970 @default.
- W4360990468 cites W4280615473 @default.
- W4360990468 cites W4285661751 @default.
- W4360990468 cites W4293539946 @default.
- W4360990468 doi "https://doi.org/10.1101/2023.03.27.534314" @default.
- W4360990468 hasPublicationYear "2023" @default.
- W4360990468 type Work @default.
- W4360990468 citedByCount "0" @default.
- W4360990468 crossrefType "posted-content" @default.
- W4360990468 hasAuthorship W4360990468A5000823779 @default.
- W4360990468 hasAuthorship W4360990468A5021953327 @default.
- W4360990468 hasAuthorship W4360990468A5027940353 @default.
- W4360990468 hasAuthorship W4360990468A5033733724 @default.
- W4360990468 hasBestOaLocation W43609904681 @default.
- W4360990468 hasConcept C104317684 @default.
- W4360990468 hasConcept C108583219 @default.
- W4360990468 hasConcept C119857082 @default.
- W4360990468 hasConcept C124101348 @default.
- W4360990468 hasConcept C125411270 @default.
- W4360990468 hasConcept C153180895 @default.
- W4360990468 hasConcept C154945302 @default.
- W4360990468 hasConcept C2986374874 @default.
- W4360990468 hasConcept C41008148 @default.
- W4360990468 hasConcept C50644808 @default.
- W4360990468 hasConcept C55493867 @default.
- W4360990468 hasConcept C81363708 @default.
- W4360990468 hasConcept C86803240 @default.
- W4360990468 hasConceptScore W4360990468C104317684 @default.
- W4360990468 hasConceptScore W4360990468C108583219 @default.
- W4360990468 hasConceptScore W4360990468C119857082 @default.
- W4360990468 hasConceptScore W4360990468C124101348 @default.
- W4360990468 hasConceptScore W4360990468C125411270 @default.
- W4360990468 hasConceptScore W4360990468C153180895 @default.
- W4360990468 hasConceptScore W4360990468C154945302 @default.
- W4360990468 hasConceptScore W4360990468C2986374874 @default.
- W4360990468 hasConceptScore W4360990468C41008148 @default.
- W4360990468 hasConceptScore W4360990468C50644808 @default.
- W4360990468 hasConceptScore W4360990468C55493867 @default.
- W4360990468 hasConceptScore W4360990468C81363708 @default.
- W4360990468 hasConceptScore W4360990468C86803240 @default.
- W4360990468 hasLocation W43609904681 @default.
- W4360990468 hasOpenAccess W4360990468 @default.
- W4360990468 hasPrimaryLocation W43609904681 @default.
- W4360990468 hasRelatedWork W2337926734 @default.
- W4360990468 hasRelatedWork W2738221750 @default.
- W4360990468 hasRelatedWork W3021430260 @default.
- W4360990468 hasRelatedWork W3156786002 @default.
- W4360990468 hasRelatedWork W4312417841 @default.
- W4360990468 hasRelatedWork W4320802194 @default.
- W4360990468 hasRelatedWork W4321369474 @default.
- W4360990468 hasRelatedWork W4366224123 @default.
- W4360990468 hasRelatedWork W4381487685 @default.
- W4360990468 hasRelatedWork W564581980 @default.
- W4360990468 isParatext "false" @default.
- W4360990468 isRetracted "false" @default.
- W4360990468 workType "article" @default.