Matches in SemOpenAlex for { <https://semopenalex.org/work/W2140356722> ?p ?o ?g. }
- W2140356722 abstract "Abstract Background Recombinant protein production is a useful biotechnology to produce a large quantity of highly soluble proteins. Currently, the most widely used production system is to fuse a target protein into different vectors in Escherichia coli ( E. coli ). However, the production efficacy of different vectors varies for different target proteins. Trial-and-error is still the common practice to find out the efficacy of a vector for a given target protein. Previous studies are limited in that they assumed that proteins would be over-expressed and focused only on the solubility of expressed proteins. In fact, many pairings of vectors and proteins result in no expression. Results In this study, we applied machine learning to train prediction models to predict whether a pairing of vector-protein will express or not express in E. coli . For expressed cases, the models further predict whether the expressed proteins would be soluble. We collected a set of real cases from the clients of our recombinant protein production core facility, where six different vectors were designed and studied. This set of cases is used in both training and evaluation of our models. We evaluate three different models based on the support vector machines (SVM) and their ensembles. Unlike many previous works, these models consider the sequence of the target protein as well as the sequence of the whole fusion vector as the features. We show that a model that classifies a case into one of the three classes (no expression, inclusion body and soluble) outperforms a model that considers the nested structure of the three classes, while a model that can take advantage of the hierarchical structure of the three classes performs slight worse but comparably to the best model. Meanwhile, compared to previous works, we show that the prediction accuracy of our best method still performs the best. Lastly, we briefly present two methods to use the trained model in the design of the recombinant protein production systems to improve the chance of high soluble protein production. Conclusion In this paper, we show that a machine learning approach to the prediction of the efficacy of a vector for a target protein in a recombinant protein production system is promising and may compliment traditional knowledge-driven study of the efficacy. We will release our program to share with other labs in the public domain when this paper is published." @default.
- W2140356722 created "2016-06-24" @default.
- W2140356722 creator A5019463541 @default.
- W2140356722 creator A5030865164 @default.
- W2140356722 creator A5037335706 @default.
- W2140356722 creator A5040718770 @default.
- W2140356722 creator A5056615015 @default.
- W2140356722 creator A5070191522 @default.
- W2140356722 date "2010-01-01" @default.
- W2140356722 modified "2023-10-10" @default.
- W2140356722 title "Learning to predict expression efficacy of vectors in recombinant protein production" @default.
- W2140356722 cites W1977779119 @default.
- W2140356722 cites W1987534021 @default.
- W2140356722 cites W1996423252 @default.
- W2140356722 cites W2014566476 @default.
- W2140356722 cites W2017526934 @default.
- W2140356722 cites W2032699674 @default.
- W2140356722 cites W2058268622 @default.
- W2140356722 cites W2067602907 @default.
- W2140356722 cites W2068446924 @default.
- W2140356722 cites W2095900655 @default.
- W2140356722 cites W2097152121 @default.
- W2140356722 cites W2097423696 @default.
- W2140356722 cites W2101924701 @default.
- W2140356722 cites W2107270211 @default.
- W2140356722 cites W2109449939 @default.
- W2140356722 cites W2115629999 @default.
- W2140356722 cites W2119246926 @default.
- W2140356722 cites W2137490166 @default.
- W2140356722 cites W2144587714 @default.
- W2140356722 cites W2148406855 @default.
- W2140356722 cites W2149367664 @default.
- W2140356722 cites W2151430326 @default.
- W2140356722 cites W2152770371 @default.
- W2140356722 cites W2156909104 @default.
- W2140356722 cites W2164511624 @default.
- W2140356722 cites W2172000360 @default.
- W2140356722 cites W2255304239 @default.
- W2140356722 cites W2735870257 @default.
- W2140356722 cites W4319425375 @default.
- W2140356722 doi "https://doi.org/10.1186/1471-2105-11-s1-s21" @default.
- W2140356722 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3009492" @default.
- W2140356722 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/20122193" @default.
- W2140356722 hasPublicationYear "2010" @default.
- W2140356722 type Work @default.
- W2140356722 sameAs 2140356722 @default.
- W2140356722 citedByCount "30" @default.
- W2140356722 countsByYear W21403567222012 @default.
- W2140356722 countsByYear W21403567222014 @default.
- W2140356722 countsByYear W21403567222015 @default.
- W2140356722 countsByYear W21403567222018 @default.
- W2140356722 countsByYear W21403567222019 @default.
- W2140356722 countsByYear W21403567222020 @default.
- W2140356722 countsByYear W21403567222021 @default.
- W2140356722 countsByYear W21403567222022 @default.
- W2140356722 countsByYear W21403567222023 @default.
- W2140356722 crossrefType "journal-article" @default.
- W2140356722 hasAuthorship W2140356722A5019463541 @default.
- W2140356722 hasAuthorship W2140356722A5030865164 @default.
- W2140356722 hasAuthorship W2140356722A5037335706 @default.
- W2140356722 hasAuthorship W2140356722A5040718770 @default.
- W2140356722 hasAuthorship W2140356722A5056615015 @default.
- W2140356722 hasAuthorship W2140356722A5070191522 @default.
- W2140356722 hasBestOaLocation W21403567221 @default.
- W2140356722 hasConcept C10010492 @default.
- W2140356722 hasConcept C104317684 @default.
- W2140356722 hasConcept C119857082 @default.
- W2140356722 hasConcept C12267149 @default.
- W2140356722 hasConcept C123894998 @default.
- W2140356722 hasConcept C154945302 @default.
- W2140356722 hasConcept C167625842 @default.
- W2140356722 hasConcept C177264268 @default.
- W2140356722 hasConcept C199360897 @default.
- W2140356722 hasConcept C203750385 @default.
- W2140356722 hasConcept C2778112365 @default.
- W2140356722 hasConcept C40767141 @default.
- W2140356722 hasConcept C41008148 @default.
- W2140356722 hasConcept C547475151 @default.
- W2140356722 hasConcept C55493867 @default.
- W2140356722 hasConcept C70721500 @default.
- W2140356722 hasConcept C86803240 @default.
- W2140356722 hasConcept C92087593 @default.
- W2140356722 hasConceptScore W2140356722C10010492 @default.
- W2140356722 hasConceptScore W2140356722C104317684 @default.
- W2140356722 hasConceptScore W2140356722C119857082 @default.
- W2140356722 hasConceptScore W2140356722C12267149 @default.
- W2140356722 hasConceptScore W2140356722C123894998 @default.
- W2140356722 hasConceptScore W2140356722C154945302 @default.
- W2140356722 hasConceptScore W2140356722C167625842 @default.
- W2140356722 hasConceptScore W2140356722C177264268 @default.
- W2140356722 hasConceptScore W2140356722C199360897 @default.
- W2140356722 hasConceptScore W2140356722C203750385 @default.
- W2140356722 hasConceptScore W2140356722C2778112365 @default.
- W2140356722 hasConceptScore W2140356722C40767141 @default.
- W2140356722 hasConceptScore W2140356722C41008148 @default.
- W2140356722 hasConceptScore W2140356722C547475151 @default.
- W2140356722 hasConceptScore W2140356722C55493867 @default.
- W2140356722 hasConceptScore W2140356722C70721500 @default.
- W2140356722 hasConceptScore W2140356722C86803240 @default.
- W2140356722 hasConceptScore W2140356722C92087593 @default.