Matches in SemOpenAlex for { <https://semopenalex.org/work/W4379875520> ?p ?o ?g. }
- W4379875520 abstract "Abstract The vast size of chemical space necessitates computational approaches to automate and accelerate the design of molecular sequences to guide experimental efforts for drug discovery. Genetic algorithms provide a useful framework to incrementally generate molecules by applying mutations to known chemical structures. Recently, masked language models have been applied to automate the mutation process by leveraging large compound libraries to learn commonly occurring chemical sequences (i.e., using tokenization) and predict rearrangements (i.e., using mask prediction). Here, we consider how language models can be adapted to improve molecule generation for different optimization tasks. We use two different generation strategies for comparison, fixed and adaptive. The fixed strategy uses a pre-trained model to generate mutations; the adaptive strategy trains the language model on each new generation of molecules selected for target properties during optimization. Our results show that the adaptive strategy allows the language model to more closely fit the distribution of molecules in the population. Therefore, for enhanced fitness optimization, we suggest the use of the fixed strategy during an initial phase followed by the use of the adaptive strategy. We demonstrate the impact of adaptive training by searching for molecules that optimize both heuristic metrics, drug-likeness and synthesizability, as well as predicted protein binding affinity from a surrogate model. Our results show that the adaptive strategy provides a significant improvement in fitness optimization compared to the fixed pre-trained model, empowering the application of language models to molecular design tasks." @default.
- W4379875520 created "2023-06-09" @default.
- W4379875520 creator A5002592084 @default.
- W4379875520 creator A5018744978 @default.
- W4379875520 creator A5019868556 @default.
- W4379875520 creator A5023856625 @default.
- W4379875520 creator A5032222900 @default.
- W4379875520 creator A5057253596 @default.
- W4379875520 creator A5072715140 @default.
- W4379875520 date "2023-06-08" @default.
- W4379875520 modified "2023-09-26" @default.
- W4379875520 title "Adaptive language model training for molecular design" @default.
- W4379875520 cites W1592238003 @default.
- W4379875520 cites W1975147762 @default.
- W4379875520 cites W1998693213 @default.
- W4379875520 cites W2027478081 @default.
- W4379875520 cites W2034549041 @default.
- W4379875520 cites W2066810295 @default.
- W4379875520 cites W2080635178 @default.
- W4379875520 cites W2107160601 @default.
- W4379875520 cites W2110791536 @default.
- W4379875520 cites W2121879602 @default.
- W4379875520 cites W2160592148 @default.
- W4379875520 cites W2406943157 @default.
- W4379875520 cites W2461470610 @default.
- W4379875520 cites W2578240541 @default.
- W4379875520 cites W2774185825 @default.
- W4379875520 cites W2790808809 @default.
- W4379875520 cites W2907657781 @default.
- W4379875520 cites W2914542247 @default.
- W4379875520 cites W2916581152 @default.
- W4379875520 cites W2939314313 @default.
- W4379875520 cites W2953128081 @default.
- W4379875520 cites W2953302413 @default.
- W4379875520 cites W2959938226 @default.
- W4379875520 cites W2973114758 @default.
- W4379875520 cites W2979826702 @default.
- W4379875520 cites W2989615256 @default.
- W4379875520 cites W2998571806 @default.
- W4379875520 cites W3006889321 @default.
- W4379875520 cites W3008443627 @default.
- W4379875520 cites W3014476516 @default.
- W4379875520 cites W3098269892 @default.
- W4379875520 cites W3113182646 @default.
- W4379875520 cites W3114291043 @default.
- W4379875520 cites W3116865743 @default.
- W4379875520 cites W3129831491 @default.
- W4379875520 cites W3133325765 @default.
- W4379875520 cites W3158755582 @default.
- W4379875520 cites W3165630607 @default.
- W4379875520 cites W3167905129 @default.
- W4379875520 cites W3209056694 @default.
- W4379875520 cites W3214740101 @default.
- W4379875520 cites W4205410305 @default.
- W4379875520 cites W4210592951 @default.
- W4379875520 cites W4221074165 @default.
- W4379875520 cites W4281619372 @default.
- W4379875520 cites W4303478269 @default.
- W4379875520 doi "https://doi.org/10.1186/s13321-023-00719-7" @default.
- W4379875520 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37291633" @default.
- W4379875520 hasPublicationYear "2023" @default.
- W4379875520 type Work @default.
- W4379875520 citedByCount "0" @default.
- W4379875520 crossrefType "journal-article" @default.
- W4379875520 hasAuthorship W4379875520A5002592084 @default.
- W4379875520 hasAuthorship W4379875520A5018744978 @default.
- W4379875520 hasAuthorship W4379875520A5019868556 @default.
- W4379875520 hasAuthorship W4379875520A5023856625 @default.
- W4379875520 hasAuthorship W4379875520A5032222900 @default.
- W4379875520 hasAuthorship W4379875520A5057253596 @default.
- W4379875520 hasAuthorship W4379875520A5072715140 @default.
- W4379875520 hasBestOaLocation W43798755201 @default.
- W4379875520 hasConcept C119857082 @default.
- W4379875520 hasConcept C137293760 @default.
- W4379875520 hasConcept C144024400 @default.
- W4379875520 hasConcept C149923435 @default.
- W4379875520 hasConcept C154945302 @default.
- W4379875520 hasConcept C173801870 @default.
- W4379875520 hasConcept C2908647359 @default.
- W4379875520 hasConcept C41008148 @default.
- W4379875520 hasConcept C60644358 @default.
- W4379875520 hasConcept C74187038 @default.
- W4379875520 hasConcept C86803240 @default.
- W4379875520 hasConcept C8880873 @default.
- W4379875520 hasConcept C91852762 @default.
- W4379875520 hasConcept C99726746 @default.
- W4379875520 hasConceptScore W4379875520C119857082 @default.
- W4379875520 hasConceptScore W4379875520C137293760 @default.
- W4379875520 hasConceptScore W4379875520C144024400 @default.
- W4379875520 hasConceptScore W4379875520C149923435 @default.
- W4379875520 hasConceptScore W4379875520C154945302 @default.
- W4379875520 hasConceptScore W4379875520C173801870 @default.
- W4379875520 hasConceptScore W4379875520C2908647359 @default.
- W4379875520 hasConceptScore W4379875520C41008148 @default.
- W4379875520 hasConceptScore W4379875520C60644358 @default.
- W4379875520 hasConceptScore W4379875520C74187038 @default.
- W4379875520 hasConceptScore W4379875520C86803240 @default.
- W4379875520 hasConceptScore W4379875520C8880873 @default.
- W4379875520 hasConceptScore W4379875520C91852762 @default.
- W4379875520 hasConceptScore W4379875520C99726746 @default.