Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313485037> ?p ?o ?g. }
- W4313485037 endingPage "2056" @default.
- W4313485037 startingPage "2046" @default.
- W4313485037 abstract "Lipophilicity, as measured by the partition coefficient between octanol and water (log P), is a key parameter in early drug discovery research. However, measuring log P experimentally is difficult for specific compounds and log P ranges. The resulting lack of reliable experimental data impedes development of accurate in silico models for such compounds. In certain discovery projects at Novartis focused on such compounds, a quantum mechanics (QM)-based tool for log P estimation has emerged as a valuable supplement to experimental measurements and as a preferred alternative to existing empirical models. However, this QM-based approach incurs a substantial computational cost, limiting its applicability to small series and prohibiting quick, interactive ideation. This work explores a set of machine learning models (Random Forest, Lasso, XGBoost, Chemprop, and Chemprop3D) to learn calculated log P values on both a public data set and an in-house data set to obtain a computationally affordable, QM-based estimation of drug lipophilicity. The message-passing neural network model Chemprop emerged as the best performing model with mean absolute errors of 0.44 and 0.34 log units for scaffold split test sets of the public and in-house data sets, respectively. Analysis of learning curves suggests that a further decrease in the test set error can be achieved by increasing the training set size. While models directly trained on experimental data perform better at approximating experimentally determined log P values than models trained on calculated values, we discuss the potential advantages of using calculated log P values going beyond the limits of experimental quantitation. We analyze the impact of the data set splitting strategy and gain insights into model failure modes. Potential use cases for the presented models include pre-screening of large compound collections and prioritization of compounds for full QM calculations." @default.
- W4313485037 created "2023-01-06" @default.
- W4313485037 creator A5015356572 @default.
- W4313485037 creator A5028135456 @default.
- W4313485037 creator A5051380538 @default.
- W4313485037 creator A5057292715 @default.
- W4313485037 creator A5070449568 @default.
- W4313485037 date "2023-01-04" @default.
- W4313485037 modified "2023-10-16" @default.
- W4313485037 title "Machine Learning for Fast, Quantum Mechanics-Based Approximation of Drug Lipophilicity" @default.
- W4313485037 cites W1541037889 @default.
- W4313485037 cites W1977775349 @default.
- W4313485037 cites W1999118725 @default.
- W4313485037 cites W2005220728 @default.
- W4313485037 cites W2011201847 @default.
- W4313485037 cites W2011301426 @default.
- W4313485037 cites W2019078804 @default.
- W4313485037 cites W2019678805 @default.
- W4313485037 cites W2037761619 @default.
- W4313485037 cites W2038344991 @default.
- W4313485037 cites W2044834685 @default.
- W4313485037 cites W2045193322 @default.
- W4313485037 cites W2047112187 @default.
- W4313485037 cites W2055928473 @default.
- W4313485037 cites W2060531713 @default.
- W4313485037 cites W2061650979 @default.
- W4313485037 cites W2063140507 @default.
- W4313485037 cites W2068946470 @default.
- W4313485037 cites W2072398524 @default.
- W4313485037 cites W2073328513 @default.
- W4313485037 cites W2077576411 @default.
- W4313485037 cites W2078374001 @default.
- W4313485037 cites W2079391392 @default.
- W4313485037 cites W2080635178 @default.
- W4313485037 cites W2086957099 @default.
- W4313485037 cites W2089545684 @default.
- W4313485037 cites W2090996511 @default.
- W4313485037 cites W2091274215 @default.
- W4313485037 cites W2093226833 @default.
- W4313485037 cites W2103757398 @default.
- W4313485037 cites W2135046866 @default.
- W4313485037 cites W2135732933 @default.
- W4313485037 cites W2147122257 @default.
- W4313485037 cites W2238155711 @default.
- W4313485037 cites W2342249984 @default.
- W4313485037 cites W2394923207 @default.
- W4313485037 cites W2405343124 @default.
- W4313485037 cites W2477754403 @default.
- W4313485037 cites W2587086259 @default.
- W4313485037 cites W2594183968 @default.
- W4313485037 cites W2605801743 @default.
- W4313485037 cites W2756205407 @default.
- W4313485037 cites W2802385907 @default.
- W4313485037 cites W2888503762 @default.
- W4313485037 cites W2900090807 @default.
- W4313485037 cites W2911418701 @default.
- W4313485037 cites W2911997094 @default.
- W4313485037 cites W2915655911 @default.
- W4313485037 cites W2932297636 @default.
- W4313485037 cites W2964007201 @default.
- W4313485037 cites W2966357564 @default.
- W4313485037 cites W2991028530 @default.
- W4313485037 cites W2991463149 @default.
- W4313485037 cites W2994986327 @default.
- W4313485037 cites W3008775157 @default.
- W4313485037 cites W3018495986 @default.
- W4313485037 cites W3025424660 @default.
- W4313485037 cites W3048565185 @default.
- W4313485037 cites W3058018199 @default.
- W4313485037 cites W3099878876 @default.
- W4313485037 cites W3101744125 @default.
- W4313485037 cites W3102476541 @default.
- W4313485037 cites W3109302196 @default.
- W4313485037 cites W3157403564 @default.
- W4313485037 cites W3167275903 @default.
- W4313485037 cites W3175735871 @default.
- W4313485037 cites W3182441856 @default.
- W4313485037 cites W3187163767 @default.
- W4313485037 cites W3205538317 @default.
- W4313485037 cites W3216996162 @default.
- W4313485037 cites W4206367183 @default.
- W4313485037 cites W4213342439 @default.
- W4313485037 cites W4221027158 @default.
- W4313485037 cites W4224950134 @default.
- W4313485037 cites W4236370975 @default.
- W4313485037 cites W4242372416 @default.
- W4313485037 cites W4282053982 @default.
- W4313485037 cites W4308683713 @default.
- W4313485037 cites W4321769975 @default.
- W4313485037 cites W765815668 @default.
- W4313485037 doi "https://doi.org/10.1021/acsomega.2c05607" @default.
- W4313485037 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36687099" @default.
- W4313485037 hasPublicationYear "2023" @default.
- W4313485037 type Work @default.
- W4313485037 citedByCount "6" @default.
- W4313485037 countsByYear W43134850372023 @default.
- W4313485037 crossrefType "journal-article" @default.
- W4313485037 hasAuthorship W4313485037A5015356572 @default.