Matches in SemOpenAlex for { <https://semopenalex.org/work/W3143418323> ?p ?o ?g. }
- W3143418323 endingPage "1865" @default.
- W3143418323 startingPage "1856" @default.
- W3143418323 abstract "ConspectusNumerous disciplines, such as image recognition and language translation, have been revolutionized by using machine learning (ML) to leverage big data. In organic synthesis, providing accurate chemical reactivity predictions with supervised ML could assist chemists with reaction prediction, optimization, and mechanistic interrogation.To apply supervised ML to chemical reactions, one needs to define the object of prediction (e.g., yield, enantioselectivity, solubility, or a recommendation) and represent reactions with descriptive data. Our group’s effort has focused on representing chemical reactions using DFT-derived physical features of the reacting molecules and conditions, which serve as features for building supervised ML models.In this Account, we present a review and perspective on three studies conducted by our group where ML models have been employed to predict reaction yield. First, we focus on a small reaction data set where 16 phosphine ligands were evaluated in a single Ni-catalyzed Suzuki–Miyaura cross-coupling reaction, and the reaction yield was modeled with linear regression. In this setting, where the regression complexity is strongly limited by the amount of available data, we emphasize the importance of identifying single features that are directly relevant to reactivity. Next, we focus on models trained on two larger data sets obtained with high-throughput experimentation (HTE). With hundreds to thousands of reactions available, more complex models can be explored, for example, models that algorithmically perform feature selection from a broad set of candidate features. We examine how a variety of ML algorithms model these data sets and how well these models generalize to out-of-sample substrates. Specifically, we compare the ML models that use DFT-based featurization to a baseline model that is obtained with features that carry no physical information, that is, random features, and to a naive non-ML model that averages yields of reactions that share the same conditions and substrate combinations. We find that for only one of the two data sets, DFT-based featurization leads to a significant, although moderate, out-of-sample prediction improvement. The source of this improvement was further isolated to specific features which allowed us to formulate a testable mechanistic hypothesis that was validated experimentally. Finally, we offer remarks on supervised ML model building on HTE data sets focusing on algorithmic improvements in model training.Statistical methods in chemistry have a rich history, but only recently has ML gained widespread attention in reaction development. As the untapped potential of ML is explored, novel tools are likely to arise from future research. Our studies suggest that supervised ML can lead to improved predictions of reaction yield over simpler modeling methods and facilitate mechanistic understanding of reaction dynamics. However, further research and development is required to establish ML as an indispensable tool in reactivity modeling." @default.
- W3143418323 created "2021-04-13" @default.
- W3143418323 creator A5029221011 @default.
- W3143418323 creator A5029465199 @default.
- W3143418323 creator A5077633918 @default.
- W3143418323 creator A5080368878 @default.
- W3143418323 date "2021-03-31" @default.
- W3143418323 modified "2023-10-17" @default.
- W3143418323 title "Predicting Reaction Yields via Supervised Learning" @default.
- W3143418323 cites W1973723982 @default.
- W3143418323 cites W1981622697 @default.
- W3143418323 cites W1988037271 @default.
- W3143418323 cites W1988195734 @default.
- W3143418323 cites W1996851544 @default.
- W3143418323 cites W2017730959 @default.
- W3143418323 cites W2033757486 @default.
- W3143418323 cites W2039609876 @default.
- W3143418323 cites W2052226480 @default.
- W3143418323 cites W2068113002 @default.
- W3143418323 cites W2068337719 @default.
- W3143418323 cites W2071411894 @default.
- W3143418323 cites W2080635178 @default.
- W3143418323 cites W2112081648 @default.
- W3143418323 cites W2114256688 @default.
- W3143418323 cites W2114704115 @default.
- W3143418323 cites W2117897510 @default.
- W3143418323 cites W2125847307 @default.
- W3143418323 cites W2139078293 @default.
- W3143418323 cites W2324777999 @default.
- W3143418323 cites W2475857747 @default.
- W3143418323 cites W2478815458 @default.
- W3143418323 cites W2581221175 @default.
- W3143418323 cites W2591867326 @default.
- W3143418323 cites W2616399381 @default.
- W3143418323 cites W2778051509 @default.
- W3143418323 cites W2784432193 @default.
- W3143418323 cites W2785942661 @default.
- W3143418323 cites W2787894218 @default.
- W3143418323 cites W2791355014 @default.
- W3143418323 cites W2794822175 @default.
- W3143418323 cites W2799620402 @default.
- W3143418323 cites W2900743800 @default.
- W3143418323 cites W2901661444 @default.
- W3143418323 cites W2901942917 @default.
- W3143418323 cites W2911964244 @default.
- W3143418323 cites W29374554 @default.
- W3143418323 cites W2945258205 @default.
- W3143418323 cites W2949064041 @default.
- W3143418323 cites W2969507301 @default.
- W3143418323 cites W2972597827 @default.
- W3143418323 cites W2975047661 @default.
- W3143418323 cites W2986375585 @default.
- W3143418323 cites W3012519883 @default.
- W3143418323 cites W3015160609 @default.
- W3143418323 cites W3023042104 @default.
- W3143418323 cites W3023459419 @default.
- W3143418323 cites W3034639166 @default.
- W3143418323 cites W3036777101 @default.
- W3143418323 cites W3042471731 @default.
- W3143418323 cites W3043132288 @default.
- W3143418323 cites W3091739897 @default.
- W3143418323 cites W3102693939 @default.
- W3143418323 cites W3102909511 @default.
- W3143418323 cites W3110901318 @default.
- W3143418323 cites W3128474010 @default.
- W3143418323 cites W3138801267 @default.
- W3143418323 cites W4210741823 @default.
- W3143418323 doi "https://doi.org/10.1021/acs.accounts.0c00770" @default.
- W3143418323 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/33788552" @default.
- W3143418323 hasPublicationYear "2021" @default.
- W3143418323 type Work @default.
- W3143418323 sameAs 3143418323 @default.
- W3143418323 citedByCount "58" @default.
- W3143418323 countsByYear W31434183232021 @default.
- W3143418323 countsByYear W31434183232022 @default.
- W3143418323 countsByYear W31434183232023 @default.
- W3143418323 crossrefType "journal-article" @default.
- W3143418323 hasAuthorship W3143418323A5029221011 @default.
- W3143418323 hasAuthorship W3143418323A5029465199 @default.
- W3143418323 hasAuthorship W3143418323A5077633918 @default.
- W3143418323 hasAuthorship W3143418323A5080368878 @default.
- W3143418323 hasConcept C119857082 @default.
- W3143418323 hasConcept C124101348 @default.
- W3143418323 hasConcept C148483581 @default.
- W3143418323 hasConcept C153083717 @default.
- W3143418323 hasConcept C154945302 @default.
- W3143418323 hasConcept C41008148 @default.
- W3143418323 hasConceptScore W3143418323C119857082 @default.
- W3143418323 hasConceptScore W3143418323C124101348 @default.
- W3143418323 hasConceptScore W3143418323C148483581 @default.
- W3143418323 hasConceptScore W3143418323C153083717 @default.
- W3143418323 hasConceptScore W3143418323C154945302 @default.
- W3143418323 hasConceptScore W3143418323C41008148 @default.
- W3143418323 hasFunder F4320337393 @default.
- W3143418323 hasIssue "8" @default.
- W3143418323 hasLocation W31434183231 @default.
- W3143418323 hasOpenAccess W3143418323 @default.
- W3143418323 hasPrimaryLocation W31434183231 @default.