Deep learning for enantioselectivity predictions in catalytic asymmetric β-C–H bond activation reactions†
Abstract
The growth of catalytic asymmetric C–H bond activation reactions, as well as that in a seemingly disparate domain like machine learning (ML), has been unprecedented. In due cognizance of the potential of such technologies, we herein examine the utility of modern ML for one of the most recent Pd-catalyzed enantioselective β-C(sp3)–H functionalization reactions using chiral amino acid ligands. Focus is on a practically relevant small data regime problem consisting of 240 such reactions, wherein substituted cycloalkanes undergo enantioselective to form arylated/alkenylated products. The molecular descriptors from a mechanistically important metal–ligand–substrate complex are used for the first time to build various ML models to predict % ee. The Deep Neural Network (DNN) offers accurate predictions with a root mean square error (RMSE) of 6.3 ± 0.9% ee. The RMSEs of out-of-bag predictions on three different reactions, namely, the enantioselective arylation of cyclobutyl carboxylic amide, the alkenylation of isobutyric acid, and the C(sp3)–H arylation of free cyclopropylmethylamine are found to be 7.8, 5.0, and 7.1% ee. This high generalizability of the DNN model suggests that it could be deployed for planning and designing of asymmetric catalysis on small data settings. The application of explainable tools using feature attribution methods on the DNN has identified important molecular features that impact the % ee. The chemical insights gathered can effectively be employed in planning the synthesis of new molecular targets.