Yuya
Tsutsui
a,
Issei
Yanaka
b,
Kazuhiro
Takeda
*b,
Masaru
Kondo
c,
Shinobu
Takizawa
d,
Ryosuke
Kojima
e,
Akihito
Konishi
*af and
Makoto
Yasuda
*af
aDepartment of Applied Chemistry, Graduate School of Engineering, Osaka University, Suita, 565-0871, Japan. E-mail: a-koni@chem.eng.osaka-u.ac.jp; yasuda@chem.eng.osaka-u.ac.jp
bDepartment of Engineering, Graduate School of Integrated Science and Technology, Shizuoka University, Hamamatsu, 432-8561, Japan. E-mail: takeda.kazuhiro@shizuoka.ac.jp
cSchool of Pharmaceutical Sciences, University of Shizuoka, 52-1 Yada, Suruga-ku, Shizuoka 422-8526, Japan
dSANKEN, Osaka University Ibaraki-shi, 567-0047, Japan
eDepartment of Biomedical Data Intelligence, Graduate School of Medicine, Kyoto University, Sakyo-ku, 606-8507, Japan
fInnovative Catalysis Science Division, Institute for Open and Transdisciplinary Research Initiatives (ICS-OTRI), Osaka University, Suita, 565-0871, Japan
First published on 5th April 2024
Selective recognition between hydrocarbon moieties is a longstanding issue. Although we developed a π-pocket Lewis acid catalyst with high selectivity for aromatic aldehydes over aliphatic ones, a general strategy for catalyst design remains elusive. As an approach that transfers the molecular recognition based on multiple cooperative non-covalent interactions within the π-pocket to a rational catalyst design, herein, we demonstrate Lewis acid catalysts showing improved selectivity through the support of an ensemble algorithm with random forest, Ada Boost, and XG Boost as a machine learning (ML) approach. Using 7963 explanatory variables extracted from model hetero-Diels–Alder reactions, the ensemble algorithm predicted the chemoselectivity of unlearned catalysts. Experiments confirmed the prediction. The proposed catalyst shows the highest selective recognition, reminiscing enzymatic catalytic activity. Additionally, a SHapley Additive exPlanations (SHAP) method suggested that the selectivity originates from the polarizability and three-dimensional size of the catalyst. This insight leads to rational design guidelines for Lewis acid catalysts with dispersion forces.
Distinguishing between aliphatic and aromatic aldehydes remains a longstanding issue because attaining selectivity can be a direct synthetic method for complicated molecules. Aliphatics and aromatics exhibit different properties. However, when aliphatic and aromatic moieties contain the same functional group, the properties of the functional group contribute strongly. Consequently, it is almost impossible for molecular catalysts to distinguish between them. Since aliphatics and aromatics are both hydrocarbons, their polarities are the same unless there is a noticeable difference in steric factors (Fig. 1B-v). Our research aims to develop catalysts that discriminate between aliphatic and aromatic aldehydes because Lewis acid mediated electrophilic reactions of carbonyls are the most fundamental and important reactions for carbon–carbon bond formation in the construction of many useful molecules. We realized a catalyst that selectively recognizes aromatic aldehydes by forming an aromatic π-pocket shaped skeleton around the Lewis acid site, which attracts carbonyl groups (Fig. 1C). A cage-shaped triphenolic ligand has established its effectiveness in controlling the Lewis acidity of a boron3,4 or an aluminum atom.5 The decoration of the ligand framework endows the cage-shaped catalyst with tailored Lewis acidity,6–9 chirality,10 and photoactivation.11,12 In some cases, the π-pocket catalyst shows high selectivity for aromatic aldehydes compared to aliphatic ones.8 However, a general strategy for catalyst design has yet to be established. Although we speculate that the π-pocket moiety has affinity for aromatic compounds due to the π–π or CH–π interaction, the details remain unclear. Numerous experiments confirm a correlation between the catalyst structure and the selectivity, but conventional knowledge such as the steric or electrostatic environment of the π-pocket cannot explain this correlation. This may be because molecular recognition is defined by multiple cooperative non-covalent interactions within the π-pocket.9
Herein, we propose a new cage-shaped borate catalyst showing improved selectivity for aromatic compounds through the support of machine learning (ML). Recent advances in ML applications to organic synthetic chemistry13–17 have significantly contributed to the predictions of yields and selectivity,13,18 sequential searches for optimal reaction conditions,19–21 and reverse structure searches for catalysts, ligands, or transient states,14,15,22–24 design of asymmetric catalysts,25–33 predictions of site-selectivity for C–H functionalization catalyzed by a pocket-shaped Rh complex,34 and estimations of the substrate specificity of enzymes.35 Although these studies employed various algorithms, including linear algorithms (e.g., multivariate linear regression, Lasso,36 Ridge,37 and PLS38), non-linear non-tree-based algorithms (e.g., GP,39 MLP,40 and SVR41) and non-linear tree-based algorithms (e.g., DT,42 RF,43 and XGB44), they all used individual algorithms to construct comprehensive models. In contrast, we propose an ensemble of algorithms to achieve stable and small root mean squared error for unlearned data (QRMSE) of the predicted selectivity. Our ensemble algorithm combines multiple non-linear tree-based algorithms with RF,43 AB,45 and XGB.44 Since the underlying patterns and relationships in multiple chemical factors of the π-pocket catalyst should be analyzed and explored by data-driven methods, the application of ML may provide insight to design the π-pocket. If a high-performance model can be constructed to represent the relationships, it could predict the performance of new structured catalysts or existing catalysts under new reaction conditions. Furthermore, it may extract factors contributing to catalyst performance. Such information will not only elucidate reaction mechanisms but also aid in inverse analysis of catalyst structures to achieve the required performance.
To briefly describe the π-pocket environment, we divided these catalysts into four categories based on the components of their π-pocket. Category 1 (1AB–1DB) has π-pockets composed of heteroaromatics. Category 2 (1EB–1GB) possesses alkylated aryl groups. The π-pockets of category 3 (1HB–1JB) are built by polycyclic aromatic hydrocarbons. In category 4 (1KB–1MB), the aromatic moieties of the π-pockets are replaced with alkyl groups. The predictions indicated the importance of the aromatic moieties in the π-pocket (Fig. 2E, also see Fig. S19 and Table S21†). Catalysts in category 4 (1KB–1MB) showed little or no selectivity. The other catalysts in categories 1, 2, and 3 were predicted to show preferred selectivity for the aromatic aldehyde 3b. The catalysts in categories 2 (1EB–1GB) and 3 (1HB–1JB) exhibited moderate selectivities for 3b, and the small differences in selectivity were estimated for each borate. In contrast, the catalysts in category 1 demonstrated that the benzo-fusion into the heterole moieties effectively improved the predicted selectivity. Although the catalyst with a π-pocket consisting of a simple furan (1BB) or thiophene (1DB) showed poor selectivity (4a/4b = 43:57 (for 1BB) or 41:59 (for 1DB)), the predicted selectivity was enhanced for 1AB (4a/4b = 33:67) or 1CB (4a/4b = 34:66) due to the benzo-fusions to the heterole moieties. Minor structural changes can significantly improve the selectivity. Our curiosity regarding the prediction as well as the high synthetic accessibility of the catalysts in category 1 prompted us to experimentally investigate their selective recognition of aromatic aldehydes. In particular, complex 1AB, which has a π-pocket composed of 2-benzofuryl moieties, had the highest predicted selectivity and the narrowest range between the maximum and minimum prediction (Fig. 2D). Consequently, complex 1AB·thf was determined to be a viable experimental target (Table S21†).
The cage-shaped boron complexes with the π-pocket composed of 2-benzofuryl moieties 1AB·L (L = tetrahydrofuran (thf), pyridine (py), or 3,5-dibromopyridine (dbp)) were synthesized according to our previous synthetic procedures (Scheme S1†).9 To compare the chemoselectivity of 1AB·L, complex 1BB·thf, which has a π-pocket composed of 2-furyl moieties, and several modified cage-shaped borates 1AI–AIVB·L with 2-benzofuryl-based π-pockets were also synthesized (Scheme S1†). All cage-shaped borates were fully characterized by 1H, 13C, and 11B NMR spectroscopy. The ORTEP drawings indicated that the three 2-benzofuryl groups effectively built a C3-symmetric π-pocket around the boron center (Fig. 3). One significant difference in the geometry between 1AB and 1bB is the dihedral angle of the component aryl group of the π-pocket against the phenoxy moiety. The large angle (ave. 51.5°) of the phenyl group in 1bB·thf led to a twisted biaryl substructure,9 whereas the small angle (ave. 13.3°) of the 2-benzofuryl group in 1AB·dbp led to a coplanarized biaryl substructure. The observed difference is attributed to the presence or absence of steric hindrance due to the hydrogen atoms at the ortho-positions in each biaryl substructure. The ligand-exchange rate of the cage-shaped borates investigated by 1H NMR measurements provided further information about the effect of the 2-benzofuryl-based π-pocket on the catalytic activity. Dimethylaminopyridine (DMAP) complexes of 1AB were dissolved in pyridine-d5, and the ligand dissociation rate was measured during the ligand exchange from DMAP to pyridine. Table S1† summarizes the results. The kinetic analysis gave activation parameters of 1AB: ΔG‡(293 K) = 29.1 kcal mol−1, ΔH‡ = 29.5 kcal mol−1, ΔS‡ = 1.46 cal K−1 mol−1, and k = 1.23 × 10−9 s−1. The observed parameters are similar to those of 1bB: ΔG‡(293 K) = 29.0 kcal mol−1, ΔH‡ = 31.2 kcal mol−1, ΔS‡ = 7.52 cal K−1 mol−1, and k = 1.16 × 10−9 s−1,7 suggesting that the catalytic turnover efficiency does not significantly differ between 1AB and 1bB.
Borates 1AB·thf, 1BB·thf, and 1AI–AIVB·thf were applied as Lewis acid catalysts in competitive hetero-Diels–Alder reactions between 3a and various aromatic aldehydes 3b–f with diene 2. The adduct yields (4a + 4b–f) are listed in Table S2 in the ESI.† These borates sufficiently catalyzed all the reactions to give the corresponding adducts in acceptable yields.
Next, we compared the chemoselectivity for 3b–3f with that for 3a (Fig. 4). The ML-predicted borate 1AB·thf demonstrated chemoselectivity for benzaldehyde 3b over that of butanal 3a to give the corresponding adducts (4a/4b) in a ratio of 26:74 (purple bar in Fig. 4). This is improved selectivity compared to that of conventional o-phenylated 1bB·thf (4a/4b = 30:70, blue bar in Fig. 4).8,9 Catalyst 1BB·thf, which ML predicted to have poor selectivity, experimentally showed sluggish selectivity (4a/4b = 46:54, pink bar in Fig. 4), confirming the importance of benzo-fusions to the furan moieties. Modified catalyst 1AIB·thf, in which a methyl group was introduced at the 3-position of the 2-benzofuryl group of 1AB, had lower selectivity for 3b than for 3a (4a/4b = 34:66, green bar in Fig. 4), implying that the introduced methyl groups into the 2-benzofuryl group shrink the π-pocket and inhibit the substrate uptake. Borate 1AII–AIVB·thf with a π-pocket constructed by π-extended naphthofuryl groups exhibited comparable chemoselectivity for 3b and 3a to 1AB·thf (Fig. 4A). For benzaldehyde derivatives bearing an electron-donating group (3c and 3d, Fig. 4B and C), ML-predicted borate 1AB·thf showed improved chemoselectivity for aromatic aldehydes over that for 3a, relative to the catalytic system of 1bB·thf. Notably, the competitive reaction catalyzed by 1AB·thf between butanal 3a and anisaldehyde 3c, which exhibited relatively low reactivity due to the electron-donating group, showed slightly enhanced chemoselectivity (4a/4c = 45:55) compared to that catalyzed by 1bB·thf (4a/4c = 49:51). Even allowing for experimental error, the slightly enhanced chemoselectivity for 4c was a common trend in the series of catalysts with a furan-based π-pocket (1AB·thf, 1BB·thf and 1AII–AIVB·thf). Our previous report9 showed such a better combination of the π-pocket and aromatic aldehyde, and the investigation of the details of the origin was continued. For the competitive reaction between aromatic aldehydes bearing electron-withdrawing groups (3e and 3f, Fig. 4D and E) and butanal 3a catalyzed by 1AB·thf and 1AII–AIVB·thf, higher selectivity for aromatic aldehydes over that for 3a was generally observed. The highest selectivities for 3f (4a/4f = 9:91–7:93) achieved with 1AB·thf and 1AII–AIVB·thf were comparable to our previously reported results.9
Fig. 4 Observed chemoselectivity and total yield (4a + 4b–4f) in the competitive hetero-Diels–Alder reactions of 3a and various benzaldehyde derivatives 3b–f. rt = 25 °C. |
Theoretical calculations provided insight into the higher chemoselectivity assisted by ML-predicted borate 1AB. Fig. S8† summarizes the computational results of the reaction mechanisms for the hetero-Diels–Alder reactions of 2 with 3a/b catalyzed by borate 1AB·thf. Like our catalytic reaction of 1bB·thf,9 the hetero-Diels–Alder reaction can be divided into three steps: (1) preorganization (reactants → IM1 → IM2) to form the inclusion complex 1AB·3⊃2, which takes up the substrates into the π-pocket, (2) C–C bond formation (IM2 → TS1 → IM3) between 2 and 3 in the π-pocket, and (3) subsequent C–O bond formation (IM3 → TS2 → products) to afford the adduct–borate complex 1AB·5. Although step 3 is the rate-determining step as it shows the highest activation energy at TS2 (ΔG‡(3a) = 8.8 kcal mol−1 and ΔG‡(3b) = 9.6 kcal mol−1), the observed chemoselectivity is hard to explain using the difference between the activation energies of 3a and 3b. This implies that step 3 with low and similar activation energies barely participates in the chemoselectivity caused by the π-pocket of 1AB. The situation is identical to that of our previous study.9 Alternatively, we found a significant difference in the stabilization energy (ΔES) for the inclusion complex 1AB·3⊃2 in the preorganization step. For reactions catalyzed by 1bB and 1AB, the stabilization energy of the inclusion complex with 3b was always larger than that with 3a.9 However, the energy difference in ΔES between 3a and 3b (ΔΔES = |ΔES(3b) − ΔES(3a)|) was larger in the catalytic system of 1AB (ΔΔES = 6.6 kcal mol−1) than that in 1bB (ΔΔES = 5.5 kcal mol−1).9 The enhanced stabilization in the inclusion complex 1AB·3b⊃2 was attributed to the large dispersion energy. Among the compared systems 1AB·3a⊃2, 1bB·3a⊃2, and 1bB·3b⊃2, the inclusion complex 1AB·3b⊃2 had the largest dispersion energy calculated at the B3LYP-D3(BJ)/6-31G** level51 (Table 1). From the NCI plots,52,53 a slightly larger NCI area was demonstrated in the π-pocket of 1AB·3b⊃2 (Fig. S10 and S11†). Notably, borate 1AB was proposed by the ML based on structural and electronic factors of the related borates themselves and not those of the inclusion complex with the substrates. Hence, our established algorithm may be extended to predict the essential intermediates in the preorganization step that determine the chemoselectivity driven by the π-pocket concept.
The chemoselectivity of borate 1AB·thf was significantly highlighted in the intramolecular recognition of aromatic moieties. We investigated hetero-Diels–Alder reactions of 2 with dialdehyde 6, where the aromatic and aliphatic carbonyl groups were separated by an amide group spacer, as model systems (Table 2). The reaction of dialdehyde 6 prepared from a β-alanine derivative with 2 showed higher selectivity. Borate 1AB·thf successfully recognized the aromatic moiety of 6 and exhibited excellent selectivity under the standard conditions (7a/7b/7c = 8:82:10, entry 1). Our previous borates 1aB·thf and 1bB·thf did not achieve the result of 1AB·thf. Instead, they showed a poor ratio of the products (7a/7b/7c = 39:22:39 (1aB·thf, entry 3) and 20:58:22 (1bB·thf, entry 4)). Conventional Lewis acids did not show catalytic activity or the desired selectivity (entries 5–7). The ratio of the products given by 1AB·thf under the standard conditions improved to 7a/7b/7c = 8:90:2 (entry 2) when using the flow system.9 Although aldehyde 6 contained a secondary amide group, which can act as a strong anchor toward the Lewis acidic center, borate 1AB remarkably recognized the aromatic moiety in 6. The behavior of 1AB is reminiscent of a certain kind of enzymatic catalytic activity based on selective molecular recognition.54,55 Hence, 1AB holds promise as a catalyst for late-stage functionalization of complex biomolecules bearing various functional groups.
Entry | Catalyst | Condition | Yield/% (7a + 7b + 7c) | Ratio 7a/7b/7c |
---|---|---|---|---|
a rt = 25 °C. | ||||
1 | 1AB·thf | Batch | 62 | 8/82/10 |
2 | 1AB·thf | Flow system | 25 | 8/90/2 |
3 | 1aB·thf | Batch | 47 | 39/22/39 |
4 | 1bB·thf | Batch | 72 | 20/58/22 |
5 | BF3·Et2O | Batch | 7 | 34/65/1 |
6 | TiCl4 | Batch | 17 | 81/9/10 |
7 | SnCl4 | Batch | 16 | 49/14/37 |
Further analysis of the ML-proposed predictions rationalized the observed highest performance of catalyst 1AB. We evaluated the contribution of each of the employed molecular descriptors to the predicted chemoselectivity using a Shapley Additive exPlanations (SHAP) method. The SHAP method was introduced in cooperative game theory to assess the contribution of each feature.56Fig. 5A summarizes the top five extracted molecular descriptors (also see Fig. S22†). The top two molecular descriptors (2_SCBO and 4_TDB08p) contributed significantly to the predicted chemoselectivity, while the other variables had a modest contribution. Herein 2_SCBO (sum of conventional bond orders (H-depleted)) corresponds to the three-dimensional size of a substrate weighted by the number of composed covalent bonds, while 4_TDB08p (three-dimensional topological distance-based descriptors – lag 8 weighted by polarizability) corresponds to the three-dimensional size of a catalyst weighted by its molecular polarizability. 2_SCBO decisively influenced the chemoselectivity. The selectivity for aromatic aldehydes with substituents and fewer hydrogen atoms such as pentafluorobenzaldehyde 3e and 4-cyanobenzaldehyde 3f was high compared to that for butanal 3a. Among the employed aldehydes, the increase in the conventional bond order is intuitively associated with the lower LUMO levels of the carbonyl, promoting selective hetero-Diels–Alder reactions.
Although the large SCBO contribution of the substrate to the predicted chemoselectivity was expected, the contribution of the TDB08p of the catalyst is truly thought provoking. We previously noted that catalysts possessing a π-pocket constructed by meta-substituted phenyl (1lB) or 1-(1mB)/2-naphthyl (1nB) moieties showed higher selectivity for aromatic aldehydes than catalysts with a π-pocket constructed by para-substituted (1c–hB, and their π-extended analogues 1o–qB) or 3,5-disubstituted (1i–kB) aromatic moieties. Considering the importance of polarizability in characterizing the molecular descriptor TDB08p, the lower symmetric substituent patterns of the π-pocket should sustain the averaged molecular polarizability of the catalyst, realizing high selectivity for aromatic aldehydes. Fig. 5B clearly shows the correlation. Catalysts 1lB, 1mB, and 1nB with positive TDB08p values larger than that of 1dB possessing para-cyano groups showed enhanced chemoselectivities. For the predicted reactions catalyzed by 1AB·thf and 1BB·thf in CH2Cl2, Table S22† provides further evidence of the importance of the contribution of TDB08p. The TDB08p value of 1AB·thf (+0.0524) is the most positive among all catalysts. In contrast, the value of 1BB·thf (−0.121) suggests an enduring negative effect on selectivity. The DFT calculations also supported the difference in molecular polarizability between 1AB and 1BB (1AB: 5.01 Debye; 1BB: 2.48 Debye).
A π-extended aromatic moiety with large polarizability is advantageous to promote non-covalent interactions within the π-pocket space in the reaction step. Notably, understanding the molecular structure–property relationship with the aid of the ML-based insight elucidated the previously unidentified origin of the chemoselectivity of the π-pocket. The size and polarizability of the π-pocket are crucial to determine the relationship. These findings provide insight to design π-pockets as molecular recognition sites.
The present study not only introduces new borate-based Lewis acid catalysts with π-pocket cavities but also highlights the importance of weak and multiple dispersion forces working within aromatic cavities. The combination of the experimental studies with the DFT calculations, the ML approach, and the SHAP analysis proposed an essential factor for the Lewis acid catalyst showing peculiar selectivity driven by the π-pocket: molecular polarizability. We believe that this strategy, assisted by the ML approach, broadens the design of other catalysts exhibiting selectivity based on dispersion forces, which enables distinguishing between carbon frameworks and a direct synthetic methodology for useful organic molecules.
Footnote |
† Electronic supplementary information (ESI) available: Synthetic procedures, and spectroscopic, computational, and machine learning data. CCDC 2297702 (1AB·dbp) and 2297703 (1AIIB·py). For ESI and crystallographic data in CIF or other electronic format see DOI: https://doi.org/10.1039/d4ob00408f |
This journal is © The Royal Society of Chemistry 2024 |