Joseph E.
Schneider
,
McKenna K.
Goetz
and
John S.
Anderson
*
Department of Chemistry, University of Chicago, Chicago, IL 60637, USA. E-mail: jsanderson@uchicago.edu
First published on 29th January 2021
Transition metal oxo species are key intermediates for the activation of strong C–H bonds. As such, there has been interest in understanding which structural or electronic parameters of metal oxo complexes determine their reactivity. Factors such as ground state thermodynamics, spin state, steric environment, oxygen radical character, and asynchronicity have all been cited as key contributors, yet there is no consensus on when each of these parameters is significant or the relative magnitude of their effects. Herein, we present a thorough statistical analysis of parameters that have been proposed to influence transition metal oxo mediated C–H activation. We used density functional theory (DFT) to compute parameters for transition metal oxo complexes and analyzed their ability to explain and predict an extensive data set of experimentally determined reaction barriers. We found that, in general, only thermodynamic parameters play a statistically significant role. Notably, however, there are independent and significant contributions from the oxidation potential and basicity of the oxo complexes which suggest a more complicated thermodynamic picture than what has been shown previously.
A large body of work supports that the free energy of reaction (ΔGPCET) is central to transition metal oxo mediated C–H activation and also offers a great deal of explanatory and predictive power.4–7 Recently however, additional properties have been cited as important although it is not clear if any have a widespread effect on reactivity. Individual cases support the influence of O-centered spin density,8 spin state,9–11 steric environment,12–14 the free energies of proton and electron transfer (ΔGPT and ΔGET),15–20 or the asynchronicity (η) of the reaction,21–24 but there is a lack of consensus regarding their generality and relative importance (Scheme 1).8,25–27 Very few studies have explored these parameters outside of a narrow range of complexes,4,6,10,20,21,28 and none have statistically examined the significance of parameters other than ΔGPCET on the reactivity of a broad set of metal oxo complexes.
We previously found an atypical dependence on ΔGPT in the concerted C–H activation reactivity of a terminal CoIII oxo complex which contrasts with the expected rate dependence on ΔGPCET.15 Given the disparity of this result with the literature, we sought to understand the interplay of characteristics affecting a broad range of transition metal oxo mediated PCET reactions using multivariable linear free energy relationships (LFERs). These models can be used to relate experimentally determined data, such as reaction rates, to multiple predictor variables simultaneously. LFER models have recently been used as versatile tools to optimize organic methodology, predict reaction barrier heights, and investigate underlying mechanisms.29–33,116,117
We have applied this analysis to examine trends in rates of PCET mediated C–H activation for a broad dataset of previously reported metal oxo complexes. This analysis enables a statistical examination of several hypotheses regarding what parameters of metal oxo species determine their PCET reactivity. Unsurprisingly, we observe that ΔGPCET is the most important factor. However, we also observe a significant role for ΔGPT and ΔGET beyond and independent of their contribution to ΔGPCET. Furthermore, the other parameters investigated do not have broad significance. These results suggest that thermodynamic factors are generally the dominant contributors to transition metal oxo C–H activation reactivity, but also demonstrate that thermodynamic parameters beyond the commonly invoked ΔGPCET are influential.
We examined the effect of each of these parameters on experimental reaction barriers by building multivariable free energy models via ordinary least squares regression of the barrier heights against the parameters. Each model consists of a set of coefficients (with variable units such that the product with the respective parameter gives units of kcal mol−1) and an intercept (with units kcal mol−1). These models were used to generate predicted reaction barriers for each data point, which could be compared with experimental reaction barriers to assess the utility of the model. Because ΔGPCET has strong theoretical and experimental support for affecting reaction barrier heights,4–6 we analyzed each parameter in combination with ΔGPCET and compared the resulting model to regression against ΔGPCET alone.
We evaluated each regression based on R2, leave-one-out (LOO) R2 (sometimes referred to as Q2), and a statistical F-test.58–61R2 is a goodness of fit measure which quantifies the amount of variation explained by a model. The predictive ability of a model is gauged with LOO R2, in which each data point is left out and predicted by the remaining data points and the goodness of fit is then reevaluated. Critically, unlike regular R2 this metric does not necessarily improve with an increase in parameters; overfitted models with too many parameters perform poorly with LOO R2. For each R2, a value close to 1 indicates a good fit. Finally, we report the p-value from an F-test on each model, which shows the probability the observed correlation arises from statistical noise. The lower this p-value is, the more significant a given parameter. Additionally, the calculation of p-values considers the number of parameters added to a model, so, as with LOO R2, an F-test is not biased in favor of adding more parameters.
A summary of our findings is presented in Table 1. In line with previous reports, we find a strong correlation between the experimental reaction barriers and ΔGPCET. This parameter alone explains 70% of the variation in reaction barriers within the training set (R2 = 0.70) and has high predictive ability (LOO R2 = 0.60). Interestingly, most other parameters do not significantly improve the model. While we do observe a small correlation with %BV steric metrics, the magnitude of the effect is too small to be statistically significant. Compared to the ΔGPCET only model, spin-based parameters and |η| barely improve R2 and perform similarly or worse in LOO cross-validation. While it is difficult to rule out the importance of these parameters in individual cases, an F-test indicates they do not have a statistically significant effect across our entire data set.
Parameter(s) Regressed with ΔGPCET | Training set on DHAa | All data for multiple substratesb | |||
---|---|---|---|---|---|
R 2 | LOOcR2 | p-valued | R 2 | LOOeR2 | |
a A subset of the reactions of 17 metal oxo complexes with DHA. b Excluding outlier metal oxo complexes (Ru oxos and oxo complexes of 13-TMC); substrates are DHA, 1,4-cyclohexadiene, xanthene, and fluorene. c Leave-one-out. d From an F-test where the null hypothesis is that only ΔGPCET has an effect. e Leave-one-out, slightly modified such that all reactions for a given metal oxo are left out together. f From an F-test where the null hypothesis is that ΔGPCET has no effect. g From an F-test where the null hypothesis is that ΔGPT has no effect. h From an F-test where the null hypothesis is that ΔGET has no effect. | |||||
ΔGPCET only | 0.70 | 0.60 | <0.001f | 0.45 | 0.36 |
%BV steric metrics | 0.77 | 0.64 | 0.15 | 0.48 | 0.28 |
Oxo spin density | 0.70 | 0.55 | 0.78 | 0.53 | 0.37 |
Spin excitation | 0.71 | 0.50 | 0.49 | 0.50 | 0.39 |
|η| | 0.73 | 0.53 | 0.22 | 0.50 | 0.30 |
ΔGPT, ΔGET | 0.86 | 0.71 | 0.0082 | 0.64 | 0.50 |
0.023g | |||||
0.0038h |
In contrast, addition of ΔGPT and ΔGET does significantly improve the fit. For this {ΔGPCET, ΔGPT, ΔGET} model, R2 increases from 0.70 to 0.86 and LOO R2 increases from 0.60 to 0.71, indicating both better explanation of the available data and better predictive ability. An F-test gives p < 0.01 which suggests the observed effect is statistically significant. The equation from this fit is ΔG‡ = 0.31ΔGPCET + 0.07ΔGPT + 0.12ΔGET − 0.26 (all coefficients unitless; free energies and intercept in kcal mol−1). Typically, ΔGPCET is a negative value while ΔGPT and ΔGET are positive values. Thus, the coefficients' positive signs mean a more exergonic reaction will have a lower barrier while increases in ΔGPT and ΔGET will raise the barrier. The larger coefficient of ΔGPCET indicates the reaction barrier is most sensitive to this free energy. Satisfyingly, the ΔGPCET coefficient agrees with experimental data: for metal oxo complexes that have a demonstrated trend of log(kobs) vs. substrate bond dissociation free energy (BDFE), the average slope of ΔG‡vs. substrate BDFE is ∼0.3 (see Table S1†), very similar to the 0.30 observed in our analysis. The intercept of −0.26 contains contributions to the average error not accounted for by the three free energies.
The significance of ΔGPT and ΔGET is intriguing because the literature discussion of these values has often been framed in terms of how they contribute to ΔGPCET rather than in terms of their intrinsic contribution to reaction barrier heights.16–19,27 However, ΔGPT and ΔGET as defined here are the energies to form the intermediates involved in stepwise reactivity – the protonated metal oxo with the deprotonated substrate, or the reduced metal oxo with the oxidized substrate (Scheme 1). Critically, ΔGPT and ΔGET do not form a full thermodynamic cycle with ΔGPCET and thus are fundamentally distinct. This fact is statistically supported by poor correlations between ΔGPCET and ΔGPT and between ΔGPCET and ΔGET (−0.12 and 0.31, respectively, see Regression S6†). Finally, we find that ΔGPT and ΔGET have importance independent of a contribution to ΔGPCET as clearly demonstrated by the LOO R2s and F-tests. All of our analyses therefore suggest that the combination of ΔGPT and ΔGET is an independent and significant contributor to C–H activation barrier heights.
While the observation of a dependence on ΔGPT and ΔGET that arises from our linear regressions is principally empirical, it is consistent with prior theoretical models in the literature. The physical underpinning of this dependency on ΔGPT and ΔGET is likely due to mixing of proton transfer and electron transfer intermediates into the concerted transition state despite these intermediates never being fully realized.62–64 Within transition state theory, this can be envisioned by a More-O’Ferrall–Jencks plot in which the transition state lies not on a one-dimensional line connecting reactant and product but on a two-dimensional plane containing reactant, product, and intermediates.65,66 In this case, the intermediates arise from proton transfer and electron transfer, and when either ΔGPT or ΔGET lowers in energy the transition state can adopt structural and electronic components of these intermediates resulting in a lower barrier height. While the use of these classical structure–energy relationships to analyse PCET reactions has been questioned recently,26,27 proton transfer and electron transfer states and their energies also have roles in nonadiabatic rate theories of PCET which treat proton transfer in a quantum mechanical fashion.67,68 Therefore, the use of ΔGPT and ΔGET to predict barrier heights of PCET is consistent with prior theoretical foundations.
Assigning a direct role for ΔGPT and ΔGET is in line with recent computational studies of PCET transition states which invoke off-ΔGPCET diagonal thermodynamic terms from Scheme 1, such as asynchronicity (η), as key contributors to DFT derived reaction barriers.21–24 Asynchronicity is derived not from the sum of ΔGPT and ΔGET, but rather their difference. Conversely, we instead find that the sum of ΔGPT and ΔGET have a more significant effect than |η|. The reason for this discrepancy is unclear, but a possible explanation is that experimental noise prevents us from observing a comparatively more subtle trend between |η| and the experimental reaction barrier heights. Furthermore, the well-controlled nature of the series of complexes previously investigated for asynchronicity may have too little variation in (ΔGPT + ΔGET) to manifest similarly to the effects we observe here.
Another way in which our data may not be amenable to investigating the effect of |η| is the variable reorganization energy of the metal oxo complexes examined here. |η| is specifically framed as an adjustment to the Marcus reorganization energy;21 therefore |η|'s effect may only be clear when reorganization is properly accounted for. While it is clear that the reorganization energy is important to PCET reactivity, there is no established way to compute it without computationally expensive transition state geometries.11,67–69 We have made multiple attempts to derive reorganization or deformation parameters using the optimized metal oxo and metal hydroxide geometries and frequencies, but none of these parameters have statistically significant contributions to predicted reaction barriers with or without |η| (see ESI†). Therefore, a combination of noise in the experimental data and our inability to compute a reliable reorganization parameter could preclude us from observing an effect of |η| on the barrier heights. Nonetheless, previous studies as well as this current work offer increasing support that off-ΔGPCET diagonal thermodynamic terms such as ΔGPT and ΔGET have important effects on reactivity independent of ΔGPCET.
The all-thermodynamic model we find here provides insights and possible alternative explanations for previously reported trends in PCET reactivity. In one study,12 steric and spin state effects were invoked to explain the comparatively high reactivity of the S = 2 complex [FeIV(O)(TMG2dien)(CH3CN)]2+. A higher rate of C–H activation as compared to S = 2 [FeIV(O)(TMG3tren)]2+ was ascribed to reduced steric hinderance in the TMG2dien complex,13 and the higher rate of C–H activation as compared to the S = 1 complexes [FeIV(O)(N4Py)]2+ and [FeIV(O)(TMC)(CH3CN)]2+ was ascribed to the S = 2 spin state in the TMG2dien complex.36,44 However, it was noted that the even faster reactivity of [FeIV(O)(Me3NTB,CH3CN)]2+, which is S = 1 and has a similar %BV profile to [FeIV(O)(TMG2dien)(CH3CN)]2+,34 is not easily explained by either hypothesis. Our analysis suggests that the thermodynamic properties of these complexes may provide an alternative explanation in these comparisons (see Table S9†). The Me3NTB complex has by far the most exergonic reaction with DHA (ΔGPCET = −16 kcal mol−1), followed by the TMG2dien complex (ΔGPCET = −9 kcal mol−1), followed by the complexes of TMG3tren, TMC, and N4Py (ΔGPCET = −7, −6, and −6 kcal mol−1, respectively). Thus, thermodynamic parameters would predict the Me3NTB complex to have the lowest reaction barrier and fastest rate of reaction, with the TMG2dien complex being the next most reactive, and the remaining complexes the least reactive as is observed experimentally.
In another study, it was observed that the rates of PCET reactions performed by [FeIV(O)(TMC)(X)]n+ decrease with more strongly donating axial ligands X.36 Variation in ΔGPCET was ruled out as a cause of this trend, as it was calculated to be similar for all complexes investigated. It was suggested that the accessibility of a high-spin state may explain this variation in the rates, as the energy of the quintet excited state decreased with stronger X ligands. However, our calculations indicate that while stronger axial donors increase ΔGET, ΔGPT decreases more substantially (see Table S9†). In our model, these changes result in a net decrease in the reaction barrier, suggesting that despite a similar ΔGPCET, the reactivity trend could be explained by thermodynamic effects. These analyses do not rule out that spin state or steric effects may be important in the previous studies, but suggest that thermodynamics may also play an important role.
The fit of the training data to {ΔGPCET, ΔGPT, ΔGET} and this model's performance on the test set is depicted graphically in Fig. 1. It is clear that the reaction barriers for most metal oxo complexes in the test set are well predicted. Nonetheless, several metal oxo complexes (given unique symbols in Fig. 1) deserve further discussion.
The {ΔGPCET, ΔGPT, ΔGET} model behaves the poorest in predicting reaction barriers for the FeIV oxo and CoIV oxo complexes of the ligand 13-TMC.49,70 Essentially no barrier is predicted for these reactions which is not observed experimentally. This is due to a large negative calculated ΔGPCET in both cases; in fact, these complexes are outliers even in the ΔGPCET only fit (see Regression S1†). The cause of this discrepancy is not entirely clear. However, it appears to be systemic to the particular ligand scaffold rather than the identity of the metal center, which suggests this discrepancy could arise from ambiguity in the primary coordination sphere of these complexes. No structural characterization is reported for the FeIV complex, and while a short Co–O bond is identified by EXAFS for the CoIV complex, it is difficult to conclusively determine the primary coordination sphere. Any discrepancy in coordination sphere would render our calculated parameters incorrect, potentially explaining their inability to predict the experimental reaction barriers.
The reaction barrier is overestimated for all Ru oxo complexes, and for three of them by more than two kcal mol−1. As Ru is the only second row transition metal in our data set, we suspect this overestimation is due to a consistent difference between first and second row transition metals rather than Ru examples not following the same trends. For instance, it is possible that the Ru oxo complexes have relatively low structural reorganization energy or that relativistic effects influence the coefficients. It may also simply be a change in the systemic DFT error upon going to the second row. Regardless, regression of barriers from the kinetics of an individual Ru oxo complex reacting with several different substrates reveals there is a trend with ΔGPCET, ΔGPT, and ΔGET with similar coefficients to those obtained from the more general model with multiple different oxo complexes (see ESI†). This supports that the same trends in free energies are at play in the Ru oxo complexes.
Interestingly, the {ΔGPCET, ΔGPT, ΔGET} model only moderately underestimates the reaction barrier (by ∼2 kcal mol−1) for a terminal CoIII oxo complex which has unusual trends in its reactivity with various substrates.15 Unlike most metal oxo complexes, the reactivity of this complex does not have a clear trend with ΔGPCET; its kinetics are instead dominated by ΔGPT. Therefore, its adherence to trends in {ΔGPCET, ΔGPT, ΔGET} as seen for the broad set of metal oxo complexes deserves further investigation. We regressed the experimental reaction barriers for the reactivity of this complex with several substrates against only ΔGPT as well as against {ΔGPT, ΔGPCET} (Fig. 2). We find that the inclusion of ΔGPCET significantly improves the model, increasing R2 from 0.94 to 0.97 and LOO R2 from 0.93 to 0.95 and having an F-test p-value of 0.02 (see ESI†). However, the relative weighting of the contribution from ΔGPCET is quite different than for the broader set of complexes.
In the broader set we observe that ΔGPCET has a larger effect on the reaction barriers than either ΔGPT or ΔGET, which is reflected in the larger coefficient for the ΔGPCET term than for the ΔGPT and ΔGET terms in the fit equation (Fig. 1). In contrast, ΔGPT has a greater effect than ΔGPCET on the reaction barriers for the CoIII oxo complex, again reflected in the magnitude of their coefficients (Fig. 2). Furthermore, the addition of ΔGET significantly improves the model for the broader set of metal oxo complexes (Table 1) but is insignificant for the series of substrates reacting with the CoIII oxo complex (p-value > 0.05, see Regressions S42 and S43†). Overall, this CoIII oxo complex is not so dissimilar from the broader set of metal oxo complexes in that the same thermodynamic free energies explain the reactivity of both. However, this individual case demonstrates a different weighting of parameters than those observed in the broad set.
Our analysis of the CoIII reactivity rests on the assumption that the coefficients of the model do not change appreciably from substrate to substrate. To test this assumption, we extended our analysis of the larger set of metal oxo complexes to include reactivity with 1,4-cyclohexadiene (CHD), fluorene, and xanthene in addition to DHA. We refit the model with reported data for the reactions between each substrate and all metal oxo complexes (excluding the previously discussed Ru and 13-TMC oxo complexes). As with our regressions for DHA alone, the inclusion of ΔGPT and ΔGET notably improves the fit (Table 1, Fig. 3). Other parameters offer comparably little improvement to the fit and do not perform well by LOO cross validation. The equation for this model is ΔG‡ = 0.23ΔGPCET + 0.04ΔGPT + 0.10ΔGET + 2.10, which is satisfyingly similar to the equation of the fit to DHA data alone, supporting the assumption that the coefficients of the model are not appreciably affected by the identity of the substrate.
While the relative importance of these thermodynamic parameters can vary between specific cases, this study on a broad set of metal oxo complexes suggests that thermodynamic parameters provide the most general contribution to reaction barriers. Furthermore, while a strong dependence on ΔGPCET is observed, as is expected based on literature precedent, significant and independent contributions from ΔGPT and ΔGET are observed. This conclusion adds to the growing body of literature supporting the importance of thermodynamic parameters beyond ΔGPCET.
All rate constants utilized here were reported as k2 values with the exception of several rate constants used in the CoIII oxo reactivity analysis.15 In this case, for substrates which did not have a reported k2, the pseudo-first order rate constant kobs at 0.0125 M of substrate was divided by 0.0125 M to obtain an approximate k2. We used all substrates with reported kinetic data in this analysis except for 1,1,3,3-tetraphenylpropene. This substrate reacts unusually slowly, which we believe to be due to large steric hindrance of the reacting C–H bond. The remaining substrates were sterically similar enough that there is no steric effect on their kinetics (see Regression S41†).
(1) |
The second term in eqn (1) is an approximation for the free energy of association of the metal oxo and the substrate.97 This adjustment allows us to compare kinetic data collected at different temperatures. As C–H bonds are poor hydrogen bond donors, we assume that the cost of association is purely entropic (or at least that enthalpic components vary minimally between different metal oxo complexes and substrates) and further assume this entropy cost is solely the loss of translational entropy. This neglects the loss of rotational entropy and the gain of low frequency metal oxo-substrate vibrational modes, but these effects will partially cancel. Regression with ΔGPCET and RT does not fit DHA reaction barrier heights significantly better than a fit to ΔGPCET alone (see Regression S10†), indicating that this adjustment satisfactorily accounts for the temperature dependence of the reaction barrier.
We do not take into account hydrogen bonding between the metal oxo complexes and protic solvents as we were unable to derive a suitably accurate correction. However, in the ESI† we demonstrate that our best attempt to do so does not change the main conclusions herein (see Table S11†).28,98–100
Unfortunately, several of our optimized structures have small imaginary frequencies (see ESI†), which we believe is due to numerical noise of CPCM solvation. Occasionally these frequencies lie below −100 cm−1 but in each of these cases the mode is isolated to a soft dihedral motion, e.g. methyl rotation on an acetonitrile ligand. We used the absolute value of these frequencies when calculating the thermodynamic enthalpy and entropy values, believing that to be a better approximation for these modes than either nonexistence or a frequency of 0 cm−1. We were unable to reoptimize these structures to remove the imaginary frequencies.
In many cases, the correct ground state multiplicity of a species was not immediately clear. In such cases we confirmed our initial assignment by running ten geometry optimization cycles on alternate spin states and confirming these alternate assignments were several kcal mol−1 uphill of the assigned spin state. In a few cases were the energy was within 5 kcal mol−1 and the optimization was not close to convergence, we fully optimized the alternate spin state. Whenever two spin states had nearly the same energy, we chose the higher spin state as the ground state due to the typically higher entropy of high spin states.
To quantify the steric environment around each metal oxo center or substrate reactive C–H bond, we calculated percent buried volume (%BV) steric metrics using the online SambVCA web application.52 We centered the calculation on the oxygen atom (for oxos) or the transferring hydrogen (for substrates), defined the negative z-axis as going through the metal center (for oxos) or the reacting carbon center (for substrates), and defined the xz plane as containing another atom bonded to the metal or carbon (the first such atom in the .xyz file). We had the center oxygen or hydrogen atom deleted from the calculation, included hydrogen atoms in the calculation, and left all other settings to their default value (using Bondi radii scaled by 1.17, a sphere radius of 3.5 angstroms, and a mesh setting of 0.10 angstroms). The application returns a total percent buried volume, as well as that for individual quadrants of the sphere. For metal oxo complexes, we used the total percent buried volume (%BV Tot) and the standard deviation of these four quadrants (%BV Dev) in our regressions in order to capture both overall steric bulk and how evenly distributed this bulk is around the metal oxo moiety. For substrates, we solely used %BV Tot. See the ESI† for a further discussion of steric parameters and their effect on reaction barrier heights.
To evaluate the effect of spin and spin state on reactivity, we used two parameters that have been discussed in the literature: spin density on the oxo ligand and the energy to excite to a higher spin state.8,10 Atomic spin populations were determined via IBO analysis using the freely available IBO Viewer software.53,54 We recorded the spin density on the metal and on oxygen for each metal oxo complex as well as how much spin both atoms gain upon PCET reduction; we also tabulated similar values for the IBO charges. In the regression analysis we solely used the spin density on the oxo ligand. The “Spin Excitation Energy” is the vertical energy from the ground spin state of the initial oxo complex to the lowest lying excited spin state that is within one spin multiplicity of the resulting metal hydroxide ground spin state. If the ground spin state is already one spin multiplicity greater or lower than the product hydroxide spin state, then the spin excitation energy is taken to be zero. For instance, in the case of a triplet FeIV oxo reacting to give a sextet FeIII hydroxide the spin excitation energy is the energy of the quintet FeIV oxo relative to the triplet FeIV oxo at the ground state optimized geometry. This is the scenario for most FeIV oxos in the data set. But in the case of the two non-heme FeIV quintet oxos,12,13 the spin excitation energy is zero because the ground spin state is already within one spin multiplicity of the sextet hydroxide product. Essentially, the spin excitation energy is the energy needed to reach a spin surface on which reduction to the metal hydroxide's ground spin state is spin allowed. While this simple metric ignores the nuances of two state reactivity theory (such as the spin inversion probability) it is relatively simple to compute and has precedent as a quantitative measure of PCET reactivity.10,36
For each metal oxo-substrate combination assessed here, we tabulated the free energies of proton coupled electron transfer (ΔGPCET, eqn (2)), proton transfer (ΔGPT, eqn (3)), electron transfer (ΔGET, eqn (4)), and the asynchronicity as defined by Srnec and coworkers (η, eqn (5)):21
ΔGPCET = GM–OH + GC· − GMO − GC–H | (2) |
ΔGPT = GM–OH+ + GC:− − GMO − GC–H | (3) |
ΔGET = GM–O− + GC–H+ − GMO − GC–H | (4) |
(5) |
The simplest metrics reported from these models are the mean square error (MSE) and the goodness of fit R2.58–61 These both give an indication of how well a model fits the available data but are prone to overfitting; more complicated models can only improve these metrics, regardless of whether or not the model is actually better.
We also evaluated each model with cross validation (CV) metrics, which can become worse upon overfitting. In K-fold cross validation, the training data is further subdivided into K subsets, and each subset is predicted by the K − 1 remaining subsets.59,61 When K is the number of data points, i.e. each data point being predicted by the rest of the data points, this is known as leave-one-out (LOO) cross validation. These predicted data points can be used to calculate the MSE and R2 as above. The MSE from LOO cross validation is an approximately unbiased estimate of the expected error of a test set; however, it has high variability from training set to training set because each prediction uses nearly every point in a given training set. By repeatedly subdividing into larger groups and averaging the resultant K-fold MSEs, one obtains a pessimistic but less variable estimate of the expected test error. As we see similar trends for both LOO and 5-fold CV, we only report LOO R2 in the main text but show all metrics in the ESI.†
Another way to determine the significance of the model is to use a statistical F-test.58,60 This allows one to compare an unrestricted model with a more restricted one (fewer parameters used as regressors, or no parameters regressed, or restrictions placed on the relationship between coefficients, etc.). In the language of hypothesis testing, the null hypothesis is that the unrestricted model offers no improvement on the restricted model and the alternate hypothesis is that there is an improvement. When both models are fit to the data, the unrestricted model will have less total squared error than the restricted model. Assuming said error of each data point is normally distributed (or that there is enough data such that the error is approximately normally distributed), that the average error is zero, and that the model is properly formulated, it is possible to determine the probability that this reduction in total squared error is spurious. This probability is known as the p-value. The test relies on a well-defined number of degrees of freedom in both the restricted and unrestricted model to draw out what the statistical distribution of total squared error ought to be.
For regressions on multiple substrates at once, the unequal weighting of different metal oxo complexes (depending on how many substrates are reported for them) renders these statistical metrics unreliable.61 We ameliorate this issue for LOO cross validation by leaving out all reaction barriers for a given metal oxo complex together rather than one at a time. That is, we leave one metal oxo complex out and predict its reaction barrier heights based on all other metal oxos' reaction barrier heights rather than leave one reaction barrier height out and predict this barrier based off all other barriers. We accordingly only report LOO CV metrics for this set of regressions.
Footnote |
† Electronic supplementary information (ESI) available: Detailed summary of LFER models; additional methodological details; further analysis of hydrogen bonding, steric effects, and reorganization effects; spreadsheets containing experimental and computational data; optimized geometries and the output of frequency calculations; and python scripts used to perform the regression. See DOI: 10.1039/d0sc06058e |
This journal is © The Royal Society of Chemistry 2021 |