Unveiling CO2 reactivity with data-driven methods†
Abstract
Carbon dioxide is a versatile C1 building block in organic synthesis. Understanding its reactivity is crucial for predicting reaction outcomes and identifying suitable substrates for the creation of value-added chemicals and drugs. A recent study [Li et al., J. Am. Chem. Soc., 2020, 142, 8383] estimated the reactivity of CO2 in the form of Mayr's electrophilicity parameter E on the basis of a single carboxylation reaction. The disagreement between experiment (E = −16.3) and computation (E = −11.4) corresponds to a deviation of up to ten orders of magnitude in bimolecular rate constants of carboxylation reactions according to the Mayr–Patz equation, log k = sN(E + N). Here, we introduce a data-driven approach incorporating supervised learning, quantum chemistry, and uncertainty quantification to resolve this discrepancy. The dataset used for reducing the uncertainty in E(CO2) represents 15 carboxylation reactions in DMSO. However, experimental data is only available for one of these reactions. To ensure reliable predictions, we selected a training set composed of this and 19 additional reactions comprising heteroallenes other than CO2 for which experimental data is available. With the new data-driven protocol, we can narrow down the electrophilicity of carbon dioxide to E(CO2) = −14.6(5) with 95% confidence, and suggest an electrophile-specific sensitivity parameter sE(CO2) = 0.81(6), resulting in an extended reactivity equation, log k = sEsN(E + N) [Mayr, Tetrahedron, 2015, 71, 5095].