Evan
Komp
* and
Stéphanie
Valleau
*
Chemical Engineering, University of Washington, 3781 Okanogan Ln, Seattle, WA 98195, USA. E-mail: evankomp@uw.edu; valleau@uw.edu
First published on 14th June 2022
We have generated an open-source dataset of over 30000 organic chemistry gas phase partition functions. With this data, a machine learning deep neural network estimator was trained to predict partition functions of unknown organic chemistry gas phase transition states. This estimator only relies on reactant and product geometries and partition functions. A second machine learning deep neural network was trained to predict partition functions of chemical species from their geometry. Our models accurately predict the logarithm of test set partition functions with a maximum mean absolute error of 2.7%. Thus, this approach provides a means to reduce the cost of computing reaction rate constants ab initio. The models were also used to compute transition state theory reaction rate constant prefactors and the results were in quantitative agreement with the corresponding ab initio calculations with an accuracy of 98.3% on the log scale.
Once input features have been computed, a trained machine learning estimator may predict chemical properties within seconds.7–15 Hence, in the last few years several efforts have been made to leverage ML for kinetics.21 Models have been trained to predict reaction energies,22 bond dissociation energies21 and recently activation energies.23,24 Machine learning has also been used to predict reaction products,25–28 yields29,30 or ideal synthesis conditions.31 Recently ML estimators were trained to predict quantum reaction rate constants for one dimensional minimum energy paths.20,32 All of the above is encouraging and only limited in application by the size of the training datasets and lack of large kinetic datasets.19 ML has also been used to help accelerate the search for MEPs16–18 and recently larger datasets of transition states33,34 and quantum rate constants20 have become available.
In this work we investigated the use of ML to predict transition state partition functions without knowledge of its geometry as well as predicting partition functions when geometries are known (Fig. 1). Together with an activation energy predictor,23,24 an accurate ML predictor of partition functions could circumvent the need to find a transition state structure when using TST to compute reaction rate constants. This would enable the rapid evaluation of kinetics for large networks of reactions. In addition, quantum reaction rate constant theories such as the flux–flux autocorrelation function still need the reactant partition function.35 Hence this predictor would help accelerate the calculation of quantum reaction rate constants. Lastly, it could be used to accelerate the evaluation of all other thermodynamic quantities which rely on knowledge of the partition function: free energy, entropy, pressure, heat capacity, etc. We generated a dataset of over 30 k partition functions for gas phase organic chemistry species extracted from an existing dataset36 of reactant, product, and transition state structures for unimolecular reactions. With this data, two deep neural network (DNN) partition function estimators were optimized. The first DNN, “Qest”, predicts the natural logarithm of the molecular partition function from featurized molecular geometry and inverse temperature, 1/T. The second DNN, “QesTS”, predicts the natural logarithm of the partition function of an unknown transition state, logQTS(T) from the difference between product and reactant featurized geometries, the reactant and product partition functions on the log scale, and 1/T. These models were then used to predict partition functions to be used in the computation of transition state theory (TST) reaction rate constants. Our reaction rate constant predictions were also compared to ab initio TST reaction rate constants. The various steps of our workflow, our results, and conclusions will be discussed in the following subsections.
Fig. 1 (Panel I) Partition functions were computed for a set of DFT optimized reactant, product, and transition state geometries taken from a dataset of 11961 unimolecular reactions.36 (Panel II) With this data, a deep neural network (DNN), Qest, was trained to predict a structure's partition function from its geometry and the inverse temperature, 1/T. A second DNN, QesTS, was trained to predict the partition function of an unknown transition state from the reactant and product structures, partition functions, along with 1/T. |
Q(T) ≈ Qel(T)Qtrans(T)Qrot(T)Qvib(T). | (1) |
See Table S1 in the ESI† for the equations of each partition function term in eqn (1). Although some products consisted of two or more distinct molecules, these had been considered as single structures when energies and hessians were computed in the original dataset.36 We used the single geometries and vibrational frequencies from ref. 33 to compute the partition functions. A more accurate representation would require the separation of the product geometry into the individual molecular geometries. When computing the vibrational partition functions, zero-point energies were included. For the rigid rotor rotational partition function, we computed the moments of inertia using the geometries and determined the symmetry number by identifying invariant symmetry operations using pymatgen,37 and by using our own software to determine which symmetry operations were proper. We accounted for the exception of linear molecules for which C2 rotations are returned as improper reflections by pymatgen.
Partition functions were computed over a set of 50 temperatures randomly sampled for each reaction from a uniform distribution of 1/T within the range of T = (50, 2000) K such that each reaction was considered at a unique set of temperatures. A histogram of the natural logarithm of the partition functions for all structures at all temperatures is shown in Fig. 2. The histogram shows a smaller count of low values of the natural logarithm of the partition function. This is due to the dominating contribution of the vibrational partition function to eqn (1). Our data has been made open source and is available to download from ref. 38.
Fig. 2 Histogram of all partition function values computed in our dataset for 35883 organic chemistry species. Each entry is either a reactant, product, or transition state at one of 50 temperatures T ∈ (50, 2000) K, sampled uniformly with respect to 1/T. Structures that exhibit the smallest and largest partition functions in the dataset are depicted in subpanels (A) and (B). The shape of this histogram is largely due to the dominating contribution of the vibrational partition function in eqn (1). |
The entire dataset was split into a hold out test set (10%) and a development set (90%). The hold out set was used to test final ML predictors. The development set was further split into 5 folds to use for cross validation during model optimization. Both the hold-out and the fold splitting were conducted using the Murcko scaffold of the reactant molecule.39 Here the scaffold is a representation of the structural backbone of the molecule. In splitting by scaffold we ensured that the same and similar molecules in the dataset were grouped together, producing a better estimate of the models accuracy on unseen data when conducting cross validation or testing. For Qest, the input consisted of a molecule's featurized structure (reactant, product, or transition state) and the inverse temperature, while the target was the corresponding natural logarithm of the partition function at that temperature. On the other hand, for QesTS the inputs were the difference between product and reactant featurized structures, product and reactant partition functions, and the inverse temperature while the target was the natural logarithm of the transition state partition function.
We found that the EncodedBonds41 featurizer with min–max scaling and target normalization were optimal for the Qest Model. For the QesTS model we employed the same input data feature representation as Qest, i.e. Encoded Bonds. We also carried out the screening of data standardization and found it was better not to use standardization. We believe that this is because the distribution of values for the difference in product and reactant feature vectors is peaked around zero; also, the distribution of partition function values for transition states is narrower than the overall partition function distribution for all species. For more information see ESI, section 2.†
With these optimal input features, we carried out a search over DNN hyperparameters to identify the best activation function, regularization, bias and weight initialization, learning rate, and neuron configuration for Qest and QesTS. For more information see ESI, section 3.†
In Fig. 3 we show parity plots of the final QestTS (panel A) and Qest (panel B) model predictions with respect to the test set. In both cases we see a low test set mean absolute error (MAE) of 4.01 (2.0%) and 4.37 (2.1%) on the logarithm of the chemical species or transition state partition function for QesTS and Qest respectively. Percentages are MAE compared to the test set standard deviation. Also, in both cases the spread of the distribution of predictions around the identity is quite narrow. The distribution of errors is shown in ESI, section 4.† This indicates that both DNN models accurately predict the partition functions.
To verify that the Qest model was learning from the molecular structures in the dataset and not simply from the temperature, we created a “null” linear model logQ(T) = m × 1/T + b and fitted it to the development set. The use of this null model for both reactant and transition state partition functions corresponds to the commonly used assumption that the ratio of partition functions is equal to 1.0. The null model performed poorly on the test set with an MAE of 31.8%. Our Qest model error was much lower: 1.9%, confirming that the Qest model learned from the molecular structure.
The prediction MAEs on the test set for Qest, QesTS, and the null model are listed in Table 1.
Null MAE | Qest MAE | QesTS MAE | |
---|---|---|---|
LogQ | 72.5 (34.5%) | 4.37 (2.1%) | N/A |
LogQTS | 72.1 (35.1%) | 4.25 (2.1%) | 4.01 (2.0%) |
Given that both models predicted with a low error on the test set, we combined these to predict transition state partition functions with machine learned reactant and product partition functions. We will discuss this “Double” model in the next section. The trained models are available on GitHub.42
To understand how machine learned partition functions influence the accuracy of the reaction rate constant, three approaches were considered. In the first both reactant and transition state partition functions were predicted from their geometries by using Qest. In the second approach, the reactant and product geometries and computed ab initio partition functions were used to predict the transition state partition function with QesTS. This approach avoids the time demanding search for a transition state geometry. In the last approach, Qest was used to predict reactant and product partition functions which were used, together with their geometries, as input to QesTS to predict a transition state partition function. This method, hereafter referred to as the Double predictor, only requires knowledge of the reactant and product geometries and does not need computed partition functions or transition state geometries.
The MAE of the three methods for predicting test set transition state partition functions and TST reaction rate constants is listed in Table 2. For the prediction of transition state partition functions (Table 2, column 1) all models have a similar MAE. The Double approach is the least accurate with an MAE of 2.7% while the QesTS approach has the lowest MAE of 2%.
LogQTS (T) MAE | LogkTST(T) MAE | Required inputs for k(T) | |
---|---|---|---|
Qest | 4.25 (2.1%) | 4.50 (1.7%) | Reactant, transition state structures, temperature, Ea |
QesTS | 4.01 (2.0%) | 4.01 (1.5%) | Reactant, product structures and partition functions, temperature, Ea |
Double | 5.58 (2.7%) | 4.50 (1.7%) | Reactant, product structures, temperature, Ea |
The fact that QesTS is the best predictor is not surprising given that it was trained specifically to predict transition state partition functions while Qest was not. Further, QesTS has knowledge of reactant and product partition functions from its input. In column 2 of Table 2 we see that when computing reaction rate constants by using the predicted partition functions, the Double and Qest approaches have the same MAE while the QesTS approach remains the most accurate with an MAE of 4.01. The increase in accuracy of the Double approach compared to the other methods is most likely due to a cancelation of errors from Qest predicted reactant and QesTS predicted transition state partition functions. Nonetheless, we can achieve the same average accuracy of reaction rate constant calculations when using only reactant and product structures and no information on the transition state geometry. The only cost remains providing the value of the activation energy. Here it is worth noting that machine learning has successfully been employed to predict activation energies.23,24
In Fig. 4, panel A we plot the error of the predicted transition state partition function with respect to the test set partition functions as a function of temperature for Qest, QesTS, and Double. In panel B we show the logarithm of the ratio of predicted transition state theory prefactors, QTS/QR, with respect to computed prefactors. We see some error cancelation: the error in the ratio of partition functions (panel B) is smaller than that for the single values (panel A). Errors for all models tend to increase at lower temperatures even though low temperatures were sampled frequently when generating the data. This could come from the fact that the partition function is more sensitive to small changes in temperature at low temperatures, making it more difficult to learn. Regardless, these errors do not significantly impact the predicted value of the reaction rate constant, with average error orders of magnitude lower than the value of the logarithm of the reaction rate constant. This is also seen in Fig. 5, where we show the average error compared to the reaction rate constant for the ring breaking of gamma-butyrolactone.
In Fig. 6 panel A, we show a plot of the transition state theory reaction rate constant for another reaction randomly selected from the test set, the ring opening of tetrahydro-2H-pyran-2-imine. Here the partition functions were computed using the original ab initio data (green line) and predicted with the ML models. In panel B, we show the absolute error as the normed difference between true and predicted values. We see that the error of the models is more than an order of magnitude smaller that the value of the logarithm of the reaction rate constant which confirms the accuracy of our trained models.
The models we have created in tandem with activation energy predictors provide an approach to predict reaction rate constants without the need to search for minimum energy paths. Our models enable a more rapid estimation of reaction dynamics in the context of coupled reactions, reaction networks, and reactor design. We also would like to note that very recent work has used ML to predict transition state structures directly.49,50 Should these models become accurate on a broader scale they could provide an alternative path towards predicting TST rate constants.
In the future, we aim to move beyond unimolecular reactants and consider bimolecular reactions as well as reactions which occur in the presence of a solvent. We also plan to investigate other machine learning approaches, for instance, on employing end-to-end message passing models, given their recent success for the prediction of adjacent molecular and reaction quantities.51,52
Footnote |
† Electronic supplementary information (ESI) available: Model architecture, optimization, and additional performance plots. See https://doi.org/10.1039/d2sc01334g |
This journal is © The Royal Society of Chemistry 2022 |