Jingbai
Li
a,
Patrick
Reiser
b,
Benjamin R.
Boswell
d,
André
Eberhard
c,
Noah Z.
Burns
*d,
Pascal
Friederich
*bc and
Steven A.
Lopez
*a
aDepartment of Chemistry and Chemical Biology, Northeastern University, Boston, MA 02115, USA. E-mail: s.lopez@northeastern.edu
bInstitute of Nanotechnology, Karlsruhe Institute of Technology, Karlsruhe, Germany. E-mail: pascal.friederich@kit.edu
cInstitute of Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
dDepartment of Chemistry, Stanford University, Stanford, CA, USA. E-mail: nburns@stanford.edu
First published on 10th March 2021
Photochemical reactions are widely used by academic and industrial researchers to construct complex molecular architectures via mechanisms that often require harsh reaction conditions. Photodynamics simulations provide time-resolved snapshots of molecular excited-state structures required to understand and predict reactivities and chemoselectivities. Molecular excited-states are often nearly degenerate and require computationally intensive multiconfigurational quantum mechanical methods, especially at conical intersections. Non-adiabatic molecular dynamics require thousands of these computations per trajectory, which limits simulations to ∼1 picosecond for most organic photochemical reactions. Westermayr et al. recently introduced a neural-network-based method to accelerate the predictions of electronic properties and pushed the simulation limit to 1 ns for the model system, methylenimmonium cation (CH2NH2+). We have adapted this methodology to develop the Python-based, Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics (PyRAI2MD) software for the cis–trans isomerization of trans-hexafluoro-2-butene and the 4π-electrocyclic ring-closing of a norbornyl hexacyclodiene. We performed a 10 ns simulation for trans-hexafluoro-2-butene in just 2 days. The same simulation would take approximately 58 years with traditional multiconfigurational photodynamics simulations. We generated training data by combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to adaptively sample sparse data regions along reaction coordinates. The final data set of the cis–trans isomerization and the 4π-electrocyclic ring-closing model has 6207 and 6267 data points, respectively. The training errors in energy using feedforward neural networks achieved chemical accuracy (0.023–0.032 eV). The neural network photodynamics simulations of trans-hexafluoro-2-butene agree with the quantum chemical calculations showing the formation of the cis-product and reactive carbene intermediate. The neural network trajectories of the norbornyl cyclohexadiene corroborate the low-yielding syn-product, which was absent in the quantum chemical trajectories, and revealed subsequent thermal reactions in 1 ns.
Unravelling the origin of chemoselectivity and stereoselectivity in organic photochemical reactions is challenging because of the short-lived molecular excited states. Quantum chemical calculations offer insight into the bonding changes that occur along a reaction coordinate and non-adiabatic molecular dynamics (NAMD) simulations to gain mechanistic insights and develop structure–reactivity relationships in complex photochemical reactions. The nuclei-electron coupling and time-dependent terms increase the complexity of Hamiltonian in NAMD and, thus, computation time. Multiple methods developed in the last two decades simplified the time-dependent molecular wave functions, e.g. ab initio multiple spawning (AIMS),8,9 and fewest switches surface hopping (FSSH).10–12 However, computing high dimensional PESs with the requisite multiconfigurational methods is extremely resource-intensive. For example, most quantum chemical software packages encode an upper limit to active space of 16 electrons and 16 orbitals for complete active space self-consistent field (CASSCF) calculations. A typical NAMD experiment requires thousands of such calculations, resulting in a maximum simulation time on the order of 1 picosecond, at a computational cost of approximately 101–104 wall-clock hours. Unfortunately, many direct excitation photochemical reactions have low quantum yields and long excited-state lifetimes. This results in prohibitively expensive NAMD simulations, thus limiting broad understanding of excited-state structure–reactivity relationships in photochemistry.
An increasing number of studies reported that fitting machine learning (ML) potentials could substantially accelerate the NAMD simulation.13–15 Aleotti et al. have parameterized ad hoc force fields for a 10 ps dynamic simulation of azobenzene.16 Westermeyr et al. have trained multilayer feedforward neural networks (NNs) to enable 1 ns simulation of methylenimmonium cation (CH2NH2+) in 59 days.17 Later, they have extended the application of the NN model in predicting excited-states electronic properties for other small molecules, such as SO2 and CSH2 employing a deep continuous-filter convolutional-layer neural network, SchNet, combined with SHARC.18 Recently, Ha et al. have trained the SchNet model to study the excited-state dynamics of penta-2,4-dieniminium cation.19 Applying ML-based NAMD on complex molecules continues to challenge theorists because of additional conformational flexibility and increase in available degrees of freedom. Training ML models becomes increasingly difficult with the rapidly growing required size of training datasets, especially when based on atom-wise molecular representations.
Here, we address the need for our combined ML and NAMD approach (ML-NAMD) that expands the scope of organic photochemical reactions. We demonstrate the high accuracy in the trained NN and low computational cost in the ML-NAMD simulations. Our trajectory statistics support that the ML-NAMD predicted photochemical product distribution is in good agreement with quantum chemical results. Our initial ML-NAMD simulations focused on the first such photodynamics of hexafluoro-2-butene (1), which is mechanistically well understood due to its characteristic vertical ππ* excitation from HOMO to LUMO (the π and π*-orbital). Hexafluoro-2-butene is a nontoxic and not flammable industrial working fluid used as a refrigerant and a foam-blowing agent.20 We also applied the ML-NAMD simulations to the 4π-electrocyclic ring-closing of norbornyl cyclohexadiene (3), a previously unknown and complex photochemical system.
In the cis–trans isomerization of trans-1, we selected an active space of 2 π-electrons and 2 π-type orbitals (i.e. (2,2)) for the CC bond. The geometries of trans-1 and cis-1 were optimized with the cc-pVDZ basis set.22 A vibrational analysis confirmed only positive frequencies. We located two minimum energy crossing points (MECPs) corresponding to the trans → cis (MECP-trans-1) and the reverse reaction (MECP-cis-1). The NAMD simulations with CASSCF(2,2)/cc-pVDZ used the fewest switches surface hopping algorithm (FSSH)10–12 in Hammes-Schiffer/Tully (HST) scheme10 with decoherence correction of 0.1 Hartree23 implemented in OpenMolcas 19.11.21 We generated 1500 initial geometries and associated velocities near the equilibrium geometry of trans-1 with Wigner sampling at 300 K. The NAMD trajectories were propagated at 300 K (Nosé–Hoover thermostat24) from the S1 Franck–Condon region for ∼500 fs with 0.5 fs timesteps. It is important to note that applying a thermostat can bias the excited-state dynamics because the excited-state lifetime of a few hundred femtoseconds often does not suffice for thermalization. We used a thermostat to reduce the kinetic energy gained from surface hopping, which compensates for the overestimated excitation energy of CASSCF calculation. We collected 1371 of the 1500 trajectories that reached the ground-state within 500 fs, whereas the others remained in the excited-state or showed molecular structures with unexpected broken bonds resulting from incorrectly converged CASSCF calculations.
The 4π-electrocyclic ring-closing of 3 used 4 π-electrons and 3 π-type orbitals active space (i.e., (4,3)) for the conjugated bonds. As Martinez and co-workers suggested in a previous study on cyclohexadiene,25 we removed the π*-orbital in A2 symmetry to ensure consistent CASSCF state ordering with CASPT2 calculations. The geometries of 3 and possible ladderene products were optimized with the ANO-S-VDZP basis set.26–29 Frequency calculations showed all positive values. We set up the same NAMD simulations for 3 with CASSCF(4,3)/ANO-S-VDZP while the simulations are done in 1000 fs. We only propagated 250 trajectories at 300 K because the NAMD simulation time increased significantly to 17 days. 240 of the 250 trajectories were completed at the ground-state and were used for analysis.
Fig. 1 Three initial training set generation approaches in PyRAI2MD. Wigner sampling explores the conformational space near the equilibrium geometries. Geometry interpolation collects the data along with reaction coordinates from reactant to product. Trajectories samples the data from quantum chemical trajectories. The 2D PES data can also supplement the data sampling. Detailed information about the initial training set generation is available in the ESI.† |
Our initial training set generation scheme combines Wigner sampling, geometrical interpolation, and trajectories. The Wigner sampling approach generates training data by sampling reactant and product structures. The geometry interpolation approach accesses the reaction coordinate diagram with the optimized reactant, product, and minimum energy crossing point (MECP) geometries. It systematically varies the reactant geometry to that of the minimum energy crossing point and then to that of the product in equal increments of Z-matrix coordinate parameters. We expanded the range of sampled structures in the training data by applying the Wigner sampled geometrical perturbations to the interpolated reaction coordinate diagram. The trajectories approach samples conformations from short-time quantum chemical NAMD trajectories (50–100 fs). For instance, we included every 10th snapshot of the first 50 fs of 132 CASSCF NAMD trajectories of 1 corresponding to the trans → cis isomerization.
Fig. 2 The computational architecture of PyRAI2MD. The bold boxes are the initial stage in NAMD and ML kernel. |
The workflow in the NAMD kernel starts with the Wigner sampled initial conditions. The following procedures, red arrows in Fig. 2, iteratively compute the energies, forces, NACs, and surface hopping probability, propagating the NAMD trajectory. We generalized the input and output format of computing energies, forces, and NACs in the NAMD kernel. This feature enables efficient communication between the ML kernel and our chosen external quantum chemical program, OpenMolcas 19.11.21 The workflow in ML kernel reads a training set or an existing model. It provides a convenient interface to train models and make predictions marked by blue arrows in Fig. 2. The green arrows in Fig. 2 illustrate the adaptive sampling workflow incorporating the NAMD and ML kernels. It first loads two trained ML models to run multiple trajectories in the NAMD kernel. The trajectories explore the unsampled conformational space while the two ML models evaluate the prediction uncertainty on-the-fly, as described in Section 2.4 and ESI.† The sampled data were added into the initial set to train a new model.
This pathway is consistent with prior theoretical studies on ethylene intramolecular hydrogen migration reactions.38–40 To test whether the NN-driven adaptive sampling method can discover the missing data for the alternative carbene pathway without prior knowledge or human bias, the initial training set only included the information about the trans → cis isomerization of trans-1.
Hyperparameters | Energies, forces | NACsa | ||
---|---|---|---|---|
NN1 | NN2 | NN1 | NN2 | |
a Here the NAC only represents the interstate coupling term. b The MAE and R2 of forces are shown in the parenthesis. | ||||
Hidden layers | 3 | 5 | 5 | 3 |
Neurons/layer | 400 | 300 | 300 | 600 |
Batch size | 64 | 128 | 128 | 128 |
MAE | 0.023(0.14)b | 0.025(0.15)b | 0.182 | 0.170 |
R 2 | 0.9989(0.9357)b | 0.9984(0.9536)b | 0.7137 | 0.7423 |
The MAE of predicted energies is 0.023–0.025 that meets the ‘chemical accuracy’ threshold (1 kcal mol−1 = 0.043 eV). The R2 values are 0.9984–0.9989 in NN1 and NN2, suggesting an almost linear correlation between the NN prediction and QC reference. The MAE and R2 in force calculation are 0.14–0.15 eV Å−1 and 0.9357–0.9536. These values are consistent with previous reports on SO2 and CH2NH2+ photochemistry.18 The MAE and R2 in the interstate coupling term of the NACs are 0.170–0.182 eV Å−1 and 0.7137–0.7423, respectively. The low R2 relative to energies and forces suggests possible overfitting of NACs (the training MAE are 0.014–0.015 eV Å−1 and the training R2 are 0.9969–0.9970) due to the indefinite term of the antiderivative of NACs and unavailable training data for it.
The state populations of QC trajectories (Fig. 3a) suggest the S1 half-life is 33.5 fs. The NN FSSH trajectories depending on predicted NACs show rather slower decay of the S1 state. The S1 half-life is 71.0 fs and the MAE in population between the NN FSSH and QC trajectories is 0.105. The NN ZNSH trajectories using NN energies and forces predict that the S1 half-life is 29.0 fs with the MAE in population of 2.5%. The good agreement between the NN ZNSH and QC trajectories indicates the NN energies and forces are sufficiently accurate to detect surface hopping events. The overestimated S1 lifetime in NN FSSH trajectories is attributed to the errors in the NAC model. It seems to underestimate the surface hopping probability, which keeps trajectories at S1 in a longer time until it finds a smaller energy gap for a hop to S0. The S1/S0 gap is similar for QC FSSH and the NNs ZNSH (0.51 eV and 0.46 eV, respectively) while the NN FSSH is 0.12 eV. Due to the closer match, we focused on the ZNSH trajectories in the subsequent discussions.
We tested the limits of our newly established ML-NAMD model by running unprecedented 10 ns simulations with a 0.5 fs time step. We obtained 89 NN ZNSH trajectories and the average state populations are shown in Fig. 3b. The state populations in the first 500 fs also include the 5820 trajectories mentioned above. The non-radiative S1/S0 relaxation completes within 1 ps. The flat population curve after 103 fs indicates all trajectories stayed in S0 up to 10 ns, without surface hopping up to the S1 state. The total 2 × 107 iterations were accomplished in an average of 50 hours on a single CPU. The 10 ns simulation with CASSCF(2,2)/cc-pVDZ would otherwise require approximately 58 years of wall time. We have included the ML-NAMD trajectory video in ESI.†
To quantify the difference between the photochemical reaction pathways enumerated by the QC and NN trajectories, we plot the trajectories corresponding to the trans → cis isomerization of trans-1 and their energies with respect to the ∠H–C–C–H and ∠C–C–C–C dihedral angles in Fig. 4.
Fig. 4a illustrates the trace of the QC trajectories that bifurcates toward cis-1 and trans-1. All trajectories started with a narrow spreading in the range of ∠C–C–C–C = 150–180° and ∠H–C–C–H = 0–60° (upper right corner). 248 surface hopping events turned the trajectories along the direction of the ∠H–C–C–H axis toward cis-1. 698 trajectories took the same path returning to trans-1. The trans:cis ratio is 2.8:1 in the QC trajectories. The NN predicted trajectories in Fig. 3b recreated the topology of the reference. The NN predicted 772 trajectories transformed to cis-1 and 2680 trajectories back to trans-1, resulting in a corresponding ratio of 3.5:1. The NN and QC trajectories show virtually identical evolution of the average ∠C–C–C–C and ∠H–C–C–H angles during the 500 fs simulation (Fig. S14†).
The QC and NN trajectories undergoing the trans → carbene pathway are shown in Fig. 5a and b with 5 snapshots from a QC trajectory in Fig. 5c, including the surface hopping point.
425 QC trajectories formed 2 in a trans:carbene ratio of 1.6:1. The trajectories in Fig. 5a display that the transformation shared the path with trans → cis isomerization from the Franck–Condon region to the S1/S0 crossing region. The S1/S0 surface hopping happened in the area of ∠C–C–C–C = 150–180° and ∠H–C–C–H = 0–60°. In Fig. 5c, the hydrogen migrated to the other carbon atom from 58 to 70 fs. The ∠H–C–C–H angle, 50° at 58 fs, increased to 86° at 70 fs. The CC π-bond broke resulting in a nonplanar geometry (∠C–C–C–C = 164° and ∠H–C–C–H = 113° at 75 fs). We intentionally omitted the trans → carbene pathway data in the initial training set, yet the adaptive sampling located 347 structures with significantly broken C–H bonds (>1.40 Å) resembling 2 (5.6% in the final training set). Fig. 5b shows the 2368 NN trajectories toward 2. The predicted trans:carbene ratio is 1.1:1.
We analyzed the crossing region and surface hopping points for the QC and NN trajectories. Fig. 6 shows the spatial distribution of the latest S1/S0 surface hopping point in each trajectory and examples of surface hopping geometries along the S1/S0 crossing seam.
Fig. 6a projects the latest S1/S0 surface hopping points from the CASSCF(2,2)/cc-pVDZ trajectories into the 2D conformational space of 1. The dense areas represent the S1/S0 crossing region as defined by the two reaction coordinates. In the density map (Fig. 6a, right panel), the surface hopping points accumulated at two regions, ∠C–C–C–C = 170°; ∠H–C–C–H = 60° and ∠C–C–C–C = 180°; ∠H–C–C–H = 40°, which have larger ∠C–C–C–C angels than MECP-trans-1. Beside the hopping points near MECP-trans-1, Fig. 6c shows two pyramidalized surface hopping geometries at A (∠C–C–C–C = 180° and ∠H–C–C–H = 80°) and B (∠C–C–C–C = 180° and ∠H–C–C–H = 20°). We noted C–H stretching surface hopping points at D (∠C–C–C–C = 180° and ∠H–C–C–H = 180°); the C–H distance is 2.24 Å. The surface hopping geometries at E (∠C–C–C–C = 110° and ∠H–C–C–H = 140°) have shown the formation of 2, which suggests that excited-state hydrogen abstraction is theoretically possible. Fig. 6b depicts the distribution and density of NN predicted S1/S0 surface hopping points. These surface hopping events occurred closely at ∠C–C–C–C = 180°; ∠H–C–C–H = 40° agreeing with those observed in QC trajectories. The significant increase in the number of NN trajectories provided an increased statistical significance of our crossing region analysis. The NNs located a twisting surface hopping geometry at C (∠C–C–C–C = 120° and ∠H–C–C–H = 90°). The density map (Fig. 6b, right panel) suggests it is a relatively rare surface hopping event as few points were located at C. The NN could detect the rare event because of the substantially increased number of trajectories (5820) with a nearly negible increase in computational cost.
Hyperparameters | Energies, forces | NACsa | ||
---|---|---|---|---|
NN1 | NN2 | NN1 | NN2 | |
a Here the NAC only represents the interstate coupling term. b The MAE and R2 of forces are shown in the parenthesis. | ||||
Hidden layers | 3 | 4 | 3 | 4 |
Neurons/layer | 400 | 300 | 500 | 300 |
Batch size | 128 | 128 | 128 | 128 |
MAE | 0.027(0.12)b | 0.031(0.13)b | 0.078 | 0.081 |
R 2 | 0.9996(0.9934)b | 0.9991(0.9858)b | 0.7956 | 0.5732 |
The MAE of the energy predictions for compound 3 are 0.027–0.031 eV, which approaches chemical accuracy (1 kcal mol−1 or 0.043 eV). The R2 values are 0.9991–0.9996 and demonstrate a good correlation between the predicted and the target energies. The MAE of the forces ranges from 0.12–0.13 eV Å−1, with R2 values ranging from 0.9991–0.9996. They are 0.02 eV Å−1 smaller than in case of compound 1, suggesting improved fitting of forces. The MAE of the S1/S0-coupling are 0.078–0.081 eV Å−1, which are smaller than 50% of that of 1. Nevertheless, possible overfitting of the S1/S0-coupling still occurs as the R2 values (0.5723–0.7956) are relatively low in the validation set (the MAE and R2 are 0.036–0.040 eV and 0.9217–0.9452 in the training set).
We compared the QC and NN state populations to assess the errors of our trained NN in predicting excited-state dynamics of 3 (Fig. 7).
Fig. 7 shows that the S1 half-life of the QC trajectories is 108 fs. The NN FSSH trajectories overestimated it to be 175 fs. The resulting MAE in population between NN FSSH and QC trajectories is 14.8% in the first 500 fs. In contrast, the NN ZNSH trajectories agree better with the QC trajectories. The predicted S1 half-is 105 fs and the MAE in population is 4.5%. The delayed surface hopping events in the NN FSSH trajectories resulted from the underestimation of NACs as discussed in Section 4.3. This causes the NN FSSH trajectories to spend longer times on approaching regions with smaller energy gaps. The average surface hopping S1/S0 energy gap is 0.22, which is notably smaller than 0.45 eV in the QC trajectories (0.26 eV in the NN ZNSH trajectories). Thus, the following discussion of the ML-NAMD simulations of 3 will focus on the NN ZNSH trajectories.
The QC and NN trajectories are characterized with the distance R between the two carbon atoms closing the cyclohexadiene ring and the bending angle θ of the cyclohexadiene plane, shown in Fig. 8a and b. The optimized geometries of observed products and intermediates are shown in Fig. 8c.
The QC and NN trajectories show two types of pathways. We distinguished the syn- and anti-pathways with the bending angle of the cyclohexadiene plane, θ = 120–150° and θ = 210–240°, respectively. 2 of 240 QC trajectories formed anti-4 in 1 ps corresponding to a low yield of 0.8%, while a pathway to syn-4 is absent due to the insufficient number of trajectories. Our calculations suggest that the major outcome of the photoreaction of 3 return to the reactant (Fig. S19†). The higher experimental yield of anti-4 (28%) and syn-4 (4%) is because the reactant can be re-excited to have another chance of ring-closing during the 4 hours of irradiation. Our QC-NAMD simulations also suggest a less direct pathway involving reactive intermediates, 3a and 3b (Fig. 8a and c). The QC trajectories suggest a yield of 3a and 3b are 13% and 7% (Fig. 8a). Among those non-productive trajectories (Fig. S19†), 44% of the trajectories involved syn-configurations, and 35% formed anti-configuration before they reverted to 3. Collectively, from the S1-FC regions, 57% of the trajectories followed the syn-pathway, and 43% of the trajectories pursued the anti-pathway. The preferential syn-pathway in the QC-NAMD simulations agree with the minimum energy path (MEP) calculations of 3 (Fig. S17†) where the syn-pathway is steeper descent than the anti-pathway.
The increase in NN trajectories in Fig. 8b identified a pathway to syn-4, with a 0.2% yield. The yield of anti-4, 3a, and 3b are 0.7%, 11%, and 5%, respectively, resulting in 84% non-productive trajectories reverting to 3. The formations of syn-4 and anti-4 start near 100 fs and complete near 200 fs (Fig. S20†). We compare the yield of syn-4 and anti-4 in the NN trajectories and the experiment by zooming in the 100–200 fs region and normalizing the yield and time scale. The predicted yield curves show excellent agreement with the experiment data (Fig. S21†). Among the non-productive trajectories, 53% preferred the syn-pathway, and 31% went through the anti-pathway (Fig. S19†). Overall, the NN trajectories are consistent with QC trajectories and MEP calculations that the syn-pathway (64%) is preferred over the anti-pathway (36%). It is also worth mentioning that we had not provided any information about the intermediates 3a and 3b in the initial training set since we did not recognize them until running QC-NAMD. The NN effectively collected the necessary training data through adaptive sampling.
We then analyzed the QC and NN surface hopping geometries to understand the behavior of NN trajectories in the crossing regions. Fig. 9a and b plot the distributions and density of the last S1/S0 surface hopping geometries in the QC and NN trajectories. Fig. 9c illustrates examples of three surface hopping geometries.
The 2D projection of surface hopping geometries from the QC trajectories shows three distinct regions (Fig. 9a). The crossing region around F and H resemble the MECP geometries along the syn- and anti-pathway, MECP-syn-3 and MECP-anti-3 (Fig. S16†). These geometries show shortened C–C distances (R = 2.2–2.6 Å) and bent cyclohexadiene plane (θ = 140–160° near F and θ = 210–230° near H) that are distorted toward the 4π-electrocyclic ring-closing. The region F is denser than H indicating the syn-pathway is preferred. Another crossing region G corresponds to cyclohexadiene-like surface hopping geometries (R = 2.8–3.2 Å, θ = 150–210°). These surface hopping events occur via the stretching of the cyclohexadiene plane. Fig. 9b plots surface hopping geometries in the NN trajectories. The distributions are similar in the three crossing regions. According to the density map, the NN tend to predict more surface hopping at the ring-closing regions (F and H) than the stretching region (G).
In the non-productive trajectories, we observed thermal conversions from 3a and 3b to 3. Fig. 10 depicts the QC trajectories snapshots showing the conversion from 3a to 3 after it hops to the ground-state.
Fig. 10 Snapshots of the formation of transient intermediate 3a in a CASSCF(4,3)/ANO-S-VDZP trajectory. The trajectory first formed 3a at 122.5 fs and then converted to 3 at 331.5 fs. |
The snapshot in Fig. 10 shows a first appearance of 3a is at 122.5 fs. It completely transformed to 3 at 331.5 fs suggesting a lifetime of 209 fs. The trajectory statistics concludes that 0.8% of the QC trajectories undergo 3a → 3 conversion and 0.8% go for 3b → 3. The NN trajectories confirm the 3a → 3 and 3b → 3 thermal conversions in 5% and 3% of the trajectories, respectively. The remaining 3a and 3b at the end of the NAMD simulations suggest the lifetime of 3a and 3b can be longer than 1 ps. That encourages us to explore subsequent thermal reactions of the photodynamics of 3 in nanosecond scale.
We continued to propagate 984 NN trajectories from 1 ps to 1 ns, which completes in just 4.8 hours per trajectory. We monitored the interconversion from 3a and 3b to 3 with the ∠H–C–C–H dihedral angles a, as shown in Fig. 11.
The top panel in Fig. 11 illustrates trajectories that formed the aforementioned short-lived 3a and 3b, which convert to 3 within 1 ps. The dihedral angle a is below 60° in the initial 100 fs. The formation of 3a and 3b rotate the CC bond leading to a = 180° and the continued conversion to 3 recovery the planar cyclohexadiene with a < 30°. The bottom panel in Fig. 11 shows the trajectories involved with a long-lived 3a and 3b. The trajectories retain the dihedral angle a = 180° at 1 ps. By tracking reaction coordinate a, we note that the earliest conversion occurred right after 1 ps and the last one appeared at 0.9 ns. In the end of 1 ns simulation, the yield of 3a reduced to 1% and the 3b completely converted to 3. Measuring the ∼0.9 ns lifetime of these reactive intermediates would not be possible without ML-NAMD simulations; this dynamic effect offers important mechanistic insight into photochemical ladderene formation reactions.
For the cis–trans isomerization of trans-1, the NNs effectively accelerated the computations of energies, forces, and NACs for trans-1 in 3.4 × 104-fold compared to the CASSCF calculations. The 10 ns ML-NAMD simulations from the S1 Franck–Condon regions of the Wigner sampled initial conditions at 300 K on a single CPU completed in 2 days 2 hours on average of 89 trajectories, which would have approximately required 58 years for running the CASSCF(2,2)/cc-pVDZ calculations. The analysis of 5820 NN trajectories in 500 fs shows consistent results with the QC trajectories, predicting the formation of cis-1 (trans:cis = 3.5:1) and 2 (trans:carbene = 1.1:1). The NN acceleration becomes more significant (2.1 × 105-fold) for complex organic molecules such as 3 because of the considerably increasing cost of CASSCF calculations with more electrons and larger active space. Our results of 3954 NN trajectories in 1 ps confirmed the low-yield of syn-4 (0.2%) and anti-4 (0.7%) that support our experimental results. The continued 1 ns NN photodynamics simulations revealed almost complete thermal conversion from 3a and 3b to 3, which explain their absence in experiment.
We have demonstrated that the PyRAI2MD ML-NAMD approach is able to not only reproduce experimental results, but can be used to identify new light-induced mechanistic pathways. The identification of reactive intermediates such as the carbene 2 and isomers 3a and 3b help to explain experimental results. Our study provides a path forward to quantify the lifetimes of reactive long-lived excited-states and reactive intermediates currently inaccessible with quantum chemical methods alone. We anticipate that these PyRAI2MD-enabled simulations will enable other researchers to understand photochemical reactivities and selectivities with thousands of nanosecond trajectories, facilitating connections between excited- and ground-state mechanistic pathways.
Footnote |
† Electronic supplementary information (ESI) available: Machine learning model, the initial training set generation, NACs phase corrections, grid search, and adaptive sampling, the computational cost of ML-NAMD, data and code availability, the QC calculations of hexafluoro-2-butene and norbornyl cyclohexadiene systems, the Cartesian coordinates of optimized geometries, and the experimental details for the norbornyl cyclohexadiene system. See DOI: 10.1039/d0sc05610c |
This journal is © The Royal Society of Chemistry 2021 |