Seung Soo
Kim
and
Young Min
Rhee
*
Department of Chemistry, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Korea. E-mail: ymrhee@kaist.ac.kr
First published on 20th February 2024
Interpolation of potential energy surfaces (PESs) can provide a practical route to performing molecular dynamics simulations with a reliability matching a high-level quantum chemical calculation. An obstacle to its widespread use is perhaps the lack of general and optimal interpolation settings that can be applied in a black-box manner for any given molecular system. How to set up the weights for interpolation is one such task, and we still need to diversify the approaches in order to treat various systems. Here, we develop a new interpolation weighting scheme, which allows us to choose the weighting coordinates in a system-specific manner, by amplifying the contribution from specific internal coordinates. The new weighting scheme with an appropriate selection of coordinates is proved to be effective in reducing the interpolation error along the reaction pathway. As a demonstration, we consider the photoactive yellow protein chromophore system, as it constitutes itself as an interesting target that bears long-standing questions related to excited-state dynamics inside protein environments. We build its two-state diabatic interpolated PES with the new weighting scheme. We indeed see the utility of our scheme by conducting nonadiabatic molecular dynamics simulations with the required semi-global PES based on a limited number of data points.
To reduce the computational cost associated with the direct evaluation of PES information, tactics of modeling PESs have been widely pursued through various fitting,18–22 interpolation,23–25 and machine-learning techniques.26–32 Among these, the PES models based on modified Shepard's interpolation (MSI)23,33–43 have been known for their data-driven features and formal simplicity. MSI approximates the semi-global PES as a weighted average of the Taylor expansions around all data points. Although the original model was proposed for gas-phase reactive dynamics,23,33,34 the scheme was continually extended to cover a wide variety of situations.44–57 The extensions include the interpolation of multiple and coupled electronic states for addressing excited-state dynamics simulations,46–49 and the linking with molecular mechanics (MM) potentials to efficiently model environmental effects in condensed-phase dynamics.51–53,56 The interpolation mechanics/molecular mechanics (IM/MM)54,58 approach and the electrostatically embedded multiconfigurational molecular mechanics/molecular mechanics50–52 can be considered as two examples with such extensions. In recent applications, such approaches were successfully employed to simulate the excited-state intramolecular proton transfer of a dye molecule in solution,59 nonadiabatic dynamics of fluorescent proteins,60 and excited-state dynamics of light-harvesting complexes.61–63 Thus, together with the recently booming machine-learning approaches,64–67 the MSI technique provides a potential framework for efficiently constructing PESs for condensed-phase molecular dynamics simulations. However, as in other data-driven approaches, the method is still challenged by two major aspects: (1) building a reliable dataset along a targeted reaction pathway, and (2) finding the optimal interpolation form for a given dataset. In this study, we will concentrate on the latter of these two challenges.
One of the key issues that determine the overall quality of MSI is the choice of the interpolating weight function.39,42 Usually, a weight is assigned to each data point and is chosen to be inversely proportional to the geometric distance from the data point. Thus, the primary concern in choosing the weight function is the definition of the weighting coordinates, whose differences for a pair of conformations are translated into the distance between the conformations. In fact, the choice of the weighting coordinates had been the focus of some early methodological studies.36,37,39,41 While internal weighting coordinates can be more directly related to the molecular force constant, as in conventional MM force fields, Cartesian weighting coordinates give a more uniformly distributed dataset points in space. A previous study showed that this uniformity can be important for faster convergence of PES construction, especially for planar systems.41 However, properly choosing weighting coordinates is essentially a chemistry problem and is related to the intrinsic character of the targeted system such as its reaction coordinates and molecular geometry. Indeed, a small geometric distortion in some molecules may induce a large change in their potential and derivatives, while in others, even a large distortion only generates a minor change. For instance, in the case of retinal chromophores in bovine rhodopsin,68,69 which has been repeatedly studied as a model system for photosensory proteins, the hydrogen out-of-plane (HOOP) coordinate has been pointed out as a critical mode during its ultrafast photoisomerization reaction.70,71 From a purely geometric perspective, the HOOP motion does not produce a large displacement and is therefore “insignificant” compared to the isomerization coordinate. However, multiple studies have suggested that the HOOP coordinate is important in controlling the orbital overlap between the conjugated π-fragments and that its velocity component strongly affects the direction of isomerization especially in the vicinity of its transition state.70,72,73 Moreover, studies that targeted biological chromophores in other photosensory proteins also reported similar observations regarding similarly defined coordinates.74,75 Therefore, when one models such chromophores with MSI, one will need to assign the interpolation weights such that special care is provided to the HOOP coordinate. More generally, it becomes necessary to work in a system-specific way toward assigning the weighting coordinates. The aim of this work is to try out such a framework on top of the conventional MSI method with an eventual purpose of establishing a generalized strategy.
Accordingly, in this work, we present a new weighting scheme to improve the quality of interpolated PESs. We apply the new scheme to construct a coupled diabatic PES model of the thioester derivative of p-coumaric acid in photoactive yellow protein (PYP).76,77 In the new scheme, the weighting coordinates are divided into two groups: conventional Cartesian weighting coordinates and additional internal one(s) that are intuitively chosen from a chemical sense in a system-specific way. In general, we expect one or two such coordinates will likely be enough in most cases and pick one for p-coumaric acid. By applying the scheme to this PYP chromophore, we show that the mixed and redundant weighting coordinates actually improve the overall quality of MSI, proving that the new weighting scheme provides more accurate yet simple enough formal framework for future applications of the MSI technique. With the constructed diabatic PES, we simulate nonadiabatic surface hopping dynamics of the PYP chromophore in an aqueous solution, and show that dynamic electron correlations are important for reliably simulating excited-state dynamics of the chromophore, which will likely be the case for other organic chromophores.78–80
UIM/MM = UIM + UMM + UIM–MM + Ustab | (1) |
![]() | (2) |
![]() | (3) |
Finally, the displacement vector Δn from the nth data point geometry Zndat to the current geometry Z is defined as
![]() | (4) |
![]() | (5) |
![]() | (6) |
In line with the previous report,53 we choose p = 6 for all one-part weight functions used in this study. Additionally, a little more sophisticated case involves a two-part weight function:37,57
![]() | (7) |
![]() | (8) |
Here, a is the number of such customized weighting coordinates, and δi,n(zi) is the effective distance with respect to the ith customized weighting coordinate, zi. One should note that this modified distance depends on both the Cartesian and internal coordinates of the current geometry. Of course, the Cartesian and the internal coordinates will be correlated in some way, but this correlation does not cause any mathematical trouble as long as the modified distance is a positive-definite function around the data point. We also note that the weighting distance with redundant coordinates has been successfully adopted in earlier reports.37,39 The definition of the effective distance is straightforward, and similar to eqn (4), it can be defined as
![]() | (9) |
In general, there is no limit on how many customized weighting coordinates can be used, and the number a should be selected to well represent the chemistry of the system of interest. In many situations, interesting dynamics rather follow a low dimensional space (e.g., reaction along a minimum energy path), and we expect that a single additional weighting coordinate may work well enough in many cases. In this case with a = 1, in combination with the simple form of the weight function (eqn (6)), the unnormalized weight can be written as
![]() | (10) |
Using the two-part weighting function (eqn (7)) is also possible, and we will indeed use it in analyzing the detailed effects of adopting the customized weighting coordinate in results and discussion. Throughout these analyses, the errors of the Taylor expansions from the data points that mainly participated the IM energy (eqn (2)) need to be monitored. Toward this end, the inverse participation ratio (IPR)35 can be adopted, which is defined as
![]() | (11) |
By definition, the IPR value converges to unity as the running geometry coincides with one of the data point geometries, while it reaches N as the distances from all data points becomes divergently large. Therefore, IPR represents the effective number of main contributors in a weighted average shown in eqn (2).
For MS-CASPT2 calculations, 32 upper SA-CASSCF orbitals were explicitly correlated following the standard implementation of MOLPRO. A level shift of 0.2 a.u. was employed to facilitate the convergence of MS-CASPT2. Similar to the SA-CASSCF case, the MS-CASPT2 energies and their analytical gradients were computed for all data points but neither second order derivatives nor nonadiabatic coupling vectors were calculated. Instead, the recently proposed α-CASSCF method91 was introduced to practically address the dynamic correlation in a more efficient way. In α-CASSCF, the SA-CASSCF energies are scaled with a single empirical parameter α. This parameter is set to minimize the RMS error between the relative α-CASSCF energy and the relative MS-CASPT2 energy at a few critical geometries. Following the earlier fitting procedures,91 the empirical parameter was determined to be 0.7 for the PYP chromophore. Thus, all the energies and derivatives obtained within the SA-CASSCF were rescaled with this empirical parameter to give the approximate second order Taylor's expansions for interpolated PES at the MS-CASPT2 level, and then the MS-CASPT2 energies and gradients were used to give the first order correction term to the approximate expansions. This approach can be regarded as the diabatic variant of the dual-interpolation scheme.33 The detailed explanation of this approach can also be found in the previous report.55
To utilize an interpolated PES in studying condensed-phase nonadiabatic dynamics, one needs to handle state-dependent electrostatic interactions involving the chromophore. This can be achieved with diabatic atomic partial charges.56 In this work, we adopted the restricted electrostatic potential (RESP) fitting approach93 in a state-specific manner as follows. First, the geometry was optimized at the MS-CASPT2 level in the S0 ground state. At the optimized geometry, RESP fitting was processed for all SA-CASSCF one-particle reduced densities as well as the transition densities to generate a set of reference diabatic atomic partial charges as listed in Table S2 (ESI†). The sizes of the diabatic partial charges are insensitive to geometric changes, and the calculated RESP charges were kept fixed at all geometries toward calculating the electrostatic interactions between the IM and MM regions. Following earlier works,56,60 the Lennard-Jones interactions between the two regions were assumed to be state-independent after adjusting the Lennard-Jones parameters for the PYP chromophore simply by referring to the generalized amber force fields (GAFF).94
During the course of IM/MM MD simulations, the integration time step was 0.5 fs and no LINCS constraints were used. After switching to the IM/MM potential, each starting configuration was relaxed in the ground state for 20 ps, and the relaxed configurations thus obtained were used in the subsequent excited-state IM/MM NAMD simulations. Physically, this tactic corresponds to simulating the condition right after the Franck–Condon transition of the equilibrium ensemble. At time zero, the trajectory started in the S1 state and was propagated for 4.0 ps with an integration time step of 0.5 fs.
![]() | (12) |
![]() | ||
Fig. 1 Chemical structure of the PYP chromophore with the definitions of the key internal coordinates, with τ(A–B–C–D) denoting the torsional angle involving the atoms A, B, C, and D. |
In the SA-CASSCF calculations, three orbitals with four electrons were selected for the active space. In fact, this choice has been proven to be useful in effectively modeling the charge-transfer diabatic states of monomethine dyes such as the anionic green fluorescent protein chromophore.101 Although the PYP chromophore does not belong to the group of monomethine dyes, we observed that the same choice was still valid for describing its charge transfer along α and β. Indeed, when we plot the active orbitals after Boys localization,102 we can clearly see that each one is well localized within the three well-distinguished fragments of phenoxy, bridge, and tail moieties (Fig. 2a). Furthermore, these localized active orbitals remain isomorphic in different geometries, and this isomorphism is particularly useful in defining the charge-localized diabatic states. Following the block-diagonalization formalism,103 the S0 and the S1 states can be transformed into two charge-localized diabatic states, which we name here as P- and Q-states, respectively. As the names suggest, the chemical properties of these diabatic states closely resemble the phenolic and quinonic resonance structures.104 The predominant electronic configuration in each charge-localized diabatic state is depicted in Fig. 2b. We also note that the similarly charge-localized diabatic states were defined for describing the charge transfer dynamics in monomethine dyes.101,105
![]() | ||
Fig. 2 Electronic structure of the PYP chromophore. (a) Boys-localized active space orbitals in representative geometries, obtained from SA2-CAS(4,3)SCF calculations. The orbitals remain isomorphic in planar, α-twisted, β-twisted, and carbonyl-twisted geometries. The definitions of the α and the β torsions can be found in Fig. 1. (b) The predominant electronic configurations in diabatic states with the Boys-localized active orbitals. |
The resulting diabatic potential energy curves along α and β are depicted in Fig. 3a. In the planar geometry, the two diabatic states have similar energies with a large diabatic coupling (∼2.0 eV). As the geometry twists along either α or β, each diabatic state converges into one of the adiabatic states, meaning that S0 and S1 charges localize in different patterns. In other words, charge transfer occurs along with the change in α or β torsion. Particularly, S1 becomes the Q state in the α-twisted geometry, while it becomes P in the β-twisted geometry. Of course, these charge-localization patterns are consistent with what was reported in an earlier quantum chemical study.104
Although SA-CASSCF may provide a correct description of the electron configuration associated with the charge transfer and π-bond breaking, its deficiency in dynamic correlation remains a problem. This is mainly important concerning reaction barriers corresponding to bond breaking. In the case of the PYP chromophore, an early analysis showed that the isomerization reaction barrier in S1 is largely affected by the inclusion of dynamic correlation.106 Of course, a natural remedy will be to adopt perturbative correction to recover the missing correlation, and we of course applied it in this study. However, using it every time we collect a data point for interpolation was deemed to be excessively costly especially in terms of database construction. Therefore, we combined it with the recently proposed α-CASSCF, an empirical correction to SA-CASSCF with a single parameter to closely mimic MS-CASPT2 results.91 The diabatic potential energy curves based on α-CASSCF and MS-CASPT2 are shown in Fig. 3b and c, and we can notice that the α-CASSCF diabatic potential energy curves indeed closely mimic the MS-CASPT2 ones, although α-CASSCF was originally developed with regard to adiabatic energies. We reason that at least for the PYP chromophore case the predominant electron configurations of the two diabatic states remain the same in both α-CASSCF and MS-CASPT2 and thus their adiabatic-to-diabatic transformation matrices will be very close to each other, resulting also in a good agreement in the diabatic picture. Of course, there are some discrepancies between α-CASSCF and MS-CASPT2, especially on the energy barrier related to β-twisting (Fig. 3b and c). How this discrepancy can possibly affect the excited-state MD simulations will be discussed in Section 4.4.
Finally, it is worth mentioning that a similar tendency was also found for the diabatic potential energy curves along the HOOP coordinate (Fig. S3, ESI†) with the Boys-localized active space orbitals remaining isomorphic even with HOOP-twisting (Fig. S4, ESI†). Thus, the current diabatization procedure appears to be robust at least for the changes in the three important internal coordinates denoted in Fig. 1.
The sampling conditions in all bins are summarized in Table 1. As shown in this table, the bins were distinguished with (i) the adiabatic electronic state for sampling simulations, (ii) the initial geometry of sampling simulations, and (iii) whether a restraining potential was used for sampling to prioritize a specific region in the conformational space. The first bin was filled up by simulating in S0 toward sampling well in the vicinity of the Frank-Condon region. Of course, in this sampling, the trajectory was initiated from the S0-optimized geometry and no restraining potential was applied. Next, samplings were conducted on the S1 state to cover the α/β twisting pathways. Because there are four S1 state minima with clockwise and anti-clockwise twisting on both α and β, data points in their vicinities were sampled through four independent bins. Samplings for these bins were conducted with no restraint potential. Finally, a series of samplings with varying dihedral restraints were performed to cover the region between the S0-optimized geometry and the four S1-state minima. In these samplings, the initial configurations were generated from the S0-optimized geometry by rigidly rotating α and β into bin-specific values. Dihedral restraints were applied to concentrate the sampling toward the targeted vicinity. The details of dihedral restraints can be found in Table S3 (ESI†).
Bin index | State | Initial geometry | Restraining potential | ΔNa |
---|---|---|---|---|
a The number of the final data points. b There are four twisted minima in the S1 state. c The geometries were generated by rigidly rotating the S0-optimized geometry through α and β. Detailed sampling conditions for each bin can be found in Table S3 in ESI. | ||||
1 | S0 | S0-optimized | No | 100 |
2–5 | S1 | S1-optimizedb | No | 400 |
6–21 | S1 | Rigidly-rotatedc | Yes | 1600 |
The sampling MD simulations were conducted in the gas phase, starting with random initial velocities generated corresponding to a temperature of 150 K. The sampling was purely based on our earlier distance-based scheme.49,54,55 In this scheme, when the sampling trajectory reached a point where its Cartesian distances from all data points exceeded a pre-set cutoff distance, the simulation was stopped and the reached conformation was added as a new data point. The cutoff distance was set to 1.0 Å in all sampling. The sampling continued until 100 data points were collected in each bin, leading to 2100 data points in total. How the interpolation data points after the entire sampling are distributed in space is depicted in Fig. 4, using the projection on the (α, β) plane. We can see that our tactic of dividing into independent bins indeed promoted a relatively even distribution of the data points along the reaction pathway. Hereafter, we will refer to the interpolation database only with the initial points (N = 21) as the “initial interpolation database”, while we will call the completed database with N = 2100 as the “interpolation database”.
Fig. 5 presents the interpolated S1 PESs projected on the (β, HOOP)-plane in the vicinity of the S0-optimized geometry, with and without the customized weighting coordinate. For direct comparisons, the reference QM PES based on α-CASSCF is also drawn in the figure. The plots for the S0 state PESs can also be found in Fig. S5 (ESI†). From Fig. 5a, we can already see some additional irregularities on the PES without the customized weighting scheme, which are not eliminated even after increasing the database size. Especially, the two plots show qualitative differences in the upper right and the lower left regions, and the S1 state energy is significantly underestimated without using our new weighting scheme. This underestimation will seriously distort simulated trajectories toward fictitiously low energy sides,58 and this aspect clearly demonstrates the need for adopting the customized weighting coordinate. The irregularities do not disappear with the growth in the database size to N = 2100 in Fig. 5b and c, whether we adopt one-part or two-part weighting schemes, even though the underestimations in the S1 potential were partly alleviated. With this observation, it is natural to anticipate that blindly increasing the database size with the GROW scheme will not provide any reliability unless the target-customized weighting coordinate is introduced. One might wonder that the error could have been induced by a limited number of ill-behaved data points in the region with large HOOP values. Namely, the error might be more about the locations of the data points rather than the choice of the weighting coordinate. The lack of reliability even with the two-part weight function shown in Fig. 5c straightforwardly negates this possibility. In general, adopting the two-part weight function using confidence radii has been proven to be effective in reducing the error associated with ill-behaved data points. Indeed, for the PESs with customized weighting coordinates, improvements from Fig. 5b to c are quite noticeable. The procedures for determining the confidence radii for the two-part weight function can be found in ESI.†
While the contour maps of the PESs in Fig. 5 give us insights into the interpolation errors as functions of two key twisting degrees of freedom, it remains to be seen whether the observed interpolation errors may have meaningful impacts on the dynamics followed by MD simulations. To address this question, a series of test configurations were generated with MD simulations for additional comparisons of the interpolated PESs with and without the customized weighting coordinate. Because using one or the other interpolated PES for this configuration generation may not be fair for the comparison, we extracted 100 chromophore geometries from trajectory snapshots of a condensed-phase MD simulation based on a pure MM potential, by adopting the same MM-level simulation protocol as described in Section 3.3. Then, each geometry was further manipulated by rigidly rotating the α and the β torsional angles into pre-defined values to collect the actual test configurations that cover the entire α/β range. The pre-defined angle values were α = ±10, ±20, …, ±90 deg paired with β = 180 deg, and α = 0 deg paired with β = 90, 100, …, 270 deg, resulting in 37 angle pairs to generate a total of 3700 test configurations.
The IM errors with and without the customized weighting coordinate at these test configurations are presented in Fig. 6 for both S0 and S1 states together with their gap. The RMS errors can also be found in Table 2. While the error levels from the two PESs are almost the same in the case of S0, there is a significant difference with the S1 state. The lower level of errors with S0 can be easily understood from the chemical nature of the chromophore. The chromophore is more rigid in S0 and its PES is highly harmonic with a single energy minimum in 90° < β < 270°, as shown in Fig. S5 (ESI†). Thus, the S0 Hessian values will not vary much over the tested configurations, and even an incorrect weighting within interpolation will suffer to a lesser extent. In contrast, the S1 PES is much more complicated with multiple minima and strongly varying Hessian (Fig. 5), and correct weighting becomes more of an issue. Especially, due to the inevitable discrepancy between MM and IM potentials, some of the test configurations will unavoidably demand extrapolating situations, and Hessian errors induced by a less reliable weighting scheme will become a larger concern. This explains why the interpolation error is larger in S1 and how our scheme with the added weighting coordinate improves the surface fidelity. Finally, in the case of the S0–S1 gap energy, we can observe a significant cancellation of errors and the resulting RMS errors are as small as 0.050 eV and 0.088 eV, with and without the customized weighting coordinate. The error cancellation is related to spectator degrees of freedom such as C–H bond stretching, for which the associated interpolation errors are very similar in both electronic states. Similar cancellation of interpolation errors between S0 and S1 states was also observed in an earlier study.57
Customized weighting | S0 | S1 | S0–S1 gap |
---|---|---|---|
Not used | 0.108 | 0.174 | 0.088 |
Used | 0.103 | 0.137 | 0.050 |
Potential energy curves along the α/β coordinates of some selected test configurations are plotted in Fig. 7. In all cases in the figure, the most significant difference between the IM PESs with and without the customized weighting is observed for the β-twisting barrier in the S1 state. This is indeed expected from what we already observed in Fig. 5b. In any case, this implies that the IM errors resulting from not using our new scheme may have a serious effect on the excited state twisting dynamics in the condensed phase. In contrast, along α, there is almost no difference between the two interpolated potential curves whether the customized weighting is used or not; although likely in some extrapolation cases, there also are noticeable errors in the α-twisted geometries as shown in Fig. 7c, and the customized weighting again remedies the issue quite satisfactorily. These facts show that the added weighting coordinate, which is related to the HOOP coordinate, does not only improve the β-twisting behavior, but also the α-twisting.
![]() | ||
Fig. 7 Interpolated potential energy curves along α and β for some selected test configurations with (blue) and without (black) the customized weighting scheme. For comparison, reference QM results are also shown (red). Arrows mark the points that will be adopted for further analyses in Fig. 8. |
To characterize how our new weighting scheme improves the energy prediction with the same set of database, we picked up three configurations with large energy errors from Fig. 7. At each configuration, we estimated the Taylor expansion errors from all individual data points and associated them with their contributions to the final interpolation. The results are pictorially provided in Fig. 8. Here, for each of the three configurations, we are showing a series of properties: the conventional Cartesian distance between the configuration and a given data point (horizontal axis), the added distance between the same pair in terms of the customized weighting coordinate (vertical axis), how importantly the given data point is contributing to the final interpolation (circle size), and the size of the energy error in the Taylor expansion from the given data point (color). For the energy error, we adopted the S0–S1 gap energy. We are showing only the results from the top-contributing data points, whose number was determined from the IPR value in eqn (11). Fig. 8 clearly shows that the main contributors to the IM error without the customized weighting coordinate are the terms with small Cartesian distances and large customized weighting distances, regardless of the IPR values or the identity among the three chosen configuration. When the customized weighting is introduced in the interpolation, such erroneous points do not contribute any more, and the IM errors are reduced to below 0.2 eV.
![]() | ||
Fig. 8 Errors of the Talyor expansions on the S0–S1 gap energy from the mainly participating data points to the final Shepard's interpolation. The horizontal and the vertical axes denote the distances of each data point in terms of conventional Cartesian coordinates and in terms of the newly added customized weighting coordinate, respectively. Left/right panels show the cases without/with including customized weighting coordinate to the interpolation weights. The configurations in (a), (b), and (c) correspond to the points marked with arrows in Fig. 7. The size of a circle is proportional to the relative magnitude of the interpolation weight, and the size of the error is represented in colors. The IPR value for each weighted summation, which corresponds to the number of circles in each panel, is also shown for reference. |
Before moving to the next section, let us discuss the choice of the added weighting coordinates. In fact, one simple numerical test for this choice can be a correlation analysis between the interpolation error and the displacement in the direction of a particular internal coordinate by following the definition of δi,n in eqn (9), potentially without any tolerance parameter. Indeed, we performed this simple test based on Pearson's correlation coefficient with the test configurations, and the test did suggest that the torsion of H13–C11–C12–H15 would be the most plausible one. The values of the correlation coefficients are listed in Tables S4–S7 (ESI†). The largest correlation was found with the torsion of H13–C11–C12–H15 with a correlation coefficient of 0.3723, while the ones for most of the other degrees of freedoms were below 0.2. Interestingly, other internal coordinates with significant correlations were also closely related to the isomerization and/or the charge transfer of the chromophore: C2–O17 bond stretching (0.3326), C5–C11 bond stretching (0.3074), and H13–C11–C12–C14 proper dihedral (0.2962). Thus, our intuition-based selection can be mathematically rationalized quite well. More importantly, this test suggests a systematic way of selecting the customized coordinate(s) in a more general sense. Even still, we emphasize that the actual choice will still heavily rely on the intrinsic character of a given chemical system and its reaction pathway, and we expect that the decision should be made based on chemistry.
In relation to this, there is a possibility of directly adopting HOOP as the customized coordinate. Not surprisingly, the effective distance in HOOP also strongly correlated with the interpolation error. The computed correlation coefficient for the HOOP coordinate was 0.3361, at a slightly lower level than the H13–C11–C12–H15 torsion. Therefore, we expect similar results would be attained even when HOOP is selected as the customized coordinate. Besides, because HOOP is defined as a difference between two torsional angles, its use will likely introduce unnecessary complications compared to using H13–C11–C12–H15. We also emphasize that the definition of effective distance shown in eqn (9) can indeed inclusively handle the case of choosing a linear combination of multiple internal coordinates such as HOOP. In general, when one finds multiple internal coordinates that are similarly correlated with the interpolation error, adopting them together or taking their linear combination may be the right tactic to follow. We anticipate that some symmetric molecules would behave in this manner.
Probably, what is more important than each numerical value is the fact that the α-CASSCF level dynamics appeared more closely to the SA-CASSCF level dynamics, rather than the MS-CASPT2 one. This was caused by the fact that the β-twisting barrier is underestimated by both SA-CASSCF and α-CASSCF, which can be easily seen in Fig. 3 and was also already discussed in an earlier report.106 The hopping geometries projected on the (α, β) plane presented in Fig. 10 also show that a significant fraction of the S1 state population undergoes decay through the β-twisting channel with SA-CASSCF and α-CASSCF, whereas transitions through that channel are negligibly observed with MS-CASPT2. These aspects clearly underscore the necessity of comparative analyses on the reaction pathways obtained with different levels of theories.
![]() | ||
Fig. 10 Distributions of the hopping geometries from IM/MM NAMD simulations based on (a) SA-CAS(4,3)SCF, (b) α-CAS(4,3)SCF, and (c) MS-CAS(4,3)PT2 levels of theories. |
Because the energy gap is one of the key deciding factors for hopping with NAMD simulations, we further inspected the degree of agreements between the IM S0–S1 gap energies and their reference QM values at the collected hopping geometries. The results are provided in Fig. 11, and they show that there are no exceptionally outlying IM errors. The RMS errors from the correlation plots are 0.056, 0.020, and 0.033 eV for the cases of the SA-CASSCF, α-CASSCF, and MS-CASPT2, respectively. These RMS errors are comparable to the 0.050 eV value found in Table 2, computed with the 3700 test configurations. The increased error with SA-CASSCF is likely due to the level discrepancy between the one used for conducting IM/MM simulations and the one used for the database sampling. In the case of MS-CASPT2, additional errors could have stemmed from the approximate nature of the adopted dual-interpolation scheme.33 In fact, this effect of the dual interpolation is more pronounced on the IM errors in individual S0 and S1 state energies (Table S8, ESI†), but they tend to cancel out each other toward the S0–S1 gap estimations. In any case, the definitely low level of RMS error of 0.033 eV is encouraging toward performing NAMD simulations with the dual-interpolation scheme. Of course, one should still care to achieve an enough fidelity with enough data points along the possible excited-state decay channels.
Indeed, the excited-state dynamics of the PYP complex continues to pose long-standing questions from both experimental and theoretical perspectives. Specifically, for example, both the detailed twisting pathway of the PYP chromophore in the complex and the key interactions within the complex that control the reaction barrier along the twisting pathway remain elusive.71,106,107 With our new weighting scheme and our new potential model that is as reliable as up to the MS-CASPT2 level of theory, we expect that new simulations of the excited-state dynamics of PYP can be performed. We hope that this may present an opportunity to answer some of the puzzling questions lingering around PYP, and we will report such results in due course.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cp05643k |
This journal is © the Owner Societies 2024 |