Esmae J.
Woods
ab and
David J.
Wales
*b
aCavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, UK
bYusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK. E-mail: dw34@cam.ac.uk
First published on 22nd November 2023
In this contribution we consider theory and associated computational tools to treat the kinetics associated with competing pathways on multifunnel energy landscapes. Multifunnel landscapes are associated with molecular switches and multifunctional materials, and are expected to exhibit multiple relaxation time scales and associated thermodynamic signatures in the heat capacity. Our focus here is on the first passage time distribution, which is encoded in a kinetic transition network containing all the locally stable states and the pathways between them. This network can be renormalised to reduce the dimensionality, while exactly conserving the mean first passage time and approximately conserving the full distribution. The structure of the reduced network can be visualised using disconnectivity graphs. We show how features in the first passage time distribution can be associated with specific kinetic traps, and how the appearance of competing relaxation time scales depends on the starting conditions. The theory is tested for two model landscapes and applied to an atomic cluster and a disordered peptide. Our most important contribution is probably the reconstruction of the full distribution for long time scales, where numerical problems prevent direct calculations. Here we combine accurate treatment of the mean first passage time with the reliable part of the distribution corresponding to faster time scales. Hence we now have a fundamental understanding of both thermodynamic and kinetic signatures of multifunnel landscapes.
In principle, we know how to calculate the full FPT distribution given a kinetic transition network of local minima and the transition states that connect them. Here, we assume that such a network has already been obtained using rare events techniques, either based on explicit dynamics10–18 or geometry optimisation in the discrete path sampling approach.19,20 If all the rates associated with transitions in the network are available, then we can set up a master equation for the global kinetics.21
The MFPT can be calculated deterministically using the graph transformation22–26 (GT) approach, which is numerically robust. Computation of the FPT either requires eigendecomposition of a transition matrix, or direct simulations based on kinetic Monte Carlo (kMC) methods.27–29 The dimensionality of the networks obtained in discrete path sampling typically contain between 104 and 106 local minima, which can be too large for eigendecomposition. However, we have shown that graph transformation can produce a renormalised network of amenable size where the FPT distribution is preserved, so long as key minima are retained.30 This partial graph transformation procedure forms the basis of all the calculations for larger networks in the present report.
Extracting the full FPT is problematical for systems that feature long time scales because the corresponding transition network is ill-conditioned. The linear algebra approaches that underpin eigendecomposition then break down because of numerical issues,24 while standard kMC methods encounter multiple recrossings (‘flickering’3,31) and require unfeasibly long trajectories. Kinetic path sampling (kPS),3,32 addresses the flickering problem using graph transformation to analyse escape from subnetworks, but requires a careful description of the metastable sets of nodes,31 which may itself be a difficult problem in practice.33–35 Here we focus instead on schemes to overcome the numerical problems of eigendecomposition.
We begin with a summary of the theory in Section II, starting from the underlying master equation. Next, we investigate the FPT itself in more detail, and show that peaks in the FPT may be associated with particular local minima in the landscape, providing an assignment for the specific relaxation modes (Section IIIA). This assignment uses the right eigenvector of the transition matrix. The left eigenvector allows us to scan the number of peaks in the FPT for all the possible sources, diagnosing starting conditions that would highlight alternative relaxation time scales (Section IIIB). We test this framework using a model landscape that features four distinct time scales, and show how the MFPT as a function of the observation time scale displays features that correspond to peaks in the FPT. The peak assignments are verified using direct kinetic Monte Carlo simulations, which are feasible in this test case if the temperature is not too low.
We then describe partial graph transformation for producing a renormalised network that approximately preserves the FPT with suitable dimensionality for eigendecomposition (Section IV). The renormalised network can be visualised using free energy disconnectivity graphs,36–39 which provide direct insight into the structure of the landscape and the competing relaxation time scales. The free energies are defined to reproduce the stationary distribution of the renormalised landscape30 and the transition rates between the states that are retained. Results are presented for a double-funnel37,40,41 atomic cluster.
The last two sections describe strategies for calculating the full FPT in cases where eigendecomposition breaks down. We first consider fitting individual peaks using results that cover a temperature range where the eigenvalues and eigenvectors are reliable (Section V). This approach is used to help validate an alternative hybrid scheme in Section VI. The hybrid approach combines the strengths of eigendecomposition and graph transformation, using eigenvalues and eigenvectors at the shorter time scale end of the spectrum, which remain accurate even when the results for slow relaxations are incorrect. Assuming that there is one missing slow relaxation, we can use the accurate MFPT obtained from graph transformation to deduce the missing part of the FPT.
The hybrid approach appears to be particularly powerful, and enables us to present results for the atomic cluster benchmark over 40 orders of magnitude. In particular, we can probe temperatures that cover the key change in morphology from icosahedral to close-packing.
(1) |
We denote the product states by the set and partition Ω as with . We may also specify initial states, as a subset of . The first passage time distribution to can be treated by setting all escape rates from to zero and considering the substochastic matrix for the subset of the full transition matrix Q containing the interstate transition rates within . is the corresponding subset of D including the escape rates to . The kinetics up to absorption are unchanged, so we analyse the master equation
(2) |
For a reversible Markov process eigendecomposition gives
(3) |
(4) |
Moments of the p(t) distribution can be calculated from eqn (3); in particular, the mean first passage time (MFPT) is
(5) |
(6) |
Our previous investigation showed how kinetic traps in the energy landscape produce signatures in the MFPT as a function of the observation timescale, tobs. Integrating θp(θ) up to tobs and including the corresponding normalisation z(tobs) gives:6
(7) |
(8) |
(9) |
(10) |
(11) |
The individual terms in the sum over eigenmodes that defines each have a maximum at y = lnλ−1, where the peak height is A/e, and the curvature (second derivative) is −A/e. Each term can be approximated quite well as λAexp[y − λexp(y)] ≈ Aexp[−(y + lnλ)2/2]/e, where the Gaussian has an integrated peak area about 8% less than the correct value.
For well separated time scales the positions and peak heights in correspond closely to the values for the individual terms. Since the peak height contains a factor that is the sum of right eigenvector components, wR(s′), with a second factor that depends on the overlap of wL and the initial probability distribution.
Hence the relative values of wR(s′) are related to the time spent in state s′, while the sum modulates the peak height in at lnθ ≈ lnλ−1. The components wR(s′) are not themselves observables, and they can have different signs. However, we find that the relative magnitudes do indeed correlate with the time spent in particular states when we break down the number of visits to each state for the different peaks.
To illustrate this correspondence we return to a model landscape that features four distinct relaxation time scales.6 The disconnectivity graph36,37 is shown in Fig. 1. The energy and all other quantities are measured in reduced units for this example. At a temperature T = 1.5 the longest kinetic Monte Carlo (kMC) trajectories do not exceed 1010 steps for a standard rejection-free27–29 scheme. For comparison with the eigendecomposition results 105 kMC trajectories were run, which is sufficient to converge the first passage time distribution. The number of visits to each local minimum was retained for each trajectory, providing a breakdown of the peaks in .
Fig. 1 Disconnectivity graph36,37 for a model energy landscape that features multiple kinetic traps.6 The scale bar corresponds to ten energy units and the local minima (sink) and (source) are marked. |
Fig. 2 presents the mean first passage time as a function of the observation time scale, calculated from eqn (7). The arrows in this figure correspond to the predicted plateau values from eqn (8), which correspond to time scales that progressively access deeper kinetic traps in the landscape as tobs increases. This figure also shows with the time scales corresponding to lnλ−1 and the plateau values for both marked. The steps in match the peak positions closely, except for the slowest relaxation. For this landscape A is essentially zero for all the modes except the slowest four, which correspond to A/λ values of 5.2, 2.0 × 106, 3.9 × 108, and 3.2 × 109, respectively.
Fig. 2 Results for the model landscape illustrated in Fig. 1 at T = 1.5, left: mean first passage time as a function of the observation time scale cutoff, tobs, calculated from the analytical eigendecomposition formulation in eqn (7). The red arrows mark the position of steps in predicted from eqn (8). Right: The probability distribution . The blue arrows correspond to the four longest time scales defined by lnλ−1 for eigenvalues of the Markov transition matrix . The red arrows correspond to the same times as for the steps in . |
The number of visits to the four sets of minima in Fig. 1 that constitute kinetic traps for relaxation to can be separated by assigning each of the 105 kMC trajectories to one of the four peaks in shown in Fig. 2. The results of this decomposition are shown in Fig. 3. We find that the four peaks can indeed be assigned to trajectories that access the traps in order of increasing depth. The longest trajectories for the peak centred at lnθ = 24 visit the deepest trap; the trajectories corresponding to the peak centred at lnθ = 20 visit all the traps except the deepest, etc.
Fig. 3 Breakdown of kinetic Monte Carlo results for visits to the four sets of minima that constitute kinetic traps for relaxation to in the landscape illustrated in Fig. 1. The four panels (a)–(d) are the visit statistics for kMC trajectories with first passage times corresponding to the four peaks in shown in Fig. 2, with (a) corresponding to the longest time scale, and (d) the fastest. Each panel contains separate histograms for the number of visits to the four traps, coloured orange, green, blue and red, which correspond to the four sets of minima from left to right and from highest to lowest energy in Fig. 1, starting with the two states that include the source . |
Having verified the peak assignment we can now compare this breakdown with the components of wR(s′) for the four eigenmodes associated with the four peaks. The objective is to determine whether the peaks could be assigned to local minima, s′, by inspecting the relative values of wR(s′), instead of performing kMC runs, which become computationally unfeasible as the temperature decreases and the relaxation time scales increase exponentially. For this analysis we scale the components of wR so that the largest value is 100. If we start from the slowest relaxation we find that a systematic assignment scheme appears to emerge. For this slowest mode the components for the two states in the deepest trap in Fig. 1 are 100, and the next largest magnitude component is less than 1. Moving to the peak at lnθ = 20, we find components of 100 for the nine states in the next deepest trap, values of −55 for the two minima we have just associated with the slowest relaxation, and a next largest value of 2.3. We therefore assign the third peak in to the minima in the third trap.
For the second peak at lnθ = 16, the ten minima in the second set of trap states have wR coefficients of 100, and the values for the states assigned to the third and fourth peaks are −70 and −21, respectively. Finally, for the small peak at lnθ = 5, it is the source state that has the maximum component of 100 (with 97 for the partner minimum, which is only connected to ), while the next largest magnitude is −21 for states assigned to the second peak.
Hence it appears that we can assign the peaks to states using the largest components of wR. For the example considered here, the next largest components in magnitude for wR have the opposite sign, and correspond to the states assigned to relaxation mode + 1, ordering the relaxation times from fastest to slowest. If the terms in the sum that contribute to can be treated separately, then the peak height at lnλ−1 for depends on the sum . The components of wR can take opposite signs, but our results suggest that the states making the largest contribution correspond to the kinetic trap that determines the time scale in question.
Having performed an eigendecomposition we can evaluate all the A for each source state and identify the eigenmodes that make a significant contribution to the sum that determines . Restricting the sum to terms above a given threshold provides an efficient calculation of the distribution (and its moments) for each source state, since the number of terms required is usually small. The number of peaks in above a chosen cutoff height can be counted during the calculation of each distribution, thus identifying sources that produce interesting structure.
Fig. 4 illustrates first passage time distributions to starting from a selected minimum in each of the four sets constituting kinetic traps. As the system is initialised in progressively deeper traps, the number of peaks decreases, and when a peak disappears its probability amplitude is transferred to the next fastest peak. For example, when the source is changed from to a minimum in the next set of states to the right in Fig. 1, the small peak around lnθ = 5 disappears, and the amplitude of the peak around lnθ = 16 increases slightly.
Fig. 4 First passage time distributions to starting from selected individual minima in the four different kinetic traps in Fig. 1 from left to right: the shallowest trap containing (solid grey), the second trap (dashed light blue), the third trap (dotted blue), and the deepest trap (dashdot dark blue). The contribution of each eigenvalue to the first passage time distribution depends on the initial condition of the network. |
(12) |
(13) |
The mean first passage time is certainly important, providing a phenomenological rate constant via the reciprocal. However, it cannot report on multiple relaxation time scales,6 which we associate with energy landscapes featuring multiple funnels and potentially multifunctional systems.7–9 Instead, the rate constant calculated in this way is likely to be dominated by the slowest relaxation process, which might lie beyond the experimental time scale. Enhanced kinetic Monte Carlo schemes, such as Monte Carlo with absorbing Markov chains (MCAMC)58,59 and kinetic path sampling,3,32,60 can provide the full first passage time distribution. However, they require a definition of metastable macrostates, and become computationally expensive when these states are large. Instead, we have recently introduced the partial graph transformation approach (pGT),30 and we showed how judicious choice of the set of removed states can yield a reduced network that retains the full first passage time distribution quite well.
The equilibrium occupation probability distribution for a pGT network Ω where nodes in set have been removed, , can be obtained by applying the balance condition.30 The result is
(14) |
(15) |
The first passage time (FPT) results presented in our previous contribution6 employed minimal subsets of much larger kinetic transition networks to demonstrate how multifunnelled landscapes with traps produce multiple features in . Here, we tackle the full database using the partial graph transformation approach, without using community definitions.
We have now found better ways to determine which states to retain with pGT, in order to properly preserve the full FPT distribution. Various possibilities were considered, and we will focus on the most successful approaches for brevity. The minimal starting information required is the identity of the source and sink states and the minima at the bottom of competing funnels in the landscape.
We first show how the simple model landscape shown in Fig. 1 can be reduced. We must retain at least one state from each kinetic trap, and the source and sinks and . However, we cannot arbitrarily choose which state to keep, because when a state is removed via GT, the waiting time for all the neighbouring states increases. This effect can increase the time taken to traverse a short-time path as states are removed. For example, the path corresponding to the shortest time peak in Fig. 2 corresponds to the direct transition . To protect this path from the effects of GT, we must keep all the neighbours of otherwise the waiting time to leave increases, and the shortest time peak shifts to longer times. For this simple landscape, we retain and all of its neighbours, which explicitly includes and at least one state from each kinetic trap. Visualisations of the resulting network are shown in Fig. 5, alongside the first passage time distributions generated from the full and pGT reduced networks, which are essentially identical.
Fig. 5 Partial graph transformation of the model landscape shown in Fig. 1. We retain state and all its neighbours, which explicitly includes the sink , and at least one state from each kinetic trap. (a) Disconnectivity graph for the reduced network, computed as described in Section IV. (b) and (c) Network visualisations showing connections between states, using Gephi,65 for the full and reduced networks, respectively. As states are removed their neighbours gain self-transitions. The colour scheme is the same as in Fig. 4. (d) The first passage time distributions for for the full (solid grey) and the reduced (dashed dark blue) network are essentially identical. |
To extend this process to larger, more complicated networks, we perform Dijkstra's shortest path algorithm62 for all pairs of states within the sources and sinks list, using edge weights equal to the branching probabilities in the original network, −lnBij.63,64 This analysis is equivalent to finding the path that makes the largest contribution to the rate constant when intervening minima are placed in steady state, in other words, the fastest path in the steady state regime.19 Using a straightforward depth-first algorithm we then calculate the minimum number of steps (via transition states) for every minimum to the minima in the fastest paths, with separate lists for the number of steps to the path endpoints, and to the full path. To study the convergence of the FPT as more minima are retained, states can be systematically added for an increasing number of steps (see the ESI,† for figures illustrating convergence).
Keeping minima on the fastest paths helps to preserve shorter time peaks in the FPT distribution, but retaining only these minima is insufficient. Retaining minima surrounding the fastest paths helps to protect them from the effects of GT and includes more routes for the flow of probability. Removal of a node via GT increases the waiting time for all directly connected nodes. If adjacent minima to the fastest paths are not retained then, as more minima are removed, the time taken to traverse the fastest path will increase.
To demonstrate the power of pGT, we revisit the nine community network presented in ref. 30, for which the disconnectivity graph is shown in Fig. 6. Instead of looking at transitions between metastable communities, we compute the FPT distribution between a high energy minimum and one of minima at the bottom of a basin (Fig. 7). When only the minima at the bottom of the nine trapping basins, plus the source minimum are retained, only the long time peaks in the FPT distribution are preserved. To maintain the full FPT distribution under pGT, we compute the fastest path between the source and the sink, which are labelled and in Fig. 6, respectively. We then additionally retain the fastest path and its first and second neighbouring minima, to produce a reduced network of 84 states, which is shown in blue in Fig. 6. The FPT distributions computed from the full 994 state network and the pGT reduced 84 state network match, which shows that we can reduce the size of a network by an order of magnitude, and still preserve the full FPT distribution. As for the previous model landscape, we work in reduced units. All these calculations correspond to T = 1, where eigendecomposition is tractable on the full 994 state network.
Fig. 6 Disconnectivity graphs36,37 for a model energy landscape consisting of nine traps.30 The scale bar corresponds to five energy units and the source and sink states and are marked. (a) Graph for the full system. The states that are on the fastest path between and and first and second neighbouring minima surrounding the fastest path, are marked in blue. (b) Graph for the pGT reduced system, computed using the method described in Section IV. We retain the global minimum of each trap, and all the states marked in blue in (a). |
Fig. 7 First passage time distributions for relaxation from to (left) and to (right), of the nine community network at T = 1.0, as shown in Fig. 6. The distributions are computed using eigendecomposition for the full network with 994 minima (thick grey line), and two different partial GT reduced networks consisting of, 10 states including the nine trap global minima and state (thin light blue line), and a network of 84 states that additionally contains the states on the fastest path between and and the first and second neighbouring minima surrounding the fastest path (dashed blue line). Only the minima at the bottom of the traps are required to retain long time behaviour, whilst minima on and around the fastest path help pick out the short time peaks. |
As well as reducing the network size, partial graph transformation can decrease the number of eigenmodes that contribute significantly to the first passage time distribution. The five eigenvalues smallest in magnitude agree to three significant figures for the full and reduced networks in Fig. 6. The slowest peak can be approximated by the eigenmode with the smallest magnitude eigenvalue, and the slowest two peaks can be reproduced from the modes corresponding to the five smallest magnitude eigenvalues. To reproduce the full distribution, the summation over eigenmodes in eqn (4) must include the 40 eigenmodes with the largest |A| values for the full network, compared to just 28 modes for the pGT reduced network. Alternatively, if we sum over the modes in order of decreasing magnitude for λ, 234 terms are required to reproduce the distribution for the full network, and only 39 modes for the pGT reduced system. (see ESI,† for more information).
Here we considered a database with 71887 minima in the connected component of interest, and 123561 transition states, which is available for download from the Cambridge Landscape Database.83 As in previous work, five low energy minima from the Oh funnel and 395 minima from the C5v funnel are chosen for the and states. Results corresponding to these sets are colour-coded red and blue , respectively, in the figures.
Our objective was to reduce the number of minima in the database using graph transformation as far as possible, while preserving the full first passage time distribution to a good approximation. We chose to run this part of the analysis at , which is somewhat higher than the equilibrium temperature for the Oh and C5v sets, but still low enough for broken ergodicity effects and separate relaxation time scales to be clearly manifested. At this temperature it is feasible to calculate converged first passage time distributions using a leapfrog kinetic Monte Carlo scheme6 and 200000 trajectories. We additionally ran 400000 trajectories for the first passage time to , and found that the distribution did not change. Standard kinetic Monte Carlo runs are feasible for the reduced network described below, and give very similar results for the first passage time distributions in each case, providing a further consistency check.
We investigate the FPT for relaxation from a high energy minimum to both the and sets. We compute the shortest (fastest) path, as defined in Section IVA, between all pairs of the lowest minima in and and the starting minimum at , producing a set of pathway minima, . We then find the distance in terms of the shortest discrete path from every other minimum to and separately, the distance to the combined set of the lowest minima in , and the starting minimum. We retain the 2500 states with the highest equilibrium occupation probabilities from the combined first- and second-neighbour set, from the latter definition, as well as every minimum in itself. The branching probabilities employed in the fastest path calculation are temperature dependent, and we analysed the effect of changing the temperature from . The retained states do not change very much over a wide range of T, and the effect on the calculated FPT is negligible (see ESI†). The results for this database of 2506 states are shown in Fig. 8, with the corresponding free energy disconnectivity graph in Fig. 9. The FPT results for the renormalised database can be converged even closer to the KMC results shown in Fig. 8 if more states are retained in the renormalised database (ESI†). Using the properties of the left and right eigenvectors of the transition matrix , as described in the previous section, we can now assign the peaks in this first passage time distribution and understand their heights.
Fig. 8 First passage time distributions for relaxation to the and (right) states of the LJ38 cluster at where ε is the pair well depth. The initial probability distribution corresponds to a single higher energy minimum, selected to highlight the separation of relaxation time scales. The histogram corresponds to 200000 kinetic Monte Carlo trajectories run for the full database with 71887 connected minima using the leapfrog algorithm.6 The solid curves were calculated using the eigendecomposition formulation in eqn (4) for a database of 2506 minima resulting from partial graph transformation, as described in the text. |
Fig. 9 Disconnectivity graph36,37 for the LJ38 database renormalised down to 2506 states, which produces the distributions in Fig. 8. The free energies of the minima and transition states are defined to reproduce the equilibrium occupation probabilities and rates for the network that results after partial graph transformation at , defined in eqn (15). Branches corresponding to minima in the original and sets are coloured red and blue, respectively. The source state for the FPT calculations is a higher energy minimum; its position in the graph is marked with a magenta arrow. |
Analysis of wR provides further insight into the first passage time distribution and global dynamics of the LJ38 cluster. In this case, we find that the peaks in Fig. 8 corresponding to the slow relaxation time scale can be assigned to a single eigenvalue of the transition matrix , and the components of the corresponding eigenvector are strongly localised on low-lying minima at the bottom of the competing funnel that acts as a kinetic trap. In fact, there are eigenvalues corresponding to slower relaxation times that do not produce features in , as we noted before.42 These eigenvalues correspond to minima that are almost disconnected from the rest of the network because they are separated by very high barriers. For most initial probability distributions is too small to produce a significant feature in for these modes. The right eigenvector is localised on the corresponding dead-end minimum (or minima), providing a straightforward interpretation.
In contrast to the simple model landscape shown in Fig. 1, the peaks corresponding to the faster relaxation time scales in Fig. 8 do not correspond to a single eigenvalue of . For relaxation to the fcc structure the fast peak at around lnθ = 3 is a result of contributions from 13 eigenmodes with |A/e| > 0.01. For relaxation to the states the largest peak around lnθ = 0.6 results from two eigenvalues with A/e values of 0.243 and − 0.154. The shoulder peak around lnθ = 4 results from 45 modes with |A/e| > 0.01, again including positive and negative contributions. The decomposition of the faster relaxation peak as a superposition results from the denser spectrum of eigenvalues for the transition matrix compared to the slow relaxation. We expect to see a similar effect in molecular and condensed matter systems if a slow relaxation time scale is associated with a well-defined rate-determining step, while faster relaxation can occur via a number of alternative pathways.
As we have found before, whilst full eigendecomposition of the transition matrix provides over 2000 eigenvalues, only a small number of them contribute significantly to the first passage time distribution. The distribution for relaxation to can be reproduced from eqn (4) by summing over only the 200 eigenvalues with the largest |A| values. In contrast, if we sum over the modes in order of decreasing magnitude for λ, we require 1934 terms to reproduce the FPT distribution (see ESI†).
This scan over sources following state reduction with partial GT provides an automated analysis of the dependence on initial conditions, which can then be used for further analysis and to suggest experiments that will highlight multiple relaxation time scales.
Some examples are shown for the LJ38 cluster in Fig. 10. is illustrated for four sources that produce multiple peaks, including the source that was used for the analysis in Section IVB. Identifying initial conditions that produce the most interesting structure in the first passage time distribution, coupled with assignments based on the right eigenvector, has enabled us to analyse kinetic transition networks for a wide variety of systems. Some further examples are presented below. New studies, ranging from proteins and nucleic acids the photosystem II light-harvesting supercomplex, will be presented elsewhere.
Fig. 10 First passage time distributions for relaxation to the state of the LJ38 cluster at for initial probability distributions localised in four different minima, as labelled in the legend. The selected minima were chosen to highlight structure in the distribution. Starting minimum 604 corresponds to the results in Fig. 8. |
For the model landscape in Fig. 1 accurate fits can be obtained with an Arrhenius form for λ and a simple polynomial for A (see ESI†). However, for the more realistic case of the LJ38 cluster fitting to the individual eigenmodes is problematic, because the spectrum of λ is far more dense, especially for the faster relaxations. However, we have found an alternative approach that works well. For a temperature range where the number of peaks in the FPT distribution is constant, we treat each peak position and height as if they were the result of a single eigenmode α with effective values of −lnλα and Aα/e. These effective amplitudes and eigenvalues provide good approximations to the true peaks, particularly when the range of eigenvalues contributing to a given peak is small. To test the procedure we compare the reference FPT distribution with the full fit using all temperatures, and for fits where the data is divided into two parts to produce high and low temperature fits. The performance of the high temperature fit at low temperature is the main point of interest, and some example results are shown for LJ38 in Fig. 11. Eigendecomposition was performed at regular temperature increments of 0.001 in reduced units, in the range from 0.1 to 0.15. The fitted distributions are shown for four temperatures, and indicate that the true distributions can be reproduced with useful accuracy.
Fig. 11 First passage time distributions for the LJ38 database for relaxation to the state renormalised down to 2506 states at values of (a) 0.10, (b) 0.12, (c) 0.13, and (d) 0.14. The starting state is the same as for Fig. 8. There are four plots in each panel, showing the full eigendecomposition results (full black), fits to the full temperature range (long orange dashed), fits to the high temperature range (green dashed) and fits to the low temperature range (blue dashed), as described in the text. |
(16) |
The different eigendecomposition routines performed similary for the LJ38 network, and, with the exception of mode 3 DSAUPD, produced accurate FPT distributions down to around T = 0.07. Mode 3 does not reproduce the short time peaks properly unless T > 0.1, and the computation for the slowest mode fails at a similar temperature to the other routines. The lowest temperatures we could reach with the other routines for LJ38 were:
We are grateful to a referee for suggesting alternative eigendecomposition approaches based on Krylov subspace iteration, which might be combined with hybrid schemes in future work. A related hybrid method, employing Krylov projection for the slowest relaxation time and some faster relaxation modes, has recently been used to analyse vacancy kinetics in aluminum.86 The numerical stability was sufficient to access the largest (numerically smallest) eigenvalue for relevant temperatures in that work, potentially providing another way to reach longer time scales for our applications. The assignments and dependence of first passage time peaks on initial conditions might also be connected with the Krylov projection, and we will explore these possibilities in the future.
If the eigenvalues and eigenvectors are trustworthy up to mode α, in order of increasing time scale, then the remaining amplitude in the first passage time distribution is . Assuming that the distribution has one missing peak at long time, we can calculate an effective value for a missing eigenvalue, λ*, that gives a mean first passage time in agreement with the graph transformation result, :
(17) |
Fig. 12 First passage time distributions for the model landscape shown in Fig. 1. At the lower temperatures considered here, standard eigendecomposition for the full network breaks down. The amplitude of the first peak in Fig. 2 becomes very small, and we focus on the three peaks corresponding to the slower relaxation modes. Left: Comparison of standard eigendecomposition for the pGT reduced network with the hybrid method applied to the full network. The plots are for the hybrid method at T = 0.85 (solid grey) and T = 0.80 (solid light blue), and eigendecomposition T = 0.85 (dashed dark blue) and T = 0.80 (dashed blue). Right: The distributions change smoothly with temperature: T = 0.84 (grey) T = 0.80 (light blue), T = 0.76 (blue) and T = 0.72 (dark blue). |
For LJ38 we compare the hybrid scheme with the peak fitting approach of Section V. The results in Fig. 13 are for , and illustrate the good agreement between the two methods in terms of both peak positions and heights. Here, the peak fitting employed temperatures from to 0.1 at intervals of 0.001. This shows that incorrect computation of eigenmodes with the smallest magnitude eigenvalues, does not mean that all eigenmodes are incorrect. We also verified that the hybrid scheme produced consistent results for cutoffs in λα ranging over several orders of magnitude, so long as the cutoff is not too small.
Fig. 13 First passage time distributions for the LJ38 database renormalised down to 2506 states comparing the hybrid scheme (Section VI solid red curves) with fits to higher temperature results (Section V dashed blue curves). (a) and (b) Are for relaxation to the and states; the starting state is the same as for Fig. 8. For these examples , in the regime where eigendecomposition begins to produce positive eigenvalues for the modes that don't significantly contribute to the FPT, and below the temperature where the truncated octahedral () and incomplete icosahedral () morphologies are in equilibrium, which is around . |
Fig. 14 shows results for LJ38 with the hybrid scheme for to 0.06 at intervals of 0.01. The systematic variation in the FPT is consistent for a wide range of eigenvalue cutoff thresholds. These results cover time scales for the slow relaxation that vary by over 20 orders of magnitude. Plotting the slow peak position or −lnλ*versus 1/T produces straight line plots for relaxation to both and (see ESI†), which shows how these kinetics manifest an apparently simple Arrhenius temperature dependence. The peak position corresponding to the faster relaxation is much less sensitive to temperature, because it corresponds to a lower effective energy barrier. In fact, this peak results from a superposition of contributions from a number of different eigenmodes, and we would not necessarily expect the temperature dependence to be Arrhenius.
Fig. 14 First passage time distributions for the LJ38 database renormalised down to 2506 states calculated with the hybrid scheme described in Section VI. (a) and (b) are for relaxation to the and states; the starting state is the same as for Fig. 8. In each case the values range from 0.15 to 0.04 in steps of 0.01, and the peak corresponding to the slow relaxation (trapping) steadily shifts to larger values of lnθ. |
Approximating the missing amplitude by a single effective mode should produce an accurate representation of the FPT if the long time kinetics is dominated by one slow process. Otherwise we will have a single long time peak that simply reproduces the MFPT by construction. We can check whether a large spectral gap opens up between the two slowest relaxations as the temperature decreases in the range where full eigendecomposition is possible. The systematic variation of the slow peak position in Fig. 14 provides a direct visual report on this spectral gap condition.
Some networks support modes that do not produce detectable peaks in the FPT (Section IIIA); the corresponding amplitudes are negligible for initial conditions of interest. Hence the relevant spectral gap may correspond to the slowest two modes that result in resolvable peaks. We note that it may still be possible to define a quasi-stationary distribution87,88 (QSD) if this gap is large enough, and the projection of the initial condition onto slower modes is sufficiently small. The QSD is defined as the occupation probability distribution in the long time limit conditional on not being adsorbed in the product states.87,88 Usually, this distribution is proportional to wR1. However, for some sources and sinks, the MFPT is significantly smaller than the mixing time for the full network, if we define this quantity as (the Kemeny constant89–91). If there are slow modes with negligible projection onto the initial conditions then a QSD may still exist on practical experimental time scales, but proportional to the right eigenvector for the slowest visible relaxation mode. This effective QSD will then correspond to the slowest resolvable peak in the FPT distribution.
The hybrid approach seems particularly attractive, since it does not involve fitting, and it should work for distributions that feature a well separated slow relaxation time scale, where choosing a suitable cutoff for the eigenvalues is straightforward. Such distributions are likely to be typical for systems featuring rare event dynamics and competing structures, with a wide variety of potential applications in biophysics and materials science.86
For the FPT analysis we used an existing kinetic transition network created using discrete path sampling;19,20 for full details of this database we refer readers to the original report.92 The potential employed for this particular database was the properly symmetrised104 AMBER ff99SB-ILDN force field,105 with implicit solvent GB-Neck2,106 which produced the best agreement with experiment.92 Similar multifunnel landscapes, featuring a variety of low energy structures with variable α-helix content, were obtained with alternative potentials.92
We surveyed the FPT distributions for different choices of target product state and source state. The distributions can exhibit multiple relaxation times, with peaks at ln(θ/s) values ranging between around −27 to 12. Fig. 16 shows one specific example corresponding to the source and target states highlighted in the landscapes illustrated in Fig. 15. Here we constructed the list of retained minima by calculating the Dijkstra fastest path between source and sink, again using edge weights that depend on branching probabilities, as described in Section IVA. We found that the two main peaks appeared converged when minima on this path and up to two steps away (in terms of connections via transition states) were included. The smaller peak at longer time appeared when one more minimum was added, corresponding to a possible trap connected to the same funnel as the source, giving a renormalised database of 234 states.
Fig. 15 Disconnectivity graphs for the CD4 receptor peptide. Left: Original potential energy graph92 with the secondary structure of selected low energy minima illustrated. Right: the free energy graph for a renormalised landscape at 302 K containing 3580 selected minima. The FPT calculations considered relaxation to a low-lying helical minimum indicated in the graphs with a magenta arrow. The source state above it is highlighted with an arrow of the same color. |
To assess the convergence of the FPT we compared with systematically larger sets of retained minima obtained from the fastest paths between the nine potential energy funnel bottoms in Fig. 15. Here we used the minima on the fastest paths, then added minima one step away, two steps away, etc. The FPT was identical to the result obtained for the smaller set for 3500 minima and above, at which point all members of the smaller set were included.
Fig. 16 First passage time distribution at 302 K for the CD4 database renormalised down to 234 states calculated with the hybrid scheme described in Section VI. These results correspond to the free energy disconnectivity graph in Fig. 15. θ is the first passage time in seconds. |
For this relatively complex multifunnel landscape we find that the FPT distribution depends strongly on the choice of initial and final states. Hence it would be essential to know what these end points are for a given experiment in order to make a proper comparison. If the experiment can probe alternative end points then it would be possible to build up a picture of the global landscape in a systematic fashion. For CD4 we have found combinations that produce up to three well-defined peaks in the FPT distribution. Although the spectrum of relaxation times defined by the eigenvalues λ is not affected by the source state, the amplitudes in the FPT depend on the corresponding component of the left eigenvector. Only a few of these amplitudes are likely to be resolvable for a particular source, but in principle it is possible to probe the whole spectrum if the initial state can be controlled. Theory could play an important role in designing and interpreting such experiments in the future. For example, understanding the subtle differences between alternative low-lying minima, and how these structures interconvert, could be helpful in drug design efforts that mimic the interaction of the cytoplasmic tail of CD4 with HIV-1 accessory proteins.107,108
Landscapes featuring rare events and multiple time scales are particularly interesting, partly because of potential applications, such as molecular switches and multifunctional materials. However, the eigendecomposition approach suffers from numerical problems for such systems, preventing access to the longest relaxation time, which may be the most important property. We have presented two approaches to tackle this problem. The first uses a fit to the peaks of calculated in the tractable temperature range. The second approach is a hybrid scheme, which combines eigendecomposition for faster relaxations with graph transformation for the mean first passage time. The two methods give consistent results for time scales that are beyond the reach of eigendecomposition. The hybrid scheme is particularly attractive, requiring only a choice for a time scale that lies between the slowest relaxation mode and the faster modes. This approach is now being applied to existing kinetic transition networks to extract new insight into the competing pathways. The kinetic information associated with alternative paths is averaged out when a single rate is associated with the mean first passage time. However, we now have a powerful tool to investigate competing pathways, which may be useful in future projects to understand and design new molecules and materials with target properties that depend upon their dynamical response.
Multifunnel landscapes may also exhibit thermodynamic signatures of broken ergodicity, especially low temperature peaks in the heat capacity.40,67,71,73–78,80,81 These features can be associated with specific sets of local minima,109 where transitions occur between alternative structures. Analysing the FPT provides a complementary approach to probe the structure of the landscape based on kinetics. Combining these thermodynamic and kinetic perspectives will enable us to understand and predict molecular properties at a fundamental level, resolved in terms of the locally stable structures and the pathways that connect them.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3cp04199a |
This journal is © the Owner Societies 2024 |