Open Access Article
This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

Physics-informed neural networks and beyond: enforcing physical constraints in quantum dissipative dynamics

Arif Ullah *a, Yu Huang a, Ming Yang a and Pavlo O. Dral *bc
aSchool of Physics and Optoelectronic Engineering, Anhui University, Hefei, 230601, Anhui, China. E-mail: arif@ahu.edu.cn
bState Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, 361005, Fujian, China. E-mail: dral@xmu.edu.cn
cInstitute of Physics, Faculty of Physics, Astronomy, and Informatics, Nicolaus Copernicus University in Toruń, ul. Grudziądzka 5, 87-100 Toruń, Poland

Received 13th June 2024 , Accepted 4th September 2024

First published on 5th September 2024


Abstract

Neural networks (NNs) accelerate simulations of quantum dissipative dynamics. Ensuring that these simulations adhere to fundamental physical laws is crucial, but has been largely ignored in the state-of-the-art NN approaches. We show that this may lead to implausible results measured by violation of the trace conservation. To recover the correct physical behavior, we develop physics-informed NNs (PINNs) that mitigate the violations to a good extent. Beyond that, we propose a novel uncertainty-aware approach that enforces perfect trace conservation by design, surpassing PINNs.


I. Introduction

Open quantum systems are ubiquitous in nature and have versatile applications across various domains such as loss of coherence in quantum information,1 quantum memory,2 quantum transport,3 proton tunnelling in DNA4 and energy transfer in photosynthetic systems.5 Being a multi-body problem, the exact characterization of open quantum systems is not feasible owing to exponential growth in Hilbert space dimension and a large number of environment degrees of freedom. However, the problem becomes more tractable by tracing out environment degrees of freedom TrE(·) or treating the environment6 and/or system within the classical phase space.7,8 To investigate open quantum systems, numerous approaches have been developed so far, spanning from entirely classical9,10 to fully quantum methods.11–18 While each of these approaches has been successful in its own right, they are hampered by many limitations, such as the inability to account for quantum effects, or demanding significant computational resources arising from the need of employing a very small discretization step due to stability constraints. Furthermore, the comprehensive integration of environmental effects, especially in highly non-Markovian scenarios, contributes significantly to the computational overhead.

Neural networks (NNs) present an efficient approach to learn complex spatio-temporal dynamics in high-dimensional space. NNs and other machine learning (ML) methods have proven to be proficient at predicting future time evolution of quantum states as a function of historical dynamics.19–27 In addition, NNs can directly predict the future quantum states as a function of time and/or simulation parameters.28–32

However, a crucial aspect of quantum simulations is adherence to fundamental physical principles. In simulating open quantum systems, it is essential for an approach to uphold the core physical principle of conserving the trace (the sum of probabilities for all possible states) of the reduced density matrix (RDM [small rho, Greek, tilde]S), which should always be equal to 1, i.e., TrS([small rho, Greek, tilde]S) = 1, where TrS represents trace over system degrees of freedom.

Despite the appeal of NNs, existing research on ML-based simulations of quantum dissipative dynamics has largely ignored trace conservation.19–30,32 To the best of our knowledge, only one study has mentioned, albeit in the context of a relatively simple system (spin-boson), that ML models, given sufficient data, were able to implicitly learn trace conservation to a reasonable degree.21 However, we cannot expect that it always holds, especially in much more complex situations and when it is difficult to obtain ample amount of data for implicit learning of the trace conservation. In general, the ML models can implicitly learn physical laws from the data but if left unchecked (unconstrained) or applied for situations too far from the training data, they can also spectacularly fail.

Physics-Informed Neural Networks (PINNs), introduced in 2017,33–35 present a promising solution to this problem.36,37 By incorporating physical constraints directly into the neural network architecture, PINNs ensure that the model's predictions adhere to underlying physical laws. This approach has been successfully applied across various fields, including fluid dynamics,38,39 seismic inversions in 2D acoustic media,40 chemical simulations,41 quantum dynamics,42 and electronic structure calculations.43

In this paper, we explore whether NNs inherently conserve trace and demonstrate that unconstrained models can lead to unphysical results due to trace violations. To address this, we develop physics-informed neural networks that significantly reduce trace conservation violations. However, we find that even with the integration of physical knowledge, physics-informed NNs alone are not sufficient. To ensure correct physical behavior, we introduce an uncertainty-aware hard constraint (U-aware HC) approach that enforces perfect trace conservation by design.

The subsequent sections of this paper are structured as follows. In the “Theory and methodology” section, we establish the foundational theory of open quantum systems and detail the various NN models employed in our study, including physics-agnostic and unconstrained NNs. We highlight the trace violations by these models and introduce physics-informed NNs (PINNs). Additionally, we discuss the associated loss functions used for training these models and introduce the U-aware HC constraint for rigorous enforcement of physical constraints. Following that, in the “Results and discussion” section, we present our findings, comparing the performance of our PINN approach and HC constraint against existing models. We discuss the effectiveness of these approaches in enforcing physical laws and achieving accurate simulations. Finally, in the “Concluding remarks” section, we summarize our key findings, explore the broader implications of our study, and outline potential future research directions.

II. Theory and methodology

Let us consider an open quantum system (S) with n-number of states coupled to an outside environment (E). The dynamics of the composite system (S + E) is governed by the Liouville–von Neumann equation (ℏ = 1)
 
[small rho, Greek, dot above](t) = −i[H,ρ(t)],(1)
where H and ρ(t) represent Hamiltonian and density matrix of the composite system, respectively. As the composite system is an isolated system, the dynamics is unitary. Assuming the initial state of the system and environment is uncoupled (i.e., ρ(0) = ρS(0) ⊗ ρE(0)), the non-unitary reduced dynamics of the system can be extracted by taking a partial trace over environment degrees of freedom, i.e.,
 
[small rho, Greek, tilde]S(t) = TrE(U(t,0)ρ(0)U(t,0)),(2)
where [small rho, Greek, tilde]S(t) is the reduced density matrix (RDM) of the system at time t, TrE is the partial trace over environment degrees of freedom, and U(t,0)(U(t,0)) is the forward (backward) propagation operator in time. While most real-world systems technically qualify as “open” due to their environment, the immense complexity arising from all possible environmental interactions (known as the curse of dimensionality) makes exact theoretical solutions impractical. In the following, we present a brief theory of two broadly studied pedagogical systems: the two-state SB model and the Fenna–Matthews–Olson complex (FMO) complex.

SB model: the SB model describes the temporal evolution of a qubit system (two-state system) interacting with an environmental bath comprising multiple independent harmonic oscillators. The system's total Hamiltonian, expressed in the basis of the excited (|e〉) and ground (|g〉) states, is given by:

 
image file: d4dd00153b-t1.tif(3)
where σz and σx are the Pauli matrices, ε is the energy bias of the qubit, and Δ is the coupling strength between states. The environment's creation and annihilation operators for the kth mode are bk and bk, respectively, with ωk being the mode's frequency. The coupling strength between the system and the kth environmental mode is denoted by ck. The environmental influence on the system is characterized by an ohmic spectral density function with a Drude–Lorentz cutoff:44
 
image file: d4dd00153b-t2.tif(4)
with λ being the reorganization energy and γ representing the characteristic frequency or the reciprocal of the environmental relaxation time, γ = 1/τ.

FMO complex: the FMO complex, a trimer in green sulfur bacteria, plays a crucial role in photosynthesis. Each monomer in the complex contains chlorophyll molecules that act as energy transfer sites, typically numbering seven or eight.45 The energy transfer within an FMO monomer is described by the Frenkel exciton model Hamiltonian:46

 
image file: d4dd00153b-t3.tif(5)
where N is the number of chlorophyll sites, εn is the on-site energy, and Jnm is the coupling strength between sites n and m. The environmental contribution is represented by Pk,n and Qk,n, the momentum and coordinate of the kth mode interacting with site n, with ωk,n as the mode's frequency. The identity matrix I ensures dimensional consistency. ck,n is the coupling strength between the kth mode and site n, and λn is the reorganization energy for site n.

For our analysis, we utilize the same ohmic spectral density function with a Drude–Lorentz cutoff as in eqn (4), assuming a uniform spectral density across all sites.

A. NN-accelerated quantum dissipative dynamics

Within NN framework, learning the time evolution of n-dimensional RDM can be defined as learning a mapping function image file: d4dd00153b-t4.tif which takes a vector of descriptive variables, image file: d4dd00153b-t5.tif and maps it to the corresponding target RDM, image file: d4dd00153b-t6.tif NN approaches for this task can be categorized into two main types: recursive and non-recursive, depending on how they handle the descriptor and inference.
Recursive methods. In recursive NN methodologies,19–21 a mapping function, denoted as Ψrec, is employed to predict future RDMs based on their past history. This approach mimics traditional quantum dynamics, where the evolution at any given time explicitly depends on the current state and implicitly on the past states. Mathematically, a recursive method can be described as:
 
image file: d4dd00153b-t7.tif(6)
where {·} represents a sequence containing the history of RDMs, denoted as […, [small rho, Greek, tilde]S(tm−1), [small rho, Greek, tilde]S(tm)], with n being the number of time steps and r the dimensionality of the [small rho, Greek, tilde]S. Recursive methods make predictions iteratively. The predicted RDM ([small rho, Greek, tilde]S) at time t is added to the history, and the oldest one is discarded to maintain a fixed-size memory. This updated history becomes the new input for the next prediction.
Non-recursive methods. Non-recursive methods, as seen in ref. 28 and 29, learn the mapping function Ψ as a function of simulation parameters and/or temporal information. The time-dependent non-recursive method as used in ref. 28, establishes a mapping function (ΨAIQD) between RDM and simulation parameters including time t. Mathematically:
 
image file: d4dd00153b-t8.tif(7)
where p is a vector containing simulation parameters (e.g., temperature, frequency, coupling strength). This approach allows for parallel computation of all time steps since the prediction for each step does not depend on the output of the previous step. In time-independent non-recursive methodology,29 the mapping function ΨOSTL predicts the entire trajectory of the RDM for a set of time steps t1 to tk in one go:
 
image file: d4dd00153b-t9.tif(8)
where the descriptor includes only the simulation parameters.

B. Limitations of existing NNs for open quantum systems: purely data-driven approaches

In the NN framework, establishing the mapping function Ψ between the descriptor x and its target dynamics can be approached in two ways: purely data-driven or based on known physical laws and constraints. Unfortunately, current research, including our own work,19,28,29 on machine learning (ML)-based simulations of open quantum systems, relies exclusively on data-driven approximations of the mapping function Ψ. As a result, these models often fail to capture underlying physical laws, leading to non-physical RDMs that violate trace conservation.

Here, we classify purely data-driven NNs into two categories: “physics-agnostic NNs” and “unconstrained NNs”. Physics-agnostic NNs are models that are not exposed to the complete data and thus remain unaware of the underlying physical laws and constraints. Unconstrained NNs, in contrast, are exposed to the entire data but do not incorporate physical constraints in their loss functions.

To emphasize on the issue of trace-violation by these data-driven NNs, we show their performance in Fig. 1 with two examples: the relaxation dynamics within the SB model and the excitation energy transfer (EET) in the 7-site Fenna–Matthews–Olson (FMO) complex. As shown, these data-driven models fail to conserve the trace in both processes. In each case, we utilize convolutional neural networks (CNNs) and OSTL-based recursive dynamics propagation (Rec-OSTL)

 
image file: d4dd00153b-t10.tif(9)


image file: d4dd00153b-f1.tif
Fig. 1 Trace violations in quantum dissipative dynamics using machine learning. Panels (A and C) in their respective order illustrate trace violations in a physics-agnostic scenario for a symmetric SB model and the 7-site FMO complex, where a CNN is trained for each state (site). Panels (B and D) demonstrate the improvement achieved with the unconstrained multi-output CNN for the same two systems. In all cases, an initial dynamics of length 0.2 (in the respective time units), with ideal trace conservation, serves as the seed for model predictions, derived from reference calculations. For the symmetric SB model, results are shown for an unseen dynamics with a characteristic frequency γ/Δ = 9.0, system-bath coupling λ/Δ = 0.6, and inverse temperature βΔ = 1.0. For the FMO complex, the initial excitation is considered on site-1, with parameters γ = 400 cm−1, λ = 40 cm−1, and temperature T = 90 K. Further details on training and prediction can be found in the Results and discussion section.

We use MLQD package47 and train the models on data from the QD3SET-1 database48 (see Results and discussion section for details). The training approach mirrors state-of-the-art methods reported previously.20,29

In essence, for the physics-agnostic scenario (Fig. 1A and C), we train individual CNNs for each diagonal RDM element, employing a loss function that gauges the deviation of NN-predicted values image file: d4dd00153b-t11.tif from their reference counterparts [small rho, Greek, tilde]S,nn:

 
image file: d4dd00153b-t12.tif(10)
where M is the number of training points and m is the training point index.

As these models are not exposed to the dynamics of all states, they lack knowledge of trace conservation. We show that a much better solution is the unconstrained NN—a single, multi-output CNN designed to learn all RDM elements, incorporating a loss function that aggregates errors across all states (sites) (Fig. 1B and D):

 
image file: d4dd00153b-t13.tif(11)

However, despite being exposed to the dynamics of all states, this solution still exhibits minor but noticeable trace violations. It is important to note that trace violations can be reduced to some extent with additional training, as demonstrated in Fig. S1 of the ESI. However, further improvement becomes limited as the model approaches the point of overfitting. Additionally, our observations indicate that the improvement in trace conservation with increasing memory time tm is somewhat unpredictable and does not follow a consistent trend. Despite this, there was a noticeable improvement in the accuracy of the dynamics predictions, as shown in Table S1.

C. Our proposed solution: PINNs and beyond

In the preceding subsection, we explored the shortcomings of the state-of-the-art purely data-driven NNs for simulating open quantum systems where they often struggle to enforce fundamental physical laws like trace conservation. This leads to inaccurate and non-physical results. To address this limitation, we first explore PINNs which integrate physical constraints into the loss function inspired by similar ideas in the literature.49,50 In our case, we include the additional loss term image file: d4dd00153b-t14.tif penalizing the NN for the deviations from the trace conservation:
 
image file: d4dd00153b-t15.tif(12)
where
 
image file: d4dd00153b-t16.tif(13)

In these equations, we can tune image file: d4dd00153b-t17.tif and the deviations from trace conservation by weight factors α and η, respectively. Here we use α = 2.0 and η = 1.0. Note that the unconstrained NN with the loss defined by eqn (11) is a special case of the PINN with α = 1.0 and η = 0.

While PINNs significantly improve trace conservation compared to purely data-driven NNs, they can still exhibit minor violations (as we'll demonstrate later). This is because the physical constraints incorporated within the PINNs loss function are typically considered “soft.” In simpler terms, PINNs are nudged towards satisfying the constraints during training, but they aren't strictly enforced.49,51

To overcome the limitations of PINNs, we propose a novel approach that enforces trace conservation by design. This approach utilizes an U-aware HC (uncertainty-aware hard-coded) constraint, guaranteeing strict adherence to physical laws during simulations. Unlike PINNs, the U-aware HC constraint operates outside of the loss function. This allows for a more direct and rigorous enforcement of the trace conservation law, rectifying potential violations during the simulation process.

The key idea is as follows: after making predictions with machine learning models, there will inevitably be a deviation from perfect trace conservation. We can calculate this residual deviation for each time step as:

 
image file: d4dd00153b-t18.tif(14)

We can redistribute the residual deviations between each state as:

 
[small rho, Greek, tilde]HCS,nn(t)=[small rho, Greek, tilde]S,nn(t) + wn(t)ΔTr(t).(15)

Here, we need to make such a choice for state-specific weighting factors wn that the trace is one. Also, it should be statistically motivated. Different states might be predicted with different uncertainty and for certain predictions we want smaller corrections (smaller weighting factors). Hence, we also need state-specific uncertainty quantification (UQ) of NN predictions. Similar problems were also faced in the prediction of partial atomic charges predicted by statistical models which do not necessarily add up to integer values: the suggested solution also was to redistribute the deviation from the correct total charge over atoms based on the UQ calculated as the disagreement between the models in ensemble.52,53 This shows how very different research field can inspire the solutions in the unrelated field.

Here we introduce a new approach for UQ. We train an additional, auxiliary multi-output CNN with the same loss function as the main PINN but we shift the reference values by a prior p2 (we assume that the main PINN model is trained with prior p1 = 0). In other words, we train the CNN on [small rho, Greek, tilde]S + p2J (J is a unit matrix with all elements 1) with the predictions given by:

 
[small rho, Greek, tilde]auxS,nn(t) = [small rho, Greek, tilde]aux-NNS,nn(t) − p2J.(16)

The UQ metric is given then as the absolute deviation of the [small rho, Greek, tilde]auxS,nn(t) from the main model predictions:

 
Dnn(t) = |[small rho, Greek, tilde]auxS,nn(t) − [small rho, Greek, tilde]S,nn(t)|.(17)

The state-specific weighting factors wn we suggest to obtain as the normalized distances:

 
image file: d4dd00153b-t19.tif(18)

The implementation of eqn (15) with the weighting factors defined with the eqn (18) ensures that TrS([small rho, Greek, tilde]HCS(t)) = 1. It's crucial to distinguish our proposed U-aware HC constraint-based approach from the conventional trace normalization technique, [small rho, Greek, tilde]S/TrS([small rho, Greek, tilde]S), commonly employed in non-trace conserving traditional methods.

Here's why our proposed U-aware HC constraint approach stands out:

• Generality: the U-aware HC constraint approach is purely machine learning-based approach and not limited to trace conservation. It can be tailored to enforce various physical constraints across diverse domains within machine learning studies. For example, it could be used to ensure the preservation of total charge in simulations of molecular systems, especially when learning individual charges for each atom.

• Uncertainty-aware correction: the U-aware HC constraint approach goes beyond simple normalization by incorporating an UQ metric (eqn (17)) along with a weighting factor (eqn (18)). This allows for targeted corrections. States (or sites) with greater uncertainty (deviations) receive larger corrections, while those with smaller deviations receive smaller adjustments. This ensures a refined correction process tailored to the level of uncertainty observed.

III. Results and discussion

In this section, we evaluate the effectiveness of PINNs and our proposed U-aware HC constraint in enforcing trace conservation during simulations. We compare their performance against state-of-the-art purely data-driven neural networks commonly used in open quantum system simulations. For comprehensive assessment, we utilize two distinct processes as benchmarks: relaxation dynamics within the spin-boson (SB) model and the energy transfer process (EET) within the 7-site FMO complex.

For the SB model, we acquire high-quality training data from the publicly available QD3SET-1 database.48 This comprehensive database provides pre-computed dynamics using the hierarchical equations of motion (HEOM) approach.11,54,55 The specific training dataset, denoted by image file: d4dd00153b-t20.tif consists of 1000 trajectories simulated across a four-dimensional parameter space encompassing system-bath coupling strength, bath reorganization energy, bath relaxation rate, and inverse temperature (represented by ε/Δ, λ/Δ, γ/Δ, and βΔ, respectively). In similar manner, training data for 7-site FMO complex was also extracted from QD3SET-1 database. This dataset encompasses 1000 training instances, capturing the dynamics for both possible initial excitation sites (site-1 and site-6) within the complex. In the considered data set, the dynamics is propagated for a range of simulation parameters chosen from a three-dimensional space image file: d4dd00153b-t21.tif The method used for propagation is the trace conserving local thermalizing Lindblad master equation (LTLME)56 with the system Hamiltonian parameterized by Adolphs and Renger.57

 
image file: d4dd00153b-t22.tif(19)
where the diagonal offset is 12[thin space (1/6-em)]210 cm−1.

For the training process, we adopted OSTL-based recursive dynamics propagation (eqn (9)) where the RDM [small rho, Greek, tilde]S(t) at each time step transforms into a 1D vector with dimension M = number of sites + (2 × number of the upper off-diagonal terms). As in RDM image file: d4dd00153b-t23.tif only the upper off-diagonal terms are learned. In addition, the real and imaginary parts of each off-diagonal term are separated. More details can be found in ref. 47. The target is multi-time step dynamics which is in the same shape as the input. Here we predict the dynamics of 20 time-steps in one shot and which is then fed to the model recursively for the prediction of the next 20 time-steps dynamics. In all cases, we trained a CNN model, implemented in the MLQD package47 and the uncertainty-aware HC constraint is incorporated with priors set as (p1, p2) = (0, 0.1).

To improve training efficiency, we utilized farthest point sampling28,58 to select a subset of training trajectories. For both the symmetric SB model (ε/Δ = 0) and FMO complex with initial excitation on site-1, 400 trajectories were chosen for training, with the remaining used for testing.

In our study, we trained CNN models with identical architectures across all four scenarios. The models used for dynamics propagation yielded nearly identical validation losses, with approximately 1.2 × 10−5 in the SB case and 1.1 × 10−7 in the FMO complex. Introducing trace constraints and adding a prior do impact computational efficiency. Including a trace constraint in the loss function increases its complexity, and the addition of a prior makes the model more challenging to fit, potentially leading to longer training times.

For example, in our experiments, the unconstrained NN model for FMO complex reached a validation loss of 1.01 × 10−7 at epoch 194. In contrast, the PINN model with the same architecture achieved a similar loss of 1.62 × 10−7 at epoch 785, and the auxiliary model in the case of PINN with U-aware HC attained loss of 2.22 × 10−7 at epoch 1142. On our machine (GeForce RTX Nvidia 4090 GPU), each epoch took approximately 1 second, resulting in total training times of 194 seconds, 785 seconds, and 1142 seconds, respectively. While the addition of constraints and priors increases computational time, the overall increase is not significant given the advanced computational resources available today.

Fig. 2 demonstrates the effectiveness of the PINNs and the uncertainty-aware HC constraint in maintaining trace conservation during simulations of quantum dissipative dynamics for the SB model and the FMO complex. We revisit the same cases as presented in Fig. 1 for purely data-driven NNs. As expected, the PINNs (Fig. 2A and C) shows a significant improvement in trace conservation compared to purely data-driven neural networks (Fig. 1). However, as previously discussed, PINNs rely on “soft constraints” within the loss function, which can lead to minor deviations from perfect trace conservation.


image file: d4dd00153b-f2.tif
Fig. 2 Trace conservation in NN-based simulations using PINNs and the uncertainty-aware HC constraint approach. This figure replicates Fig. 1 (data-driven NN) for SB model and FMO complex, demonstrating improved conservation with PINNs (panels A and C) and perfect trace conservation achieved by combining U-aware HC constraint with PINNs (panels B and D). In the case of SB model, an initial period of tmΔ = 2.0 serves as a seed for the model's predictions and results are presented for a test trajectory with characteristic frequency γ/Δ = 9.0, system-bath coupling λ/Δ = 0.6, and inverse temperature βΔ = 1.0. For the FMO complex, an initial dynamics of tm= 0.2 ps is provided as an input and the initial excitation is considered on site-1, with parameters γ = 400 cm−1, λ = 40 cm−1, and temperature T = 90 K.

Perfect trace conservation is achieved via the U-aware HC constraint, as demonstrated in Fig. 2B and D. By explicitly incorporating this constraint within the PINNs framework, we maintain perfect trace conservation throughout the simulations for both the SB model and the FMO complex. This finding underscores the benefit of enforcing strict physical constraints by design, rather than solely relying on the model's ability to learn physical principles indirectly.

Additionally, we present the corresponding population dynamics for all four cases in Fig. S2 and S3 of the ESI. To evaluate the accuracy of each model in dynamics propagation, we provide the MAE averaged over all time steps for each state (site) in Table 1. From the MAE comparison, we observe that all models have tiny errors for populations, so the trace conservation did not have much impact on the quality of the dynamics in the studied cases. However, the trace conservation might have a big impact in the cases where ML struggles to learn and predict dynamics with such an accuracy. As described above, the additional computational cost for enforcing the trace conservation is not that high either, which does not justify the use of the non-conserving approaches in case they break down and have even worse behavior than in Fig. 1. In any case, using trace-conserving approaches can be considered as a good prophylactic against unphysical behavior.

Table 1 MAE averaged over all time-steps (0.2–1 ps) for the diagonal elements [small rho, Greek, tilde]Snn in the SB model and FMO complex. In the table, n denotes the state (site) number, and Avg(n) represents the average MAE across all states (sites)a
n SB model FMO complex
1 2 Avg(n) 1 2 3 4 5 6 7 Avg(n)
a All values are in units of 10−3 except for those marked with * which are in units of 10−4.
Physics-agnostic NN 4.37 4.20 4.29 2.57 1.39 7.91 1.99 2.06 7.62* 9.12* 2.51
Unconstrained NN 7.10 7.42 7.26 1.76 2.60 2.34 6.23* 5.91* 6.94* 4.40* 1.29
PINN 6.14 6.10 6.12 3.93 1.13 2.76 2.15 1.55 6.15* 2.17 2.04
PINN + U-ware HC 6.08 6.08 6.08 1.41 1.53 5.22 1.26 1.54 6.73* 2.10 1.96


IV. Concluding remarks

This work addresses the critical issue of trace conservation in NN-based simulations of open quantum systems. While NN models are adept at capturing complex dynamics, they often struggle to maintain fundamental physical principles such as trace conservation. Our investigation reveals three key findings.

First, purely data-driven NN models, including physics-agnostic and unconstrained NNs, can effectively capture correlations between state-specific populations. However, they lack explicit enforcement of physical laws, leading to potential violations of trace conservation.

Second, PINNs offer a significant improvement by incorporating physical knowledge into the loss function. This method penalizes deviations from physical constraints, enhancing the accuracy of simulations. Despite this advancement, PINNs still rely on “soft constraints,” which can result in minor violations of physical constraints like trace conservation.

Finally, U-aware HC constraint approach addresses the limitations of PINNs by enforcing trace conservation by design rather than solely through the loss function. The U-aware HC constraint utilizes uncertainty quantification techniques to redistribute residual errors and correct potential trace violations, ensuring physically consistent simulations throughout.

It is important to note that while we did not explicitly enforce a positivity constraint in our case-since all diagonal elements remained strictly positive-such a constraint could be incorporated if necessary.

To conclude, our findings underscore the importance of integrating well-defined physical constraints into NN models. The methods developed in this study are broadly applicable and can be adapted to enforce other essential constraints in various domains. For instance, in molecular simulations where individual atomic charges are learned, our different-prior approach for uncertainty quantification as well as an approach for redistributing residual error in atomic charges could be used as an alternative to existing, related approaches52,53 for ensuring total charge conservation. By extending these techniques, we can improve the fidelity and reliability of NN-based simulations across a wide range of scientific and engineering applications.

Data availability

The data supporting this work is available at https://github.com/Arif-PhyChem/trace_conservation.

Conflicts of interest

The authors declare no competing interests.

Acknowledgements

A. U. acknowledges funding from the National Natural Science Foundation of China (No. W2433037). P. O. D. acknowledges support from the National Natural Science Foundation of China (No. 22003051), as well as funding through the Outstanding Youth Scholars (Overseas, 2021) project and the Fundamental Research Funds for the Central Universities (No. 20720210092). This project is also supported by the Science and Technology Projects of the Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM) (No. RD2022070103).

References

  1. H. P. Breuer, E. M. Laine, J. Piilo and B. Vacchini, Colloquium: Non-Markovian dynamics in open quantum systems, Rev. Mod. Phys., 2016, 88(2), 021002 CrossRef.
  2. K. Khodjasteh, J. Sastrawan, D. Hayes, T. J. Green, M. J. Biercuk and L. Viola, Designing a practical high-fidelity long-time quantum memory, Nat. Commun., 2013, 4(1), 2045 CrossRef PubMed.
  3. P. Cui, X. Q. Li, J. Shao and Y. Yan, Quantum transport from the perspective of quantum open systems, Phys. Lett. A, 2006, 357(6), 449–453 CrossRef.
  4. L. Slocombe, M. Sacchi and J. Al-Khalili, An open quantum systems approach to proton tunnelling in DNA, Commun. Phys., 2022, 5(1), 109 CrossRef CAS.
  5. E. Zerah Harush and Y. Dubi, Do photosynthetic complexes use quantum coherence to increase their efficiency? Probably not, Sci. Adv., 2021, 7(8), eabc4631 CrossRef PubMed.
  6. Y. Wang and Y. Yan, Quantum mechanics of open systems: dissipaton theories, J. Chem. Phys., 2022, 157(17), 170901 CrossRef CAS.
  7. A. Jain and A. Sindhu, Pedagogical Overview of the Fewest Switches Surface Hopping Method, ACS Omega, 2022, 7(50), 45810 CrossRef CAS PubMed.
  8. H. D. Meyer and W. H. Miller, A classical analog for electronic degrees of freedom in nonadiabatic collision processes, J. Chem. Phys., 1979, 70(7), 3214–3223 CrossRef CAS.
  9. J. Liu, X. He and B. Wu, Unified formulation of phase space mapping approaches for nonadiabatic quantum dynamics, Acc. Chem. Res., 2021, 54(23), 4215–4228 CrossRef CAS PubMed.
  10. J. E. Runeson and J. O. Richardson, Spin-mapping approach for nonadiabatic molecular dynamics, J. Chem. Phys., 2019, 151(4), 044119 CrossRef PubMed.
  11. Y. Tanimura and R. Kubo, Time evolution of a quantum system in contact with a nearly Gaussian-Markoffian noise bath, J. Phys. Soc. Jpn., 1989, 58(1), 101–114 CrossRef.
  12. D. E. Makarov and N. Makri, Path integrals for dissipative systems by tensor multiplication. Condensed phase quantum dynamics for arbitrarily long time, Chem. Phys. Lett., 1994, 221(5–6), 482–491 CrossRef CAS.
  13. L. Han, A. Ullah, Y. A. Yan, X. Zheng, Y. Yan and V. Chernyak, Stochastic equation of motion approach to fermionic dissipative dynamics. I. Formalism, J. Chem. Phys., 2020, 152(20), 204105 CrossRef CAS.
  14. A. Ullah, L. Han, Y. A. Yan, X. Zheng, Y. Yan and V. Chernyak, Stochastic equation of motion approach to fermionic dissipative dynamics. II. Numerical implementation, J. Chem. Phys., 2020, 152(20), 204106 CrossRef.
  15. Y. Su, Z. H. Chen, Y. Wang, X. Zheng, R. X. Xu and Y. Yan, Extended dissipaton equation of motion for electronic open quantum systems: application to the Kondo impurity model, J. Chem. Phys., 2023, 159, 024113 CrossRef PubMed.
  16. Y. Yan, M. Xu, T. Li and Q. Shi, Efficient propagation of the hierarchical equations of motion using the Tucker and hierarchical Tucker tensors, J. Chem. Phys., 2021, 154(19), 194104 CrossRef.
  17. L. Chen, D. I. Bennett and A. Eisfeld, Simulation of absorption spectra of molecular aggregates: a hierarchy of stochastic pure state approach, J. Chem. Phys., 2022, 156(12), 124109 CrossRef.
  18. H. Gong, A. Ullah, L. Ye, X. Zheng and Y. Yan, Quantum entanglement of parallel-coupled double quantum dots: a theoretical study using the hierarchical equations of motion approach, Chin. J. Chem. Phys., 2018, 31(4), 510 CrossRef.
  19. A. Ullah and P. O. Dral, Speeding up quantum dissipative dynamics of open systems with kernel methods, New J. Phys., 2021, 23, 113019 CrossRef.
  20. L. E. H. Rodríguez, A. Ullah, K. J. R. Espinosa, P. O. Dral and A. A. Kananenka, A comparative study of different machine learning methods for dissipative quantum dynamics, Mach. Learn.: Sci. Technol., 2022, 3(4), 045016 Search PubMed.
  21. L. E. Herrera Rodríguez and A. A. Kananenka, Convolutional neural networks for long time dissipative quantum dynamics, J. Phys. Chem. Lett., 2021, 12(9), 2476–2483 CrossRef PubMed.
  22. L. Zhang, A. Ullah, M. Pinheiro Jr, P. O. Dral and M. Barbatti, Excited-state dynamics with machine learning, in Quantum Chemistry in the Age of Machine Learning, Elsevier, 2023, pp. 329–353 Search PubMed.
  23. D. Wu, Z. Hu, J. Li and X. Sun, Forecasting nonadiabatic dynamics using hybrid convolutional neural network/long short-term memory network, J. Chem. Phys., 2021, 155(22), 224104 CrossRef.
  24. K. Lin, J. Peng, C. Xu, F. L. Gu and Z. Lan, Automatic evolution of machine-learning-based quantum dynamics with uncertainty analysis, J. Chem. Theory Comput., 2022, 18(10), 5837–5855 CrossRef PubMed.
  25. S. Bandyopadhyay, Z. Huang, K. Sun and Y. Zhao, Applications of neural networks to the simulation of dynamics of open quantum systems, Chem. Phys., 2018, 515, 272–278 CrossRef.
  26. B. Yang, B. He, J. Wan, S. Kubal and Y. Zhao, Applications of neural networks to dynamics simulation of Landau-Zener transitions, Chem. Phys., 2020, 528, 110509 CrossRef CAS.
  27. D. Tang, L. Jia, L. Shen and W. H. Fang, Fewest-Switches Surface Hopping with Long Short-Term Memory Networks, J. Phys. Chem. Lett., 2022, 13(44), 10377–10387 CrossRef CAS.
  28. A. Ullah and P. O. Dral, Predicting the future of excitation energy transfer in light-harvesting complex with artificial intelligence-based quantum dynamics, Nat. Commun., 2022, 13(1930), 1–8 Search PubMed.
  29. A. Ullah and P. O. Dral, One-Shot Trajectory Learning of Open Quantum Systems Dynamics, J. Phys. Chem. Lett., 2022, 13(26), 6037–6041 CrossRef PubMed.
  30. F. Ge, L. Zhang, Y. F. Hou, Y. Chen, A. Ullah and P. O. Dral, Four-dimensional-spacetime atomistic artificial intelligence models, J. Phys. Chem. Lett., 2023, 14(34), 7732–7743 CrossRef.
  31. A. V. Akimov, Extending the time scales of nonadiabatic molecular dynamics via machine learning in the time domain, J. Phys. Chem. Lett., 2021, 12(50), 12119–12128 CrossRef.
  32. K. Lin, J. Peng, C. Xu, F. L. Gu and Z. Lan, Trajectory Propagation of Symmetrical Quasi-classical Dynamics with Meyer-Miller Mapping Hamiltonian Using Machine Learning, J. Phys. Chem. Lett., 2022, 13, 11678–11688 CrossRef.
  33. M. Raissi, P. Perdikaris and G. E. Karniadakis, Physics informed deep learning (part i): data-driven solutions of nonlinear partial differential equations, arXiv, 2017, preprint, arXiv:171110561,  DOI:10.48550/arXiv.1711.10561.
  34. M. Raissi, P. Perdikaris and G. E. Karniadakis, Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations, arXiv, 2017, preprint, arXiv:171110566,  DOI:10.48550/arXiv.1711.10566.
  35. M. Raissi, P. Perdikaris and G. E. Karniadakis, Physics-informed neural networks: a deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, J. Comput. Phys., 2019, 378, 686–707 CrossRef.
  36. G. E. Karniadakis, I. G. Kevrekidis, L. Lu, P. Perdikaris, S. Wang and L. Yang, Physics-informed machine learning, Nat. Rev. Phys., 2021, 3(6), 422–440 CrossRef.
  37. S. Cuomo, V. S. Di Cola, F. Giampaolo, G. Rozza, M. Raissi and F. Piccialli, Scientific machine learning through physics-informed neural networks: where we are and what's next, J. Sci. Comput., 2022, 92(3), 88 CrossRef.
  38. S. Cai, Z. Mao, Z. Wang, M. Yin and G. E. Karniadakis, Physics-informed neural networks (PINNs) for fluid mechanics: a review, Acta Mech. Sin., 2021, 37(12), 1727–1738 CrossRef.
  39. S. Cai, Z. Wang, S. Wang, P. Perdikaris and G. E. Karniadakis, Physics-informed neural networks for heat transfer problems, J. Heat Transfer, 2021, 143(6), 060801 CrossRef CAS.
  40. M. Rasht-Behesht, C. Huber, K. Shukla and G. E. Karniadakis, Physics-informed neural networks (PINNs) for wave propagation and full waveform inversions, J. Geophys. Res.: Solid Earth, 2022, 127(5), e2021JB023120 CrossRef.
  41. Y. F. Hou, L. Zhang, Q. Zhang, F. Ge and P. O. Dral, Physics-informed active learning for accelerating quantum chemical simulations, J. Chem. Theory Comput., 2024 DOI:10.1021/acs.jctc.4c00821.
  42. J. Zhang, C. L. Benavides-Riveros and L. Chen, Artificial-intelligence-based surrogate solution of dissipative quantum dynamics: physics-informed reconstruction of the universal propagator, J. Phys. Chem. Lett., 2024, 15(13), 3603–3610 CrossRef CAS.
  43. V. Martinetto, K. Shah, A. Cangi and A. Pribram-Jones, Inverting the Kohn–Sham equations with physics-informed machine learning, Mach. Learn.: Sci. Technol., 2024, 5(1), 015050 Search PubMed.
  44. A. O. Caldeira and A. J. Leggett, Path integral approach to quantum Brownian motion, Phys. A, 1983, 121(3), 587–616 CrossRef.
  45. M. S. Am Busch, F. Müh, M. E. A. Madjet and T. Renger, The eighth bacteriochlorophyll completes the excitation energy funnel in the FMO protein, J. Phys. Chem. Lett., 2011, 2(2), 93–98 CrossRef.
  46. A. Ishizaki and G. R. Fleming, Unified treatment of quantum coherent and incoherent hopping dynamics in electronic energy transfer: reduced hierarchy equation approach, J. Chem. Phys., 2009, 130(23), 234111 CrossRef PubMed.
  47. A. Ullah and P. O. Dral, MLQD: a package for machine learning-based quantum dissipative dynamics, Comput. Phys. Commun., 2024, 294, 108940 CrossRef.
  48. A. Ullah, L. E. H. Rodríguez, P. O. Dral and A. A. Kananenka, QD3SET-1: A Database with Quantum Dissipative Dynamics Data Sets, Front. Phys., 2023, 11, 1223973 CrossRef.
  49. A. Norambuena, M. Mattheakis, F. J. González and R. Coto, Physics-informed neural networks for quantum control, Phys. Rev. Lett., 2024, 132(1), 010801 CrossRef PubMed.
  50. E. H. Müller, Exact conservation laws for neural network integrators of dynamical systems, J. Comput. Phys., 2023, 488, 112234 CrossRef.
  51. R. Wang and R. Yu, Physics-guided deep learning for dynamical systems: a survey, arXiv, 2021, preprint, arXiv:210701272,  DOI:10.48550/arXiv.2107.01272.
  52. B. K. Rai and G. A. Bakken, Fast and accurate generation of ab initio quality atomic charges using nonparametric statistical regression, J. Comput. Chem., 2013, 34(19), 1661–1671 CrossRef PubMed.
  53. B. K. Rai and G. A. Bakken, Fast and accurate generation of ab initio quality atomic charges using nonparametric statistical regression, J. Comput. Chem., 2013, 34(19), 1661–1671 CrossRef PubMed.
  54. Q. Shi, L. Chen, G. Nan, R. X. Xu and Y. Yan, Efficient hierarchical Liouville space propagator to quantum dissipative dynamics, J. Chem. Phys., 2009, 130(8), 084105 CrossRef PubMed.
  55. Z. H. Chen, Y. Wang, X. Zheng, R. X. Xu and Y. Yan, Universal time-domain Prony fitting decomposition for optimized hierarchical quantum master equations, J. Chem. Phys., 2022, 156, 221102 CrossRef PubMed.
  56. M. Mohseni, P. Rebentrost, S. Lloyd and A. Aspuru-Guzik, Environment-assisted quantum walks in photosynthetic energy transfer, J. Chem. Phys., 2008, 129(17), 11B603 CrossRef.
  57. J. Adolphs and T. Renger, How proteins trigger excitation energy transfer in the FMO complex of green sulfur bacteria, Biophys. J., 2006, 91(8), 2778–2797 CrossRef.
  58. P. O. Dral, MLatom: a program package for quantum chemical research assisted by machine learning, J. Comput. Chem., 2019, 40(26), 2339–2347 CrossRef.

Footnote

Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4dd00153b

This journal is © The Royal Society of Chemistry 2024
Click here to see how this site uses Cookies. View our privacy policy here.