Masaya
Hagai
*a,
Mahito
Sugiyama
b,
Koji
Tsuda
cde and
Takeshi
Yanai
*af
aDepartment of Chemistry, Nagoya University, Furocho, Chikusa Ward, Nagoya 464-8601, Aichi, Japan. E-mail: hagai.masaya.v9@s.mail.nagoya-u.ac.jp; yanait@gmail.com
bNational Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan
cGraduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwa-no-ha, Kashiwa 277-8561, Chiba, Japan
dRIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan
eResearch and Services Division of Materials Data and Integrated System, National Institute for Materials Science, Ibaraki 305-0047, Japan
fInstitute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa Ward, Nagoya 464-8601, Aichi, Japan
First published on 30th March 2023
Artificial neural networks (ANNs) for material modeling have received significant interest. We recently reported an adaptation of ANNs based on Boltzmann machine (BM) architectures to an ansatz of the multiconfigurational many-electron wavefunction, denoted as a neural-network quantum state (NQS), for quantum chemistry calculations [Yang et al., J. Chem. Theory Comput., 2020, 16, 3513–3529]. Here, this study presents its extended formalism to a quantum algorithm that enables the preparation of the NQS through quantum gates. The descriptors of the ANN model, which are chosen as the occupancies of electronic configurations, are quantum-mechanically represented by qubits. Our algorithm may thus bring potential advantages over classical sampling-based computation employed in previous studies. The NQS can be accurately formed using quantum-native procedures. Still, the training of the model in terms of energy minimization is efficiently performed on a classical computer; thus, our approach is a class of variational quantum eigensolvers. The BM models are related to the Gibbs distribution, and our preparation procedures exploit techniques of quantum phase estimation but with no Hamiltonian evolution. The proposed algorithm is assessed by implementing it on a quantum computer simulator. Illustrative molecular calculations at the complete-active-space configuration interaction level of theory are displayed, confirming consistency with the accuracy of our previous classical approaches.
The use of ANN architectures as models of quantum many-particle physics is a research subject that has drawn great interest. Carleo et al. proposed an intriguing ML-based approach to use a class of ANN-based generative models, the restricted Boltzmann machine (RBM), for a representation of the quantum many-body state.7–13 The RBM auto-encoder is used to parameterize the coefficients of the linear combination of many-body basis in the quantum superposition, with configuration vectors of spins (↑ or ↓) or electron occupancies (0 or 1) serving as descriptors. This wavefunction ansatz is called the neural-network quantum state (NQS). A scheme related to reinforcement learning is used to train the network parameters without prior knowledge or datasets, finding the best possible representation of the ground state as a solution of the Schrödinger equation for the given Hamiltonian. Ref. 7 demonstrated high applicability towards physical systems with quantum Ising and Heisenberg models.
This inspiring but transparent formalism by Carleo and Troyer7 to use ML technology for a wavefunction solver has led various groups to its application to first-principles electronic structure calculations for chemical and material systems.11,14–24 Xia and Kais reported the earliest study that used the RBM-based NQS for ab initio electronic structure calculations for molecules, with an additional focus on its extension to a hybrid quantum-classical algorithm14 along a line similar to this work. Our group previously presented an adaptation of the NQS as an encoder of the quantum chemical multireference wavefunction with a complete-active-space configuration interaction (CAS-CI) model.15 Our interest attaches to NQS's applicability as a solver to describe the so-called static electron (or multireference) correlation, whose quantum complexity often becomes challenging, particularly when studying multiple bond breaking, state-degeneracies, varying radical nature in reactions, etc. The CAS-CI method is a basic framework of the approach to the static correlation problem based on the CI (or linear) expansion into electronic configuration basis spanning the chemically important part of the Hilbert space.25,26 In our anstaz, the number of electron configurations considered in the CAS-CI scheme is formally written as 2k, where k is the number of spin-orbitals; in the MR electronic structure cases, these numerous configurations can individually play major roles in wavefunction construction.
In ref. 15, we further proposed using the high-order Boltzmann machine (HBM), which is hidden-node free, in place of the RBM from an alternative perspective – specifically, the second- and third-order BM models, termed BM2 and BM3, respectively. The earlier informatics studies of the HBM model27,28 indicate that the BM3 can extract higher-order features to a comparable degree to the RBM and, more importantly, yields the concave log-likelihood function where the RBM renders it non-concave. The pilot implementation of the quantum-chemical NQS was based on the BM2, BM3, and RBM architectures, demonstrating that the modeled wave functions for illustrative molecules delivered desirable convergence to the CAS-CI results. We confirmed that the native combinatorial complexity arising from evaluating the energy, gradients, and partition function could be mitigated by stochastic sampling approaches in ML using the Markov chain Monte Carlo (MCMC) technique. However, this MCMC-based integration remains the most computationally demanding, practically hindering calculations with larger active space. A promising direction to address this issue may be, as typical to the prevalent ML computation, to conduct the MCMC process on general-purpose computing on graphics processing units (GPGPU), which is considered advantageous over traditional central processing units (CPUs).16–19 With advanced implementations, ref. 16–18 studied bicyclobutane and cyclobutadiene as the largest cases, which involve six π electrons in sixπ orbitals and four π electrons in four π orbitals, respectively. The molecules studied in ref. 19 are systems with a minor static correlation effect. All of these studies are thought of as focusing on the computation of a dynamic (or weak/perturbative) correlation with high-energy virtual orbital space associated with the dynamic correlation effect; note that the complexity of perturbative electron correlation is considered not exponential but polynomial.
In this study, we explore an alternative game-changing strategy by reformulating the ML of the wave function with the NQS as a quantum algorithm that can run on a quantum computer (QC). The QC is a future device that should enable exponential speedup for certain kinds of combinatorial computations. The use of quantum computing for ML has attracted significant attention recently, and quantum algorithms oriented to general-purpose ML have been extensively studied, emerging as a subfield referred to as quantum machine learning (QML).29–36 There are numerous earlier developments of QML that underlie this work. Wiebe et al. presented a quantum process named GEQS (gradient estimation via quantum sampling), allowing for the preparation of a state in which the weights of the superposition obey the Gibbs distribution, corresponding to the probability modeled by the RBM, as a function of a configuration of visible and hidden units.29 Its procedure is based on the quantum algorithm developed by Poulin and Wocjan31 for preparing the Gibbs state of interacting quantum objects through quantum circuits with the use of Kitaev's quantum phase estimation (QPE).37,38 The algorithm in ref. 29 incorporates a technique of quantum amplitude amplification/estimation (QAA/QAE)39 into the state preparation for reducing the number of samples at a quadratic rate.
The development of QC algorithms for electronic structure calculations in quantum chemical research has been a topic of intensive research in the recent past. As reviewed in ref. 40–45, there are two major canonical frameworks of the algorithm for estimating the ground state energy on a QC. One is the pioneering approach using the iterative QPE scheme proposed by Aspuru-Guzik et al.46 It is considered susceptible to quantum noise and ill-suited for using near-term QCs or noisy intermediate-scale quantum (NISQ) devices. The second framework is the variational quantum eigensolver (VQE),47 a hybrid quantum-classical approach amenable to the NISQ devices. Its state preparation is carried out on a QC and can be potentially built from low-depth circuits but at the cost of an increasing number of measurements. Various wave function ansatzes for state preparation have been developed. Of course, they cannot be all cited, but the most extensively studied are the unitary coupled-cluster (UCC) framework47 and many others.48,49 In parallel with the intensive research effort towards NISQ-friendly algorithms, quantum algorithms for a fault-tolerant quantum computer have recently been receiving growing interest.50,51 Ref. 50 showed fault-tolerant quantum circuits for performing QPE, and ref. 51 reported a low-cost fault-tolerant approach using tensor hypercontraction Hamiltonian for quantum chemistry calculation.
In this work, we present the development of an algorithm to build a trained ANN object on a QC as a materialized representation of the theoretical molecular many-electron wave function. It is classified as a VQE type. Unlike UCC47 or others,48 the NQS ansatz does not require the prerequisite of the reference wave function nor involve any wave operators represented by excitation operators. The quantum algorithm of NQS preparation based on the BM2, BM3, and RBM energy functions is explored here as a primal objective. As mentioned earlier, the RBM-based NQS method as the VQE algorithm was similarly investigated by Xia and Kais.14 However, their algorithm does not form an NQS on a QC in complete form. Still, it only prepares an intermediate quantum state, sampled randomly by measures to evaluate the energy and gradients as expectation values classically using the sampled data. Similarly, the preparation (or reconstruction) of the RBM-based NQS for quantum computing was explored on hardware by Torlai et al.52 from a somewhat different perspective while again restricting the measurements to a finite number of samples against their formal exponential requirement.
Developing a full-fledged quantum process (or oracle) to prepare the NQS via quantum circuits should be valued for meaningfully bridging the gap between VQE and QML and was carried out in this study. The aim of this work is not to offer a substitution outperforming the existing VQE algorithms in terms of the circuit depth and the CNOT gate number, but is primarily weighted towards uncovering a direct route to connect the RBM-based NQS theory and quantum computing. Through the training of the NQS, as was performed fully classically in ref. 15, we aim to solve the equation to determine the ground-state molecular electronic structure and associated energy at the CAS-CI level of theory, a suited ab initio quantum chemistry model to handle chemical systems involving multireference electron correlation. The convergence behavior in optimizing network parameters is essentially the same as those observed in our previous study15 based on Carleo's approach. Ref. 15 demonstrated the NQS calculation of the CAS-CI(8e,8o) wavefunction.
Note that Table 1 summarizes the abbreviations used in this paper, and the additional background of this work is mentioned in Section S1.8.
ANN | Artificial Neural Network |
BM2 | Second-order Boltzmann Machine |
BM3 | Third-order Boltzmann Machine |
CAS-CI | Complete Active-Space Configuration Interaction |
CMO | Canonical Molecular Orbital |
FCI | Full Configuration Interaction |
FS | Fock Space |
HBM | High-order Boltzmann Machine |
LMO | Localized Molecular Orbital |
MCMC | Markov Chain Monte Carlo |
ML | Machine Learning |
NOON | Natural Orbital Occupation Number |
NQS | Neural-network Quantum State |
PES | Potential Energy Curve |
PN | Particle Number |
QAA | Quantum Amplitude Amplification |
QC | Quantum Computer |
QML | Quantum Machine Learning |
QPE | Quantum Phase Estimation |
QUBO | Quadratic Unconstrained Binary Optimization |
RBM | Restricted Boltzmann machine |
RDM | Reduced Density Matrix |
RHF | Restricted Hartree–Fock |
UCC | Unitary Coupled Cluster |
VQE | Variational Quantum Eigensolver |
The probability distribution is modeled using the energy function at the heart of the BM-based ANNs. It is given for BM2, BM3, and RBM as the following:
![]() | (1) |
![]() | (2) |
![]() | (3) |
Eqn (1)–(3) are functions of the bitstrings v ∈ {0,1}nv and h ∈ {0,1}nh. In this approach, the binary signal of the unit vi = 0, 1 is seamlessly homologized to the two levels of a qubit; the same applies to the binary unit hj = 0, 1. With this mapping between nodes and qubits, a superposition of {|v〉} in the NQS, written as
![]() | (4) |
As a viable form of the NQS for quantum computing, we use the following ansatz for structuring Cv,15
![]() | (5) |
![]() | (6) |
Now let a joint set of the parameters be defined as σ ≡ (θ,τ). The update of the whole parameters σ for the training of the BM models is a task that can be processed efficiently by classical computation. This is achieved by finding variationally optimal σ based on energy minimization. This optimization is related to reinforcement learning because neither reference data nor prior knowledge of the wave function is used. For the updating, we use the gradients of the energy E (= 〈Ψ|H|Ψ〉) with respect to the parameters, which are calculated to be where the Hamiltonian H is obtained from a user-specified chemical structure in the first-principles manner, and Oσ is expressed in the locally discretized form with the basis |v〉 as
(see ref. 15 for more details). Iteratively updating σ leads us to determine the optimal parameters that meet the variational condition
(Fig. 1b).
As discussed in Section S1.6, the quantity 〈H〉 (= 〈Ψ|H|Ψ〉) is an object that can be evaluated as a sum of Pauli operator terms measured with Ψ prepared with the given neural network parameters σ. Importantly, in the case of BM2, BM3, or other HBM, this simplicity can be further applied to 〈HOσ〉 and 〈Oσ〉 for gradients (eqn (S19)–(S20)†); thus, the QC efficiency can fully benefit the computation of these quantities via quantum-native processes. However, for the RBM, the gradient-related objects 〈HOσ〉 and 〈Oσ〉 cannot be simply evaluated in a similar manner because of the presence of the sigmoid function (Sig) in its gradient formulae (eqn (S21)†). With the QC simulator, we evaluate Sig in RBM's gradients classically. The state of preparation will be discussed in detail later. In the previous implementations tailored to classical computing, these expectation values were evaluated using the stochastic approach. This widely used ML technique takes a statistical average from Markov chain Monte Carlo (MCMC) sampling over the distribution generated by the ML model. This classical sampling process is entirely replaced by the quantum computing procedure in this study.
![]() | ||
Fig. 2 (a) Qubit architecture employed in this study. (b) The whole procedural steps of the state preparation of the NQS on a QC. (c) Graphical representation of a quantum circuit for preparing the Gibbs distribution state (eqn (14)). (d) Illustrative process of the bitwise operations for evaluating the Gibbs factor using the controlled Ry gate. |
Then, we apply the Hadamard gate to all the visible qubits, forming the following uniform superposition in the visible qubit space:
![]() | (7) |
As an alternative to this state, we may employ a uniform superposition of the subgroup of {|v〉} built under the constraint of the particle-number (PN) symmetry54 (see also Section 3 and S1.5† for details); indeed, it is used extensively in our test calculations. The PN-conserving preparation circuit for the case of the singlet state with four electrons in four orbitals is shown in the ESI.†
The next step is to obtain the following state by transforming eqn (7)via Kitaev's QPE procedure:37
![]() | (8) |
Ẽv(θ) ≡ E(v;θ)/D + Δ | (9) |
The QPE algorithm is a technique for finding θn of the eigenvalue e2πiθn on a QC, given the unitary operator U and the eigenvector |ψn〉 such that U|ψn〉 = e2πi
θn|ψn〉.37 The QPE-based procedure is shown in Algorithm S1† with its subroutine for U Algorithm S2.† Note that Algorithms S1 and S2 are provided in Section S1.1 and S1.2 of the ESI,† respectively. The structure of U is a key ingredient to realize the formation of eqn (8)via the QPE, and its quantum circuit, here named the energy curation gate, should be built, in this case, based on the postulation U|v〉 = e2πi
Ẽv|v〉. This U appears to behave as an evolution operator that attaches the converted BM energy function Ẽvvia the phase kickback to the basis |v〉 in |Ψ〉 non-iteratively. We underscore that it does not involve Hamiltonian evolution or Trotter steps. The quantum algorithm of applying U based on the phase shift rotation
is defined as,
![]() | (10) |
A primary part of the task can be achieved by applying a sequence of bitwise transformations on the energy register qubits. This operation aims to read out the stored Ẽv and transform it into the rescaled coefficient , by which the basis of the superposition is scalarly scaled via the qubit rotations. In this operation, which is bitwise, the Ry(Θ) gate is applied on the k-th anicilla qubit with the angle Θ = 2arccos(e−D2−(k+1)) conditioned on the k-th energy register qubit. This allows the state with the k-th ancilla qubit in the state |0〉 to be transformed as follows:
![]() | (11) |
![]() | (12) |
As written in Algorithm 1, we then apply the aforementioned QPE procedure (Algorithm S1†) on eqn (12) in reverse, allowing the energy register qubits to revert to |0〉 as follows:29
![]() | (13) |
Next, a measurement is performed on the ancilla qubits. If |0〉nreg is observed, the state is indicated to result in the formation of the Gibbs distribution state:
![]() | (14) |
Finally, we turn to the preparation of the phase segment of eqn (5) as the rest of the task. We apply the phase shift rotation (eqn (10)) on eqn (14) over all the visible qubits and furthermore apply its controlled gate over all the pairs (see also Algorithm S3 in Section S1.3 for details†), finally obtaining the target NQS:
![]() | (15) |
The QAA process performs amplification iteratively. Given that the whole process of the state preparation to form eqn (14) is written as |Ψ〉 = S|0〉, the whole process of the same S is repeatedly performed during the single QAA process. Thus, an important consequence is that the number of amplification iterations is a factor arising in the scaling of the circuit depth.
The research on quantum computing to prepare the Gibbs state for a quantum system, e−βH, is motivated by its importance in statistical simulation, ML, and various others.31,32,56 There are several proposed algorithms, which were first shown by Terhal and DiVincenzo,55 followed by Poulin and Wocjan31 revealing rigorous upper bounds, and then reported based on quantum walks and Metropolis sampling,56–59 and others.60,61 These algorithms, however, are considered to be formidable for the NISQ devices.62
The use of the variational quantum algorithm (VQA) approach has been recently highlighted for preparing a Gibbs state with high fidelity and low-depth circuits.62–72 The VQA class is a classical/quantum hybrid scheme in which a quantum circuit is optimized classically but the objects evaluated on a QC can be reduced in number.67,68,73–75
The work of Wu and Hsieh62 shows a VQA to prepare a Gibbs state ρ via purification of the thermal state preparation named thermofield double (TFD) states. The TFD state is written as using the dual Hamiltonian HA + HB = H ⊗ 1 + 1 ⊗ H and entangled Hamiltonian
where |ψ0〉 is the ground state of the HAB. This ansatz is related to the quantum approximate optimization algorithm (QAOA)73 and involves alternating time evolution of the Hamiltonian HA + HB and HAB. This hybrid approach is oriented to optimizing the parameters
by minimizing the Gibbs free energy F(ρA) = E(ρA) − kBTS(ρA) = Tr(ρAH) − β−1 Tr(ρA
ln(ρA)), which serves as a cost function, where
. Warren et al. showed an extension of this approach introducing a more viable cost function C(ρ) = −
Tr(e−βHρ)/Z + Tr(ρ2)/2 avoiding the estimation of the von Neumann entropy.63 As pointed out in ref. 63, measuring the entropy S and its gradients is a difficult task. In addition, the dynamically generated compact ansatz realized by the adaptive derivative assembled problem-tailored (ADAPT) based VQAs48,76 was used for preparing the state at a lower circuit depth. The combination of the idea of this ADAPT-based algorithm with the formation of our Gibbs distribution state (eqn (14)) is interesting; however, its direct combination seems to be difficult because there is no operator form of H (e.g., HA = ∑iZA,iZA,i+1) directly representing the BM energy function E(v;θ) (eqn (1)–(3)) in our Gibbs distribution state.
Tong et al.77 recently presented a quantum algorithm called fast inversion to solve a class of quantum linear system problems (A + B)−1|b〉 via block encoding.78 The algorithm assumes that A is usually large but ‖B‖, ‖A−1‖, ‖(A + B)−1‖ is O(1). It was shown to enable preparing the Gibbs state as a special case from e−H|b〉 with H = A + B by setting |b〉 to the maximally entangled state. The core of the block encoding lies in using a subnormalized unitary matrix UA to encode A as a submatrix,
![]() | (16) |
Another approach closely related to the Gibbs state preparation is the quantum Boltzmann machine (QBM)32,70,71,79–82 as a realization of QML towards data modeling (not molecular wave function modeling). Unlike the conventional BM approach with the classical energy function {Eθi} used in this work, the QBM uses the energy operator, namely a parameterized Hamiltonian Hθ, typically represented by an Ising network, for the neural networks. The distribution of the QBM is thus modeled by the quantum Gibbs state e−Hθ/Z, which bears a resemblance to the energy-function-based BM distribution (e.g., eqn (14) in our case). Very recently, Zoufal et al.70 and independently, Shingu et al.71 proposed a VQA approach to prepare the Gibbs state of the QBM
via the variational imaginary time simulation algorithm recently proposed by ref. 67 and 68, based on the normalized imaginary time evolution (ITE) ansatz |ψ(τ)〉 = C(τ)e−Hθτ|ψ(0)〉 where
. This hybrid VQA approach to QBM modeling including the computation of the cost function and its gradients has been proposed, revealing that it is compatible with the NISQ computers due to the feasibility of the underlying variational quantum ITE scheme. The fully visible QBM network with bipartite graphs has been so far studied,70,71,80 corresponding to quantum analogy to BM2, and ref. 70 demonstrated compatibility to its variant involving the hidden nodes. The QBM approach with the variational quantum ITE scheme appears to be extremely powerful for constructing the Gibbs state for the QBM neural network and training it on noisy intermediate-scale quantum computers. As analyzed by ref. 80, the QBM and conventional BM models have, by construction, unequal learnability but offer similar (in some cases different) capabilities of feature extraction. It would be thus interesting to explore the adaptation of this variational QBM approach to the NQS-based electronic structure computation for future work.
The bond dissociation energy curves of H2 were calculated with the bond length ranging from 0.25 to 1.95 Å. The geometries of cis- and trans-conformers of 1,3-butadiene were determined by the geometry optimization at the B3LYP/cc-pVDZ level of theory,86,87 are provided in the ESI.† All the single-point structures used in the potential energy curve (PEC) calculations for the electrocyclic reaction of the PABI are tabulated in the ESI.†
Two types of the initial states for the visible qubits in the state preparation of the NQS were used. They changed the treatment of the particle-number subspaces. The first type is the Hadamard-transformed state, as described in Section 2, involving 2nv basis states, which are complete with spanning the Fock space for the given second-quantized Hamiltonian. The use of this initial state, denoted as FS, considers all possible numbers of electrons with arbitrary spins in constructing the NQS. The second type is the particle-number (PN) state, prepared using the quantum circuit shown by Gard et al.54 It is an equally weighted superposition like the Hadamard-transformed state but only using the basis states with a desired number of electrons. The PN-conserving initial state undergoes our state preparation process, forming an NQS as a superposition of these PN-conserving basis states. Fig. S1† shows a PN circuit used as the initial state for the CAS(4e,4o) calculations. The BM2 and BM3 states prepared with the FS treatment are denoted as BM2(FS) and BM3(FS), respectively, while those with the PN-conserving basis states were denoted as BM2(PN) and BM3(PN), respectively.
The trained NQS models with BM2 and BM3 were obtained for all systems. We tested the various numbers of the energy register qubits to gauge their impact on the accuracy in the prepared state. The examined nreg was 6, 8, 10, and 12 for H2 and C4H6, and 6 for PABI. For comparison, the reference data were obtained with 50 energy register qubits, offering near double-precision accuracy. The RBM energies were calculated for H2 with nreg set to 8 and 50 and for butadiene with nreg set to 8. The numbers of hidden nodes (nh) were tested to be 2, 3, 4, and 5 for H2 and 4 for butadiene. As shown in Fig. 5a, with nreg = 6, where obtaining stable convergence was difficult, we used the lowest of the energies of the training history as the energy of the solution.
It should be noted that the noise and system errors were not considered in all the quantum-computing simulations unless otherwise stated.
Fig. 3b–e show the errors of the potential energy curves (PECs) relative to the FCI predictions as a function of tested nreg for BM2/CMO, BM2/LMO, BM3/CMO, and BM3/LMO, respectively, prepared with the FS treatment. Regardless of the model and orbital type, the total energies with nreg ≥ 10 are accurate to 10−6Eh, far exceeding the chemical accuracy (≈1 mEh). Round-off errors associated with the truncation of energy register qubits were not wholly vanishing in the obtained energies even with nreg = 10 and 12, compared to the results with nreg = 50; however, they are negligible. The round-off errors are systematically eliminated with increasing nreg, exhibiting an approximately quadratic convergence with nreg. Interestingly, the coarsest representation of the energy registers with nreg = 6 corresponding to a precision of 2−6 ≈ 0.016 can produce errors in energy falling below 1 mEh. Overall, the energies were predicted better with LMOs than with CMOs for the given number of the energy register qubits. The BM3 does not consistently outperform the BM2 across the curves in this system. This contradicts the theoretical assumption but appears to be ascribed to the exceeding complexity of the BM3 parameterization compared to the relatively simple structure of the H2 wavefunction.
In Fig. 3f–i, the errors of the PECs obtained with the models prepared as a PN state54 with BM2/CMO, BM2/LMO, BM3/CMO, and BM3/LMO, respectively, are monitored. As detailed in the Methods section, this preparation can efficiently train the BM models, focusing on the descriptors (i.e., configurations) conserving the target electron number. In this test system, the PN-conserving configurations amount to 6 = (4C2), which is much smaller than the dimension of FS, 16 = (24). This reduction should have a favorable impact on the representability of the BM models. It was indeed confirmed in the drastic rectification of the errors observed in all the predicted PECs compared to those of the FS variants (Fig. 3b–e).
The training as an FS state requires that the models predict the coefficients Cv to be exactly zero for the PN-unconserving v. This particular requirement is imposed during the training process, i.e., optimizing θ and τ; nonetheless, the optimization does not discriminate between the PN-conserving and -unconserving v. In our experience with FS-based calculations, the cost of learning for the PN-unconserving v was comparable to or even more significant than the cost of learning for the PN-conserving v. The modeling of the BM that outputs zero precisely against several different inputs is seemingly a numerically difficult task. In the PN approach, this requirement and related cost completely disappear because the values of these PN-unconserving coefficients automatically vanish regardless of the BM's parameters. Compared to the FS approach, this compactness in the PN approach indeed plays a beneficial role in showing better performance even with nreg = 6. It should be emphasized that the difficulties pointed out above in the FS treatment stem from the nature of our underlying ML model, as observed in the previous study based on the MCMC sampling, and are not fundamentally caused by our quantum algorithm of state preparation.
Fig. 3j–m show the errors of the PECs computed using the RBM model as a function of varying nh with CMO and LMO basis in addition to testing nreg = 8 and 50. The results with nreg = 50 indicated that the RBM-based ANN state with nh = 2, the smallest RBM, is capable of reproducing the FCI energies across the curve with a near machine accuracy. Our quantum algorithm's validity for preparing the RBM state was confirmed even for enlarging nh. Moreover, reducing nreg to 8 for the state preparation for the RBM still yields a reasonable accuracy in energy prediction.
We attempted to closely analyze the effect of the round-off errors in the energy register qubits on the coefficients Cv. In Fig. 3n and o, the distributions of the weights |Cv|2 determined by BM2/LMO with nreg = 6 are shown for bond lengths of 0.25 and 0.9 Å. The exact distributions taken from the corresponding FCI results are included in the graph for comparison. The BM distribution errors appear appreciable for a bond length of 0.25 Å, whereas they were negligible for 0.9 Å.
To scrutinize these errors, we focus on the weight ratio between two configurations v and v′. We found that the precision to represent the ratio |Cv|2/|Cv′|2 is in fact limited depending on nreg and D, and formally written as exp(−2nreg·D). This limited precision stems from the finite binary representation of the energy register Ẽv (eqn (9)). With a bond length of 0.25 Å, the exact ratio of the weights for v = (1100) and v′ = (1001) is observed to be 1.20 (= 0.272/0.227) from the FCI result; however, the BM2 calculation with the resulting scaling constant D = 78 can express the ratio with a precision of 3.38 (=exp(−26 × 78)), which exceeds the exact one. This poor precision underlies the errors in Fig. 3n. On the other hand, the BM calculation with the bond length of 0.9 Å resulted in a scaling constant D = 38, and thus, can express the ratio of the weights between v = (1100) and v′ = (1001) with a precision of 1.81 (= exp(−26 × 38)). This precision is comparable to the ratio of the corresponding weight for FCI results, 1.83 (= 0.324/0.177); thus, the satisfactory accuracy in Fig. 3o was delivered. The above discussion indicates that the scaling constant D plays a crucial role in determining reliability but is an uncontrollable parameter. An increase in the number of the energy register qubits is a simple way to rectify the round-off errors arising in Cv and consequently enhance the accuracy of the energy prediction.
The noise error was characterized by the error of E relative to the noise-free prediction. This error is compatible with the errors in the estimation of the gradients because they are also evaluated as expectation values. Fig. 4a–d compile the errors of 100 re-run E predicted via the NQS quantum circuit simulator considering the aforementioned noises. We observed that adding the depolarizing noise to the gates in some cases caused an overshooting of the variational E. Recompiling the data in Fig. 4e and f, we found that the absolute errors decrease with decreasing gate noise, systematically approaching the noise-free E, with either 104 and 105 shot noises. This behavior is reflected by the fact that the average of the errors of 100 re-runs yields very similar error convergence regardless of the number of shots (Fig. 4g). Thus, the average of the errors appears to be robust against the shot noise. Fig. 4g indicates that the error trend is approximately linearly proportional to that of εG. Fig. 4h plots the standard deviation of the errors of E predictions, revealing that the standard deviations behave in approximately O(N−0.48Cε0.42G). Thus, increasing NC and decreasing εG may have similar impacts on the mitigation of the standard deviation of the noise errors. Further error mitigation can be investigated for future work. From a perspective of fault-tolerant simulations for quantum chemistry calculation, which have been recently highlighted,50,51 the fault-tolerant technologies may realize the fault-tolerant simulation of the noise-susceptible ansatz.
s-trans (CMO) | s-cis (CMO) | ΔE | s-trans (LMO) | s-cis (LMO) | ΔE | |
---|---|---|---|---|---|---|
a n v = 4. | ||||||
RHF | −16.49 | −14.04 | 2.46 | |||
CAS-CI | −102.70 | −100.30 | 2.40 | −102.70 | −100.30 | 2.40 |
BM2 | ||||||
6 Qubits | −88.92 | −85.00 | 3.92 | −102.57 | −100.11 | 2.46 |
8 Qubits | −90.64 | −85.88 | 4.77 | −102.65 | −100.25 | 2.41 |
50 Qubits | −90.82 | −86.09 | 4.73 | −102.67 | −100.26 | 2.41 |
BM3 | ||||||
6 Qubits | −88.05 | −83.46 | 4.59 | −102.55 | −99.90 | 2.66 |
8 Qubits | −90.02 | −87.03 | 3.00 | −102.65 | −100.27 | 2.38 |
50 Qubits | −90.22 | −87.24 | 2.98 | −102.69 | −100.30 | 2.39 |
RBMa | ||||||
8 Qubits | −54.89 | −52.69 | 2.20 | −101.54 | −99.20 | 2.35 |
As shown in Fig. 5a, we monitored the training progress of the BM2/LMO energy for the s-trans isomer with various nreg. The results with nreg = 6 shows a largely oscillating behavior. This instability was relatively mitigated with nreg = 8. Ten or more energy register qubits were required to achieve stable training. The optimization of the BM parameters can suffer from convergence issues analogous to the Barren plateau problem in the VQE optimization. Our previous study15 mitigated this issue to a certain degree via the two-step optimization: the first hundred learning iterations only optimize the phase parameters τ with the amplitude parameters θ fixed with the initial random values, and the rest of the iterations undergo the optimization of both θ and τ. This scheme was used in the present work. The convergence should be checked using different random seeds. Ref. 89 and 90 studied Barren plateaus for the QBM.
![]() | ||
Fig. 5 (a) The progress of BM2/LMO energy as a function of training iteration for s-trans butadiene with nreg = 6, 8, 10, and 12. (b) The counts of QAA cycles as the progress of training of the BM2 for CMO and LMO basis with FS and PN treatments. The heat maps of network parameters Wij (eqn (18)) for (c) BM2/CMO and (e) BM2/LMO, and of particle correlation Iij (eqn (17)) for (d) BM2/CMO and (f) BM2/LMO, calculated for the s-trans isomer. |
Fig. 5b shows the number of QAA39 cycles (Algorithm S4) to achieve a desired, amplified state in every state preparation process for the BM2 calculation of the s-trans isomer using nreg = 8. The average number of the QAA operations per iteration varies depending on the orbital type (LMO or CMO) and the particle-conservation treatment (PN or FS). Compared to the CMOs, using LMOs resulted in fewer QAA operations, with a frequency of 1.0 and 4.5 times per iteration for the PN and FS treatment, respectively. This is related to the fact that when using the LMOs, the configuration distribution of the prepared state is widely spreading (also see Fig. S2†) to a more significant degree than in the CMO case and less dissimilar to that of the initial state that begins with a uniform distribution. The PN treatment significantly reduces the cost associated with the amplification compared to the FS treatment. Note that marginalizing the hidden layer of RBM's Cv lowers the probability of the target state by a factor of approximately ; thus, the number of the QAA cycles required is much larger, scaling up exponentially with nv, compared to the BM cases.
In Fig. 5, we attempt to show a numerical comparison between the resulting ANN parameters and the physical quantities computed from the ANN state. Fig. 5d and f show heat maps of the so-called particle correlation Iij,91
Iij: = 〈Ψ|![]() ![]() ![]() ![]() | (17) |
Wij : = {wij + (ai + wii)/6 + (aj + wjj)/6}(1 − δij) | (18) |
Fig. 6f and g show the natural orbital occupation numbers (NOONs) of the state calculated at the closed-ring and open-ring geometries, respectively. This analysis shows that the electronic character of the closed-ring structure is of a quinoidal nature, which is within the single-determinant picture using the CMO basis. At the open-ring structure, however, the open-shell biradical nature emerges, as evidenced by the half-integer NOONs of NO2 and NO3, which are approximately 1.6 and 0.4, respectively. As indicated by the distribution of the configurations (Fig. S4†), the biradical state is of multireference (or strongly corrected) electronic character. These calculations thus demonstrated that our quantum computation of the BM-based models is within reach of accurate multireference wavefunction calculations that involve a variation between quinoidal and radical nature.
QPE | UCCSD | BM2 based NQS | |
---|---|---|---|
a The same for BM3. b Extra complexity O((N3CASnreg)NQAA) is added for BM3. | |||
Ref. | Ref. 46 | Ref. 47 | This work |
Framework | QPE | VQE | VQE |
Energy | Eigenvalue | Expectation value | Expectation value |
No. of qubits | N CAS + nreg | N orb |
N
CAS + 2nreg![]() |
No. of gates | O(N5CASnregNTrotter) | O(N3orbN2elecNTrotter) |
O((N2CASnreg + n2reg)NQAA)![]() |
Opt. iterations | O(1) but long-time evolusion | O(N3orb/ε2) | O(N3CAS/ε2) |
Reference | required | required | No |
For all these algorithms, the total number of qubits required grows in proportion to the number of the orbital basis used to represent the correlated electronic configurations. In addition, the register qubits are further required for QPE and the NQS. It is widely acknowledged that realizing the QPE calculation necessitates the use of fault-tolerant and scalable QCs to handle its long sequence of quantum operations, O(N5orbNregNTrotter), where the required number of Trotter steps (NTrotter) is generally very large for achieving an acceptable accuracy. The VQE algorithm on NISQ devices can mitigate this hardware challenge; in its UCC ansatz, the number of gates is drastically reduced mainly because NTrotter is relatively fewer compared to the QPE case. Notably, the operators in the exponent of the BM/RBM models are commutable, although it is not so in QPE and UCC; thus, the NQS algorithm forgoes the Trotter decomposition. The critical factor in determining the depth of the quantum circuits in the NQS is, as written earlier, the number of QAA39 cycles (NQAA). As illustrated in Fig. 4b, NQAA largely varies depending on systems and basis types and can be small when the resulting distribution is close to an equally weighted superposition that is used as an initial state.
Another marked difference is that QPE and UCCSD require the prerequisite of the reference wave function, for which the HF state is typically chosen, while the NQS does not. This should have a characteristic effect on the depth of the quantum circuit of the state preparation. As mentioned earlier, the main aim of this work is to find a direct route or quantum-native algorithm that can suitably connect the ML-inspired NQS theory and quantum computing, but not to explore an substitution to UCC or other highly tuned-up VQE methods.
• The process to prepare the HBM-based NQS representing the CAS-CI state is fully quantum-native, forgoing the random sampling of the intermediate state performed in the approach by Xia and Kais.
• Kitaev's QPE37,38 is exploited to evaluate the BM energy functions via its phase kickback trick with a flavor similar to the GEQS algorithm29 of the general-purpose QML. Unlike the way of using the QPE in Hamiltonian evolution as done by Aspuru-Guzik et al. and others,46 our approach for preparing a single NQS uses it in a non-iterative manner with no time evolution or Trotter steps.
• The NQS described in this study is a hybrid quantum-classical algorithm and classified as a VQE type.
• The HBM energy functions with no hidden nodes, namely BM2, BM3, or even higher-order, are a suited class of underlying ANNs for an NQS prepared on a QC. With the HBM, our approach estimates analytical gradients of the energy with respect to network parameters using the prepared NQS, just as the energy is calculated from the expectation values of Pauli strings.
Meanwhile, some concerns are found in the NQS preparation algorithm as follows:
• The re-scaling parameters for the energy function have to be determined, requiring an additional quantum treatment subject to the QUBO problem53 to find the maximum and minimum of a function.
• The values of the re-scaled energy functions with a bit-vector representation are efficiently calculated and stored in a user-given number of energy register qubits, which can thus suffer round-off errors. In our test cases, the use of ten energy register qubits showed satisfactorily convergent results.
• Formulation with the RBM model for our quantum algorithm is straightforward; however, several concerns are aroused in the implementation. (i) The energy gradients cannot be analytically computed on a QC as a sum of the expectation values of Pauli strings because of the involvement of the sigmoid function, which is, in contrast, absent in the HBM. (ii) The qubits proportionally increase with an increasing number of hidden nodes. Contrarily, the HBM has no dependence in the number of qubits on its order. (iii) Our testing on the simulator revealed that marginalizing the hidden layer of RBM's Cv lessens the probability of the target state by a factor of approximately in Gibbs distribution state preparation. This means that NQAA grows rapidly with increasing nv.
In addition, the following characteristics of our approach are interesting, but their in-depth understanding should be investigated in future work:
• As analyzed in Section 4.3, the critical factor determining the depth of NQS's quantum circuits is the number of the QAA cycles to amplify the probability of the desired state. The numerical tests showed that it is apt to be relatively small if the resultant state is near maximally entangled or highly multireference.
• The NQS ansatz does not require the prerequisite of the reference wave function unlike the UCC, which usually uses the HF/CMO state as the reference. On a complementary note, it is well acknowledged that when using CMOs, the ground state of the closed-shell small molecule is usually of a single-determinant character or weakly correlated. Despite the same wavefunction, this correlation nature is drastically changed to a multi-determinant (or strongly correlated) character when using LMOs (also see Fig. S2 and S4†), as also emphasized in ref. 15. Our assessments were thus carefully performed using CMOs and LMOs from such a perspective.
We have confirmed on the simulator that the quantum algorithm (Algorithm 1) is implementable with the use of elementary quantum gates except for the process to determine D and Δ, and formally involves no approximation to the given NQS ansatz except the round-off errors due to the finite precision of the energy register. The NQS calculations simulated for illustrative molecules overall reproduced the ground-state CAS-CI energy and wave function with high accuracy when the computational conditions were given properly. The errors of the trained NQS compared to the CAS-CI wavefunction are fundamentally irrelevant to the use of quantum state preparation proposed in this work, and have the same nature of error as was observed in the fully classical NQS scheme studied in the previous work. It should also be similarly pointed out that convergence behavior in training the NQS's network parameters is the same as those observed in our previous study15 and essentially inherited from Carleo's approach. The concerns related to them are not in the scope of this research. In ref. 15, the progress of the training for the fully classical NQS based CAS-CI calculation with CAS(6e,6o) was reported along with the results with CAS(8e,8o). The fully-visible-unit nature of the HBM neural networks plays a crucial role in its well-suited accommodation to a quantum algorithm. This adaptability in the HBM should be underscored as a marked advantage over the RBM with the hidden nodes. Additionally, as demonstrated in the previous work, it is of great importance that the increase in the order of the HBM instead of the hidden nodes for the RBM can systematically improve a model with a certain numerical stability. This exceptional usability of the HBM introduced in our previous study for the NQS should benefit the general-purpose QML.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2dd00093h |
This journal is © The Royal Society of Chemistry 2023 |