Nobuko Ohba*a,
Takuro Yokoyab,
Seiji Kajitaa and
Kensuke Takechia
aToyota Central R&D Labs., Inc., Nagakute, Aichi 480-1192, Japan. E-mail: e4606@mosk.tytlabs.co.jp
bToyota Motor Corporation Higashi-Fuji Technical Center, Susono, Shizuoka 410-1193, Japan
First published on 17th December 2019
Oxygen storage materials (OSMs), such as pyrochlore type CeO2–ZrO2 (p-CZ), are used as a catalyst support for three-way catalysts in automotive emission control systems. They have oxygen storage capacity (OSC), which is the ability to release and store oxygen reversibly by the fluctuation of cation oxidation states depending on the reducing or oxidizing atmosphere. In this study, we explore high-capacity OSMs by using materials informatics (MI) combining experiments, first-principles calculations, and machine learning (ML). To generate training data for the ML model, the OSC values of 60 metal oxides were measured from the amount of CO2 produced under alternating flow gas between oxidizing (O2) and reducing (CO) conditions at 973, 773, and 573 K. Descriptors were computed by atomic properties and first-principles calculations on each oxide. The support vector machine regression model was trained to predict the OSC at each temperature. The features describing OSC were automatically selected using grid search to achieve practical cross validation performance. The features related to the stability of the oxygen atoms in the crystal and the crystal structure itself such as cohesive energy are highly correlated with OSC. The present model predicts the OSC of 1300 existing oxides. Based on its high predictive power for OSC and synthesizability, we focused on Cu3Nb2O8. We synthesized this material and experimentally confirmed that Cu3Nb2O8 showed a higher OSC than conventional OSM p-CZ. This MI scheme can significantly accelerate the development of new OSMs.
In recent years, materials informatics (MI) has attracted attention as a method of accelerating the search for new viable materials from a wider range of materials space. MI uses materials data, and constructs a predictive model using machine learning techniques in order to discover new materials that have not yet gained attention.8,9 A representative search method in MI is virtual screening based on supervised learning, as shown in Fig. 1. This method has been used in a variety of applications, such as in the search for oxide ion conductors,10 organic EL materials,11 cathode materials for lithium-ion batteries,12 and lithium ion conductors.13,14 While there are many researches that have reported to able to accelerate material developments by MI, there are still scarce reports that achieved experimental validations. Indeed, according to a review paper,15 only 26 studies have been reported along with experimental validations. One of the difficulties is that experimental data in literature are observed in different conditions. In this study, we utilize a well-defined experiment dataset which were obtained in a unified condition by the same facility, to perform the virtual screening method. The aim of this virtual screening is to discover a novel OSM that indicates a higher OSC than the current state-of-the-art storage materials. The process of virtual screening is outlined in Fig. 1 and this paper is organized along these procedures.
MOx−δ + δ/2O2 ↔ MOx | (1) |
The enthalpy difference between the left and right states in eqn (1) corresponds to the formation energy of the oxygen defect. Therefore, the oxygen-defect formation energy is appropriate for an explanatory variable that accurately describes the characteristics of OSC. This quantity should be determined by the first-principles calculation of a large supercell made of repeatedly-connected unit cells to describe the isolated defects. This results in a high calculation cost. To accelerate the evaluation, Deml et al.19 performed the first-principles calculations on 45 oxide materials and used them as the training data for a linear regression model of the oxygen-defect formation energy, achieving an error range of 0.2 eV. Their model used three explanatory variables, which are: (i) heat of formation of the oxide, (ii) oxygen p-band center with the center of the band gap as the origin, and (iii) the average value of the electronegativity of the elements that constitute the polyhedron in the crystal structure. The features (i) and (ii) can be obtained by the first-principles calculation on a small unit cell, and (iii) can be easily obtained if the constituent elements and the crystal structure are known. Therefore, if the oxygen-defect formation energy can be predicted using (i) to (iii), the calculation cost is dramatically reduced in comparison to the direct calculation which uses the supercell.
It would be the best to be able to obtain the explanatory variables in a simple way. On the basis of the above insights, we examined the following seven explanatory variables of OSC:
E_coh: cohesive energy computed by Vienna Ab initio Simulation Package (VASP) code20 [eV]
band gap: Band gap (Eg) [eV]
p band center: oxygen p-band center [eV] (with the top of the valence band as the origin)
pband2: (p band center) − (band gap)/2 [eV] (with center of the band gap (Eg/2) as the origin)
delta chi: average value of the difference in electronegativity
weight/O: molecular weight per oxygen
average r: average distance between oxygen and the cation [Å]
Here we consider two cases of the definition of energy origins for the oxygen p-band center: the top of the valence band, and the middle of the band gap. Note that the E_coh, band features (band gap and p-band center or pband2), and delta chi correspond to the features of (i) to (iii) described above. Additionally, we added weight/O and average r as the features relating to the diversity of constituent elements and crystal structure.
To obtain these features, first-principles calculations were performed using VASP20 with PAW pseudo-potentials,21 which is based on the density functional theory. The Perdew–Burke–Ernzerhof (PBE)22 generalized-gradient-approximation functional was applied in the exchange and correlation energy terms. The cutoff energy of the plane-wave basis is 500 eV, with the k-point grid of the Brillouin zone divided into 0.01–0.02 Å−1. In the structural optimization, the atomic configurations were relaxed so that the force acting on each atom was 0.01 eV Å−1 or less. All the calculations were performed taking into consideration spin polarization.
In addition to the SVM, we investigated the prediction accuracies using other machine learning (ML) algorithms; Gaussian Process Regression (GPR),26 Kernel Ridge Regression (KRR),27 Linear Ridge Regression (LRR),28 and Neural Network (NN).29–32 These algorithms were also implemented using the machine learning library scikit-learn.25 Table S5 in ESI† shows the prediction accuracy of LOOCV for various ML algorithms at each temperature. It is found that the prediction accuracy of SVM model is the best at each temperature. The radial basis function is used as the kernel in SVM, GPR, and KRR models. There is almost no difference in the prediction results of these three models. On the other hand, the prediction accuracy of LRR was worse. The poor prediction accuracy in NN model is due to the small training dataset. In general, the NN model requires a highly diverse training dataset with sufficient representative examples for proper prediction.33 In this study, the non-linear prediction model SVM seems to be appropriate.
Pairwise scatterplots for the OSC data at 973 K and seven explanatory variables are shown in Fig. S3 of ESI.† Here we discuss the details of the selected explanatory variables. Although the oxygen defect formation energy is considered to correlate with the OSC as mentioned in previous section, it is interesting that the “E_coh” and “p band center” have correlation with not only the oxygen defect formation energy but also the measured OSC. The correlation coefficients between the common logarithm of OSC (log10(OSC[μmol-O g−1]@973 K)) and these variables (weight/O, E_coh, and pband2) were −0.19, 0.45, and −0.10, respectively. Since OSC is standardized per oxide weight, it is reasonable that the weight/O feature is selected as the explanatory variable. The E_coh corresponds to the energy difference between the crystalline state and the isolated atoms. Therefore, the higher the E_coh, the more unstable the OSM will be. This result indicates that unstable OSMs would be preferable towards improving the OSC. The feature pband2 is expressed by the oxygen p-band center and the band gap. The lower the oxygen p-band center, the more stable the oxygen states in the crystal. Since the correlation of OSC with this variable is negative, it indicates that oxygen should be stable within the crystal for higher OSC values. This result can be explained by the fact that moderate stability oxides are likely to release and store the oxygen molecules in reducing and oxidation atmospheres, respectively.
The synthesis procedure is described in ESI.† X-ray diffraction (XRD) confirmed that a single phase was obtained. Fig. 5 shows the measurement results of the OSC along with those of a current high-performance OSM, pyrochlore-type CeO2–ZrO2 (p-CZ), for comparison. The present Cu3Nb2O8 showed a higher capacity than the baseline p-CZ at all temperatures. Furthermore, at 773 K, the predicted and measured values were in good agreement. On the other hand, the difference between the predicted and measured values of p-CZ is large, likely because the training dataset at 573 K includes only one pyrochlore-type crystal structure. The regression accuracy may improve with augmentation of the training dataset. Another reason seems that the regression model cannot learn the chemical insight of the redox of cations. For example, when the oxidation state of Ce in p-CZ changes from Ce4+ to Ce3+ during oxygen release, the theoretical maximum of OSC is 1693 [μmol-O g−1], while the prediction value at 573 K is higher than the theoretical maximum. Building the model considering the physical and chemical insights is one of the challenging problems in MI.
In addition, the rate of oxygen release/storage (OSC-r) was simply estimated from the transient response curve as shown in Fig. S4 of ESI.† The OSC-r for Cu3Nb2O8 and p-CZ were 2.38 × 10−5 mol min−1 and 1.60 × 10−5 mol min−1, respectively. Surprisingly, Cu3Nb2O8 has better performance than p-CZ from the view point of rate, though the OSC-r is not predicted in our ML model directly. Regarding the OSC-r estimated from Fig. S4 of ESI,† the OSC-r of p-CZ is increasing monotonically as the temperature increases from 573 K to 973 K. On the other hand, the OSC-r of Cu3Nb2O8 at 773 K is highest among three temperatures. This performance of Cu3Nb2O8 is considered to be related to the possibility of phase separation and thermal stability. Nonetheless, the industrial use of an OSM requires other performance metrics such as thermal stability and easy synthesis by mass production, which are not considered in this virtual screening.
The selected explanatory variables and crystallographic features of Cu3Nb2O8 were compared to the training data in Fig. 6. The color of each point in Fig. 6 represents the measured value of OSC[μmol-O g−1] at 973 K. The 3-dimentional scatter plot in the features space (Fig. 6(a)) indicates that Cu3Nb2O8 is the most similar to Ca2Fe2O5 among training datasets. The second-most similar materials are γ-Fe2O3 and ZnMn2O4 which are classified in spinel-type structure. This result indicates that the Cu3Nb2O8 is similar to the spinel-type OSMs in terms of not only crystal structure (to be mentioned later) but also the material features space. In addition, we applied the Smooth Overlap of Atomic Positions (SOAP) kernel,35,36 which describes the similarities of the local structures surrounding oxygen atoms. The structural similarity between the 60 oxides belonging to the training dataset and Cu3Nb2O8 was calculated as a form of SOAP distance matrix. Parameters of the SOAP kernel, the cutoff radius and width of Gaussian distribution of atoms, were set at 5 Å and 0.5 Å, respectively. To distinguish the elemental species in the SOAP measure, we include the electronegativity in the width of Gaussian function at 1, as in the ref. 36. Fig. 6(b) shows a reduced two-dimensional map of the SOAP distance matrix projected using the metric multi-dimensional scaling method37 implemented in the scikit-learn library. The training dataset can be classified in terms of their crystal structures: e.g. fluorite (such as CeO2), delafossite, and spinel types. Cations around the oxygen ion in spinel oxides form edge and vertex sharing tetrahedral sites. The discovered Cu3Nb2O8, which is not denoted by the spinel, has a similar structure but consists of the tetrahedral and triangular sites. This similarity is detected in the SOAP distances. This result suggests that utilizing MI highlights hidden OSMs that are structurally similar to the known materials, but difficult for humans to recognize only with the names of the structure types. There may still exist prospective hidden OSMs whose crystal structures have received little attention so far.
The oxygen storage and release reaction mechanism of the proposed material Cu3Nb2O8 is currently under investigation. In particular, the oxidation state change of Cu accompanying the generation of oxygen defects is important for the industrial use. Detailed reaction mechanisms, phase separation, and thermal stability of Cu3Nb2O8 will be reported in the future.
Furthermore, experimental validation of high OSC candidates other than Cu3Nb2O8 is also underway. Augmentation the training data with these validated ones makes virtual screening process a closed-loop. This is expected to improve the accuracy of the prediction model and further advance the material discovery.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c9ra09886k |
This journal is © The Royal Society of Chemistry 2019 |