Machine learning for design principles for single atom catalysts towards electrochemical reactions

Mohsen Tamtaji a, Hanyu Gao a, Md Delowar Hossain a, Patrick Ryan Galligan a, Hoilun Wong a, Zhenjing Liu a, Hongwei Liu a, Yuting Cai a, William A. Goddard III b and Zhengtang Luo *a
aDepartment of Chemical and Biological Engineering, Guangdong-Hong Kong-Macao Joint Laboratory for Intelligent Micro-Nano Optoelectronic Technology, William Mong Institute of Nano Science and Technology, and Hong Kong Branch of Chinese National Engineering Research Center for Tissue Restoration and Reconstruction, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong 999077, P. R. China. E-mail: keztluo@ust.hk
bMaterials and Process Simulation Center (MSC), MC 139-74, California Institute of Technology, Pasadena, CA 91125, USA

Received 15th March 2022 , Accepted 13th June 2022

First published on 14th June 2022


Abstract

Machine learning (ML) integrated density functional theory (DFT) calculations have recently been used to accelerate the design and discovery of heterogeneous catalysts such as single atom catalysts (SACs) through the establishment of deep structure–activity relationships. This review provides recent progress in the ML-aided rational design of heterogeneous catalysts with the focus on SACs in terms of structure–activity relationships, feature importance analysis, high-throughput screening, stability, and metal–support interactions for electrochemistry. Support vector machine (SVM), random forest regression (RFR), and deep neural networks (DNN) along with atomic properties are mainly used for the design of SACs. The ML results have shown that the number of electrons in the d orbital, oxide formation enthalpy, ionization energy, Bader charge, d-band center, and enthalpy of vaporization are mainly the most important parameters for the defining of the structure–activity relationships for electrochemistry. However, the black-box nature of ML techniques occasionally makes a physical interpretation of descriptors, such as the Bader charge, d-band center, and enthalpy of vaporization, non-trivial. At the current stage, ML application is limited by the lack of a large and high-quality database. Future prospects for the development of a large database and a generalized ML algorithm for SAC design are discussed to give insights for further studies in this field.


1. Introduction

Heterogeneous catalysts play important roles in the synthesis of high-value chemicals through thermal, electrochemical, and photochemical reactions. Designing improved catalysts requires deep understanding of how the composition and processing affect the properties at the interface, but their progress is hindered due to the complexity in experimental and theoretical investigations.1 Thus the successes have often involved time- and resource-consuming trial-and-error experimental and theoretical investigations. On the other hand, recent advances in Quantum Mechanics (QM) calculations provides accurate information about how molecules react at the interface to form various products, but QM calculations are limited in the size of the system and the time scale of the simulations. In order to discover new catalysts for specific applications, a combination of time-consuming experimental and QM studies is used to develop atomic level understanding of the fundamental mechanisms and to develop preparation-structure or structure–activity relationships. Accordingly, there is a huge demand for the accelerated discovery of novel catalysts with desired activities. Machine Learning (ML)2–4 as a data-intensive tool can accelerate time-consuming experimental and QM studies to predict the catalytic activity in a vast dimensional space of heterogeneous catalysis.

Fig. 1 illustrates the general workflow for the integration of QM calculations and ML for the accelerated discovery of heterogeneous and single atom catalysts (SACs). The predicted data from QM calculations and feature vectors are used to design and train ML algorithms. The trained ML algorithms will then be used for not only the prediction of the optimal activity of heterogeneous catalysts, but also for performing feature importance analysis. Subsequently, optimized catalysts will be used for the desired reaction to produce valuable chemicals and fuels.


image file: d2ta02039d-f1.tif
Fig. 1 The general workflow for the integration of QM calculations and ML for the rational design of heterogeneous catalysts. The process contains several steps: data generation using QM calculations, training of ML, optimization, and feature importance analysis, and using designed catalysts to produce chemicals and fuels.

Although the ML-assisted prediction of a single physical property such as formation energies5 and band gaps6 is widely applied for the purpose of materials discovery,7–10 its application for heterogeneous catalyst design and discovery11,12 is still in its early stage.13 Here, ML as a supportive tool, aims to guide, not to replace experiments and QM calculations in the search for ideal catalysts.14 However, the main hurdles for employing ML in heterogeneous catalyst design are the lack of a consistent database, the lack of a universal ML algorithm, and the existence of only a few descriptors as input features for ML.15

Herein, we review recent studies reporting the incorporation of ML into QM calculations (typically density functional theory (DFT) calculations) and experiments to accelerate heterogeneous catalyst design and discovery for various reactions. Recent review papers have summarized the recent studies on the application of ML for catalytic reactions,16–20 reaction prediction,21 discovery of catalysts,13,22–27 inverse design of catalysts,28 and catalysis informatics.29,30 In this review paper, we focus mainly on the different aspects of ML in experimental and theoretical studies with an emphasis on the limitations and hurdles of ML in heterogeneous catalyst design. Inspired by the application of ML in heterogeneous catalyst design, we continue with a comprehensive review on the application of ML in SAC design and discovery with an emphasis on ML algorithms, different SACs, environmental effects, stability, support–metal interaction, structure–activity relationships, and high-throughput screening. Recent findings on the input features of ML and their importance for different electrochemical reactions will be reviewed, where the isolated electrons in d orbitals have been demonstrated to play a key role in the nitrogen reduction reaction (NRR).31 Subsequently, the application of different ML algorithms in several examples including the O2 reduction reaction (ORR), O2 evolution reaction (OER), CO2 reduction reaction (CO2RR), NRR, and H2 evolution reaction (HER) will be provided to demonstrate the potential application of ML for the design and discovery of SACs for electroreduction reactions. Finally, a summary and future prospects in the area of ML-guided SAC and DAC discovery are provided and discussed.

2. Machine learning (ML) algorithms

The most important ML algorithms applied for the establishment of deep structure–activity relationships are normally support vector machine (SVM), random forest regression (RFR), deep neural networks (DNN), sure independence screening and sparsifying (SISSO), and Gaussian process regression (GPR). As shown in Fig. 2a, SVM as a binary classification and regression algorithm classifies data points into two distinct categories by using hyperplanes.32 The SVM assigns each point of training data to one of two classes and minimizes the error between the classes by dividing the categories using a hyperplane, which maximize the margin around the hyperplane. The hyperplane is completely defined by the data points that are closest to the plane and between the support vectors. SVM can also be used in mapping the non-separable data through the radial basis function (RBF) kernel by transforming a real space into a higher-dimensional space through several hyperplanes:33,34
 
image file: d2ta02039d-t1.tif(1)

image file: d2ta02039d-f2.tif
Fig. 2 Machine learning algorithms. (a) Schematic of the SVM algorithm. The hyperplane divides SACs into two distinct classes based on the largest distance between the data points placed between the support vectors. Class 1 and class 2 (red and blue circles) show the SACs with similar properties based on features x1 and x2. (b) Schematic of RFR algorithm. Orange and green circles represent decision nodes containing ‘if/then’ statements. The result that is predicted by the highest number of decision trees (majority voting) is given as the output of RFR algorithm. (c) Schematic of DNN. Circles represent neurons in the input, hidden, and output layers of the DNN. Neurons are interconnected using the black lines. (d) Schematic of the GPR algorithm. Predicted mean (red line) and confidence interval (light orange interval) for the GPR algorithm trained based on the input dataset (blue dots).

in which G is a radially symmetric function of its argument, G(r) = ϕ(|r|), x is the vector of joint angles or other parameters describing the current pose of the skeleton, xk is the pose of the kth example, and ωk represents the different weights of each vertex coefficient. SVM is highly efficient in terms of memory usage; however, the boundary between categories may become obscured when there are a large number of training data points. SVM can also create both the linear and the non-linear model, and the latter one is based on a kernel-based regression technique.35 When comparing SVMs and the kernel ridge regression (KRR) algorithm, no big performance differences are to be expected. Usually, SVMs arrive at a sparser representation, which can be of advantage; however, their performance relies on a good setting of the C and γ hyperparameters for the SVM method and the α and γ hyperparameters for the KRR method. Normally, the SVM method leads to faster predictions and consumes less memory, whereas the KRR method leads to less fitting time for large datasets. Nevertheless, because of the generally low computational cost of both algorithms, these differences are rarely significant for a relatively small number of data points. Unfortunately, neither method is feasible for large datasets as the size of the kernel matrix scales quadratically increases with the number of data points.36

In comparison with other algorithms, random forest regression (RFR) needs fewer hyperparameters with higher robustness.37 In fact, as shown in Fig. 2b, the RFR algorithm acts as an aggregated decision tree algorithm to lower the bias by reaching a collective decision.38 The issue with the RFR is that it is not accurate for out-of-sample predictions especially in the case of a small number of training data points.39 Furthermore, feature importance analysis can be easily performed after the training of the RFR, SVM, and KRR algorithms.40 Similar to SVM and KRR methods, the deep neural network (DNN) algorithm has the potential to learn system nonlinearity. As shown in Fig. 2c, DNN is a mimic of the combination of neurons inside the human brain, which is composed of several interconnected neurons in several layers. Similar to SVM and KRR methods, the number of neurons and layers as the hyperparameters for DNN should be optimized concerning the quality and accuracy of the output results for minimizing the loss functions such as root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE).41–43

Compared with other ML techniques, the SISSO algorithm possesses high convenience and accuracy, while the fitting formulae generated by the SISSO model possess high efficiency and portability.44 As shown in Fig. 2d, GPR is a Bayesian approach to bring waves to the ML area and works well with a small number of input data to provide uncertainty measurements on the predictions.45

ML techniques can also be applied as the text mining tools to gather the large numbers of already available QM calculations and experimental data in the literature, construct readily available databases applicable in deep analysis, and study preparation-structure–activity relationships. ML techniques for text mining can be categorized into supervised, unsupervised, and semi-supervised techniques.46,47 Supervised and semi-supervised algorithms such as neural networks and transfer learning can be used for text classification, information extraction, and analyzing the data, while unsupervised algorithms such as expectation-maximization (EM) mostly are used for text clustering, summarization, and dimensionality reduction.47

3. Inspiration from heterogeneous catalyst design

Although ML techniques are widely used for the design of heterogeneous catalysts, their application to single atom catalysts (SACs) is in its infancy. Therefore, in accordance with the trends in ML-aided heterogeneous catalyst design which are discussed in this section, we will continue with the ML-aided design of SACs in section 4. The integration of ML with experimental- and QM-predicted data is widely used along with atomic and structural properties as the input features to predict the properties of heterogeneous catalysts.48–52 For example, a ML algorithm was trained based on experimental data and structural properties as the input features to optimize the singlet oxygen (1O2) quantum yields of core–shell plasmonic photocatalysts applicable in organic synthesis and photodynamic therapy (PDT).53 In addition, a ML model was trained based on DFT calculation data to predict and screen the surface reactivity of bimetallic alloys using atomic properties as the input features.54 To shed light on the integration of ML with experiments and QM studies for heterogeneous catalyst design and discovery, more details are provided in the following subsections.

3.1 Integration of ML with experiments

Learning from experimental data is the earliest application of ML in heterogeneous catalyst design for electrocatalysis, photochemistry, and biocatalysis.55–62 ML models can be trained based on experimental data to optimize the performance, decrease the number of experiments, and therefore to accelerate high-throughput experimentation.63,64 The input features for ML models can be synthesis and reaction operation conditions to predict the catalytic performance.65 For example, a ML algorithm was used to calculate the yields of dioctyl adipate synthesis by implementing the substrate molar ratio, enzyme amount, temperature, and reaction time as the input features.66 Adaptive learning was applied to find high-activity AA′B2O6 cubic perovskite catalysts for the OER by establishing a relationship between the electronic structure properties as the input features and the OER activity of the perovskite catalysts. It was revealed that the orbital electronic structure characteristics of the B-site ion is an important factor for the OER.51 Also a multi-output support vector regression (SVR) as the ML algorithm was applied to predict the selectivity and conversion of methane oxidation.67 Likewise, ML allows the optimization of experimental data to increase the efficiency of heterogeneous catalysts for the selective oxidation of methane.68 In addition, ML was applied on experimental data to predict the activity and selectivity of bimetallic metal catalysts with TM–Pt–Pt(111) and Pt–TM–Pt(111) architectures for ethanol reforming.69

One of the disadvantages of ML models is that they are only applicable for specific systems and are not transferable from one to another experiment due to the lack of consistent data and the presence of hidden variables for each specific experiment.70,71

To overcome this issue, ML can be applied to analyze available data in the literature through data mining processes72,73 to extract and analyze previously published experimental data for future heterogeneous catalyst discovery.74–76 For example, ML was used to extract the data for the synthesis of oxide materials from 12[thin space (1/6-em)]000 scientific articles.77 In addition, several studies have recently reported data mining from the literature for the ML-assisted design and discovery of new heterogeneous catalysts for oxidative coupling of methane.78–83Fig. 3 shows the workflow for the summary of a data mining sequence from the literature. It starts with a query search to find related papers from metadatabase, following by downloading and classifying the papers.46,84 The classified papers can be used for text mining using several ML algorithms such as KRR, RFR, SVR, Extreme Gradient Boosting (XGB), extra trees regression (ETR), and artificial neural network (ANN) to extract the data. The extracted data can be used for regression, classification, and/or clustering purposes. For example, several ML algorithms such as XGB, RFR, and ETR were used to analyze the literature data for the oxidative coupling of methane on metal supported catalysts to discover new heterogeneous catalysts.85,86 Similarly, the statistical analysis of available data in the literature for CO oxidation, water–gas shift reaction, and oxidative coupling of methane reactions was performed using several ML algorithms such as Kernel Ridge Regression (KRR), RFR, XGB, and SVR for heterogeneous catalyst discovery. Through feature importance analysis, reaction temperature was revealed as the key parameter for the three investigated reactions.87 Very recently, suitable catalysts for environmental applications were discovered based on available data in the literature, from which binary and ternary element catalysts such as MnxCoy and ZrxMnyCrz were identified and optimized through ML for high NOx conversion. An ANN algorithm was used to predict NOx conversion efficiency as a function of temperature and the element molar ratio. The conversion reaches a maximum around 300 °C for the ternary element catalysts. Also, the loading amount of Zr was found to play an important role due to the fact that the Cr5+ species can reduce as the Zr loading amount increases, which can subsequently lower the NOx conversion efficiency.88 In addition, a ML algorithm along with 27 descriptors was applied to 2228 experimental data obtained from the literature89 to predict the activity of heterogeneous catalysts, which reveals that temperature is the most important descriptor for the water–gas shift reaction.90


image file: d2ta02039d-f3.tif
Fig. 3 The workflow for data mining from the literature. Summary of the data mining sequence from the literature using several ML algorithms such as KRR, RFR, SVR, XGB, ETR, and ANN.

Moreover, learning from a large database in nanoscience can be used for rapid design and discovery of new heterogeneous catalysts using ML.91 However, the obtained dataset from the literature is mostly incomplete and inconsistent, which limits the application of ML. In order to generate a consistent database for the training of ML algorithms, high-throughput experimentation can be performed. As a result, high-throughput experimentation for oxidative coupling of methane was performed for 20 catalysts and 216 reaction conditions to produce a consistent dataset for ML to accurately predict C2 yields.92 From feature importance analysis, temperature, in the range of 700 to 900 °C, is the most important parameter compared to other parameters such as the flow rate of argon, flow rate of O2, flow rate of CH4, contact time, and composition of the catalyst.

ML also has great potential to alter the current form of conventional experiments and increase the efficient heterogeneous catalyst discovery through automation.93–95 In fact, ML-assisted robots can help to accelerate high-throughput experimentation without human interactions.96–99 As a result, a ML-guided robot was used to carry out 688 experiments within an experimental space of ten variables, 1000 times faster than manual approaches. The ML-assisted high-throughput experimentation revealed a new photocatalyst mixture with six times more activity.100

3.2 Integration of ML with Quantum Mechanics (QM)

Learning from Quantum Mechanics (QM) is highly desired due to the existence of enormous amounts of quantitative QM-predicted data as a training dataset for ML. The trained ML can be used for accelerated and accurate prediction of the catalytic properties and adsorption energies of reaction intermediates.101 Using the adsorption energies as the key parameter, the reaction barrier can be predicted, the reaction mechanism can be investigated, and the desired catalyst can be discovered. For example, the local similarity kernel and Bayesian linear regression as the ML algorithms were used for predicting the adsorption energies of NO, O, and N on a Rh1xAux alloy, based on the nanoparticle composition and size.102,103 The findings were used to predict the rate of NO decomposition on RhAu nanoparticles, which indicates a maximum for catalytic activity at a particle diameter of 2.0 nm. In addition, structure–activity relationships were established for predicting CO and H adsorption energies based on structural properties using active learning across reaction intermediates.104,105 In fact, an automated screening approach through the integration and optimization of ML was presented to guide DFT calculations for predicting catalytic activity.105 The feasibility of this approach was demonstrated by screening various alloys combining 31 elements, which resulted in 131 candidate surfaces across 54 alloys being identified for the CO2RR and identification of 258 surfaces across 102 alloys for the HER.104,105 Likewise, active learning was then used to accelerate the screening of CO adsorption energy on Cu based components.106

The ML-predicted adsorption energies of reaction intermediates were also used for the investigation and optimization of the reaction network of the syngas reaction (CO + H2) over Rh(111) catalysts at 573 K and 1 atm. Gaussian process regression (GPR) as a ML algorithm was trained based on a few DFT calculations to predict the adsorption energies for all intermediates in the reaction network. A probable reaction network from syngas to acetaldehyde was revealed by using a simple classifier to select the potential rate-limiting steps, where only predicted potential rate-limiting steps were analyzed via further DFT calculations.107

ML was also trained based on DFT-calculated data to accelerate the prediction of the adsorption energies of H and CHx intermediates on Cu-based alloys using 12 properties as the input features. Amongst several ML algorithms, the ETR algorithm resulted in the highest accuracy. Based on feature importance analysis, the surface energy, element group, and melting point were identified to be the most important parameters for predicting adsorption energies.108 In addition, ML was applied for predicting the adsorption energies of different intermediates on metal alloys.109 ML was also used to predict the adsorption energies of H on Ni2P(0001) surfaces. From the feature engineering perspective, the Ni–Ni bond length is the key parameter for HER activity, where a higher Ni–Ni bond length leads to lower HER activity.110 Similarly, ML was used to predict the adsorption energies of CO on bimetallic alloys, where feature engineering analysis resulted in the d-band shape and sp-band filling as key parameters.111,112 Furthermore, to accurately predict the d-band as one of the most important parameters in CO adsorption, a GBR model was applied to several individual 3d, 4d, and 5d transition metal structures and their binary alloys for both the cases of metal impurities and overlayer-covered metal surfaces.113,114 Recently, ML was integrated with DFT calculations to predict the adsorption energies of various molecules on metal oxide surfaces. Feature importance analysis indicates that the highest occupied molecular orbital (HOMO) of the adsorbates and the metal oxide surface energy are the most important parameters for molecular adsorption.115 ML in combination with DFT calculations was also used for the prediction of the adsorption energies of 12 elements on 38 metal surfaces by using SVR, RFR, and multi-layer perceptron regression (MLPR).116

The integration of ML and QM can also be performed to accelerate the discovery and high-throughput screening of heterogeneous catalysts. For example, ML integrated DFT calculations were used to accelerate the discovery and high-throughput screening of 2D MXenes for the HER.117,118 SVR, GPR, RFR, and AdaBoost were used as ML algorithms to accelerate the prediction of ΔGH*, based on the distance between the nearest neighbor O atoms as well as the surface oxygen–metal bond length as the most important parameters.117 Similarly, several ML models, such as DNN, KRR, SVM, and RFR, were used to accelerate the high-throughput screening of ΔGH* by using several elemental properties as the input features. RFR led to the highest accuracy, with the lowest RMSE of 0.27 eV for the test data. Feature importance analysis shows that HER performance is highly dependent on charge and structural properties. S- and Os2B-terminated Scn+1Nn (n = 1, 2, 3) were revealed as appropriate catalysts for the HER with ΔGH* close to zero and satisfactory hydrogen coverages. It was also shown that S functional groups are of great importance in regulating the HER performance. This is because filling antibonding states with electrons weakens the adsorption of H*, which is a key step for the HER.118

For spinel structures, the ML model was used to accurately calculate the energy difference between the centers of the oxygen p and metal d bands to identify the better spinel oxide catalysts for the OER. It was shown that a [Mn]T[Al0.5Mn1.5]O–O4 spinel catalyst has the optimal energy difference for high activity, as confirmed by experimental observations.119 ML was also applied to optimize TiO2-supported Re and zeolite catalysts for methylation of aromatic hydrocarbons.120 Similarly, ML was applied on the DFT-calculated data to predict how strain in platinum core–shell nanocatalysts can improve the ORR activity. It was revealed that the optimal strain depends on the nanoparticle size rather than the bimetallic material composition and shell thickness.121

As with experimental data, there is a large amount of QM-predicted data in the literature that can be mined for the purposes of ML analysis to commence a new direction using a large database in the rational design of heterogeneous catalysis and SACs.122 For example, ML was applied on literature data for CO2 hydrogenation.123 In addition, a dataset of 37[thin space (1/6-em)]000 structures from the Catalysis-Hub database,124 containing 11 adsorbates on 2000 metal alloy surfaces, was used for training a graph neural network (GNN) to predict adsorption energy based on relaxed structures.125

ML can also be used for investigating reaction mechanisms and finding active sites for reactions. For instance, the LASSO ML algorithm was trained on DFT-calculated data for predicting the methane activation mechanism on rutile metal oxides.126 It was revealed that the energy of methane activation decreased if the reacted atoms including O, C, H, and metal atoms could be placed in the same plane. In addition, ML was combined with multi-scale simulations and QM to identify the performance of surface sites on Au nanoparticles as well as dealloyed Au surfaces for the CO2RR.127 Based on ML results, surface defects are responsible for the high performance of Au surfaces. Similarly, ML was applied to DFT-calculated data to discover active bimetallic facets for the CO2RR.128 It was revealed that most facets of nickel gallium bimetallic materials lead to similar activity on Ni surfaces.

ML integrated DFT calculations are able to predict the surface segregation energies of bimetallic catalysts through the establishment of structure–activity relationships.129 ML was used for the prediction of reaction barriers on a variety of surfaces130 and for the discovery of phase diagrams applicable in electrochemical reactions.131 In addition, symbolic regression as a ML technique in combination with QM calculations was used to accelerate the discovery of new perovskite catalysts with excellent OER activity. The ratio of octahedral factor to tolerance factor (μ/t) was revealed as a simple and important descriptor for the discovery of perovskite catalysts.132

4. Single atom catalysts (SACs)

Along with the studies mentioned above on heterogeneous catalysis, single atom catalysts (SACs) have recently been applied to several photochemical and electroreduction reactions to produce a wide range of chemicals.133–135 The unique properties and high atom-utilization efficiency of SACs make them interesting and promising.136–138 With these increased applications, the rational design of SACs has come into the forefront to enable improvements in the efficiency and feasibility of optimizing the desired products.139 DFT calculations are widely used for the rational design of SACs with efficient activity, selectivity, and stability. DFT calculations, however, are time-consuming and computationally expensive140,141 because the complexity of structure–activity relationships requires performing a large number of non-trivial DFT calculations in a large parameter space, including the SAC type, environmental coordination, and reactants.142 On the other hand, ML is considered as a fast, accurate, inexpensive,143 and supportive tool144 to predict the properties of SACs towards their rational design.145–147 As shown in Fig. 3, using ML, one can apply the available datasets from QM and DFT calculations to construct readily available databases applicable in the deep analysis and establishment of preparation-structure–activity relationships. The established relationships can be used to predict the adsorption energy (Eads) or Gibbs free energy (ΔG) of various reaction intermediates adsorbed on SACs to discover more active and selective SACs. Once enough high quality databases are provided, a reliable ML model can be trained and constructed to address the electroreduction challenges.148,149 ML in combination with DFT calculations commences a new direction for rapid and low cost rational design of SACs predicted to have optimal electroreduction catalytic activity.150,151 For example, several studies have used ML to design single atom alloy catalysts (SAACs) with excellent stability and activity by predicting the Eads, ΔG, or aggregation energies.152–155 ML can also be used for the interpretation of characterization of SACs.156,157 For example, as shown in Fig. 4, ML techniques have been used to interpret the EXAFS spectra based on which edge sites (zigzag or armchair) are responsible for the HER activity of a cobalt SAC embedded in graphene.145 In the following subsection, the application of ML for the establishment of structure–activity relationship, feature engineering, high-throughput screening, and stability of SACs is broadly discussed. As the application of SACs in thermal and electrochemical reactions was presented in a recent review paper,158 we only focus on the progress of ML for the design of SACs and DACs especially for electrochemical reactions.
image file: d2ta02039d-f4.tif
Fig. 4 ML for the interpretation of the EXAFS of Co–N doped graphene. (a) Establishment of training data using MD-EXAFS calculations for Co–4N–P, Co–2N–A, and Co–2N–Z. (b) The architecture of the DNN composed of one input layer of the EXAFS spectrum, two hidden layers, and one output layer of the proportion vector. (c) The estimation of the local structural proportion from the experimental EXAFS measurement. Reproduced with permission from ref. 145, copyright 2021, Wiley-VCH. Results show that ML is an appropriate and powerful tool for the interpretation of EXAFS.

4.1 Structure–activity relationship and feature engineering

ML is a strong tool159 to provide a fundamental understanding of structural sensitivity160,161 through establishing deep relationships between catalytic activity and structural as well as atomic properties based on mechanisms and similarities in SACs.13,32,162 ML is considered as a new direction for the rational design of SACs by exploring feature importance analysis for electroreduction reactions to introduce more perceptions on the origin of the activity and stability of SACs.163–165 For example, ML integrated DFT was applied to establish a relationship between various descriptors and hydrogen adsorption free energy (ΔGH*) for the HER by altering the size and dimensionality of a nitrogen-doped 2D-carbon substrate for 3d, 4d, and 5d transition metals (TMs) as SACs.166 The sure independent screening and sparsifying operator (SISSO) as the supervised ML algorithm was applied with 10 input features including the d-state center (εd), covalent radius (rcov), Bader charge (q), number of occupied d states (docc), Zunger radius (rd), number of valence electrons (Ne), ionization energy (IE), electronegativity (EN), and formation energy of single atom sites (Ef). Our evaluation on this work using the SVM algorithm is shown in Fig. 6a, demonstrating that the number of occupied d states (docc) and Bader charge (q) are the most important parameters for the HER. Using the SISSO algorithm, the following general descriptor for HER activity containing four properties was obtained, in which EN is the electronegativity of the SACs:
 
image file: d2ta02039d-t2.tif(2)

Similarly, several atomic properties were implemented as input features to establish structure–activity relationships and predict the OER overpotential of SACs on carbon substrates. The full connection neural network (FCNN) ML algorithm trained using DFT-calculated data leads to an accurate prediction of overpotentials with a relative error of 6.49% and a 130[thin space (1/6-em)]000 times reduction in the computational time. It was revealed that the d-electron count (de), the atomic radius of metal (AtR), and electron affinity (EA) are the most important parameters for OER overpotential. Moreover, an intrinsic descriptor (ϕ) that defines the overpotential of SACs based on their intrinsic atomic properties was proposed using ML and DFT:167

 
image file: d2ta02039d-t3.tif(3)
where ENC, AtRC, and NC are the electronegativity of carbon, the atomic radius of carbon, and the nearest neighbor carbon atoms, respectively. ENM, IE1, and AtM are the electronegativity of metal, first ionization energy, and atomic mass, respectively.

In another study, atomic properties such as electronegativity, electron affinity, and radii of the metal atoms were considered as input features to reveal ORR activity for heterobimetallic SACs. Using RFR, the origin of the ORR activity of SACs was investigated experimentally or by establishing structure–activity relationships based on DFT-calculated data.168 Similarly, atomic properties were used to predict the catalytic activity of SACs and bi-atom catalysts for the CO2RR. Based on results from the GBR algorithm, Ag–MoPc was revealed as an excellent electrocatalyst with a limiting potential of −0.33 V.169 Subsequently, the data from the abovementioned work were used as an example to evaluate the efficiency of a DFT–ML hybrid program for catalysis programming.170

In order to observe the effect of substrates on the activity and stability of SACs, the combination of atomic and structural properties should be considered as input features for the training of ML algorithms. Therefore, several atomic as well as structural properties were used to establish structure–activity relationships for the discovery and design of bifunctional rhodium SACs on defective g-C3N4 for the OER and ORR using the GBR algorithm.171 The atomic and structural properties include the TM bond length and coordination atoms (dTM–N1, dTM–C1, and dTM–C2), the d-band center (εd), the charge transfer of TM atoms (Qe), the electronegativity (EN), the electron affinity (EA), the first ionization energy (IE1), the radius of the TM atom (AtR), and the number of TM-d electrons (de). As shown in Fig. 5c, the GBR model predicts ΔG*OH with an R2 = 0.99 and a low RMSE = 0.03 eV. However, this work included only 16 points of input data, which is insufficient. Feature importance analysis revealed that the first ionization energy (IE1) and the charge transfer of transition metal atoms (Qe) are the key features (Fig. 6b). The most important descriptor IE1, the energy needed to remove one or more electrons from a neutral atom to form a positively charged ion (which increases from left to right in each period), affects the OER and ORR activities.


image file: d2ta02039d-f5.tif
Fig. 5 Density functional theory (DFT)-based machine learning (ML). Comparison of ML- and DFT-predicted (a) ΔGO* using the RFR algorithm, reproduced with permission from ref. 173, copyright 2019, American Chemical Society. (b) Limiting potentials using the RFR algorithm, reproduced with permission from ref. 174, copyright 2020, Royal Society of Chemistry. (c) ΔGOH* using the GBR algorithm. Reproduced with permission from ref. 171, copyright 2021, American Chemical Society. Results indicate that ML can be used for the out-of-sample (test set) predictions of activity for SACs using deep structure–activity relationships. However, the quantity of data points in the training dataset is not sufficient to give a generalized ML algorithm.

image file: d2ta02039d-f6.tif
Fig. 6 Feature importance analysis. (a) The feature importance for SACs embedded in nitrogen-doped graphene indicating that the number of occupied d states (docc) and Bader charge (q) are the most important parameters for the HER. Please note that this is our evaluation on ref. 166. Reproduced with permission from ref. 166, copyright 2020, American Chemical Society. (b) The feature importance based on the GBR algorithm for rhodium SACs. Reproduced with permission from ref. 171, copyright 2021, American Chemical Society. First ionization energy (IE1) and the charge transfer of TM atoms (Qe) are the most important factors for ΔGOH*. Inset shows the structure of rhodium SACs on defective g-C3N4 for the OER and ORR. (c) The feature importance based on the RFR algorithm for SACs embedded on nitrogen-doped carbon supports. Reproduced with permission from ref. 173, copyright 2019, American Chemical Society. The oxide formation enthalpy (Hf,ox) and the adjusted electron numbers of d/p orbitals (dpe) are the most important factors for ΔGO*. Inset shows the structure of SACs embedded on nitrogen-doped carbon supports for a two-electron ORR. (d) The feature importance for dual atom catalysts (DACs) based on a RFR algorithm indicating that the average distance between metal atoms and the coordinated N atoms (M12–N), the distance between the two metal atoms (M1–M2), and the outer electron number of metal atoms (de) are the most important factors for the ORR limiting potentials. Reproduced with permission from ref. 174, copyright 2020, Royal Society of Chemistry. Inset shows the structure of DACs embedded in nitrogen-doped graphene for the ORR. Results indicate that feature engineering of SACs and DACs depends on the application and the type of substrate. Please see Table 2 for the abbreviations.

Similarly, atomic and structural properties including the number of electrons in d orbitals, the oxide formation enthalpy, the Pauling electronegativity of the metal atom, the sum of Pauling electronegativity of surrounding atoms, and the average pKa values of the surrounding atoms were used to establish structure–activity relationships. To do this, the RFR algorithm was applied based on DFT-calculated data for 104 SACs embedded in graphene including M@C3, M@C4, M@pyridine-N4, and M@pyrrole-N4. The RFR algorithm revealed that the number of electrons in d orbitals is the most important parameter for the ORR, OER, and HER. The trained RFR algorithm was employed to predict the activity of 260 graphene-based SACs (M@NxCy), through which, it was revealed that Fe@pyrrole-N1C3 and Fe@pyrrole-N2C2 were more active than Fe@pyridine-N1C3 and Fe@pyridine-N2C2.172

Comparably, 8 atomic and structural properties including the oxide formation enthalpy (Hf,ox), the number of electrons in d/p orbitals (dpe), electron affinity (EA), electronegativity (EN), number of coordinated N atoms (NN), first ionization energy (IE1) of the central atoms, the sum of the electronegativity of neighboring C and N atoms (SEN), and the distance ratio (DR) were used to establish the structure–activity relationship for a two electron ORR using RFR. Fig. 5a shows the comparison of ML- and DFT-predicted ΔGO* for this system. Through the feature importance analysis of 8 intrinsic features, it was revealed that the oxide formation enthalpy (Hf,ox) and the number of electrons in d/p orbitals (dpe) are the most important parameters for determining the ΔGO* of SACs (Fig. 6c).173 The feature importance analysis implies that metals like Ag, Au, and Pd with a weaker affinity for oxygen can remarkably decrease band hybridization between the oxygen and metal, leading to enhanced H2O2 selectivity.

As the complexity of SAC structures increases, new and general descriptors will be needed for establishing the correct structure–activity relationships. For example, the number of isolated electrons in d-orbitals, obtained from a bidirectional activation mechanism, was suggested as a new input feature for the ML algorithm, which introduces new insights for the rational design of SACs. It was shown that this new descriptor is the most important parameter for the NRR, while the electron affinity of metal atoms was shown to be the most important parameter for the HER. ML using this new input features was therefore used to accelerate the computational screening, design, and discovery of SACs by establishing the structure–activity relationship for 126 SACs for the NRR, validated by experimental studies and DFT calculations.31

Unlike SACs, the geometry of dual atom catalysts (DACs) is more complex and the synergetic effect between the two metal atoms plays an important role in the performance. In other words, the linear relationships for DACs are significantly weakened, demonstrating that the DACs' activity requires new descriptors to consider the effects of both metals in the structure. Therefore, in order to consider the synergetic effect of the two metals, ML integrated DFT was used to identify the structure–activity relationship of DACs embedded on nitrogen-doped graphene for the ORR. Fig. 5b shows the ML- and DFT-predicted limiting potentials using the random forest regression (RFR) model.174 Feature importance analysis indicates that the average distance between metal and N atoms (M12–N), the distance between metal atoms (M1–M2), and the outer electron number of metal atoms (Ne,O) are the most important factors for the ORR limiting potentials (Fig. 6d).

In order to shed more light on the structure–activity relationships, the effect of different intermediates should also be considered on the activity of SACs. Therefore, in addition to atomic and structural properties, the properties of intermediates were also considered as input features for training the RFR algorithm to calculate the binding energies of H*, OH*, O*, and OOH* on SACs embedded in nitrogen-doped graphene using 1700 DFT-calculated data points. Based on feature importance analysis, the type of intermediate was found to be one of the most important features.175

The input features with high feature importance can be used for descriptor-based SAC design to predict adsorption energies. For example, descriptor-based design was used to predict the adsorption energies of intermediates on SACs embedded in graphitic carbon nitride (g-C3N4), g-CN, and g-C2N. It was shown that Ni@g-CN, Cu@g-CN, and Co@C2N are excellent SACs for the CO2RR.176 It was also shown that catalytic activities are highly related to ΔGOH*, ΔGOCH*, the number of electrons in d orbitals, and the TM enthalpy of vaporization.

The descriptors can also be used for establishing volcano-shaped relationships177 from which SAC candidates for various electrocatalytic reactions can be found.178 Therefore, a new intrinsic descriptor based on the bonding, topology, and electronic structure of SACs embedded in carbon supports, shown in Fig. 7a, was defined as follows:179

 
image file: d2ta02039d-t4.tif(4)


image file: d2ta02039d-f7.tif
Fig. 7 Volcano plots. (a) Structure of SACs embedded in nitrogen-doped graphene supports for the descriptor-based SAC design. Volcano plots for (b) overpotential (η), (c) onset potential (Vonset), and (d) Faraday efficiency (FE) based on the descriptor for SACs embedded in nitrogen-doped graphene supports. This indicates two definitive volcanoes in the plot for overpotential with Ti and Co located at the summits. Also, for the onset potential and Faraday efficiency, Co is in the summit of volcanoes with better CO2RR performance. Reproduced with permission from ref. 179, copyright 2019, Wiley-VCH.

in which Ne, EN, and IR are the valence electron number, electronegativity, and ionic radius of central metals, respectively. This descriptor was used for volcano plots of overpotential, onset potential, and Faraday efficiency, as shown in Fig. 7b–d, indicating two definitive volcanoes in the plot for overpotential with Ti and Co located at the summits. Another descriptor to consider the effect of supports was also introduced as follows:139

 
image file: d2ta02039d-t5.tif(5)

in which ENN, ENC, NN, NC, and de represent the electronegativity of N atoms, the electronegativity of C atoms, the number of nearest-neighbor N atoms, the number of nearest-neighbor C atoms, and valence electrons in d orbitals, where α is the correction coefficient. These descriptors were used to predict the adsorption energies of different intermediates for the CO2RR. Moreover, these descriptors were used for volcano plots of onset potential and overpotential with Ni and Pt located at the summits of volcano plots.

However universal and appropriate descriptors are still insufficient to establish structure–activity relationships for all types of SACs, supports, and electroreduction reactions.180 Therefore, a large number of DFT calculations and ML analyses are still needed to screen different descriptors for each reaction system.181

4.2 High throughput computational screening for SACs

DFT calculations have been applied for high-throughput screening of SACs,96,182–186 where, for example, S was found to be the best dopant in graphene-based Co SACs for the HER.187 ML, however, can accelerate the screening of SACs and decrease the computational cost and time by screening for similarities in SACs and establishing deep structure–activity relationships.146,188–190 Therefore, the integration of ML algorithms and DFT calculations has been performed for the rapid and high-throughput screening of SACs.191 For example, ML combined DFT calculations were employed to screen and design MBene-based SACs for the HER. ΔGH* values were calculated accurately via SVM algorithm by using atomic and structural features. The Bader charge transfer of the surface metal was revealed as the most important parameter for HER activity. Stable Co2B2 and Mn/Co2B2 were also identified as the efficient HER catalysts because |ΔGH*| < 0.15 eV.192 In addition, the screening of SACs embedded on MXenes was performed using ML and DFT calculations to show the ability of ML to screen new candidates with excellent performance.193 It shows that the HER catalytic activity is dependent on the synergistic effect between single metal atoms and substrates. In addition, the bag-tree algorithm as a supervised ML technique was applied for the separation of DFT-calculated data and converse prediction of HER performance.194 ML integrated DFT calculations were applied to accelerate the discovery and screening of TMs and lanthanide (Ln) metals for SACs embedded in graphdiyne, based on the adsorption energies, adsorption trend, electronic structures, reaction pathway, and active sites.

In addition to the HER, ML algorithms were employed based on DFT-calculated data for the fast screening of efficient NRR and CO2RR electrocatalysts.105 For instance, graph-based convolutional neural network (GCNN) was applied for the accelerated screening of SACs for the NRR. The results show superior NRR selectivity over the HER with overpotentials of 0.44 V, 0.40 V, 0.24 V, 0.60 V, 0.17 V, 0.17 V, 0.64 V, 0.37 V and 0.58 V, respectively, for SACs embedded in MBenes, defect-engineered 2D-materials, and 2D p-conjugated polymers, TaB, NbTe2, NbB, HfTe2, MoB, MnB, HfSe2, TaSe2 and Nb.195 A deep neural network (DNN) was used for rapid and high throughput screening of efficient SACs embedded on boron-doped graphene for the NRR. The adsorption and free energies were calculated using the light gradient boosting machine (LGBM) model based on the bonding characteristics and structural properties as input features. Feature importance analysis was also performed for nitrogen fixation, revealing that the TM coordination number and the number of hydrogen atoms are the key parameters.196 Extreme gradient boosting regression (XGBR) was implemented as a supervised ML algorithm to screen ΔGCO* and ΔGH* for 1060 SACs embedded in metal–nonmetal co-doped graphene using simple features for the CO2RR.197 Based on feature importance analysis, the Pauling electronegativity (E_M), covalent radius (M_cov), and first ionization energy of metal atoms (1E_M) are the most important parameters on ΔGCO*.

4.3 Stability of SACs

SAC's stability is the prerequisite for constructing high-activity SACs, which should be considered by studying metal–support interactions, aggregation energies, and adsorbate-induced structural changes.198–201 In other words, constructing a strong coordination environment for achieving SACs with strong metal–support interactions is highly desirable and can be achieved by increasing either the anchoring capability of supports or the number of anchor sites.202 The former can be performed by optimizing the coordination environment and the coordination atoms. The latter can be achieved by introducing intrinsic defects and structural engineering through controlling its size and morphology.

In this regard, ML can be applied as a new guideline to efficiently synthesize highly-loaded-yet-stable SACs with strong metal–support interactions.36,203 For example, ML integrated DFT calculations were employed to correlate the stability of SACs embedded on oxide supports with the binding energy (Ebind) and cohesive energy of the bulk metal (Ec). Assisted by ML methods, it was found that the diffusion activation barrier (Ea) correlates with Ebind2/Ec in the physical descriptor space,204 while Ebind was previously explored to be correlated with Ec.205

Designed SACs should be thermodynamically stable with the lowest energy state. Therefore, thermodynamic stability and optimal combination of dual atomic catalysts embedded in graphdiyne were also investigated by using d-band center modifications and formation stability. Using Gaussian process regression (GPR) as the ML algorithm with seven input features, the potential f–d orbital coupling was found as the most important factor in tuning the d-band center with high stability.33 Based on these results, the combination of lanthanide metals and transition metals leads to appropriate stability and activity. The thermodynamic stability of SAACs was also investigated in terms of aggregation energies and adsorbate (O*)-caused changes in the structure by using ML algorithms trained with DFT-calculated data for 38 different SAACs on a Cu support. A GPR model was applied on the aggregation energy and O* adsorption energies with a MAE of 0.092 and 0.091 eV, respectively. Moreover, the GPR model is extendable to other substrates, adsorbates, and larger cluster sizes to address the large number of degrees of freedom and decrease the calculation time.206

The zero-valence stability and electron transfer ability of SACs should also be investigated for the stability by considering the redox process between transition metals and a graphdiyne support using ML and DFT. It was indicated that amongst transition metals, Co, Pd, and Pt show high stability of zero-valence SACs based on the difference of energy barriers between gaining and losing electrons.207 Fuzzy C-Means (FCM) as an unsupervised ML algorithm was used for the separation of DFT-calculated data. The developed ML algorithm has been also applied to create a database capable of screening out SACs embedded in graphdiyne.207 The different number and directions of electron transfer between the transition metals and graphdiyne were also analyzed, finding that the initial one-electron transfer is the most difficult one.

Very recently, the stability of the SAAC configuration based on a ML based approach was examined to investigate the tendency of the promoter atom to diffuse into the bulk material, form surface clusters, or avoid alloying with the host.208 Decision trees, neural networks (NN), and SVM with atomic properties as the input feature were used to analyze DFT-calculated data. Then, a physical bond counting model was combined with a KRR algorithm to expand the domain where the model is useful.

The stability and activity of SACs embedded in NxCy (TM@NxCy) were screened and explored in terms of the structure, coordination, formation energy, structural and electrochemical stability, electronic properties, electrical conductivity, and reaction mechanism for the HER, OER, and ORR using DFT- and ML-based descriptors.209 Among various TM@NxCy SACs, TM@N2C2 shows higher electrochemical catalytic performance, tends to be more easily formed, and possesses longer durability without aggregation or dissolution. In the TM@N2C2 templates, Ni/Ru/Rh/Pt show low HER overpotentials. The ML-based descriptors indicate superior HER, OER, and ORR performances of TM@N2C2 compared to those of benchmark noble metal catalysts. It was shown for the first time that both TM and carbon atoms participates in H adsorption.

Table 1 shows the summary of applied ML algorithms and their applications in SAC designing through input feature engineering and feature importance analysis. The list of abbreviations for Table 1 is presented in Table 2. As shown in Table 1, SVM, KRR, RFR, and DNN are mostly used as the supervised ML algorithms for the design of SACs to describe the relationship between the input features and SAC activity. All the mentioned algorithms are normally applied in Scikit-learn.210 Atomic properties are mainly used as the input features for the design of SACs from which the number of electrons in the d orbital and enthalpy of vaporization are usually the most important input features for ML algorithms. However, the application of ML is limited by the lack of not only a large and high-quality database but also a generalized ML algorithm for further studies in this field.

Table 1 Summary of ML algorithms and their applications in SACs' design. List of abbreviations is presented in Table 2
# Support/substrate ML algorithms Reaction Purpose Input features Most important features Year Ref.
1 CeO2, TiO2, MgO, ZnO, SeTiO3, MoS2, and graphene LASSO, elastic net, ridge Stability E c, Ec−1, Ec0.5, Ec−0.5, Ec2, Ec−2, ln(Ec), Eb, Eb−1, Eb0.5, Eb−0.5, Eb2, Eb−2, ln(Eb), Eb2/Ec (Eb)2/Ec 2020 204
2 Graphdiyne (bi-atom catalysts) GPR Optimal combination of metals for high stability Potential f–d orbital coupling 2021 33
3 Graphdiyne FCM Clustering the data EA, EN, Qe, εd, etc. 2019 207
4 Cu, Ru, Rh, Pd, Ag, Re, Os, Ir, Pt, and Au GKR, SVM, GPR Aggregation energy and ΔGO* AtN, Atwt, AtPN, AtGN, AtR, EN, IE, EA, B01,O*, etc. AtR, EN, and AtGN 2020 206
5 Transition metals DT, SVM, NN, hybrid KRR Stability AtN, Atwt, AtGN, AtR, rcov, PEN, IE1, Ef, de, etc. 2020 208
6 NxCy GNB, LR, KNN, radius neighbor classifier, support vector classifier, NN, DT, RFR, ETR, and GBR HER, OER, and ORR Stability and activity AtN, IE1, etc. 2021 209
7 Graphene KRR, RFR, NN, SISSO HER ΔGH* ε d, rcov, q, dunocc, docc, N, rd, Ef, IE, EN docc and q 2020 166
8 Graphene (dual atom catalysts) RFR ORR U L M1–M2, M12–N, AtR, Ne,O, PEN, IE1, EA of two metals M12–N, M1–M2, and Ne,O 2020 174
9 Carbon FCNN OER η AtR, de, EN, EA, and IE1 de, AtR, and EA 2021 167
10 g-C3N4 GBR OER and ORR ΔGOH* ε d, Qe, EN, EA, IE1, AtR, and de, etc. IE1 and Qe 2021 171
11 Graphene RFR HER, ORR, OER U L de, Hf,ox, PEN, the sum of PEN, etc. de 2020 172
12 2D materials LSBoost HER and N2RR ΔG EN, EA, IE, and diso,e, etc. diso,e for NRR and EA for the HER 2021 31
13 Graphene RFR and SVM ΔGH*, ΔGOH*, ΔGO*, and ΔGOOH* Adsorbate type 2020 175
14 2D materials RFR ORR ΔGO* H f,ox, dpe, EN, EA, IE1, NN, SEN, etc. H f,ox and dpe 2019 173
15 Graphene NN HER EXAFS spectra Experimental EXAFS spectrum 2021 145
16 g-C3N4, CN, and C2N ETR method CO2RR ΔGOH* and ΔGOCH* AtN, de, AtR, EN, Hvap, IE, and EA de and Hvap 2020 176
17 Transition metals SVM, KRR, GBR, GPR, DTR, ETR, RFR, ABR, MLPR, KNR CO2RR ΔGCO*, ΔGCHO*, ΔGCOOH*, ΔGHCOO*, and ΔGCOH* EN, Ne, and ratio of EN and Ne Ratio of EN and Ne 2020 191
18 Au(111) RFR N2RR ΔGN2* AtR, EN, EA, AtGN, de AtGN 2021 165
19 MBenes SVM HER ΔGH* Q, AtR of C, N, and B elements, molar ratio, AtR, and EA of metal q 2020 192
20 MXenes SVM, RFR, ANN, LASSO, KNN, Bayesian HER ΔGH* and E Molar volume of the surface element 2021 193
21 MBenes and 2D-materials LGBM N2RR ΔGN2* N–N bond length 2021 195
22 Graphene Extreme GBR CO2RR and HER ΔGCO* 2020 197
23 Graphdiyne Bag-tree algorithm HER ΔGH* 2020 194
24 Graphdiyne DNN and LGBM N2RR and HER ΔG EN, AtN, AtR, NN, CN, etc. CN 2020 196
25 C2N, C1N1, and C1S1 RFR ORR and OER ΔGO* AtN, AtR, Ne,O, EN, IE1, EA, SEN, Hf,ox H f,ox and Ne,O 2021 164
26 Cu GBR, SVM, RFR CO2RR ΔGCO* N e 2021 252


Table 2 List of abbreviations for Table 1
Abbreviation Explanation
GPR Gaussian process regression
GKR Gaussian kernel regression
GNB Gaussian naive bayes
SVM Support vector machine
LASSO Least absolute shrinkage and selection operator
SISSO Sure independence screening and sparsifying operator
FCM Fuzzy C-means
GBR Gradient boosting regression
LGBM Light gradient boosting machine
LR Logistic regression
KRR Kernel ridge regression
RFR Random forest regression
ERT Extremely randomized trees
NN Neural network
FCNN Full connection neural network
DNN Deep neural network
ANN Artificial neural network
KNN k-nearest neighbors
LSBoost Least-squares boosting
DT Decision tree
DTR Decision tree regression
ETR Extra tree regression
ABR Adaptive boost regression
TPOT Tree-based pipeline optimization tool
MLPR Multilayer perceptron regression
KNR k-neighbor regression
SAC Single atom catalyst
SAAC Single atom alloy catalyst
E c, Eb Cohesive energy of bulk metals, binding energy
AtN, Atwt, AtR Atomic number, atomic weight, atomic radius
AtPN, AtGN Period number, group number
EN Electronegativity
PEN Pauling electronegativity
S EN Sum of the electronegativity of coordinated atoms such as N and C
IE, IE1 Ionization energy, first ionization energy
EA Electron affinity
ε d d-states' center
r cov Covalent radius
r d Zunger radius
N e,O Outer electron number
docc,e Number of occupied d states
de The electron numbers of d orbitals
diso,e Isolated electrons in d orbitals
dpe Adjusted electron numbers of d/p orbitals
N e Number of valance electrons
E f Formation energy of a single atom site
H f,ox Oxide formation enthalpy
H Vap Enthalpy of vaporization
Q, Qe Bader charge, charge transfer of metal atoms
CN Coordination number
N N Number of coordinated N atoms
M1–M2 The distance between the two metal atoms
M12–N The average distance between the two metal atoms and the coordinated N atoms
η Overpotential
ΔG Gibbs free energies
E Adsorption energies
U L Limiting potential
V onset Onset potential


Moreover, based on Table 1, the d-band center, enthalpy of vaporization, Bader charge, ionization energy, electron affinity, covalent radius, the electron numbers in the d orbital, formation energy, oxide formation enthalpy, etc. mainly are used as the key descriptors to describe the catalytic activity of SACs. Still, one of the main hurdles for employing ML in heterogeneous catalyst design is the lack of appropriate descriptors as input features for ML. An appropriate descriptor needs to simultaneously possess: (1) physical interpretation, (2) high simplicity, and (3) relatively high feature importance. To some extent, the black-box nature of ML techniques occasionally makes a physical interpretation of descriptors, such as the d-band center and enthalpy of vaporization, non-trivial. In particular, the d-band center is widely adopted as an efficient descriptor,211 typically with high feature importance to describe the reactivity of SACs. However, the d levels of atomically dispersed metal atoms on a graphene substrate may not form a band that makes evaluating the position of the d-band center impossible. Therefore, frontier molecular orbitals and the density of states (DOS) seem more appropriate descriptors than the d-band center.212 However, obtaining the frontier molecular orbital and DOS requires time-consuming DFT calculations, making this descriptor not worthwhile. In fact, the simplicity of descriptors requires using metal atom and substrate properties, being readily obtained without needing time-consuming DFT calculations. In contrast to Bader charge and DOS, descriptors such as the atomic number, number of electrons in d orbital, ionization energy, and coordination number of metal atoms possess simplicity requirements.

5. Summary and future prospects

Recently ML has gained much interest for rational deign of heterogeneous catalysts due to its potential for robust and fast prediction of catalyst properties by establishing structure–activity relationships. High throughput screening and feature importance analysis can be achieved through deep structure–activity establishment. However, ML is still at an early stage for the design of heterogeneous catalysts. In this review, high throughput screening and feature importance analysis using ML are provided as the guidelines for heterogeneous catalyst screening and discovery. Although much research has been carried out on the application of ML to improve the activity and stability of heterogeneous catalysts and SACs, there are still challenges to be resolved, requiring additional studies as follows:

(1) There remains room for ML to investigate the catalytic performances and stability,213–216 and improve calculated parameters for stable SACs.217,218 In addition, the ML technique can help to investigate the hybridization of SACs,219 atomic interface effect,220 and aggregation energy.206 Moreover, SACs face challenges such as low metal loading, low selectivity and activity, and the lack of catalytic mechanisms.136 Therefore ML can help the community to understand the reaction pathways and the catalytic mechanisms221–225 to improve the selectivity and activity of highly loaded SACs on graphene supports.226–229 In addition, there is a clear need for ML to consider environmental effects, interfacial engineering, SAC coverage, and the potential for agglomeration. ML can be used for the synthesis of highly loaded SACs, multi-metal SACs, and multi-atom catalysts.230,231 In other words, since the structure–activity relationships for nanoclusters and DACs are much more complicated than those of SACs,232 it will be useful to apply ML for predicting adsorption energies for them using new descriptors to consider the synergetic effect of several metals.233

(2) ML techniques continue to improve for studying adsorption energies, overpotentials, and metal–support interactions for various SACs, but the field of predictive SAC synthesis to guide experiments is much needed. Because SACs face tedious preparation processes,8,234,235 ML can accelerate high-throughput experimentation for the synthesis and characterization of SACs.190,236–241 ML can also be applied to predict Faraday efficiency and onset potentials to help understand the volcano plots.

(3) A major hurdle for developing ML-aided heterogeneous catalyst design is the lack of sufficient and consistent datasets, data scarcity, bias, and noise from both experiments and QM calculations, which is a high priority to avoid overfitting.48 In order to solve this issue, active learning and transfer learning can be applied, which are efficient in compensating for the lack of data. In other words, having a large database composed of DFT-calculated and experimental data is required to train the generalized ML algorithm for systematic and comprehensive discovery of SACs. We expect that in the near future, with a huge database and a universal ML algorithm, the applicability of theoretical calculations for electroreduction reactions using SACs will be improved greatly.242 In addition, the vast parameter space for dynamic catalysts requires applying ML to screen candidate catalysts by predicting the regions with high selectivity and operability.243 The effect of the coordination number, coordination atoms, designed bond length, and bond angle on the current density, overpotentials, and reaction mechanism should be considered through ML.244–246

(4) ML has the potential to predict the properties of SACs very quickly and accurately, but its application has been limited to specific systems using various ML algorithms. Therefore, a fair comparison to assess the strengths and best use of different ML algorithms is needed. Also, similar to ML-aided retrosynthesis and reaction planning,72,247,248 a strong need is the development of a universal (generalized) ML algorithm that changes ML from a supportive tool to a surrogate tool for SAC design. This universal ML algorithm should be extended to widespread SACs and supports for all electroreduction reactions toward efficient and cost-effective potential SACs to balance between the activity and stability.249

(5) In Table 1, two-dimensional (2D) materials leading to reduced computational cost due to their simplicity in structure are shown. However, three-dimensional (3D) materials, such as oxides and nitrides,250,251 play a major role in catalysis and should be extensively investigated by using existing or new ML algorithms.

Conflicts of interest

These authors respectfully declare that there are no conflicts of interest to acknowledge for this research.

Acknowledgements

Z. L. acknowledges support from the RGC (16304421), Innovation and Technology Commission (ITC-CNERC14SC01), Guangdong Science and Technology Department (Project#: 2020A0505090003), Research Fund of Guangdong-Hong Kong-Macao Joint Laboratory for Intelligent Micro-Nano Optoelectronic Technology (No. 2020B1212030010), IER foundation (HT-JD-CXY-201907), and Shenzhen Special Fund for Central Guiding the Local Science and Technology Development (2021Szvup136). Technical assistance from the Materials Characterization and Preparation Facilities of HKUST is greatly appreciated. W. A. G. acknowledges support from the DOE Liquid Sunlight Alliance (LiSA) (DE-SC0021266) and the US National Science Foundation (NSF CBET-2005250).

References

  1. Y. Liu, O. C. Esan, Z. Pan and L. An, Machine learning for advanced energy materials, Energy and AI, 2021, 3, 100049 CrossRef.
  2. D. Lemm, G. F. von Rudorff and O. A. von Lilienfeld, Machine learning based energy-free structure predictions of molecules, transition states, and solids, Nat. Commun., 2021, 12, 1–10 CrossRef PubMed.
  3. Y. Guo, et al., Machine-Learning-Guided Discovery and Optimization of Additives in Preparing Cu Catalysts for CO2Reduction, J. Am. Chem. Soc., 2021, 143, 5755–5762 CrossRef CAS PubMed.
  4. Z. J. Baum, et al., Artificial Intelligence in Chemistry: Current Trends and Future Directions, J. Chem. Inf. Model., 2021, 61, 3197–3212 CrossRef CAS PubMed.
  5. F. Faber, A. Lindmaa, O. Anatole Von Lilienfeld and R. Armiento, Crystal Structure Representations for Machine Learning Models of Formation Energies, Int. J. Quantum Chem., 2015, 115(16), 1094–1101 CrossRef CAS.
  6. G. Pilania, A. Mannodi-Kanakkithodi, B. P. Uberuaga, R. Ramprasad, J. E. Gubernatis and T. Lookman, Machine learning bandgaps of double perovskites, Sci. Rep., 2016, 6(1), 1–10 CrossRef PubMed.
  7. S.-D. Huang, C. Shang, X.-J. Zhang and Z.-P. Liu, Material discovery by combining stochastic surface walking global optimization with a neural network, Chem. Sci., 2017, 8(9), 6327–6337 RSC.
  8. Y. Han, et al., Machine-learning-driven synthesis of carbon dots with enhanced quantum yields, ACS Nano, 2020, 14, 14761–14768 CrossRef PubMed.
  9. H. Yin, et al., The data-intensive scientific revolution occurring where two-dimensional materials meet machine learning, Cell Rep. Phys. Sci., 2021, 2, 100482 CrossRef.
  10. Y. Liu, T. Zhao, W. Ju and S. Shi, Materials discovery and design using machine learning, J. Materiomics, 2017, 3, 159–177 CrossRef.
  11. S. Kito and T. Hattori, Neural network as a tool for catalyst development, Catal. Today, 1995, 23, 347–355 CrossRef.
  12. S. Kito, T. Hattori and Y. Murakami, Estimation of the acid strength of mixed oxides by a neural network, Ind. Eng. Chem. Res., 1992, 31, 979–981 CrossRef CAS.
  13. T. Toyao, et al., Machine Learning for Catalysis Informatics: Recent Applications and Prospects, ACS Catal., 2020, 10, 2260–2297 CrossRef CAS.
  14. C. Chen, et al., A Critical Review of Machine Learning of Energy Materials, Adv. Energy Mater., 2020, 10, 1–36 Search PubMed.
  15. L. Chanussot, A. Das, S Goyal, T. Lavril, M. Shuaibi, M. Riviere, K. Tran, J. Heras-Domingo, C. Ho, W. Hu, A. Palizhati, A. Sriram, B. Wood, J. Yoon, D. Parikh, C. Lawrence Zitnick and Z. Ulissi, Open Catalyst 2020 (OC20) Dataset and Community Challenges, ACS Catal., 2021, 11(10), 6059–6072 CrossRef CAS.
  16. I. Funes-Ardoiz and F. Schoenebeck, Established and Emerging Computational Tools to Study Homogeneous Catalysis—From Quantum Mechanics to Machine Learning, Chem, 2020, 6, 1904–1913 CAS.
  17. Z. Li, S. Wang and H. Xin, Toward artificial intelligence in catalysis, Nat. Catal., 2018, 1, 641–642 CrossRef.
  18. P. Schlexer Lamoureux, et al., Machine Learning for Computational Heterogeneous Catalysis, ChemCatChem, 2019, 11, 3581–3601 CrossRef CAS.
  19. J. G. Freeze, H. R. Kelly and V. S. Batista, Search for Catalysts by Inverse Design: Artificial Intelligence, Mountain Climbers, and Alchemists, Chem. Rev., 2019, 119(11), 6595–6612 CrossRef CAS PubMed.
  20. J. A. Keith, et al., Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems, Chem. Rev., 2021, 121(16), 9816–9872 CrossRef CAS PubMed.
  21. J. Xu, X.-M. Cao and P. Hu, Perspective on computational reaction prediction using machine learning methods in heterogeneous catalysis, Phys. Chem. Chem. Phys., 2021, 23, 11155 RSC.
  22. W. Yang, T. T. Fidelis and W.-H. Sun, Machine Learning in Catalysis, From Proposal to Practicing, ACS Omega, 2019, 5(1), 83–88 CrossRef PubMed.
  23. M. Erdem Günay and R. Yıldırım, Recent advances in knowledge discovery for heterogeneous catalysis using machine learning, Catal. Rev., 2021, 63, 120–164 CrossRef.
  24. W. Liu, et al., Molecular Dynamics and Machine Learning in Catalysts, Catalysts, 2021, 11, 1129 CrossRef CAS.
  25. Z. Yu and W. Huang, Accelerating Optimizing the Design of Carbon-based Electrocatalyst Via Machine Learning, Electroanalysis, 2021, 34(4), 599–607 CrossRef.
  26. Y. Guan, et al., Machine Learning in Solid Heterogeneous Catalysis: Recent Developments, Challenges and Perspectives, Chem. Eng. Sci., 2021, 248, 117224 CrossRef.
  27. N. V. Orupattur, S. H. Mushrif and V. Prasad, Catalytic materials and chemistry development using a synergistic combination of machine learning and ab initio methods, Comput. Mater. Sci., 2020, 174, 109474 CrossRef CAS.
  28. J. G. Freeze, H. R. Kelly and V. S. Batista, Search for Catalysts by Inverse Design: Artificial Intelligence, Mountain Climbers, and Alchemists, Chem. Rev., 2019, 119, 6595–6612 CrossRef CAS PubMed.
  29. A. J. Medford, M. Ross Kunz, S. M. Ewing, T. Borders and R. Fushimi, Extracting Knowledge from Data through Catalysis Informatics, ACS Catal., 2018, 8(8), 7403–7429 CrossRef CAS.
  30. K. Takahashi, et al., The Rise of Catalyst Informatics: Towards Catalyst Genomics, ChemCatChem, 2019, 11(4), 1146–1152 CrossRef CAS.
  31. Z. W. Chen, et al., Machine-learning-accelerated discovery of single-atom catalysts based on bidirectional activation mechanism, Chem Catalysis, 2021, 1(1), 183–195 CrossRef.
  32. A. Soyemi and T. Szilvási, Trends in computational molecular catalyst design, Dalton Trans., 2021, 50(30), 10325–10339 RSC.
  33. M. Sun, et al., Self-Validated Machine Learning Study of Graphdiyne-Based Dual Atomic Catalyst, Adv. Energy Mater., 2021, 11(13), 2003796–2003807 CrossRef CAS.
  34. A. Haywood, et al., Kernel Methods for Predicting Yields of Chemical Reactions, J. Chem. Inf. Model., 2022, 62(9), 2077–2092 CrossRef CAS PubMed.
  35. W. S. Noble, What is a support vector machine?, Nat. Biotechnol., 2006, 24, 1565–1567 CrossRef CAS PubMed.
  36. J. Schmidt, M. R. G. Marques, S. Botti and M. A. L. Marques, Recent advances and applications of machine learning in solid-state materials science, npj Comput. Mater., 2019, 5(1), 1–31 CrossRef.
  37. V. Svetnik, et al., Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling, J. Chem. Inf. Comput. Sci., 2003, 43, 1947–1958 CrossRef CAS PubMed.
  38. L. Breiman, Random forests, Mach. Learn., 2001, 45, 5–32 CrossRef.
  39. D. T. Ahneman, J. G. Estrada, S. Lin, S. D. Dreher and A. G. Doyle, Predicting reaction performance in C–N cross-coupling using machine learning, Science, 2018, 360, 186–190 CrossRef CAS PubMed.
  40. C. Strobl, A.-L. Boulesteix, A. Zeileis and T. Hothorn, Bias in random forest variable importance measures: Illustrations, sources and a solution, BMC Bioinf., 2007, 8, 1–21 CrossRef PubMed.
  41. A. Malek, et al., A Data-Driven Framework for the Accelerated Discovery of CO2 Reduction Electrocatalysts, Front. Energy Res., 2021, 9, 1–15 CrossRef.
  42. S. A. Palkovits, Primer about Machine Learning in Catalysis-A Tutorial with Code, ChemCatChem, 2020, 12, 3995–4008 CrossRef CAS.
  43. B. Huang and O. Anatole Von Lilienfeld, Ab Initio Machine Learning in Chemical Compound Space, Chem. Rev., 2021, 121(16), 10001–10036 CrossRef CAS PubMed.
  44. A. Wei, H. Ye, Z. Guo and J. Xiong, SISSO-assisted prediction and design of mechanical properties of porous graphene with a uniform nanopore array, Nanoscale Adv., 2022, 4, 1455–1463 RSC.
  45. C. E. Rasmussen, Gaussian Processes in Machine Learning, in Advanced Lectures on Machine Learning. ML 2003. Lecture Notes in Computer Science, ed. O. Bousquet, U. von Luxburg and G. Rätsch, Springer, Berlin, Heidelberg, 2004, vol. 3176,  DOI:10.1007/978-3-540-28650-9_4.
  46. T. Gupta, M. Zaki and N. M. A. Krishnan, MatSciBERT: A Materials Domain Language Model for Text Mining and Information Extraction, npj Comput. Mater., 2021, 8(1), 1–11 Search PubMed.
  47. J. Žižka, F. Dařena and A. Svoboda, Text Mining with Machine Learning: Principles and Techniques, CRC Press, 2019 Search PubMed.
  48. Y. Gambo, et al., Catalyst design and tuning for oxidative dehydrogenation of propane – A review, Appl. Catal., A, 2021, 609, 117914 CrossRef CAS.
  49. J. R. Kitchin, Machine learning in catalysis, Nat. Catal., 2018, 1, 230–232 CrossRef.
  50. B. R. Goldsmith, J. Esterhuizen, J. X. Liu, C. J. Bartel and C. Sutton, Machine learning for heterogeneous catalyst design and discovery, AIChE J., 2018, 64, 2311–2323 CrossRef CAS.
  51. Z. Li, L. E. K. Achenie and H. Xin, An Adaptive Machine Learning Strategy for Accelerating Discovery of Perovskite Electrocatalysts, ACS Catal., 2020, 10(7), 4377–4384 CrossRef CAS.
  52. X.-T. Li, L. Chen, G.-F. Wei, C. Shang and Z.-P. Liu, Sharp Increase in Catalytic Selectivity in Acetylene Semihydrogenation on Pd Achieved by a Machine Learning Simulation-Guided Experiment, ACS Catal., 2020, 10(17), 9694–9705 CrossRef CAS.
  53. M. Tamtaji, et al., Singlet Oxygen Photosensitization Using Graphene-Based Structures and Immobilized Dyes: A Review, ACS Appl. Nano Mater., 2021, 4, 7563–7586 CrossRef CAS.
  54. X. Ma, Z. Li, L. E. K. Achenie and H. Xin, Machine-Learning-Augmented Chemisorption Model for CO2 Electroreduction Catalyst Screening, J. Phys. Chem. Lett., 2015, 6, 3528–3533 CrossRef CAS PubMed.
  55. M. L. Mohammed, et al., Optimisation of alkene epoxidation catalysed by polymer supported Mo(VI) complexes and application of artificial neural network for the prediction of catalytic performances, Appl. Catal., A, 2013, 466, 142–152 CrossRef CAS.
  56. M. Sasaki, H. Hamada, Y. Kintaichi and T. Ito, Application of a neural network to the analysis of catalytic reactions Analysis of NO decomposition over Cu/ZSM-5 zeolite, Appl. Catal., A, 1995, 132(2), 261–270 CrossRef CAS.
  57. B. Selvaratnam and R. T. Koodali, Machine learning in experimental materials chemistry, Catal. Today, 2020, 371, 77–84 CrossRef.
  58. Q. Tao, P. Xu, M. Li and W. Lu, Machine learning for perovskite materials design and discovery, npj Comput. Mater., 2021, 7, 1–18 CrossRef.
  59. L. Shi, D. Chang, X. Ji and W. Lu, Using Data Mining to Search for Perovskite Materials with Higher Specific Surface Area, J. Chem. Inf. Model., 2018, 58, 2420–2427 CrossRef CAS PubMed.
  60. M. Mowbray, et al., Machine learning for biochemical engineering: A review, Biochem. Eng. J., 2021, 172, 108054 CrossRef CAS.
  61. A. Coşgun, M. E. Günay and R. Yıldırım, Exploring the critical factors of algal biomass and lipid production for renewable fuel production by machine learning, Renewable Energy, 2021, 163, 1299–1317 CrossRef.
  62. N. Alper Tapan, R. Yıldırım and M. Erdem Günay, Analysis of past experimental data in literature to determine conditions for high performance in biodiesel production, Biofuels, Bioprod. Biorefin., 2016, 10, 422–434 CrossRef CAS.
  63. K. McCullough, T. Williams, K. Mingle, P. Jamshidi and J. Lauterbach, High-throughput experimentation meets artificial intelligence: A new pathway to catalyst discovery, Phys. Chem. Chem. Phys., 2020, 22, 11174–11196 RSC.
  64. C. Desgranges and J. Delhommelle, Towards a machine learned thermodynamics: Exploration of free energy landscapes in molecular fluids, biological systems and for gas storage and separation in metal-organic frameworks, Mol. Syst. Des. Eng., 2021, 6, 52–65 RSC.
  65. S. B. Torrisi, et al., Random forest machine learning models for interpretable X-ray absorption near-edge structure spectrum-property relationships, npj Comput. Mater., 2020, 6(1), 1–11 CrossRef.
  66. M. Basyaruddin Abdul Rahman, et al., Application of Artificial Neural Network for Yield Prediction of Lipase-Catalyzed Synthesis of Dioctyl Adipate, Appl. Biochem. Biotechnol., 2009, 158(3), 722–735 CrossRef PubMed.
  67. I. Miyazato, T. N. Nguyen, L. Takahashi, T. Taniike and K. Takahashi, Representing Catalytic and Processing Space in Methane Oxidation Reaction via Multioutput Machine Learning, J. Phys. Chem. Lett., 2021, 12, 814 Search PubMed.
  68. B. Kunkel, A. Kabelitz, A. G. Buzanich and S. Wohlrab, Increasing the Efficiency of Optimized V-SBA-15 Catalysts in the Selective Oxidation of Methane to Formaldehyde by Artificial Neural Network Modelling, Catalysts, 2020, 10, 1411 CrossRef CAS.
  69. N. Artrith, Z. Lin and J. G. Chen, Predicting the Activity and Selectivity of Bimetallic Metal Catalysts for Ethanol Reforming using Machine Learning, ACS Catal., 2021, 22, 41 Search PubMed.
  70. S. Ma and Z.-P. Liu, Machine Learning for Atomic Simulation and Activity Prediction in Heterogeneous Catalysis: Current Status and Future, ACS Catal., 2020, 10(22), 13213–13226 CrossRef CAS.
  71. T. Yang, et al., High-Throughput Identification of Exfoliable Two-Dimensional Materials with Active Basal Planes for Hydrogen Evolution, ACS Energy Lett., 2020, 2313 CrossRef CAS.
  72. P. Schwaller, et al., Molecular Transformer: A Model for Uncertainty-Calibrated Chemical Reaction Prediction, ACS Cent. Sci., 2019, 5, 1572–1583 CrossRef CAS PubMed.
  73. C. W. Coley, W. H. Green and K. F. Jensen, Machine Learning in Computer-Aided Synthesis Planning, Acc. Chem. Res., 2018, 51, 1281–1289 CrossRef CAS PubMed.
  74. U. Zavyalova, M. Holena, R. Schlçgl and M. Baerns, Statistical Analysis of Past Catalytic Data on Oxidative Methane Coupling for New Insights into the Composition of High-Performance Catalysts, ChemCatChem, 2011, 3(12), 1935–1947 CrossRef CAS.
  75. S. Hyun Woo Kim, et al., Reaction condition optimization for non-oxidative conversion of methane using artificial intelligence, React. Chem. Eng., 2021, 6(2), 235–243 RSC.
  76. C. Wulf, et al., A Unified Research Data Infrastructure for Catalysis Research – Challenges and Concepts, ChemCatChem, 2021, 13(14), 3223–3236 CrossRef CAS.
  77. E. Kim, et al., Materials Synthesis Insights from Scientific Literature via Text Extraction and Machine Learning, Chem. Mater., 2017, 29, 31 Search PubMed.
  78. J. Ohyama, et al., Catalysis Science & Technology Direct design of active catalysts for low temperature oxidative coupling of methane via machine learning and data mining, Catal. Sci. Technol., 2021, 11, 524 RSC.
  79. L. Takahashi, et al., Constructing catalyst knowledge networks from catalyst big data in oxidative coupling of methane for designing catalysts, Chem. Sci., 2021, 12(38), 12546–12555 RSC.
  80. K. Takahashi, L. Takahashi, T. N. Nguyen, A. Thakur and T. Taniike, Multidimensional Classification of Catalysts in Oxidative Coupling of Methane through Machine Learning and High-Throughput Data, J. Phys. Chem. Lett., 2020, 11, 6819–6826 CrossRef CAS PubMed.
  81. T. N. Nguyen, et al., Learning Catalyst Design Based on Bias-Free Data Set for Oxidative Coupling of Methane, ACS Catal., 2021, 11, 1797–1809 CrossRef CAS.
  82. S. Nakanowatari, et al., Extraction of Catalyst Design Heuristics from Random Catalyst Dataset and their Utilization in Catalyst Development for Oxidative Coupling of Methane, ChemCatChem, 2021, 13, 3262–3269 CrossRef CAS.
  83. J. Ohyama, S. Nishimura and K. Takahashi, Data Driven Determination of Reaction Conditions in Oxidative Coupling of Methane via Machine Learning, ChemCatChem, 2019, 11, 4307–4313 CrossRef CAS.
  84. W. Wang, et al., Automated pipeline for superalloy data by text mining, npj Comput. Mater., 2022, 8, 1–12 CrossRef.
  85. S. Nishimura, et al., Revisiting Machine Learning Predictions for Oxidative Coupling of Methane (OCM) based on Literature Data ChemCatChem, ChemCatChem, 2020, 12, 5888–5892 CrossRef CAS.
  86. S. Mine, et al., Analysis of Updated Literature Data up to 2019 on the Oxidative Coupling of Methane Using an Extrapolative Machine-Learning Method to Identify Novel Catalysts, ChemCatChem, 2021, 13(16), 3636–3655 CrossRef CAS.
  87. K. Suzuki, et al., Statistical Analysis and Discovery of Heterogeneous Catalysts Based on Machine Learning from Diverse Published Data, ChemCatChem, 2019, 11, 4537–4547 CrossRef CAS.
  88. Y. Chen, R. Li, H. Suo and C. Liu, Evaluation of a Data-Driven, Machine Learning Approach for Identifying Potential Candidates for Environmental Catalysts: From Database Development to Prediction, ACS ES&T Engg, 2021, 1(8), 1246–1257 Search PubMed.
  89. Ç. Odabaşi, M. E. Günay and R. Yildirim, Knowledge extraction for water gas shift reaction over noble metal catalysts from publications in the literature between 2002 and 2012, Int. J. Hydrogen Energy, 2014, 39, 5733–5746 CrossRef.
  90. A. Smith, A. Keane, J. A. Dumesic, G. W. Huber and V. M. Zavala, A machine learning framework for the analysis and prediction of catalytic activity from experimental data, Appl. Catal., B, 2020, 263, 118257 CrossRef CAS.
  91. K. A. Brown, S. Brittman, D. Jariwala and U. Celano, Machine Learning in Nanoscience: Big Data at Small Scales, Nano Lett., 2021, 20(1), 2–10 CrossRef PubMed.
  92. T. Nhat Nguyen, et al., High-Throughput Experimentation and Catalyst Informatics for Oxidative Coupling of Methane, ACS Catal., 2021, 10(2), 921–932 CrossRef.
  93. H. Tang, A. Hosein and M. Mattioli-Belmonte, Traditional Chinese Medicine and orthopedic biomaterials: Host of opportunities from herbal extracts, Mater. Sci. Eng., C, 2021, 120, 111760 CrossRef CAS PubMed.
  94. Y. Shi, P. L. Prieto, T. Zepel, S. Grunert and J. E. Hein, Automated Experimentation Powers Data Science in Chemistry, Acc. Chem. Res., 2021, 54, 31 CrossRef PubMed.
  95. N. J. Szymanski, et al., Toward autonomous design and synthesis of novel inorganic materials, Mater. Horiz., 2021, 8, 2169–2198 RSC.
  96. Y. Xiao, C. Shen and N. Hadaeghi, Quantum Mechanical Screening of 2D MBenes for the Electroreduction of CO2 to C1 Hydrocarbon Fuels, J. Phys. Chem. Lett., 2021, 12, 6370–6382 CrossRef CAS PubMed.
  97. E. Stach, et al., Autonomous experimentation systems for materials development: A community perspective, Matter, 2021, 4, 2702–2726 CrossRef.
  98. O. Stroyuk, et al., High-Throughput Robotic Synthesis and Photoluminescence Characterization of Aqueous Multinary Copper-Silver Indium Chalcogenide Quantum Dots, Part. Part. Syst. Charact., 2021, 38(10), 2100169 CrossRef CAS.
  99. O. A. Moses, et al., Integration of data-intensive, machine learning and robotic experimental approaches for accelerated discovery of catalysts in renewable energy-related reactions, Materials Reports: Energy, 2021, 1, 100049 CrossRef.
  100. B. Burger, et al., A mobile robotic chemist, Nature, 2020, 583(7815), 237–241 CrossRef CAS PubMed.
  101. L. Ge, et al., Predicted Optimal Bifunctional Electrocatalysts for the Hydrogen Evolution Reaction and the Oxygen Evolution Reaction Using Chalcogenide Heterostructures Based on Machine Learning Analysis of in Silico Quantum Mechanics Based High Throughput Screening, J. Phys. Chem. Lett., 2020, 11, 869–876 CrossRef CAS PubMed.
  102. R. Jinnouchi, H. Hirata and R. Asahi, Extrapolating Energetics on Clusters and Single-Crystal Surfaces to Nanoparticles by Machine-Learning Scheme, J. Phys. Chem. C, 2017, 121(47), 26397–26405 CrossRef.
  103. R. Jinnouchi and R. Asahi, Predicting Catalytic Activity of Nanoparticles by a DFT-Aided Machine-Learning Algorithm, J. Phys. Chem. Lett., 2017, 8, 4279–4283 CrossRef CAS PubMed.
  104. K. Tran and Z. W. Ulissi, Active learning across intermetallics to guide discovery of electrocatalysts for CO2 reduction and H2 evolution, Nat. Catal., 2018, 1, 696–703 CrossRef CAS.
  105. Y. Huang, et al., Mechanistic Understanding and Design of Non-noble Metal-based Single-atom Catalysts Supported on Two-dimensional Materials for CO2 Electroreduction, J. Mater. Chem. A, 2021, 10(11), 5813–5834 RSC.
  106. M. Zhong, et al., Accelerated discovery of CO2 electrocatalysts using active machine learning, Nature, 2020, 581, 178–183 CrossRef CAS PubMed.
  107. Z. W. Ulissi, A. J. Medford, T. Bligaard and J. K. Nørskov, To address surface reaction network complexity using scaling relations machine learning and DFT calculations, Nat. Commun., 2017, 8(1), 1–7 CrossRef PubMed.
  108. T. Toyao, et al., Toward Effective Utilization of Methane: Machine Learning Prediction of Adsorption Energies on Metal Alloys, J. Phys. Chem. C, 2018, 122, 8315–8326 CrossRef CAS.
  109. J. A. Esterhuizen, B. R. Goldsmith and S. Linic, Theory-Guided Machine Learning Finds Geometric Structure-Property Relationships for Chemisorption on Subsurface Alloys, Chem, 2020, 6, 3100–3117 CAS.
  110. R. B. Wexler, J. Mark, P. Martirez and A. M. Rappe, Chemical Pressure-Driven Enhancement of the Hydrogen Evolving Activity of Ni 2P from Nonmetal Surface Doping Interpreted via Machine Learning, J. Am. Chem. Soc., 2018, 140, 4678–4683 CrossRef CAS PubMed.
  111. Z. Li, X. Ma and H. Xin, Feature engineering of machine-learning chemisorption models for catalyst design, Catal. Today, 2017, 280, 232–238 CrossRef CAS.
  112. Z. Li, S. Wang, W. S. Chin, L. E. Achenie and H. Xin, High-throughput screening of bimetallic catalysts enabled by machine learning, J. Mater. Chem. A, 2017, 5, 24131–24138 RSC.
  113. I. Takigawa, K.-I. Shimizu, K. Tsuda and S. Takakusagi, Machine-learning prediction of the d-band center for metals and bimetals, RSC Adv., 2016, 6(58), 52587–52595 RSC.
  114. I. Tanaka, Nanoinformatics, Springer Nature, 2018 Search PubMed.
  115. C. Liu, et al., Frontier Molecular Orbital Based Analysis of Solid-Adsorbate Interactions over Group 13 Metal Oxide Surfaces, J. Phys. Chem. C, 2020, 124, 15355–15365 CrossRef CAS.
  116. Z. H. Liu, T. T. Shi and Z. X. Chen, Machine learning prediction of monatomic adsorption energies with non-first-principles calculated quantities, Chem. Phys. Lett., 2020, 755, 137772 CrossRef CAS.
  117. X. Wang, et al., Accelerating 2D MXene catalyst discovery for the hydrogen evolution reaction by computer-driven workflow and an ensemble learning strategy, J. Mater. Chem. A, 2020, 8, 23488–23497 RSC.
  118. J. Zheng, et al., High-Throughput Screening of Hydrogen Evolution Reaction Catalysts in MXene Materials, J. Phys. Chem. C, 2020, 124, 13695–13705 CrossRef CAS.
  119. Y. Sun, et al., Covalency competition dominates the water oxidation structure–activity relationship on spinel oxides, Nat. Catal., 2020, 3, 554–563 CrossRef CAS.
  120. K. W. Ting, et al., Catalytic Methylation of m-Xylene, Toluene, and Benzene Using CO2 and H2 over TiO2-Supported Re and Zeolite Catalysts: Machine-Learning-Assisted Catalyst Optimization, ACS Catal., 2021, 11, 5829–5838 CrossRef CAS.
  121. M. Rück, B. Garlyyev, F. Mayr, A. S. Bandarenka and A. Gagliardi, Oxygen Reduction Activities of Strained Platinum Core–Shell Electrocatalysts Predicted by Machine Learning, J. Phys. Chem. Lett., 2020, 11, 1773–1780 CrossRef PubMed.
  122. L. Ward, A. Agrawal, A. Choudhary and C. Wolverton, A general-purpose machine learning framework for predicting properties of inorganic materials, npj Comput. Mater., 2016, 2(1), 1–7 CrossRef.
  123. Q. Yang, et al., Revealing property-performance relationships for efficient CO2 hydrogenation to higher hydrocarbons over Fe-based catalysts: Statistical analysis of literature data and its experimental validation, Appl. Catal., B, 2021, 282, 119554 CrossRef CAS.
  124. O. Mamun, K. T. Winther, J. R. Boes and T. Bligaard, High-throughput calculations of catalytic properties of bimetallic alloy surfaces, Sci. Data, 2019, 6(1), 1–9 CrossRef CAS PubMed.
  125. V. Fung, J. Zhang, E. Juarez and B. G. Sumpter, Benchmarking graph neural networks for materials chemistry, npj Comput. Mater., 2021, 7(1), 1–8 CrossRef.
  126. J. Xu, X.-M. Cao and P. Hu, Improved Prediction for the Methane Activation Mechanism on Rutile Metal Oxides by a Machine Learning Model with Geometrical Descriptors, J. Phys. Chem. C, 2019, 123(47), 28802–28810 CrossRef.
  127. Y. Chen, Y. Huang, T. Cheng and W. A. Goddard, Identifying Active Sites for CO2 Reduction on Dealloyed Gold Surfaces by Combining Machine Learning with Multiscale Simulations, J. Am. Chem. Soc., 2019, 141, 11651–11657 CrossRef CAS PubMed.
  128. Z. W. Ulissi, et al., Machine-learning methods enable exhaustive searches for active Bimetallic facets and reveal active site motifs for CO2 reduction, ACS Catal., 2017, 7, 6600–6608 CrossRef CAS.
  129. D. Ologunagba and S. Kattel, Machine learning prediction of surface segregation energies on low index bimetallic surfaces, Energies, 2020, 13, 20–25 CrossRef.
  130. A. R. Singh, B. A. Rohr, J. A. Gauthier and J. K. Nørskov, Predicting Chemical Reaction Barriers with a Machine Learning Model, Catal. Lett., 2019, 149, 2347–2354 CrossRef CAS.
  131. Z. W. Ulissi, A. R. Singh, C. Tsai and J. K. Nørskov, Automated Discovery and Construction of Surface Phase Diagrams Using Machine Learning, J. Phys. Chem. Lett., 2016, 7, 3931–3935 CrossRef CAS PubMed.
  132. B. Weng, et al., Simple descriptor derived from symbolic regression accelerating the discovery of new perovskite catalysts, Nat. Commun., 2020, 11, 1–8 CrossRef PubMed.
  133. S. Ding, M. J. Hülsey, J. Pérez-Ramírez and N. Yan, Transforming Energy with Single-Atom Catalysts, Joule, 2019, 3, 2897–2929 CrossRef CAS.
  134. M. D. Hossain, Y. Huang, T. H. Yu, W. A. Goddard and Z. Luo, Reaction mechanism and kinetics for CO2 reduction on nickel single atom catalysts from quantum mechanics, Nat. Commun., 2020, 11, 1–14 CrossRef PubMed.
  135. Y. Gu, et al., Atomic-Scale Tailoring and Molecular-Level Tracking of Oxygen-Containing Tungsten Single-Atom Catalysts with Enhanced Singlet Oxygen Generation, ACS Appl. Mater. Interfaces, 2021, 13(31), 37142–37151 CrossRef CAS PubMed.
  136. C. Chen, Z. Zhang, G. Li, L. Li and Z. Lin, Recent Advances on Nanomaterials for Electrocatalytic CO2 Conversion, Energy Fuels, 2021, 35, 7485–7510 CrossRef CAS.
  137. K. Sun, et al., Electrochemical Oxygen Reduction to Hydrogen Peroxide via a Two-Electron Transfer Pathway on Carbon-Based Single-Atom Catalysts, Adv. Mater. Interfaces, 2021, 8, 1–16 Search PubMed.
  138. A. Wang, J. Li and T. Zhang, Heterogeneous single-atom catalysis, Nat. Rev. Chem., 2018, 2, 65–81 CrossRef CAS.
  139. H. Xu, D. Cheng, D. Cao and X. C. Zeng, A universal principle for a rational design of single-atom electrocatalysts, Nat. Catal., 2018, 1, 339–348 CrossRef CAS.
  140. M. D. Hossain, et al., Rational Design of Graphene-Supported Single Atom Catalysts for Hydrogen Evolution Reaction, Adv. Energy Mater., 2019, 9, 1–10 CAS.
  141. L. Li, X. Chang, X. Lin, Z. J. Zhao and J. Gong, Theoretical insights into single-atom catalysts, Chem. Soc. Rev., 2020, 49, 8156–8178 RSC.
  142. D. Liu, Q. He, S. Ding and L. Song, Structural Regulation and Support Coupling Effect of Single-Atom Catalysts for Heterogeneous Catalysis, Adv. Energy Mater., 2020, 10(32), 2001482 CrossRef CAS.
  143. M. R. Dobbelaere, P. P. Plehiers, R. Van de Vijver, C. V. Stevens and K. M. Van Geem, Machine Learning in Chemical Engineering: Strengths, Weaknesses, Opportunities, and Threats, Engineering, 2021, 7(9), 1201–1211 CrossRef CAS.
  144. K. Wang, et al., Metal-free nitrogen -doped carbon nanosheets: A catalyst for the direct synthesis of imines under mild conditions, Green Chem., 2019, 21, 2448–2461 RSC.
  145. X. Liu, et al., Identifying the Activity Origin of a Cobalt Single-Atom Catalyst for Hydrogen Evolution Using Supervised Learning, Adv. Funct. Mater., 2021, 1–9, 2100547 CrossRef.
  146. L. Wu, T. Guo and T. Li, Rational design of transition metal single-Atom electrocatalysts: A simulation-based, machine learning-Accelerated study, J. Mater. Chem. A, 2020, 8, 19290–19299 RSC.
  147. M. Wang and H. Zhu, Machine Learning for Transition-Metal-Based Hydrogen Generation Electrocatalysts, ACS Catal., 2021, 11(7), 3930–3937 CrossRef CAS.
  148. M. G. Kibria, et al., Electrochemical CO2 Reduction into Chemical Feedstocks: From Mechanistic Electrocatalysis Models to System Design, Adv. Mater., 2019, 31, 1–24 CrossRef PubMed.
  149. J. Kim, D. Kang, S. Kim and H. W. Jang, Catalyze Materials Science with Machine Learning, ACS Mater. Lett., 2021, 3(8), 1151–1171 CrossRef CAS.
  150. L. Chen, X. Xu, W. Yang and J. Jia, Recent advances in carbon-based electrocatalysts for oxygen reduction reaction, Chin. Chem. Lett., 2020, 31, 626–634 CrossRef CAS.
  151. D. Johnson, Z. Qiao and A. Djire, Progress and Challenges of Carbon Dioxide Reduction Reaction on Transition Metal Based Electrocatalysts, ACS Appl. Energy Mater., 2021, 4(9), 8661–8684 CrossRef CAS.
  152. R. T. Hannagan, G. Giannakakis, M. Flytzani-Stephanopoulos and E. C. H. Sykes, Single-Atom Alloy Catalysis, Chem. Rev., 2020, 120, 12044–12088 CrossRef CAS PubMed.
  153. S. Saxena, T. S. Khan, F. Jalid, M. Ramteke and M. A. Haider, In silico high throughput screening of bimetallic and single atom alloys using machine learning and ab initio microkinetic modelling, J. Mater. Chem. A, 2020, 8, 107–123 RSC.
  154. A. Dasgupta, Y. Gao, S. R. Broderick, E. B. Pitman and K. Rajan, Machine Learning-Aided Identification of Single Atom Alloy Catalysts, J. Phys. Chem. C, 2020, 124, 14158–14166 CrossRef CAS.
  155. R. A. Hoyt, et al., Machine Learning Prediction of H Adsorption Energies on Ag Alloys, J. Chem. Inf. Model., 2019, 59, 1357–1365 CrossRef CAS PubMed.
  156. S. Mitchell, et al., Automated Image Analysis for Single-Atom Detection in Catalytic Materials by Transmission Electron Microscopy, J. Am. Chem. Soc., 2022, 144(18), 8018–8029 CrossRef CAS PubMed.
  157. S. Xiang, et al., Solving the structure of ‘single-atom’ catalysts using machine learning-assisted XANES analysis, Phys. Chem. Chem. Phys., 2022, 24, 5116–5124 RSC.
  158. J. Zhang, et al., Single-atom catalysts for thermal- and electro-catalytic hydrogenation reactions, J. Mater. Chem. A, 2022, 10, 5743–5757 RSC.
  159. K. Jorner, A. Tomberg, C. Bauer, C. Sköld and P. O. Norrby, Organic reactivity from mechanism to machine learning, Nat. Rev. Chem., 2021, 5(4), 240–255 CrossRef CAS.
  160. D. Gao, T. Liu, G. Wang and X. Bao, Structure Sensitivity in Single-Atom Catalysis toward CO2Electroreduction, ACS Energy Lett., 2021, 6, 713–727 CrossRef CAS.
  161. A. F. Zahrt, et al., Prediction of higher-selectivity catalysts by computer-driven workflow and machine learning, Science, 2019, 363(6424), eaau5631 CrossRef CAS PubMed.
  162. N. Zhang, et al., Single-atom site catalysts for environmental catalysis, Nano Res., 2020, 13, 3165–3182 CrossRef CAS.
  163. G. L. W. Hart, T. Mueller, C. Toher and S. Curtarolo, Machine learning for alloys, Nat. Rev. Mater., 2021, 6(8), 730–755 CrossRef.
  164. Y. Ying, K. Fan, X. Luo, J. Qiao and H. Huang, Unravelling the origin of bifunctional OER/ORR activity for single-atom catalysts supported on C 2 N by DFT and machine learning, J. Mater. Chem. A, 2021, 9(31), 16860–16867 RSC.
  165. G. Zheng, et al., High-Throughput Screening of a Single-Atom Alloy for Electroreduction of Dinitrogen to Ammonia, ACS Appl. Mater. Interfaces, 2021, 13, 16336–16344 CrossRef CAS PubMed.
  166. V. Fung, G. Hu, Z. Wu and D. E. Jiang, Descriptors for Hydrogen Evolution on Single Atom Catalysts in Nitrogen-Doped Graphene, J. Phys. Chem. C, 2020, 124, 19571–19578 CrossRef CAS.
  167. L. Wu, T. Guo and T. Li, Machine learning-accelerated prediction of overpotential of oxygen evolution reaction of single-atom catalysts, iScience, 2021, 24, 102398 CrossRef CAS PubMed.
  168. X. Zhu, et al., Activity Origin and Design Principles for Oxygen Reduction on Dual-Metal-Site Catalysts: A Combined Density Functional Theory and Machine Learning Study, J. Phys. Chem. Lett., 2019, 10, 7760–7766 CrossRef CAS PubMed.
  169. X. Wan, et al., Machine-Learning-Accelerated Catalytic Activity Predictions of Transition Metal Phthalocyanine Dual-Metal-Site Catalysts for CO2Reduction, J. Phys. Chem. Lett., 2021, 12, 6111–6118 CrossRef CAS PubMed.
  170. X. Wan, Z. Zhang, W. Yu and Y. Guo, A density-functional-theory-based and machine-learning-accelerated hybrid method for intricate system catalysis, Materials Reports: Energy, 2021, 1, 100046 CrossRef.
  171. H. Niu, et al., Single-Atom Rhodium on Defective g-C3N4: A Promising Bifunctional Oxygen Electrocatalyst, ACS Sustainable Chem. Eng., 2021, 9, 3590–3599 CrossRef CAS.
  172. S. Lin, H. Xu, Y. Wang, X. C. Zeng and Z. Chen, Directly predicting limiting potentials from easily obtainable physical properties of graphene-supported single-Atom electrocatalysts by machine learning, J. Mater. Chem. A, 2020, 8, 5663–5670 RSC.
  173. X. Guo, et al., Simultaneously Achieving High Activity and Selectivity toward Two-Electron O2 Electroreduction: The Power of Single-Atom Catalysts, ACS Catal., 2019, 9, 11042–11054 CrossRef CAS.
  174. C. Deng, et al., Understanding activity origin for the oxygen reduction reaction on bi-atom catalysts by DFT studies and machine-learning, J. Mater. Chem. A, 2020, 8, 24563–24571 RSC.
  175. J. Melisande Fischer, et al., Accurate prediction of binding energies for two-dimensional catalytic materials using machine learning, ChemCatChem, 2020, 12, 5109–5120 CrossRef CAS.
  176. H. Yuan, Z. Li, X. C. Zeng and J. Yang, Descriptor-Based Design Principle for Two-Dimensional Single-Atom Catalysts: Carbon Dioxide Electroreduction, J. Phys. Chem. Lett., 2020, 11, 3481–3487 CrossRef CAS PubMed.
  177. B. Meyer, B. Sawatlon, S. Heinen, O. Anatole Von Lilienfeld and C. Emence Corminboeuf, Machine learning meets volcano plots: computational discovery of cross-coupling catalysts, Chem. Sci., 2018, 9(35), 7069–7077 RSC.
  178. S. Pablo-García, R. García-Muelas, A. Sabadell-Rendón and N. López, Dimensionality reduction of complex reaction networks in heterogeneous catalysis: From linear-scaling relationships to statistical learning techniques, Wiley Interdiscip. Rev.: Comput. Mol. Sci., 2021, 11(6), e1540 Search PubMed.
  179. L. Gong, et al., Catalytic Mechanisms and Design Principles for Single-Atom Catalysts in Highly Efficient CO2 Conversion, Adv. Energy Mater., 2019, 9(44), 1902625 CrossRef CAS.
  180. Z. Yu, H. Xu and D. Cheng, Design of Single Atom Catalysts, Advances in Physics: X, 2021, 6(1), 1905545 Search PubMed.
  181. X. Guan, W. Gao and Q. Jiang, Design of bimetallic atomic catalysts for CO2 reduction based on an effective descriptor, J. Mater. Chem. A, 2021, 9, 4770–4780 RSC.
  182. C. Ling, et al., A General Two-Step Strategy–Based High-Throughput Screening of Single Atom Catalysts for Nitrogen Fixation, Small Methods, 2019, 3(9), 1800376 CrossRef.
  183. W. Song, L. Fu, C. He, K. Xie and Y. Guo, Computational screening of 3d transition metal atoms anchored on the defective graphene for efficient electrocatalytic N2 fixation, ChemPhysChem, 2021, 22(16), 1712–1721 CrossRef CAS PubMed.
  184. Y. Tang, et al., Nitrogen and boron coordinated single-atom catalysts for low-temperature CO/NO oxidations, J. Mater. Chem. A, 2021, 9, 15329–15345 RSC.
  185. F. Gao, Y. Wei, J. Du and G. Jiang, Theoretical screening of 2D materials supported transition-metal single atoms as efficient electrocatalysts for hydrogen evolution reaction, Materialia, 2021, 18, 101168 CrossRef CAS.
  186. Y. Wang, et al., High-throughput screening of carbon-supported single metal atom catalysts for oxygen reduction reaction, Nano Res., 2021, 15(2), 1054–1060 CrossRef.
  187. C. Ren, et al., Relative Efficacy of Co− X4 Embedded Graphene (X = N, S, B, and P) Electrocatalysts towards Hydrogen Evolution Reaction: Is Nitrogen Really the Best Choice?, ChemCatChem, 2020, 12, 536–543 CrossRef CAS.
  188. O. V. Prezhdo, Advancing Physical Chemistry with Machine Learning, J. Phys. Chem. Lett., 2020, 11, 9656–9658 CrossRef CAS PubMed.
  189. Z. K. Han, et al., Single-atom alloy catalysts designed by first-principles calculations and artificial intelligence, Nat. Commun., 2021, 12, 1–9 CrossRef PubMed.
  190. J. Liu, et al., Recent Progress in Non-Precious Metal Single Atomic Catalysts for Solar and Non-Solar Driven Hydrogen Evolution Reaction, Adv. Sustainable Syst., 2020, 4(11), 2000151 CrossRef CAS.
  191. Z. Yang, W. Gao and Q. Jiang, A machine learning scheme for the catalytic activity of alloys with intrinsic descriptors, J. Mater. Chem. A, 2020, 8, 17507–17515 RSC.
  192. X. Sun, et al., Machine-learning-accelerated screening of hydrogen evolution catalysts in MBenes materials, Appl. Surf. Sci., 2020, 526, 146522 CrossRef CAS.
  193. H. Liang, M. Xu and E. Asselin, A Study of Two-Dimensional Single Atom-Supported MXenes as Hydrogen Evolution Reaction Catalysts Using DFT and Machine Learning, ChemRxiv, 2021 DOI:10.26434/chemrxiv.14566656.v1.
  194. M. Sun, A. W. Dougherty, B. Huang, Y. Li and C. H. Yan, Accelerating Atomic Catalyst Discovery by Theoretical Calculations-Machine Learning Strategy, Adv. Energy Mater., 2020, 10(12), 1903949 CrossRef CAS.
  195. M. Zafari, A. S. Nissimagoudar, M. Umer, G. Lee and K. S. Kim, First principles and machine learning based superior catalytic activities and selectivities for N2 reduction in MBenes, defective 2D materials and 2D π-conjugated polymer-supported single atom catalysts, J. Mater. Chem. A, 2021, 9, 9203–9213 RSC.
  196. M. Zafari, D. Kumar, M. Umer and K. S. Kim, Machine learning-based high throughput screening for nitrogen fixation on boron-doped single atom catalysts, J. Mater. Chem. A, 2020, 8, 5209–5216 RSC.
  197. A. Chen, X. Zhang, L. Chen, S. Yao and Z. Zhou, A Machine Learning Model on Simple Features for CO2Reduction Electrocatalysts, J. Phys. Chem. C, 2020, 124, 22471–22478 CrossRef CAS.
  198. T. Yang, et al., Protecting Single Atom Catalysts with Graphene/Carbon-Nitride ‘chainmail’, J. Phys. Chem. Lett., 2019, 10, 3129–3133 CrossRef CAS PubMed.
  199. C. Rivera-Cárcamo, et al., Stabilization of Metal Single Atoms on Carbon and TiO2 Supports for CO2 Hydrogenation: The Importance of Regulating Charge Transfer, Adv. Mater. Interfaces, 2021, 8, 1–17 Search PubMed.
  200. E. J. M. Hensen, D. G. Vlachos, Y. Wang and Y. Q. Su, Finite-temperature structures of supported subnanometer catalysts inferred via statistical learning and genetic algorithm-based optimization, ACS Nano, 2020, 14, 13995–14007 CrossRef PubMed.
  201. P. Serp, Cooperativity in supported metal single atom catalysis, Nanoscale, 2021, 13, 5985–6004 RSC.
  202. H. Zhang, X. F. Lu, Z. P. Wu and X. W. D. Lou, Emerging Multifunctional Single-Atom Catalysts/Nanozymes, ACS Cent. Sci., 2020, 6, 1288–1301 CrossRef CAS PubMed.
  203. K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Machine learning for molecular and materials science, Nature, 2018, 559, 547–555 CrossRef CAS PubMed.
  204. Y. Q. Su, et al., Stability of heterogeneous single-atom catalysts: a scaling law mapping thermodynamics to kinetics, npj Comput. Mater., 2020, 6(1), 1–7 CrossRef.
  205. N. J. O'Connor, A. S. M. Jonayat, M. J. Janik and T. P. Senftle, Interaction trends between single metal atoms and oxide supports identified with density functional theory and statistical learning, Nat. Catal., 2018, 1, 531–539 CrossRef.
  206. Z. Lu, S. Yadav and C. V. Singh, Predicting aggregation energy for single atom bimetallic catalysts on clean and O* adsorbed surfaces through machine learning models, Catal. Sci. Technol., 2020, 10, 86–98 RSC.
  207. M. Sun, et al., Mapping of atomic catalyst on graphdiyne, Nano Energy, 2019, 62, 754–763 CrossRef CAS.
  208. K. K. Rao, Q. K. Do, K. Pham, D. Maiti and L. C. Grabow, Extendable Machine Learning Model for the Stability of Single Atom Alloys, Top. Catal., 2020, 63, 728–741 CrossRef CAS.
  209. M. Ha, et al., Tuning metal single atoms embedded in NxCy moieties toward high-performance electrocatalysis, Energy Environ. Sci., 2021, 14, 3455–3468 RSC.
  210. F. Pedregosa, et al., Scikit-learn: Machine learning in Python, J. Mach. Learn. Res., 2011, 12, 2825–2830 Search PubMed.
  211. G. Di Liberto, L. A. Cipriano and G. Pacchioni, Role of Dihydride and Dihydrogen Complexes in Hydrogen Evolution Reaction on Single-Atom Catalysts, J. Am. Chem. Soc., 2021, 143, 20431–20441 CrossRef CAS PubMed.
  212. G. D. Liberto, L. A. Cipriano and G. Pacchioni, Universal Principles for the Rational Design of Single Atom Electrocatalysts ? Handle with Care, ACS Catal., 2022, 12(10), 5846–5856 CrossRef.
  213. J. Zhang, H. Yang and B. Liu, Coordination Engineering of Single-Atom Catalysts for the Oxygen Reduction Reaction: A Review, Adv. Energy Mater., 2021, 11, 1–20 CAS.
  214. X. Li, et al., Microenvironment modulation of single-atom catalysts and their roles in electrochemical energy conversion, Sci. Adv., 2020, 6, 1–20 Search PubMed.
  215. Q. Wang, et al., Recent Advances in Strategies for Improving the Performance of CO 2 Reduction Reaction on Single Atom Catalysts, Small Science, 2021, 1, 2000028 CrossRef.
  216. L. Li, et al., Recent Developments of Microenvironment Engineering of Single-Atom Catalysts for Oxygen Reduction toward Desired Activity and Selectivity, Adv. Funct. Mater., 2021, 31(45), 2103857 CrossRef CAS.
  217. J. Li, et al., Highly Active and Stable Metal Single-Atom Catalysts Achieved by Strong Electronic Metal-Support Interactions, J. Am. Chem. Soc., 2019, 141, 14515–14519 CrossRef CAS PubMed.
  218. Z. Li, et al., Metal-support interactions in designing noble metal-based catalysts for electrochemical CO2 reduction: Recent advances and future perspectives, Nano Res., 2021, 14(11), 3795–3809 CrossRef CAS.
  219. S. Mitchell and J. Pérez-Ramírez, Single atom catalysis: a decade of stunning progress and the promise for a bright future, Nat. Commun., 2020, 11, 10–12 CrossRef PubMed.
  220. Z. Jiang, et al., Atomic interface effect of a single atom copper catalyst for enhanced oxygen reduction reactions, Energy Environ. Sci., 2019, 12, 3508–3514 RSC.
  221. W. Ju, et al., Unraveling Mechanistic Reaction Pathways of the Electrochemical CO2 Reduction on Fe-N-C Single-Site Catalysts, ACS Energy Lett., 2019, 4, 1663–1671 CrossRef CAS.
  222. Q. Fan, et al., Electrochemical CO2 reduction to C2+ species: Heterogeneous electrocatalysts, reaction pathways, and optimization strategies, Mater. Today Energy, 2018, 10, 280–301 CrossRef.
  223. E. J. Askins, et al., Toward a mechanistic understanding of electrocatalytic nanocarbon, Nat. Commun., 2021, 12, 1–15 CrossRef PubMed.
  224. F. Wang, W. Xie, L. Yang, D. Xie and S. Lin, Revealing the importance of kinetics in N-coordinated dual-metal sites catalyzed oxygen reduction reaction, J. Catal., 2021, 396, 215–223 CrossRef CAS.
  225. T. Cheng, H. Xiao and W. A. Goddard, Reaction Mechanisms for the Electrochemical Reduction of CO2 to CO and Formate on the Cu(100) Surface at 298 K from Quantum Mechanics Free Energy Calculations with Explicit Water, J. Am. Chem. Soc., 2016, 138, 13802–13805 CrossRef CAS PubMed.
  226. S. Back, J. Lim, N. Y. Kim, Y. H. Kim and Y. Jung, Single-atom catalysts for CO2 electroreduction with significant activity and selectivity improvements, Chem. Sci., 2017, 8, 1090–1096 RSC.
  227. T. Zheng, et al., Large-Scale and Highly Selective CO 2 Electrocatalytic Reduction on Nickel Single-Atom Catalyst, Joule, 2019, 3, 265–278 CrossRef CAS.
  228. W. Song, L. Fu, C. He and K. Xie, Carbon-Coordinated Single Cr Site for Efficient Electrocatalytic N2 Fixation, Adv. Theory Simul., 2021, 4, 2100044 CrossRef CAS.
  229. M. Li, et al., Heterogeneous Single-Atom Catalysts for Electrochemical CO2 Reduction Reaction, Adv. Mater., 2020, 32, 1–24 Search PubMed.
  230. J. Lu, et al., Scalable two-step annealing method for preparing ultra-high-density single-atom catalyst libraries, Nat. Nanotechnol., 2021, 17(2), 174–181 Search PubMed.
  231. L. Jiao, et al., Non-Bonding Interaction of Neighboring Fe and Ni Single-Atom Pairs on MOF-Derived N-Doped Carbon for Enhanced CO2 Electroreduction, J. Am. Chem. Soc., 2021, 143(46), 19417–19424 CrossRef CAS PubMed.
  232. M. A. Hunter, J. M. T. A. Fischer, Q. Yuan, M. Hankel and D. J. Searles, Evaluating the Catalytic Efficiency of Paired, Single-Atom Catalysts for the Oxygen Reduction Reaction, ACS Catal., 2019, 9, 7660–7667 CrossRef CAS.
  233. X. Guo, et al., Tackling the Activity and Selectivity Challenges of Electrocatalysts toward the Nitrogen Reduction Reaction via Atomically Dispersed Biatom Catalysts, J. Am. Chem. Soc., 2020, 142, 5709–5721 CrossRef CAS PubMed.
  234. F. Doherty, H. Wang, M. Yang and B. R. Goldsmith, Nanocluster and single-atom catalysts for thermocatalytic conversion of CO and CO2, Catal. Sci. Technol., 2020, 10, 5772–5791 RSC.
  235. T. Williams, K. McCullough and J. A. Lauterbach, Enabling Catalyst Discovery through Machine Learning and High-Throughput Experimentation, Chem. Mater., 2020, 32, 157–165 CrossRef CAS.
  236. A. Thakkar, et al., Artificial intelligence and automation in computer aided synthesis planning, React. Chem. Eng., 2021, 6, 27–51 RSC.
  237. N. S. Eyke, B. A. Koscher and K. F. Jensen, Toward Machine Learning-Enhanced High-Throughput Experimentation, Trends Chem., 2021, 3, 120–132 CrossRef CAS.
  238. X. Li, et al., Combining machine learning and high-throughput experimentation to discover photocatalytically active organic molecules, Chem. Sci., 2021, 12(32), 10742–10754 RSC.
  239. G. Lo Dico, Á. P. Nuñez, V. Carcelén and M. Haranczyk, Machine-learning-accelerated multimodal characterization and multiobjective design optimization of natural porous materials, Chem. Sci., 2021, 12, 9309–9317 RSC.
  240. K. Abbasi, et al., Dimensional Stacking for Machine Learning in ToF-SIMS Analysis of Heterostructures, Adv. Mater. Interfaces, 2021, 8(3), 2001648 CrossRef CAS.
  241. K. Higgins, et al., Exploration of Electrochemical Reactions at Organic-Inorganic Halide Perovskite Interfaces via Machine Learning in In Situ Time-of-Flight Secondary Ion Mass Spectrometry, Adv. Funct. Mater., 2020, 30(36), 2001995 CrossRef CAS.
  242. L. Liu and A. Corma, Metal Catalysts for Heterogeneous Catalysis: From Single Atoms to Nanoclusters and Nanoparticles, Chem. Rev., 2018, 118, 4981–5079 CrossRef CAS PubMed.
  243. M. Shetty, et al., The Catalytic Mechanics of Dynamic Surfaces: Stimulating Methods for Promoting Catalytic Resonance, ACS Catal., 2020, 10(21), 12666–12695 CrossRef CAS.
  244. H. Jing, et al., Electronics and coordination engineering of atomic cobalt trapped by oxygen-driven defects for efficient cathode in solar cells, Nano Energy, 2021, 89, 106365 CrossRef CAS.
  245. Y. Wang, et al., Regulating the coordination structure of metal single atoms for efficient electrocatalytic CO2 reduction, Energy Environ. Sci., 2020, 13, 4609–4624 RSC.
  246. Z. Kou, W. Zang, P. Wang, X. Li and J. Wang, Single atom catalysts: A surface heterocompound perspective, Nanoscale Horiz., 2020, 5, 757–764 RSC.
  247. Y. Shen, et al., Automation and computer-assisted planning for chemical synthesis, Nat. Rev. Methods Primers, 2021, 1(1), 1–23 CrossRef.
  248. M. H. S. Segler, M. Preuss and M. P. Waller, Planning chemical syntheses with deep neural networks and symbolic AI, Nature, 2018, 555, 604–610 CrossRef CAS.
  249. X. Huang, et al., Applying machine learning to balance performance and stability of high energy density materials, iScience, 2021, 24, 102240 CrossRef CAS.
  250. R. Lang, et al., Single-Atom Catalysts Based on the Metal-Oxide Interaction, Chem. Rev., 2020, 120, 11986–12043 CrossRef CAS.
  251. R. Li, et al., Single atoms supported on metal oxides for energy catalysis, J. Mater. Chem. A, 2022, 10, 5717–5742 RSC.
  252. D. Wang, et al., Accelerated prediction of Cu-based single-atom alloy catalysts for CO2 reduction by machine learning, Green Energy Environ., 2021 DOI:10.1016/j.gee.2021.10.003.

This journal is © The Royal Society of Chemistry 2022
Click here to see how this site uses Cookies. View our privacy policy here.