Rui
Ding
ab,
Junhong
Chen
*ab,
Yuxin
Chen
*c,
Jianguo
Liu
d,
Yoshio
Bando
e and
Xuebin
Wang
*f
aPritzker School of Molecular Engineering, University of Chicago, Chicago, IL 60637, USA. E-mail: junhongchen@uchicago.edu
bChemical Sciences and Engineering Division, Physical Sciences and Engineering Directorate, Argonne National Laboratory, Lemont, IL 60439, USA. E-mail: junhongchen@anl.gov
cDepartment of Computer Science, University of Chicago, Chicago, IL 60637, USA. E-mail: chenyuxin@uchicago.edu
dInstitute of Energy Power Innovation, North China Electric Power University, Beijing, 102206, China
eChemistry Department, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
fCollege of Engineering and Applied Sciences, Nanjing University, Nanjing, 210093, China. E-mail: wangxb@nju.edu.cn
First published on 9th October 2024
Machine learning (ML) is rapidly emerging as a pivotal tool in the hydrogen energy industry for the creation and optimization of electrocatalysts, which enhance key electrochemical reactions like the hydrogen evolution reaction (HER), the oxygen evolution reaction (OER), the hydrogen oxidation reaction (HOR), and the oxygen reduction reaction (ORR). This comprehensive review demonstrates how cutting-edge ML techniques are being leveraged in electrocatalyst design to overcome the time-consuming limitations of traditional approaches. ML methods, using experimental data from high-throughput experiments and computational data from simulations such as density functional theory (DFT), readily identify complex correlations between electrocatalyst performance and key material descriptors. Leveraging its unparalleled speed and accuracy, ML has facilitated the discovery of novel candidates and the improvement of known products through its pattern recognition capabilities. This review aims to provide a tailored breakdown of ML applications in a format that is readily accessible to materials scientists. Hence, we comprehensively organize ML-driven research by commonly studied material types for different electrochemical reactions to illustrate how ML adeptly navigates the complex landscape of descriptors for these scenarios. We further highlight ML's critical role in the future discovery and development of electrocatalysts for hydrogen energy transformation. Potential challenges and gaps to fill within this focused domain are also discussed. As a practical guide, we hope this work will bridge the gap between communities and encourage novel paradigms in electrocatalysis research, aiming for more effective and sustainable energy solutions.
Despite advances in materials science and electrochemistry, the conventional method of developing electrocatalysts is mostly dependent on time-consuming trial-and-error processes. Such processes, either experimental synthesis and evaluation or numerical simulation, heavily rely on the subjectivity and experience of researchers, and usually produce unsatisfying outcomes. Corresponding limited design spaces have resulted in costly and sluggish improvement of catalytic performances because of the inability of the traditional research paradigm to manage complex systems with a large number of variables. Thus, more effective methods are urgently needed in this labor-intensive field to more widely explore a greater variety of potential electrocatalyst candidates and optimal combinations.
The rapid evolution of artificial intelligence (AI) and machine learning (ML) has transformed various areas of human society. ML has shown its potency across various scientific domains including natural language processing (NLP),6–8 computer vision,9,10 and drug discovery11,12 by revealing patterns and relationships in data that may be difficult through conventional analysis.13 In the field of hydrogen energy, ML is promising to reshape the development of electrocatalysts—traditionally guided by researcher intuition and subjectivity—by complementing or enhancing traditional computational and experimental approaches. Traditional theoretical simulation methods, while capable of predicting experimental outcomes often with a high degree of fidelity, are computationally intensive and often struggle with optimization tasks in high-dimensional parameter spaces requiring high-throughput calculations. For instance, using density functional theory (DFT) to screen for the optimal structure with the lowest reaction energy barrier might require thousands to millions of attempts to traverse all possible configurations—often prohibitive in computational resources. In contrast, ML-driven surrogate models, capable of processing vast, high-dimensional, multivariable datasets, can efficiently explore these vast spaces at significantly lower costs. These models accelerate brute force searches for optimal configurations and unveil innovative insights into catalyst behavior that traditional methods might overlook, such as correlation between material descriptors. Hence, they enhance both the speed and depth of insights in costly and time-consuming explorations like high-throughput experiments and DFT simulations.14–17 This enables the rapid discovery of new optimal candidates, the improvement of existing candidates,18,19 and the fine-tuning of catalytic performances, which are the core needs in the field. Despite the relatively recent application of ML in electrocatalyst research, with many studies predominantly utilizing “off-the-shelf” methodologies and algorithms, a practical, material-oriented perspective is essential to effectively implement ML in diverse material scenarios. Moreover, ML's ability to handle a wide range of input variables, from atomic-level descriptors to macroscale engineering factors, allows for a holistic optimization of catalytic systems. The interpretability and reliability of ML models is essential for deeper insights and thus can help to more effectively distinguish qualitatively and quantitatively the most decisive descriptors in a complex system and reveal the mechanisms of catalyst behavior.20,21
This review offers comprehensive guidelines for materials scientists and engineers new to ML, specifically focusing on enhancing electrocatalyst designs for critical reactions such as HER, OER, HOR, and ORR. It categorizes ML-driven research by commonly studied material types such as metal alloys, 2D materials, and single-atom catalysts—providing clear material-oriented insights and facilitating connections between these materials and broader applications. This organizational approach not only elucidates the connection between material systems and their broader applications beyond hydrogen energy (Fig. 1) but also enhances understanding by detailing key novel material insights for each type of electrocatalyst. That is, for each category, we delve into the most frequently used ML algorithms and identify the critical parameters and descriptors—whether derived from experimental data or theoretical simulations—that serve as essential inputs for modeling these materials. This analysis helps chemists, materials scientists, and engineers grasp the most influential features in predicting the performance of electrocatalysts. This is especially helpful for hands-on practice by electrocatalyst researchers who are unfamiliar with ML, by facilitating their process in preparation of datasets correspondingly. Overall, this tailored breakdown makes ML applications more accessible, aiding materials scientists in understanding and applying ML techniques effectively. By summarizing descriptors and features commonly utilized in ML modeling, we also present the unique integration strategies with both theoretical and experimental approaches tailored to different material systems. This leads to a comprehensive understanding of how ML may be smoothly integrated at different fidelity levels into the design of electrocatalysts for hydrogen applications. Our in-depth discussion further examines current advancements and prospective avenues for future expansion of this rapidly changing research field, and inspects the challenges ahead, such as bridging fidelity gaps and facilitating knowledge integration. Conclusively, this review not only offers a systematic exploration of ML's transformative role in advancing electrocatalyst design for hydrogen energy transformation but also serves as a practical guide tailored specifically for electrocatalyst researchers. By demystifying ML applications through a reader-accessible material-based focus, we advocate for a paradigm shift toward more integrative, data-driven research approaches in this field and beyond.
Fig. 1 Schematic of the review scope for the electrocatalytic material systems covered in this work. |
ML models, once trained, are fast to execute and can conduct large-scale screenings of uncharted possibilities, making them invaluable as inexpensive surrogate tools in the material design space. However, data are needed in the first place for the targeted system. For instance, data derived from DFT calculations can be used to screen the best alloy compositions for optimal hydrogen species adsorption energy, where the configurations of nanoparticles related to composition play a pivotal role.27 Similarly, experimental synthesis and evaluation data can significantly improve metrics like the half-wave potential in ORR for carbon-based catalysts.28
The importance of input features cannot be overstated, as they directly impact the model's performance and predictive accuracy. Poorly chosen or limited features may lack the necessary information about the material system, rendering even the best models unable to learn effectively. Conversely, an excessive number of features can lead to overfitting and increased computational complexity. Most current research still relies on customized, handcrafted features based on researchers’ subjective understanding of the targeted material systems. This domain-specific expertise has not been thoroughly summarized, highlighting a significant gap that this review aims to fill for reaching a consensus. By summarizing the general features used in current publications, we lay the groundwork for understanding the most effective descriptors for various material systems, which will be discussed in detail in the subsequent sections focused on HER, OER, HOR and ORR.
Typical structural and geometrical descriptors include bond lengths, bond angles, atomic radius, and coordination numbers. Bond length, defining the distance between bonded atoms, impacts electronic properties and reactivity. For example, transition metal (TM) atom bond lengths with adsorbates29 or neighboring TM atoms30 are used in predicting adsorption energies. Bond angles indicate adjacent bond angles around a central atom, for instance Fe–O–Fe31 angles. They influence surface-catalyst interaction strength. Atomic radii determine structural configurations, affecting how atoms pack and overall material geometry, often described through covalent, ionic, and van der Waals radius.32,33 Coordination numbers, indicating nearest neighboring atoms around a catalytic site (or second nearest34), are key in understanding local atomic environments and correlating with catalytic activity. In particular, unsaturated coordinated atoms are more active and can serve as reaction sites. Researchers would utilize these descriptors for micro-environment descriptions.
Besides these straightforward descriptors, there are other manual feature engineering techniques, such as direct one-hot encoding of atomic positions35 and number of certain types of element atoms reflecting the local atomic environment,36 which describe the immediate surroundings of atoms of the site. In general, comprehensive and appropriate descriptions of these structural and geometrical attributes are crucial for ML models to learn crystallographic knowledge.
The Basic atomic properties are widely used in ML models for elemental descriptions, including electronegativity, ionization energy, atomic mass, group number, and periodic number, are crucial in determining material characteristics.37,38 As TM elements are usually the studied target, d-orbital and valence electron characteristics hold significance, such as d-electron count, d-band center (εd), valence electron number, occupied and unoccupied d states near the Fermi level, and total d electrons. These features are critical in understanding catalytic behavior, influencing adsorption energies and reaction kinetics.38–40 Electronic properties like electron affinity, charge transfer, and density of states (DOS) at the Fermi level provide insights into electronic behavior affecting catalytic performance. Local density of states (LDOS)41 and total band filling42 describe the electronic environment at catalytic sites, while charge distribution analyses like Bader charge analysis quantify electron density distribution, offering local electronic insights. Inputs like Bader charge at catalytic sites, charge transfers, and charge state variations are commonly used in related studies for electronic environment characterization.43,44
Many studies use a combination of primary and derived features. For instance, a combination of primary atomic features (empirical radius, mass, electron affinity) and derived features (d-band center, formation energy of single-atom sites40) provides a more comprehensive representation. The former features are directly available, and the later ones may require scenario-specific DFT calculations. Built on the foundation by geometrical and structural descriptors, these descriptors further provide detailed information about the material's electronic structure and local chemical environments of the catalytic sites.
Descriptor generation methods generally transform atomic structures into fixed-size numerical fingerprints, capturing essential structural and chemical information.45 These descriptors are designed to be physically meaningful and invariant to rotations and translations, providing a robust representation of the atomic environment. Among the representative popular methods, smooth overlap of atomic positions46 (SOAP) captures local atomic density using Gaussian functions, many-body tensor representation47 (MBTR) considers interactions at multiple levels, and atom-centered symmetry functions48 (ACSF) encodes local atomic environments, all of which are particularly useful for modeling short-range atomic interactions such as adsorption energies and catalytic activities. These methods are recognized in the community as able to comprehensively represent both structural and chemical properties of the crystal structures.
Pre-built deep learning frameworks for solid systems automate feature extraction by directly accepting raw atomic structures as input. They handle both descriptor generation and supervised learning, using advanced neural network architectures to capture complex dependencies and interactions within crystal structures. Popular libraries include crystal graph convolutional neural networks49 (CGCNN), which represents crystal structures as graphs, allowing the model to learn directly from the structure without manual feature engineering. SchNet50 uses continuous filter convolutions to represent atoms and their interactions, providing a flexible and accurate representation of the atomic environment. SpinConv51 introduces spin convolutions to capture rotational invariance and angular dependencies in atomic interactions, achieving high performance on large-scale datasets. DimeNet++52 by Gasteiger et al., an advanced version of directional message passing neural network (DimeNet), excels at capturing angular dependencies in atomic interactions, crucial for modeling properties sensitive to atomic orientations. It also prioritizes computational efficiency for large datasets and complex systems. GemNet-OC,53 a further advancement also by Gasteiger et al. in graph neural networks (GNNs) tailored for materials science, enhances traditional methods by incorporating directional information about atomic interactions, making it particularly effective for properties sensitive to relative atomic orientations like bond angles and torsional interactions. Recently, the community has proposed more state-of-the-art methods: MACE,54 spherical channel network (SCN),55 equivariant spherical channel network (eSCN),56 neural equivariant interatomic potentials (NequIP),57 equiformer V158/V2,59 atomistic line graph neural network (ALIGNN),60 crystal Hamiltonian graph neural network (CHGNet),61 Matformer,62 M3GNet.63
These “off-the-shelf” deep learning libraries are recommended when feature customization needs are limited. They could automate feature extraction and eliminate the need for manual feature engineering and the use of other libraries. They are highly scalable for large datasets and complex systems. And their predefined architectures streamline the modeling process, making them efficient and user-friendly. In many communities, especially those focused on DFT and MD simulations, these frameworks are also referred to as ML potentials, as they serve as efficient surrogates for computationally expensive quantum mechanical calculations, enabling faster and more accurate simulations. In general, physics informed descriptors represent the current frontier methods that are preferred when the research fidelity is based on DFT simulations.
The experimental and synthesis-based parameters covered in this review include a wide range of features. These features comprise experimental observations like Tafel plots, mole fractions of metal precursors, and primary atomic characteristics such as empirical radius, mass, electron affinity, ionization energy, and density. Additionally, empirical synthesis parameters like annealing temperature, heating rate, hold time, and similar parameters for hydrothermal processes are crucial. Other important parameters include material characterization properties like lattice constant, crystal plane spacing, and morphology-related information.
Given the diversity in material systems and synthesis methods, establishing a universally recommended way of preparing experimental datasets is challenging. For instance, synthesis steps for alloys differ from those for 2D materials like MoS2. Therefore, the preparation process must be tailored to the specific material system. However, a general approach involves systematically documenting and standardizing all relevant synthesis parameters and experimental conditions to ensure reproducibility and consistency across different studies. This comprehensive documentation enables the creation of high-fidelity datasets crucial for accurate prediction and optimization of electrocatalysts using ML.
Experimental data for electrocatalysts can be derived from electrochemical tests, characterization of properties, or extraction from scientific literature. The most frequently measured parameters are the overpotentials for OER and HER, with the overpotential at 10 mA cm−2 (η10) being a widely accepted benchmark for comparing catalytic activities. For ORR, metrics like mass activity and half-wave potential (E1/2) are preferred due to their greater reproducibility and relevance to practical performance. These electrochemical metrics are indeed well adopted for assessing and comparing the activity of electrocatalyst products within the hydrogen electrocatalyst community. Additionally, some studies would prefer current density,69 while others focus on device-level metrics like maximum power density and area-specific resistance.70 Beyond these primary electrochemical measurements, other material system-specific metrics include the morphology of polymerization products71 and electrochemical double-layer capacitance,72etc. Given the diversity of data, it is essential to standardize experimental conditions and reporting methods to ensure the dataset's consistency and comparability, thereby enabling meaningful predictions made by ML models.
A comprehensive summary of the 151 papers covered in this review is provided in Table S1 (ESI†) through the online repository: https://github.com/ruiding-uchicago/ML-in-Hydrogen-Energy-Transformation-Electrocatalysts-Review/tree/main.
Nevertheless, researchers can still use it to categorize data based on similarities, with clustering as a popular method.75 Clustering organizes items in a collection according to their similarities to one another in comparison to other groups. Specifically in hydrogen electrocatalysis research, unsupervised learning could be helpful in categorizing electrocatalysts based on their intrinsic properties or performance indicators. By groupings among samples, clustering could also potentially identify unexpected behaviors or anomalies, which are either potential exceptional candidates worthy of further investigation or outliers to deprecate.
Conversely, underfitting occurs when the model is too simple to capture the underlying structure of the data, resulting in poor performance on both the training and test sets. Underfitting is often due to overly conservative hyperparameters or an insufficiently complex model architecture. Addressing underfitting involves increasing the model's capacity and ensuring it has enough flexibility to learn from the data. Additionally, the number of features and their dimensions play a significant role, which will be discussed in Section 2.2.4 Quality of Features. In summary, the goal is a well-fitted ML model with good generalization, starting with sufficient unbiased data.
AL represents a transformative approach in ML where the algorithm proactively queries information source to obtain labels for new data points. This method contrasts with traditional supervised learning, which uses pre-established labeled datasets passively. Supervised learning mines datasets for patterns, whereas AL allows the model to choose data points based on uncertainty sampling, representative sampling, or potential to alter the model's understanding.76 Specifically, for electrocatalysis, AL becomes particularly valuable where data labeling is costly or time-consuming like experimental sample synthesis. By focusing on data points with the highest informational gain, AL accelerates model training and enhances data utilization, leading to a quicker improvement of model prediction.
Beyond AL applications, some works focus more directly on optimizing black-box functions (e.g., fitness functions calculated through DFT simulations), which are typical metrics discussed in 2.1.2. Bayesian optimization (BO), based on AL principles, excels in balancing the exploration of new possibilities and exploitation of known information.77 BO is more concerned with how to obtain better target values, for example overpotentials, while AL is focused on the precision of model prediction. This method employs a surrogate model, often a Gaussian process (GP), to predict the performance of various configurations within a Bayesian framework, thereby efficiently managing the trade-off between potential exploration costs and the value of targeted outcomes. Its ability to handle uncertainty makes it valuable in resource-intensive data collection or optimizing functions with costly evaluations. Specifically, the BO + GP combination could effectively guide the optimization of electrocatalysts’ synthesis recipes and conditions for experimentally measured performance. This scenario typically faces challenges from limited labeled data, high labeling costs, and a limited query budget, but the feature dimensionality is usually low.
Among these, accuracy remains one of the most critical metrics, providing a straightforward measure of how well the model performs overall.
In the context of electrocatalyst ML modeling, where output targets can vary greatly in magnitude and scale, the R2 metric is particularly useful due to its adaptability and ability to provide a standardized measure of model performance across different scales and dimensions.
Some researchers who prioritize intuitive and interpretable models apply symbolic regression. Symbolic regression aims to find mathematical expressions that best fit the data, uncovering underlying relationships in the form of human-readable formulas. This method combines basic mathematical operations such as addition, subtraction, multiplication, division, and exponentiation to derive simple yet powerful equations. Symbolic regression also aims to find relationships between material structural and electronic descriptors and output targets, with a focus on interpretability rather than just predictive accuracy.78 Although symbolic regression may not outperform other black-box models that will be covered later in terms of metrics such as R2 or MAE, it offers unique advantages. In such cases, researchers focus on finding formulas or combinations of certain descriptors to deepen their understanding of the material, rather than directly using the obtained formula as an accurate surrogate model to develop better electrocatalyst samples. This approach, often considered part of statistical learning, is especially beneficial when a straightforward, interpretable model is preferred over more complex, less transparent ones.
Deep learning is linked to representation learning, as it uses neural networks to automate feature extraction. Deep learning can be regarded as a subset of representation learning, specifically involving neural networks with multiple layers that learn representations through hierarchical feature extraction. Representation learning also includes unsupervised techniques like principal component analysis (PCA)79 and t-distributed stochastic neighbor embedding (t-SNE).80 However, these clustering-oriented techniques are less used for feature engineering to improve model prediction accuracy in electrocatalyst development. Thus, in our context, we use deep learning and representation learning interchangeably to refer to the same learning paradigm.
Representation learning allows a model to map input features into a new latent space, capturing the data's structure. This is achieved through layers in neural networks, which progressively learn more abstract representations. As Bengio et al. articulated,23 deep learning techniques aim to learn representations of data with multiple levels of abstraction. Representation learning excels at handling extensive and learnable features, ideal for high-dimensional data. Essentially, deep learning performs representation learning multiple times across its layers, progressively refining the data representation to capture complex patterns and relationships. For example, autoencoders and their variants like convolutional autoencoders81 use deep learning to achieve this transformation.
In electrocatalysis, deep learning is invaluable for handling high-dimensional descriptors from Section 2.1.1. These descriptors detail chemical properties and structural characteristics of crystal structures. Neural networks excel over classical ML models in processing rich, complex data. Starting with basic feedforward neural networks (BFNNs), which are akin to multilayer perceptrons (MLPs), these models can manage straightforward descriptors effectively. PyTorch82 is a typical implementation library for it. As descriptor complexity increases, sophisticated architectures like convolutional neural networks (CNNs) and GNNs become necessary. CNNs excel at identifying spatial features, suitable for capturing atomic arrangements in a crystal lattice. GNNs handle graph-represented data, ideal for molecular property prediction and analyzing non-tabular relationships in crystal structures.
Frameworks like CGCNN and SchNet, introduced in Section 2.1.1.3, are prime examples of deep learning models for representation learning. They automate feature extraction from raw atomic structures. CGCNN represents crystal structures as graphs, learning features through convolutional layers for effective material property prediction. SchNet, using continuous filter convolutions, captures the interactions between atoms in a flexible and accurate manner. These advanced neural network architectures demonstrate how deep structures facilitate representation learning by efficiently handling extensive and learnable features. In general, the ability of neural networks to manage and learn from extensive and learnable features makes them the first choice for DFT surrogate modeling tasks from a theoretical front. The complexity of the data at the atomic level necessitates such techniques.
Therefore, for limited and fixed features, classical ML models are preferable. Table S1 (ESI†) shows that over 75% of works use classical ML methods, highlighting their importance in hydrogen electrocatalysts. A typical implementation library used is Scikit-learn.83 Suitable classical ML methods include K-nearest neighbors (KNN), support vector machines (SVM), and GP. KNN,84 an instance-based learning method, excels in pattern recognition by leveraging local similarities, making it suitable for classifying electrocatalysts with low-dimensional features. Its adaptability and interpretability add value, especially in understanding electrocatalyst data patterns. SVM85 excels in classification by finding the optimal hyperplane for data categorization, managing both linear and nonlinear decision boundaries through kernel functions. This makes SVM useful for classifying electrocatalysts based on distinctive features. GPs are notable for their probabilistic approach to regression and classification, offering predictions and uncertainty estimates, crucial for BO and AL processes.86 This makes GPs invaluable in high-throughput explorations where prediction confidence is essential for sequential decision-making. GPs are typically the default algorithm in these processes. The flexibility and Bayesian nature of GPs support their application in complex, sequential tasks, and are often combined with BO and AL to navigate high-dimensional spaces efficiently. These traditional algorithms balance computational efficiency and meaningful insights from sparse data, ideal for smaller, finite datasets and lower-dimensional features. Their advantages in efficiency and rapid deployment are essential for tasks needing quick model development with budget constraints.
Ensemble methods, leveraging the collective intelligence of multiple models, have become key tools for improving prediction robustness and accuracy. As a typical example of ensemble methods, a Random Forest model builds on Decision Trees (DTs), which serve as the foundational base learner, providing a rule-based decision-making framework87 (see “trees, forests, bagging, and boosting” from Murphy (2022)). This progression from simple DTs to more sophisticated ensemble methods like extra trees (ET) and random forests (RF) illustrates a transition from single models to robust aggregated models.88 ETs and RFs employ techniques such as bagging and feature randomization to mitigate overfitting and improve diversity, making them highly adaptable and scalable for a range of applications. Among the advanced ensemble algorithms are gradient boosting decision tree (GBDT) and corresponding derivatives—LightGBM,89 XGBoost,90 and CatBoost.91 These algorithms refine the ensemble approach by focusing on correcting errors of previous models iteratively, which, when combined with gradient optimization, allows for unparalleled accuracy in detecting complex patterns. Their efficiency, ability to handle categorical features, and scalability have made these GBDT variations highly popular in ML.
For electrocatalysts, ensemble methods like RF and GBDT are robust in handling intricate data landscapes. They adeptly integrate diverse descriptors—chemical, engineering, structural, and operational—to predict catalytic performance with remarkable precision. Given their adaptability to both low-dimensional inputs and datasets with dozens of features, these algorithms have become a staple in electrocatalyst ML research. Their application spans from experimental datasets, including synthesis conditions, to surrogate modeling for DFT, covering atomic descriptors and crystal configurations. Ensemble methods provide high accuracy across various fidelity levels (experimental/simulation) without the computational intensity or overfitting risks of artificial neural networks (ANNs). Their widespread use highlights their potential as a first-line approach in electrocatalysis, providing a versatile tool for innovative material discovery.
Hyperparameters are the external configurations that dictate a model's structure and learning process and must be determined before training begins. For deep learning, typical hyperparameters include learning rates, batch sizes, and the number of layers. In the case of KNN, the number of neighbors is crucial, while for SVM, the choice of kernel and regularization parameter are essential. For GP, kernel functions and their parameters are important. Ensemble methods like RF involve hyperparameters such as the number of trees and the maximum depth of each tree.
Regularization techniques92 also play a vital role in model optimization. They exist both in deep learning and boosting models. Techniques such as L1 and L2 regularizations help in reducing overfitting by penalizing large weights. Dropout,93 specific to neural networks, prevents the co-adaptation of features by randomly disabling neurons during training. Early stopping94 halts training when performance on a validation set drops, preventing overfitting to the training data. Batch normalization95 improves training speed and stability by adjusting and scaling activations.
Model optimization is usually a custom trial-and-error process dependent on the database. For the hydrogen electrocatalyst community, a deep understanding of the mathematical and computational aspects of hyperparameters and algorithm architectures is less significant. The practical approach is to first use the default hyperparameter settings provided by the ML library. If excellent predictive performance is not achieved, consider whether feature engineering and model selection are appropriate and if the data is unbiased and sufficient. If the model shows good baseline performance, then refer to the library's manual to identify hyperparameters that can be further tuned. Grid search, random search, and Bayesian optimization are all viable methods for fine-tuning.
Uncertainty quantification (UQ) is also crucial for understanding and managing the uncertainty in ML models, typically categorized into aleatoric (data-based) and epistemic (model-based) uncertainties.96,97 Techniques such as model ensembling and mean/variance estimation provide insights into prediction variability, while deep kernel learning and distance-based conformal prediction refine uncertainty estimates. Monte Carlo dropout introduces randomness during training to assess uncertainty, and evidential regression estimates uncertainty by predicting distribution parameters. As commonly adopted in BO, GP naturally incorporate UQ through their inherent probabilistic framework. Greedy acquisition, epsilon-greedy, probability of improvement (PI), expected improvement (EI), thompson sampling (TS), and upper confidence bound (UCB) are effective strategies for leveraging UQ in decision-making.98 Additionally, information entropy serves as a data-based, model-independent measure of uncertainty. For experimental hydrogen electrocatalyst development, UQ is indispensable as it guides experimental efforts by identifying the most promising candidates with the highest certainty, ultimately accelerating the discovery of efficient and stable catalysts while minimizing costly trial-and-error approaches.
Intrinsic feature importance, often calculated as the default method by DT-based models serves as the basic interpretation. Corresponding libraries typically use methods like Gini/entropy (DT, RF and GBDT), gain (XGBoost), split (LightGBM), or permutation (CatBoost) to rank feature contributions to the output. They are straightforward ways based on impurity reduction, prediction accuracy, or feature usage frequency. In addition to these intrinsic methods, there are several more advanced interpretation techniques to understand how input features affect output targets, providing deeper insights into the underlying mechanisms in the studied electrocatalyst system. Partial dependency plots102 (PDPs) clarify the relationship between a specific feature and the outcome by isolating its effect while holding other features constant, making it easier to visualize a feature's impact on model predictions. For a more detailed analysis, shapley additive explanations103 (SHAP) break down predictions to quantify each feature's contribution, offering a nuanced view grounded in game theory. This method ensures equitable attribution of prediction impacts, including interactions between features. Similarly, local interpretable model-agnostic explanations104 (LIME) provide local insight by approximating how changes in input affect predictions, making complex models more interpretable on a case-by-case basis. Sensitivity analysis105 can further enrich understanding by illustrating how minimal changes to inputs affect the predictions. This method offers intuitive, actionable insights into the model's behavior and helps in identifying the most sensitive parameters in the electrocatalyst system.
Collectively, these interpretation techniques enable data scientists and electrocatalyst domain experts to gain a more thorough and transparent understanding of their ML models. Identification and visualization of the feature impacts could deepen the understanding of hydrogen electrocatalysts through a unique data science approach.
For classical ML applications that are more frequently applied, Scikit-learn is the first choice that provides a comprehensive suite of algorithms for classification, regression, and clustering that are suitable for various electrocatalyst analysis tasks. Scikit-learn also offers basic realizations of RF and GBDT. Additionally, TPOT112 and PyCaret113 serve as auto ML tools, automating the machine learning pipeline and making them accessible for beginners. For advanced cases, LightGBM,89 XGBoost,90 and CatBoost91 have independent packages that offer highly optimized versions of GBDT, excelling in handling tabular data for predictive modeling with efficiency and scalability.
Chemical and crystallographic datasets are also crucial for data preparation. The materials project,114 atomic simulation environment (ASE),115 open quantum materials database (OQMD),116 the joint automated repository for various integrated simulations (JARVIS),117 and automatic FLOW for materials discovery (AFLOW)118 are pivotal in democratizing access to vast repositories of chemical and crystallographic data. These platforms provide pre-computed properties for thousands of materials, enabling data-driven discovery and design of new electrocatalysts. The open catalyst (OC) catalyst datasets, particularly OC20119 and OC22,120 are extensive collections to be highlighted, with the former containing over 250 million single-point calculations and the latter featuring 62000 DFT relaxations. These datasets cover a wide range of reactions involving various small molecules, such as CO, H2O, and O2, among others. This comprehensive scope is crucial for training ML models that can generalize across different catalytic systems, making them particularly valuable for advancing hydrogen electrocatalyst studies. Suitable libraries also exist for feature engineering, especially for theoretical studies. For generating sophisticated descriptors, Dscribe45 offers a toolkit for creating a wide array of materials and molecular descriptors that are essential for ML models in materials science, typically coulomb matrix, SOAP, MBTR, and ACSF introduced in Section 2.1.1.3. Matminer121 is another valuable tool that facilitates the extraction and manipulation of materials data for ML applications. When it comes to neural network potentials for molecular dynamics (MD) simulations, libraries like Schnet50 and DeePMD122 provide powerful frameworks for developing and deploying accurate and efficient models. These tools allow for the simulation of atomic-scale phenomena with unprecedented detail, opening new avenues for understanding and optimizing electrocatalytic materials. Together, these ML toolboxes and resources form a robust ecosystem that supports the entire lifecycle of materials discovery and development.
HER takes place in the cathode of water electrolyzers. It is the cornerstone in producing hydrogen as a clean-energy resource driven by electrical energy input to split water. The HER mechanism unfolds through a series of electrochemical steps, each important for the overall reaction efficiency. The Volmer step, which produces an adsorbed hydrogen atom (*H) on the catalyst's active site by electrochemically reducing a proton (H+) with an electron (e−), is the first and most important stage in HER because it creates the foundation for the subsequent formation of gaseous hydrogen. After HER completes the Volmer step, it can proceed in one of two ways. During the electrochemical desorption stage of the Heyrovsky process, the *H releases the active site and combines with another proton and electron to form gaseous H2. Alternatively, the Tafel step rereleases the active site by recombining two *H to form gaseous H2. A brief schematic of the reaction mechanism is illustrated in Fig. 3.
The catalyst's electronic properties have a significant influence on how effectively these steps are completed in an energy-favorable manner. Particularly in acidic environments, the activity of a HER catalyst is closely correlated with the Gibbs free energy change of hydrogen adsorption (ΔGH*). The optimal condition is an intermediate and balanced interaction between the hydrogen adsorbate and the catalyst's active site. Since these conditions can limit the initial formation of hydrogen or obstruct its release, it is preferable to have neither too strong nor too weak adsorption. A value of ΔGH* near zero indicates theoretically high intrinsic catalytic performance, which is expected to be observed with low overpotential in corresponding electrochemistry experiments.64,65 In alkaline conditions, however, HER becomes more complex due to the inclusion of water dissociation as another decisive step. This adds a layer of complexity to the reaction mechanism, making the understanding of activity descriptors in alkaline HER more challenging. Recent studies highlight the significance of the cooperative action of different active components in alkaline HER catalysts.123,124 Some components facilitate the activation of water molecules with a low energy barrier, while others optimize the desorption of hydrogen atoms. This cooperative mechanism suggests both new opportunities and challenges due to the complexity in developing more effective HER electrocatalysts. After decades of exploration, platinum-based noble metal catalysts are currently the most widely used in commercial applications due to their exceptional efficiency.125 However, due to the high cost and limited availability of these materials, research has mainly concentrated on two objectives: enhancing the intrinsic catalytic efficiency of HER and reducing the requirement for costly noble metals. Research on catalysts based on nonprecious metals has also thrived to achieve this goal, including efforts on carbon materials, transition metal (TM) compounds, and other novel systems.126 Due to the nature of the potentially vast candidate design space, ML techniques could greatly aid these explorations.
Two primary data sources are usually used in subfields like electrocatalysts in materials science for ML, especially when designing electrocatalysts for HER: theoretical simulations like DFT, and experimental data. For the former, first-principles descriptors such as electronic structure and crystal geometric configurations are frequently included in input features. ML models act as surrogate models for quickly predicting outcomes such as energies and forces which might theoretically indicate catalytic activity or stability. Such a high-throughput screening would typically consume significant computational resources. The ML-DFT strategy is valued in a theoretical perspective, efficiently screening candidates from vast possibilities, especially between electrocatalyst materials that have intrinsic differences like crystal structures and element types. Even with lower fidelity, several benefits to using simulation data, such as DFT, as a source for ML should be noted, including greater speed, lower barrier of automation in high-throughput dataset preparation, and the capacity to identify underlying mechanisms. One study that has illustrated such advantages is reported by Gu et al.,127 who focused on jagged Pt nanowires for alkaline HER. In the study, the local environments of 3413 binding sites on jagged Pt nanowires were used to obtain input features (descriptors) for ML model training (Fig. 4a). ACSF,48 CGCNN,49 nearest atom distance-Gaussian process, and SchNet50 were applied and compared for this representation task. ACSF is recognized as the best, with the lowest MAE of 0.043 eV. With this experimentally well-validated ML model, the researchers could further correlate the activity of different sites on the nanowire with their intuitive descriptor: coordination number and site types (top, bridge, hollow) via unsupervised learning. The results identified an auto bifunctional catalysis mechanism (Fig. 4b and c) where distinct sites on the Pt nanowire surface synergistically contribute to the HER process: the stronger binding sites adsorb protons, and the weaker binding sites activate hydrogen. Such a discovery that would originally require an immense number of simulation calculations for statistical analysis is now enabled by the ML-DFT technique. In another study that discusses the HER mechanism, Ooka et al. investigation into hydrogen surface-binding energies on Pt, diverging from the convention of thermoneutrality, offers a significant shift in understanding the HER design rule.128 Their database is based on experimentally acquired electrochemical data, and they employed a novel approach by integrating regression modeling with GA to effectively capture the non-linear dynamics of the HER process, allowing for a more accurate estimation of the binding energies. Their findings highlight the importance of considering overpotentials in catalyst design and suggest that optimal catalytic efficiency may require binding energies that are not thermoneutral, especially under conditions far from equilibrium. This insight opens new pathways into the design of more efficient HER electrocatalysts.
Fig. 4 ML studies on pure Pt electrocatalysts for HER. (a) Integrated simulation process for jagged Pt nanowires. This involves a synergistic approach that uses force field analysis, DFT, ML techniques, and kinetic modeling, aiming at a comprehensive multiscale simulation of the alkaline HER on jagged Pt nanowires. Validation of the model is achieved through comparison with experimental data, with a focus on elucidating the underlying mechanism, which encompasses the Volmer, Heyrovsky, and Tafel reactions, as depicted in the lower left plot. (b) Illustration of the bifunctional mechanism, where protons adsorb at a Volmer-favorable site and migrate to a Tafel-favorable site for H2(g) formation. (c) Simplified visualization of the nanowire indicating reaction preferences at different binding sites (top, bridge, and hollow sites marked by circles, squares, and triangles, respectively). Color coding (blue for Tafel, red for Volmer reactions) reflects relative reaction rates. (a–c are reproduced from ref. 127 with permission). |
Except for deepening the understanding and design rule from the fundamental mechanism level, ML models are more widely recognized as powerful tools in screening the design parameters of HER electrocatalysts, such as the element types and corresponding composition in alloys. Li et al. investigated the (100) surfaces of binary alloy systems formed by strong- (Pd and Pt) and weak-binding (Ag, Au, and Cu) transition metals27 (Fig. 5a). To predict the DFT-calculated H binding energies (ΔEH*, which does not consider thermal correction: temperature and entropy), which are HER activity descriptors, a database with more than 450 entries and the manually chosen input features of 26 physical properties like electronegativity, d-orbital information, and d-band center are used. With a simple BPNN (Fig. 5b), the researchers could identify the superiority of Pd2Au2-d/Pd0.75Au0.25 among other competitors. Similarly, Jäger et al. focused on a specific model system of 55-atom bimetallic icosahedral Pt nanoclusters composed of binary combinations of the elements Ti, Fe, Co, Ni, and Cu.41 Their strategy for input feature engineering is to combine electronic descriptors with structural descriptors: SOAP-derived descriptors and the local density of states together. By using kernel ridge (KR) regressor as the ML algorithm along with an additional training set supplement in the loop, an MAE of 0.1 eV could be reached by 1767 DFT calculations. As the result, researchers revealed not only the advantage of Ni in binary, but also NiCo and NiTi in ternary Pt alloy. Li et al. also adopted the idea of iteratively generating a new training dataset by applying AL with a query strategy that measured the deviations of DFT-calculated adsorption energies129 (Fig. 5c). By further applying previously introduced state-of-the-art GNN framework: DimeNet++52 and labeled site crystal graph,130 the authors finally screened out Cu3Pt(100) and FeCuPt2(100) and (001) as potential candidates for replacing Pt(111). Zhang et al. uniquely focused on Pt-modified amorphous alloy (Pt@PdNiCuP) and the features used to describe the adsorption sites consist of simple geometric elements.131 Nevertheless, the ML-assisted results align well with the previous experimental study132 and identify a theoretical best composition of the five elements in this complex system for further exploration. For real experimental exploration that is more practical and valuable, the use of AL is a powerful and low-cost option. Kim et al. innovatively apply AL on both binary and ternary Pt-based systems, demonstrating its efficacy in rapidly identifying optimal multi-metallic alloy catalysts for HER with significantly reduced experimental costs.133 By iteratively updating a GP model with experimental data, their method efficiently narrows down the vast design space. The AL process initially started with 73 preliminary random data points and conducted two loops with 40 additional data points explored in each. The exploration studied both binary and ternary (Fig. 5d and e) alloy systems. Even with such a limited data size, it still effectively led to the discovery of a high-performing Pt0.65Ru0.30Ni0.05 catalyst with an overpotential of merely 54.2 mV, which remarkably surpasses the electrocatalytic efficiency of pure Pt. Beyond directly guiding experiments with ML, the literature contains a wealth of domain expertise that can be leveraged for ML modeling to offer a holistic view. Yang et al. effectively used a comprehensive database derived from an extensive literature review in their work.37 They employed the sure independence screening and sparsifying operator (SISSO) method, a form of supervised regression, to refine and enhance the predictive accuracy of the Nørskov model65 for HER kinetics on various metal surfaces.
Fig. 5 (a) Schematic of the random sampling method for (100) bimetallic alloys. The four-fold ensemble that offers H's particular adsorption environment is represented by red squares. (b) The BPNN model's algorithmic architecture with input features used in ref. 27 (a and b are reproduced from ref. 27 with permission). (c) Schematic representation of AL in catalyst discovery via DFT (c is reproduced from ref. 129 with permission). (d) and (e) AL results for ternary composition: with each iteration, the triangular diagrams for the (d) uncertainty and (e) overpotential of the Pt–Ru–Ni system are updated. Red dotted circles highlight shifts in predictions without additional data at specific points post-iteration (d and e are reproduced from ref. 133 with permission). |
Among noble metals, Pd is also commonly studied as a promising candidate to boost HER as an electrocatalyst.134 Gao et al. investigated the amorphous alloy Pd40Ni10Cu30P20, a promising candidate for HER (Fig. 6a).135 The electrocatalytic performance of this complex system was analyzed using the SOAP as the input feature generator and GP as the ML algorithm (Fig. 6b), which successfully mapped the catalytic activities of sites on the alloy surface with a small MSE of 0.018 (eV).2 Using this ML surrogate model, the ideal atomic ratio (Pd:Cu:P:Ni = 0.51:0.33:0.09:0.07) for optimal HER activity was found via sampling 40000 active sites. Hoyt et al. performed a thorough investigation on H adsorption energies on Ag alloys (211) surfaces.136 They trained different ML algorithms on the dataset obtained from more than 5000 DFT calculations. Remarkable accuracy was shown by their innovative employment of the best-performing RF model along with a combination of standard chemical and structural descriptors as input features. On the median, the RF's absolute test error was merely 14 meV. Except for predicting with precision, the as-trained ML model also helps to reveal intricate electronic structure effects and counterintuitive behaviors in dopant atoms, further underscoring the potential of ML to uncover novel insights in electrocatalysts as a popular subfield in materials science.
Fig. 6 (a) Left: The atomic structure of the Pd40Ni10Cu30P20 amorphous alloy. Right: The DFT-optimized structure of the Pd40Ni10Cu30P20 amorphous alloy. (b) Algorithm framework of SOAP-ML model construction (a and b are reproduced from ref. 135 with permission). (c) Right: Depiction of the [MxAu25−x(SCH3)18 + H]q system (M = Pd, Cu, x between 0 and 1, q between −2 and 2) and Left: Its corresponding graph representation. Various metal doping and hydrogen adsorption sites are highlighted. Color coding is as follows: orange for gold, yellow for sulfur, turquoise for carbon, white for methyl hydrogen, and violet-tinted gold atoms indicating dopant location types. Three green spheres represent potential H adsorption sites (c is reproduced from ref. 137 with permission). (d) The optimal adsorption sites for H on the surface of different 55-atom Cu binary clusters with|ΔGH*| < 0.1 eV (reproduced from ref. 34 with permission). |
Pihlajamäki et al. uniquely considered the possible organic ligands on metal clusters, investigating Cu- and Pd-doped 25-atom Au monolayer-protected clusters with thiolate ligands on the surface (Fig. 6c).137 The innovation of this work is that instead of directly applying GNN, the authors employed graph-based representations of the local atomic environment of hydrogen, incorporating geometric, graph theoretical, and tabulated features which enabled the prediction of interaction energies between hydrogen and the nanoclusters with a high degree of accuracy. Such a strategy allows relatively simple distance-based kernel models to reach a CV RMSE of below 0.1 eV. Hence, this work not only provided insights into the HER catalysis behavior of the complex nanocluster system, but also demonstrated the power of combining graph-based methods for feature engineering. Through similar DFT-ML strategies, recent researchers have also explored binary alloy systems: Cu55−nMn34 (Fig. 6d) (M = Co, Ni, Ru, and Rh) clusters or ternary alloy system: NiCoCu.36 Except for predicting a theoretical optimum composition, ML models also allow these works to gain deeper insights into the relationship between the local microstructures of the active sites and the hydrogen adsorption behavior that determines the HER activities.
Besides exploring systems with predefined metal elements, ML models can also be extended further to screen from a vast candidate space of different combinations of metal elements. Chen et al. used the CGCNN to explore a substantial dataset of 38484 structures, leading to the identification of 43 promising alloys from an initial pool of as many as 2973 candidates138 (Fig. 7a). This approach, integrating ML potential for efficient structural description and simple physical properties, demonstrated a balance between computational efficiency and accuracy. The use of final configurations obtained via the SchNet50 calculator as input features was key in accurately predicting the hydrogen adsorption values. The framework's efficacy was further validated by the close match of computational predictions with experimental results for selected candidates like AgPd alloy, showcasing the practical potential of ML in accelerating the discovery of new electrocatalysts from the various possible combinations of the elements. Similarly, Zhang et al. explored a vast candidate space of binary alloys for HER;139 however, they chose to leverage ensemble methods and classical ML algorithms. As a result, the best performing LightGBM model, which is less computationally intensive than deep learning models, achieved a remarkable R2 score of 0.921 and an RMSE of 0.224 eV. Notably, they also employed the SHAP method post-training to extract insightful interpretations; they found an interesting descriptor: mean of group number of elements in an alloy to be the most impactful on the model's ΔGH* value prediction.
Fig. 7 (a) Schematic of the ML framework for the high-throughput screening of electrocatalysts: on the left in the “constructing adsorption database” section, the depiction includes adsorption sites for binary alloys. These are represented as ontop, bridge, and hollow sites, indicated by a black star, red “+”, and blue “×”, respectively (reproduced from ref. 138 with permission). (b) t-distributed stochastic neighbor embedding visualization of all simulated adsorption sites using DFT: the visual representation shows the adsorption energy values in eV. Stronger binding sites are superimposed over weaker ones. Notably, the clusters/materials in dark purple are labeled for their potential as promising candidates. (c) Normalized distribution of low coverage ΔEH* (electronic energy change) values from DFT Workflow: this graph presents the distribution of ΔEH* values. Dashed lines highlight the 0.1 eV range around the optimal ΔEH* value of −0.27 eV. Note, the authors of ref. 140 chose ΔEH* rather than ΔGH* for HER, hence the optimal value is not 0 eV (b and c are Reproduced from ref. 140 with permission). |
As broader interest among different metal elements for forming alloys would largely increase the candidate space, and the demand for calculations of over thousands of configurations by DFT to prepare a dataset for ML could be more expensive. Hence, an efficient approach to leverage AL for a higher efficiency is needed. Tran and Ulissi reported in 2018 a pioneering work that employs a novel ML framework for integrating AL and surrogate-based optimization to streamline the discovery of electrocatalysts for CO2 reduction and HER.140 Their approach, applied to an extensive, order-of-magnitude-improved database of 1499 intermetallic crystals leading to 17507 unique surfaces and 1.6 million adsorption sites, significantly narrows down the search space while maintaining the model's evolving accuracy. This method not only reduced the computational cost but also finally led to the identification of 131 candidate surfaces for CO2 reduction and 258 surfaces for H2 evolution (Fig. 7b and c, to be noted, like some of previously mentioned research. This work chooses to use ΔEH* that has not included the entropy and zero-point energy as the HER activity metric. Most of the other ML-related works in this section for HER choose to use ΔGH*, which has the optimal value of 0 eV as mentioned previously), highlighting its reliability for accelerating the exploration of efficient electrocatalysts in an immense candidate space. Kayode et al. have also implemented BO in their recent study.141 The authors applied this approach to efficiently screen for high-performance, single-atom alloys and bimetallic catalysts, which are crucial in not only HER but also for reactions such as alkane transformations and for CO2 reduction. The BO workflow was effective even with limited initial datasets (as few as two to eight data points), and it employed simple yet insightful input features such as group and period numbers. Notably, their approach, which requires significantly fewer DFT calculations compared to traditional methods, still successfully led to the identification of promising candidates such as Hf1Cu for alkane transformations, Y1Au, Y1Cu, and Y1Ag for CO2 reduction, and Ag–Ir binary alloy for HER. These works demonstrated the practical utility and flexibility of adaptive learning techniques like AL and BO in handling an electrocatalyst system with vast searching space.
Using nitrogen (N) as the candidate dopant, Lv et al. investigated the possibility of developing bifunctional electrocatalysts via γ-graphyne (allotrope of carbon, distinct from graphene, with unique lattice structure) nanoribbons for both HER and ORR.145 Among the different ML algorithms, they screened out the best performing LightGBM model and a special set of input features (Fig. 8a) such as atomic distances (d2, d3) and charges associated with the active site (Q2, Q3). With the dataset's size near 300, the MAE of overpotential was as low as 0.072 and 0.066 V for ORR and HER, respectively. They further applied SHAP for feature importance analysis to provide important new information, emphasizing in particular the strong impact of the chemical environment around the active sites (Fig. 8b). However, Kronberg et al. explored an innovative approach to further leverage SHAP in a continuous 10 × 5-fold nested CV loop, rather than as a typical one-time post-explanation after model training.146 By applying this innovative method on various dataset subsets, they were able to dynamically assess the RF model's generalization performance and feature importance on a dataset with roughly 6500 DFT-calculated configurations. They also achieved strong model stability and accuracy by fine-tuning the hyperparameters inside the inner CV loops. Furthermore, the integration of SHAP into this layered CV framework allowed for a detailed, iterative examination of the feature attributions, providing important insights into the complex interplay between the structural, chemical, and electronic factors influencing the hydrogen adsorption on N-doped carbon nanotubes (Fig. 8c). Moreover, the work of Ebikade et al. takes a direct experimental approach instead of depending on theoretical simulations.147 Expecting higher costs through experiments, the authors wisely applied the iterative AL strategy. Their input features constituted a nine-dimensional parameter space that takes into account structural characteristics like N species and pore volume in addition to synthesis conditions like hold time and final temperature. Despite resource limitations, this approach resulted in effective exploration and optimization in a complicated multidimensional space. The authors were able to determine the ideal final conditions with better HER performance than earlier reports,148,149 all in less than 20 experimental runs. Moreover, graphitic N content was identified as the most decisive material feature for electrochemical performance.
Fig. 8 (a) Heat map of the Pearson correlation coefficient among the selected features for ML modeling of γ-graphyne nanoribbons, (b) measurement of the feature importance using the SHAP method (a and b are reproduced from ref. 145 with permission). (c) Left: Global SHAP importance rankings for the top 10 features in adsorption energy prediction: Bar heights represent CV averages, with error bars showing ±1 standard deviation across outer CV folds. Each bar is annotated with correlation coefficients between the SHAP values and the feature values. Right: Local SHAP value distributions for the 10 most impactful features: this is shown across all test set observations. Vertical data point dispersion indicates dense clusters of similar ϕj values, with color-coding reflective of corresponding feature values (c is reproduced from ref. 146 with permission). |
TM metal atoms doped into a graphene matrix could serve effectively as electrochemical reaction catalytic centers while tuning local electronic structures of the carbon materials.150 Among the popular experimentally reported doped TM-(N)C structures, Liu et al. have made significant strides in integrating ML with theoretical methods as well as experimental validation to explore cobalt single-atom catalysts (Co SACs).151 Using supervised learning, particularly a BPNN with three hidden layers, they analyzed MD-extended X-ray absorption fine structure (EXAFS) spectra to accurately determine the local chemical environments of Co SACs. This ML approach, trained on a dataset of 1000 configurations generated from EXAFS simulations, enabled the elucidation of the atomic structure of edge-rich Co single atoms, revealing proportions that were 65.49% of Co-4N-plane (Co-4N-P), 13.64% in Co-2N-armchair (Co-4N-A), and 20.86% in Co-2N-zigzag (Co-4N-Z). Except for the outstanding electrochemical performance for HER, the leveraged ML method in this work has successfully deepened understanding of the HER mechanism on this electrocatalyst system. Besides Co–(N)C, there is a wide range of TMs that are potential candidates. Fung et al. investigated the vast possibilities of 3d–5d TM atoms doped in N-doped two-dimensional (2D) graphene (Fig. 9a) and nanographenes of several sizes.152 Using descriptors such as d-band centers, formation energies, and atomic properties, they applied regression models such as KR regression and neural networks and achieved notable accuracy, with the RMSE as low as 0.15 eV. Despite that V, Rh, and Ir have been identified as the top candidates that could significantly enhance HER activity, SISSO was applied to directly provide a straightforward formula. Similarly, Baghban et al. reported approximately the same screening candidate space in the same system,40 and they have drawn consistent results identifying Ir, Rh, Fe, V, Sc, and Co as the most promising TM dopants. Moreover, the contribution of this work is that sensitivity analysis as a post-method has been applied to bring deeper insights into feature importance (Fig. 9b). Several valence electrons and the covalent radius have shown a high relevancy of 0.74, indicating their dominant impact on the adsorption energy. Recently, Zhou et al. further delved into the complex interplay between TM and their surrounding atoms in single-atom catalysts, investigating configurations where N atoms in typical TM–N4 structures are directly substituted with C atoms with different degrees.38 They employed a novel topology-based, multi-scale convolution kernel ML algorithm and used input features like atomic group number and electron count. The strategy employs multi-scale convolution kernels of varying sizes, enabling the simultaneous extraction of both global and local information from the material's feature matrix (Fig. 9c). Notably, Zhou et al. also leveraged ML models to predict not only the typically studied ΔGH* but also, comprehensively, the energies of H2 dissociation and water molecule adsorption. The models have achieved impressive prediction accuracies (R2 scores ranging from 0.931 to 0.965), which allowed the authors to identify promising electrocatalyst materials for HER and hydrogen sensing, such as Pt and Sc atoms in specific coordination environments.
Fig. 9 (a) Three examples of optimized structures of H adsorption on the transition-metal single atom embedded on N-doped graphene (reproduced from ref. 152 with permission). (b) The relevance factor of different input variables by sensitivity analysis (reproduced from ref. 40 with permission). (c) Schematic of the topology-based, multi-scale convolution kernel ML model (reproduced from ref. 38 with permission). |
Researchers have also explored other more complex variations, like dual-TM-atom doped graphene153 (TM1TM2@N6) and TM-graphdyine (GDY).154 As expected, ML surrogate modeling of DFT has also been proven effective in these systems by successfully screening out the best candidate configurations, AuCo/NiNi@N6 and GDY-Eu/Sm, while saving immense computational costs.
Fig. 10 (a) The top view of the g-C3N4 catalyst's optimized shape. The C and N atoms are represented by the blue and gray colored balls, respectively. Different dopant locations are indicated by the dashed circles with letters: Two-fold coordinated nitrogen bonded to two C atoms (N1), triazine ring-connecting nitrogen (N2), and carbon bridging three N atoms. (b) The structure and charge density differences of: B@N1-site, Mn@N1-site, and Co@N1-site, in the order of left to right. Electron depletion and accumulation are indicated by the blue and yellow isosurfaces (0.002 e Å−3), respectively (a and b are reproduced from ref. 156 with permission). (c) 2D materials structures with TM embedded at various defect sites (reproduced from ref. 157 with permission). Color code: metal, magenta; B, light pink; N, blue; C, gray; O, red; H, cyan. (d) The feature importance of the GBDT model (reproduced from ref. 158 with permission). (e) Left: The periodic table with the elements that have been thought to have endohedral sites in C60 shaded in orange. Right: Schematics showing dopants may be positioned inside the cage in the middle or off-center (reproduced from ref. 42 with permission). |
Beginning with the basic pure 2D MoS2 clusters, Jäger et al. extensively investigated the system,165 focusing on the training set size and structural descriptors (SOAP, MBTR, ACSF etc.) that could better predict the potential energy surface (Fig. 11a), reflected in the ΔEH*. They employed a comprehensive dataset of approximately 10000 DFT-based single-point calculations, featuring MoS2 and AuCu nanoclusters, to train their models. The study highlighted the effectiveness of the SOAP descriptor in accurately predicting hydrogen adsorption energy, with a notable MAE of 0.13 eV for MoS2 clusters. Wei et al., however, drive their exploration based on experiments in order to optimize the synthesis conditions of MoS2 within a BO framework.166 They employed hydrothermal synthesis techniques with parameters such as temperature, reaction time, and precursor concentrations as input features for their ML model. The ML approach, particularly using GP belief models and the upper confidence bound policy, effectively identified optimal synthesis conditions, resulting in the optimum sample with notable HER performance. The ML approach or the HER performance is evidenced by its low overpotential at 10 mA cm−2 (η10 = 240 mV) and Tafel slope (64 mV dec−1). Patra et al. employed GA alongside MD and high-resolution transmission electron microscopy (HRTEM) to investigate the defect dynamics in 2D MoS2.167 Their approach determined that extended line defects are more stable sulfur vacancy configurations than isolated vacancies. This finding further elucidated the critical role of defects in the 2H-to-1T phase transition and demonstrated the effectiveness of ML in advancing the understanding of complex material phenomena.
Fig. 11 (a) Learning curves for different MoS2 datasets show the MAE for different training set sizes (reproduced from ref. 165 with permission). (b) Basal plane of 2H-MoS2 and its local structural deformations (insets) when Fe, Co, Ni and Cu are doped at substitutional Mo sites (reproduced from ref. 168 with permission). (c) Structure models of two example chalcogenides-supported TM single-atom catalysts: Ni@ZnS and Sn@CoS, and (d) BPNN input and output schematic (c and d are reproduced from ref. 169 with permission). |
Incorporating heteroatoms as dopants into TMCs serving as substrates can lead to modifications in local electronic structures and other material characteristics, meriting detailed exploration. Hakala et al. delved into typical cases where common TMs such as single Fe, Co, Ni, Cu atoms are doped into MoS2168 (Fig. 11b). They applied RF for both classification and regression tasks, targeting regularly chosen ΔGH* as the output feature for accessing HER potentials. The ML model revealed that the type of edge (Mo or S) and the specific dopant (Fe, Co, Ni, Cu) are the most decisive factors that would determine the hydrogen adsorption characteristics. Tu et al. further extended the diversity by including more TM dopants and more sulfides beyond MoS2: CdS, CoS, FeS, and ZnS.169 But unlike the last work's assumption, in which the TM atoms have directly replaced Mo atoms, the TM atoms in this work are loaded on the surface (Fig. 11c). A three-layer BPNN (Fig. 11d) could reach a promising R2 over 0.95 and MSE less than 0.016 (eV)2 for predicting ΔGH* after training. With it, the authors successfully identified Sn@CoS and Ni@ZnS as the most promising catalysts among candidates with a theoretical ΔGH* of only 0.04 eV and −0.05 eV, respectively.
In addition to the sulfides, the chalcogen elements in TMCs can include Se or Te. Similar to the previously mentioned MoS2 structure, transition metal dichalcogenides (TMDCs) can be experimentally synthesized into monolayers of 2D materials. This would yield a rich specific surface area and abundant active sites. Further considering combinations of various 2D TMDCs for heterojunction structures, the potential exploration space for ML applications could be extensively expanded. Lee et al. proposed to use symbolic regression to find optimal descriptors for predicting ΔGH* on 2D TMDCs.170 Their novel genetic descriptor search method efficiently identified descriptors without intensive calculations, using a dataset of only 70 TMDCs. Like other typical ML algorithms, this approach successfully leveraged 27 primary TMD features, including atomic radii and valence electrons, to generate descriptors that align with chemical knowledge. The model has facilitated the discovery of optimal materials for catalytic performance by successfully identifying MnS2/FeS2/TaS2 with chalcogen vacancy as best candidates. Ran et al. also studied various 2D TMDCs (Fig. 12a) and combined both black-box ML modeling with symbolic strategy using linear square regression.171 By narrowing down from 27 features to five key features, including local electronegativity and valence electron number, they developed ML models using RF and BPNN (possibly with skip-layer connections). These models achieved a high fitting degree (up to 0.94) but were poor in explainability. Linear square regression (Fig. 12b) revealed a quantitative expression as ΔGH* = 0.093 − (0.195*LEf + 0.205*LEs) – 0.15 Vtmx (LEf/LEs: nearest/next nearest neighbor local electronegativity; Vtmx: average valence electron number of TM-X). This formula could reach an impressive R2 of 0.74 (Fig. 12c), further indicating that ΔGH* decreases with the valence electron number and electronegativity of local structure. Doping a second TM into existing TMDCs significantly expands the pool of potential catalysts for ML exploration. Lee et al. studied TM-doped MX2 systems (Fig. 12d), employing an ML approach that used 28 atomic features to predict ΔGH*.172 The tree-based regression models revealed that the most influential are (i) the number of valence electrons, (ii) the distance of the valence electrons, and (iii) the electronegativity of the TM dopant. Chen et al. additionally explored macroscopic patterns in a similar system,173 revealing that certain doping concentrations in TMDCs significantly influence the ΔGH*, indicating enhanced HER performance at specific alloying ratios (Fig. 12e). They attributed this trend to the alloying effect, which alters the electronic structure and p-band center of the adsorption sites, thereby modulating the catalytic activity for HER. Lastly, novel heterojunctions could be obtained by stacking different 2D materials like TMDCs. Additionally, the formation of interfaces can potentially optimize electrical conductivity, electronic structures, and the density of active sites.174,175 Ge et al. considered in their study the heterostructures formed by different 2D MX2 single layers,176 taking the rotation angle, bond length, layer distance, and the ratio of bandgaps of two materials into consideration. Using the simple least absolute shrinkage and selection operator (LASSO) regression method, they efficiently identified key physical descriptors affecting the adsorption performance of these heterostructures. This approach led to the discovery of MoTe2/WTe2, with a 300° rotation angle as the optimal structure, achieving remarkably low overpotentials of 0.03 V for HER and 0.17 V for OER. Pham et al. ambitiously broadened the investigated space beyond the heterostructure formed by MX2 layers, but also MX2 with M′X′ (e.g., ZnO, GaN) layers.177 To describe such complex systems for ML models, they meticulously screened 46 input features derived from atomic properties and positional information, and the ML surrogate model successfully identified MoS2/ZnO as the best candidate. This is proven by both its exceptional theoretical performance via a ΔGH* of −0.02 eV and dynamical stability without imaginary frequency in phonon dispersion calculations.
Fig. 12 (a) Workflow of multilevel, high-throughput calculations for seeking metallic, lowest-energy, −0.09 eV ≤ ΔGH* ≤ 0.09 eV 2D-TMD materials. (b) Illustration of the linear regression fitting process. (c) distribution of the ΔGH* versus the descriptor obtained by least-squares regression (a–c are reproduced from ref. 171 with permission). (d) Geometric structure and colored periodic table representation of TM@MX2: in this illustration, the TMs are depicted in blue, M elements (Cr, Mo, and W) in green, and X elements (S, Se, and Te) in red (reproduced from ref. 172 with permission). (e) Lowest ΔGH* values for hydrogen adsorption on a W(1−x)VxS2 system across various compositions (x): the graph displays how ΔGH* values change with different V concentrations in W(1−x)VxS2. Insets provide visual examples of adsorption configurations that result in the lowest ΔGH*. In these configurations, V, W, S, and hydrogen atoms are represented in red, grey, yellow, and pink, respectively (reproduced from ref. 173 with permission). |
Fig. 13 (a) (i) ΔGH* predictions by the regularized RF versus DFT: the black-dashed line indicates perfect correlation. (ii) Top 10 descriptors’ relative importance from the model. (iii) Descriptor definitions: The three Ni atoms closest to the first doping site are labeled α, β, and γ, based on proximity. (iv) Impact on ΔGH* by Ni–Ni bond length: the role of chemical (via nonmetal doping) and mechanical pressure (by immobilizing surface Ni atoms), identifying the optimal Ni–Ni bond length for HER as 2.97 to 3.07 Å, with adjustments for bond contraction upon H adsorption, highlighted by a green dotted line (reproduced from ref. 179 with permission). (b) Workflow of the proposed stepwise strategy for predicting adsorption energy EH on amorphous catalyst surfaces using ML: this includes two key stages – (I) calculating frozen adsorption energy, where the initial adsorption energy is estimated without considering atomic rearrangements, and (II) determining structural relaxation energy, which accounts for the energy changes resulting from structural adjustments upon adsorption (reproduced from ref. 180 with permission). (c) Schematic structure of 3 × 3 PC3 monolayer (reproduced from ref. 185 with permission). (d) Top and side views of the TM/C3B monolayer structure (reproduced from ref. 186 with permission). |
Fig. 14 (a) Optimized atomic structure of single-atom-loaded MXenes with surface termination elements and single-atom elements, excluding Cr and Mn for the single atom position and C for the surface termination position (reproduced from ref. 44 with permission). (b) Top and side views of a 3 × 3 × 1 supercell of Mo-based MXene structures. (i) M2C structure: Top, fcc, and hcp sites indicating potential O adsorption areas; (ii) Mo2CO2 with functional group O; (iii) single-atom-doped model of Mo2CO2-STM (single transition metal), where STM includes 3d, 4d, and 5d metals. S0, S1, and S2 denote three types of O equivalent positions for H adsorption. The Tc atom is excluded due to its radioactivity (reproduced from ref. 43 with permission). (c) The selected elements for MM′XT2 MXenes (M/M′ = Sc, Ti, V, Cr, Mn, Y, Zr, Nb, Mo, or W; X = B, C, or N; T = O, F, Cl, or S), leading to optimized structures of pristine and functionalized MXenes (reproduced from ref. 190 with permission). (d) Left: Side views of bare MXenes, with early transition metals (purple) and C/N (gray) depicted. Right: Color block map showing ΔGH* for bare MXenes, where gray, yellow, orange, and wine-red circles indicate ΔGH* intervals of <−1.5, −1.5 to −1.0, −1.0 to −0.5, and −0.5–0.2 eV, respectively (reproduced from ref. 191 with permission). |
Aside from single-atom TM doping, the TM element in MXenes could be further tuned in different ratios and types. Wang et al. studied 2D MXene-ordered binary alloy M2M′X2O2 and M2M′2X3O2, allowing the second TM to exist in large amounts.192 Their interdisciplinary ML approach identified 110 promising MXene catalyst candidates with superior HER activity compared to Pt, out of a pool of as many as 2520 candidates. Abraham et al., however, further expanded the search space for 2D MXene-based catalysts by including F, S, and Cl terminations alongside O (Fig. 14c).190 They trained a GBDT regressor with feature selection and hyperparameter optimization on 1125 systems to further predict the activity of all possible 4500 MM′XT2-type MXenes. But in the post-ML analysis for insights in structural and electronic descriptors, they revealed that the number of valence electrons and the electron affinity of the terminating groups are decisive. Similarly, Zheng et al. considered M2X, M3X2, and M4X3 structures with different M and X, also with and without the S as T (Fig. 14d). Notably, they also took hydrogen coverage into consideration.191 As a result, Os2B and Sc–N based S-MXenes exhibited promising catalytic activity, with ΔGH* values approaching zero over a wide range of hydrogen coverages. Additionally, their ML data mining revealed that the atomic mass and electronegativity of the T atom play crucial roles in determining catalytic performance. Although we can see that it is in good agreement with ref. 190 where T is considered as a variable, it is different from ref. 192 for M2M′X2O2 and M2M′2X3O2 systems. As the authors of ref. 192 have only considered O as the terminal element, geometrical and electronic features related to the alloying effect are found to be the most important. Such comparisons between different ML works on similar systems should remind readers of the multifaceted nature of materials discovery and catalyst optimization, where the importance of specific descriptors and factors can vary depending on the alloying effects, terminations, composition of the electrocatalysts, and, most importantly, the search space that was defined.
The nuanced differences in algorithm application, feature selection, and paradigm adoption reflect the unique characteristics and complexities of each material class. Hence, we have visualized the statistical data of input features, applied ML algorithms, etc., for meta insights as shown in Fig. 15 (based on Table S1, ESI†). The bar plot summaries provide a direct trend of popular choices of input features chosen and identified as decisive, and ML algorithms used and identified as the best-performing.
The fundamental mechanistic understanding of OER has evolved, recognizing two primary pathways: the adsorbate evolution mechanism (AEM) and the lattice oxygen mechanism (LOM). AEM, the traditional pathway, emphasizes the sequential adsorption and desorption of intermediates on the catalyst surface, with the activity significantly influenced by the binding energies of these intermediates. Under alkaline conditions, the four steps involved four OH− ions and the intermediate converted from *OH to *O then finally *OOH:5 (1) * + OH− → *OH + e−; (2) *OH + OH− → *O + H2O + e−; (3) *O + OH− → *OOH + e−; (4) *OOH + OH− → * + O2 + H2O + e−. As for AEM in acidic conditions, the four steps and the corresponding oxygen-containing intermediates are the same with OH− replaced by H+: (1) * + H2O → *OH + H+ + e−; (2) *OH → *O + H+ + e−; (3) *O + H2O → *OOH + H+ + e−; (4) *OOH → * + O2 + H+ + e−. This mechanism aligns with the Sabatier principle, where optimal catalyst activity is achieved when intermediates are neither bound too strongly nor too weakly to the catalyst surface. However, the inherent scaling relationships among the adsorption energies of the intermediates pose limitations to the activity enhancement achievable through AEM. In contrast, LOM offers a paradigm shift by implicating the lattice oxygen atoms of a certain type of catalyst material (typically perovskite) in the OER process,201 bypassing the limitations imposed by scaling relationships in AEM. This mechanism suggests that oxygen evolution can proceed through the participation of lattice oxygen, leading to the formation and subsequent refill of oxygen vacancies. This insight into the active involvement of lattice oxygen has been supported by experimental evidence such as oxygen isotope labeling and advanced spectroscopic techniques,202 underscoring the dynamic nature of catalyst surfaces during OER. A brief schematic of OER mechanism is provided in Fig. 16.
Both mechanisms are underpinned by the thermodynamics and kinetics of intermediate species formation and evolution, with the Gibbs free energy change of adsorption playing a central role in determining catalytic activity. The activity of OER catalysts is often depicted in volcano plots, illustrating the trade-off between intermediate adsorption energies that are too strong or too weak. This relationship has been instrumental in guiding the theoretical screening and rational design of new OER catalysts, leveraging descriptors such as the difference in the Gibbs free energy change of adsorption between critical intermediates. Specifically, difference in the Gibbs free energy change of adsorption between O and OH, namely, ΔGO*–ΔGOH* was found by Norskov et al.203 to be a concise but effective descriptor of the theoretical overpotential of OER in common AEM pathways. Meanwhile, ΔGO* is proposed by Kolpak et al.204 to be the descriptor when LOM is taken as the mechanism. Nevertheless, due to the complexity of catalyst surfaces, computation and examination on all the four steps to find the real RDS is more comprehensive and reliable. The exploration of OER mechanisms has also highlighted the significance of the electrocatalyst's electronic structure, particularly the d-band center theory for metal-based electrocatalysts, in influencing the adsorbate binding strength and, consequently, catalytic activity.205
Like HER, as previously mentioned, the operational environment—whether alkaline or acidic—plays a pivotal role in dictating the choice of materials and the mechanisms at play. Commonly used catalysts in both environments include oxides of noble TMs like Ir, Ru. However, the differences between alkaline and acidic conditions have profound implications on the catalyst's performance and durability. In alkaline media, catalysts often exhibit lower overpotentials and enhanced stability due to the less corrosive nature of the environment, which is conducive to the use of a broader range of materials, including non-noble metals and their oxides. This versatility facilitates the development of cost-effective and efficient catalysts. Alkaline conditions also allow for the exploitation of mechanisms like LOM with greater efficacy, which is attributed to the favorable interaction between OH− ions and the catalyst surface. Conversely, acidic environments necessitate the use of more corrosion-resistant materials, typically noble metals, to withstand the harsh conditions, thereby limiting the material choices. Hence, for both situations in addressing the limitations of noble metal-based catalysts, research has pivoted toward developing non-noble metal catalysts, including TM oxides, hydroxides, and perovskites,206 as well as carbon-based207 and hybrid compound.208 These efforts are driven by the dual goals of achieving high catalytic activity and stability while reducing costs. The rational design of these catalysts often involves strategies such as doping, alloying, and surface modification to optimize electronic structures, enhance active site availability, and promote favorable adsorption energetics. In light of these considerations, the intricate challenges of OER present a prime opportunity for the application of ML to unravel and optimize the multifaceted design of catalysts.
ML-assisted investigation on RuO2 has also been reported. Timmermann et al., building upon their innovative application of GAP for the surface structure determination of rutile IrO2, extended their methodology to include RuO2,213 showcasing the versatility and efficiency of their approach for discovering novel surface structures. In this advancement, they employed a data-efficient iterative training protocol for GAPs, leveraging sparse GP regression alongside simulated annealing, to explore and optimize the surface geometries of both IrO2 and RuO2. This refined ML process, enriched by a dataset that eventually encompassed an additional 143 structures beyond the initial bootstrapping set, not only reaffirmed the discovery of thermodynamically stable surface complexions on IrO2 but also unveiled similar energetically favorable complexions on RuO2. Similarly, GAP was also used in the DFT calculation part by Singh et al. in their experimental exploration on Na-substituted disordered rock salt as OER electrocatalysts.214 Feng et al. introduced CrystalGNN with a dynamic embedding layer to self-update the atomic features adaptively along with the iteration of the neural network (Fig. 17a).215 Impressively, by accurately predicting the formation energies of more than 10500 IrO2 configurations, they discovered eight previously unreported metastable phases. They also innovatively used transfer learning to enable the discovery of RuO2 and MnO2, showcasing significant improvements in prediction efficiency and accuracy for these electrocatalysts, thereby highlighting the potential of transferring the capability of as-trained ML models across different electrocatalyst systems. TM dopants into IrO2 and RuO2 are another well-studied strategy to enhance their activity and durability.216 Researchers have already reported successful doping by TM elements including, Mn,217 Ni,218 Co,219 Mo,220 Cu,221 and Pb,222etc. Xu et al. focused on doped RuO2 and IrO2 electrocatalysts,223 leveraging the SISSO method for data-driven descriptor engineering to predict OER adsorption enthalpies with remarkable accuracy. Their novel approach, involving an extensive dataset of 684 DFT calculations and innovative input features, enabled the identification of promising dopants like Co and Fe that are in agreement with experimental validations.
Fig. 17 (a) Framework of CrystalGNN and workflow of the dynamic embedding layer (reproduced from ref. 215 with permission). (b) The exploration process for efficient bifunctional multimetallic alloy catalysts integrates computational and experimental strategies. It begins with a comprehensive search for potential catalysts, followed by the experimental validation of selected candidates. Throughout this process, a Pareto AL cycle is employed to refine predictions and focus on promising alloys. Data points from predictions are categorized into three types: discarded points, which are overshadowed by superior options; uncertain points, requiring further analysis to determine their value; and Pareto front points, which represent optimal candidates undominated by others, highlighting the most efficient catalysts for further development (reproduced from ref. 224 with permission). |
Researchers have also directly applied ML in experimental exploration despite higher expenses. Jiang et al. approached the design of bifunctional oxygen electrocatalysts for ORR and OER from a unique perspective of chemical bonds for composite electrocatalyst material systems.225 They used a dataset from 151 published studies to develop ML models that predict the E1/2, η10, and their difference as metrics of potential catalysts based on their chemical bonds. By employing SHAP values, they identified a promising combination of C–N, C–C, Fe–N, Ru–O, and C–P bonds, demonstrating a novel and efficient strategy for electrocatalyst discovery that led to a promising RuO2@Fe–N–P–C catalyst. In their most recent study, Kim et al. further refine the application of AL to electrocatalyst experimental discovery by targeting bifunctional catalysts for both HER and OER,224 extending their elemental palette to eight: Pt, Pd, Ru, Ni, Fe, Cu, Co, and Sn (Fig. 17b). They also applied the previous data133 for training the initial model. Their expanded approach efficiently pinpointed an optimal catalyst composition, Pt0.15Pd0.30Ru0.30Cu0.25, achieving a notable cell voltage of 1.56 V at 10 mA cm−2 for water splitting. This advancement is facilitated by a refined Pareto AL framework using GP regressors for multi-objective optimization. By integrating more than 110 experimental data points from possible 77946 points over five iterations, the method exhibited a remarkable efficiency in navigating the complex design space for bifunctional catalysts.
Fig. 18 (a) The ML model's prediction on covalency competition in spinel oxides; inset compares the model predictions to DFT for Max(DT, DO), with counts on the y-axis (reproduced by ref. 226 with permission). (b) Schematic of ML trained on EXAFS and XANES data (reproduced by ref. 227 with permission). (c) Models of the crystalline–amorphous interface (close packed atoms) paired with differential charge density outcomes (atom-bonds), where yellow indicates charge accumulation and blue signifies charge depletion. The structure is obtained by high-dimensional neural network potential-boosted MD and DFT provided by the DeePMD-kit package (reproduced from ref. 228 with permission). (d) DFT calculations boosted by ML force field on 9e-HEA: Left shows the model without oxidation. Center and right depict models with pre-oxidation for *O and *OOH intermediates, respectively. Black circles highlight Ni as catalytically active sites, with red and white spheres for O and H atoms. (reproduced from ref. 229 with permission). |
Following the same idea, there are also a series of experiment data-based ML research works regarding η10 as the output fitting target to explore similar systems from binary to quinary: FeNiOxHy,230 (Ni–Fe–Co)Ox,32 (Ni–Fe–Co–Ce)Ox,231 pseudo-quaternary metal oxide combinations from six earth-abundant TM (Co, Ni, Fe, Mn, etc.) elements,232 NiaCobFecX1−a−b−c.233 These studies leverage various ML techniques, including ANNs, SVR, and deep symbolic regression, to analyze and predict overpotential in these earth-abundant TM oxide systems. The studies collectively examined thousands of data points by experiments, spanning compositions of binary to quinary systems. The research demonstrated that experimental dataset-based ML models could also uncover complex mechanism relationships in electrocatalysis. Like in the work that Jiang et al. investigated (Ni–Fe–Co)Ox ternary systems,32 the ML model revealed a complex relationship, indicating that the variance in the first ionization energies and outermost d-orbital electron numbers of catalyst compositions correlates linearly with the reduction in overpotential. As expected, by achieving significant accuracy in forecasting electrocatalyst performance, these works could show the successful prediction of optimal catalyst compositions. For example, in the following work by Jiang et al., they successfully synthesized a novel Ni0.77Fe0.13La0.1 (OH)x sample with an ultra-low η10 of only 226 mV under ML's guidance.233 Wei et al., however, noticed that η10 is not the only effective descriptor.72 They used a domain knowledge database to predict electrochemical double-layer capacitance (Cdl) for earth-abundant TM-layered double hydroxides (LDHs) in OER. By incorporating features such as chemical compositions, structural morphology, and testing conditions into their models, they identified Ce as a pivotal element in modifying the double-layer capacitance of LDHs. The importance of enhancing OER activity is further validated by the authors’ experiments. Timoshenko et al. proposed to use EXAFS and X-ray absorption near-edge structure (XANES) spectra data to predict the partial bond length distributions227 (Fig. 18b) for deeper comprehension of the structural and chemical transformations in CoxFe3−xO4 nanocatalysts during OER. By leveraging a combination of unsupervised and supervised ML methods, including PCA and ANN, they were able to elucidate the evolution of tetrahedrally and octahedrally coordinated species. They unveiled that the active OER mechanism likely involves the reversible formation and oxidation of Co3+–O6 octahedral clusters, which vary with the Co-to-Fe ratio and the electrochemical conditions. In conclusion, the works described in this section have firmly demonstrated the power of ML in guiding the discovery of efficient non-noble metal OER electrocatalysts and deepening our comprehension of their mechanisms, with comparatively less abundant but higher fidelity experimental datasets.
Recently, researchers have extended interest in multi-element alloys containing five or more metals, namely high-entropy alloys (HEAs) for OER, but such systems could hardly be investigated efficiently without the help of ML. Before experimental synthesis, Cui et al. applied ML-boosted MD and DFT via high-dimensional neural network potential to provide guidance in the FeCoNiMoAl HEA system228 (Fig. 18c). The results revealed optimized atomic configurations and electronic structures, thereby significantly reducing the electron transfer resistance, and enhancing the catalytic active sites for OER. Through this computational approach, the team successfully synthesized HEA fibers, demonstrating superior OER performance with an overpotential of 470 mV at 2 A cm−2 and remarkable stability. Moreover, Tajuddin et al. applied a similar strategy but more boldly investigated 9e (element)-HEAs including Ti, Cr, Mn, Fe, Co, Ni, Zr, Nb, and Mo,229 where ML force fields were also generated from DFT-MD simulations to estimate the Gibbs free energy for both OER and HER in challenging acid electrolytes (Fig. 18d). They used an innovative top-down approach for designing HEAs, focusing on the self-selection and self-reconstruction of elements under operational conditions, which enabled the automatic identification of both catalytically active and passivation sites on the alloy surface. Their findings revealed that certain elements like Mn and Fe are as effective as platinum for HER, and the combination of elements in the nonary alloy achieved high catalytic activity and remarkable stability during OER.
Fig. 19 (a) The surface center-environment model for ML input feature construction includes the central surface atom (B), top surface environment (excluding B), and subsurface atoms. ML targets ΔGO*, ΔGOH*, ΔGOOH*, and ηOER, with D representing the elementary properties of the center and the surrounding atoms (reproduced from ref. 237 with permission). (b) Transfer learning pipeline to predict the property of unknown pyrochlore oxides (reproduced from ref. 239 with permission). |
In addition to typical ABO3 type perovskites, other advanced oxide systems have been studied using ML. Li et al. investigated AA′B2O6-type double perovskites,240 employing an adaptive learning strategy with GP regressors. Their approach led to the discovery of several novel perovskites with promising OER activity, such as KRbCo2O6 and BaPbTi2O6, highlighting the model's effectiveness in guiding the design of next generation electrocatalysts that have a calculated overpotential of ∼0.5 V and tolerance factors greater than 0.90. Song et al. also investigated double perovskite catalysts, using a multi-task symbolic regression method to distill universal activity descriptors from diverse datasets gathered via publications.241 They successfully applied the ML-derived 2D descriptor to predict and experimentally validate two new nickel-based perovskites, Cs0.4La0.6NiO3 and K0.5Ce0.5NiO3. Wang et al. focused on pyrochlore compounds, which are promising for acidic conditions.239 The team innovatively implemented a nuanced transfer learning strategy to navigate the vast compositional space (Fig. 19b). By leveraging a two-stage model, where the first stage trained on the formation energies of inorganic compounds to craft a nuanced representation of individual elements and the second stage applied this knowledge to predict the critical properties of pyrochlore oxides, the team efficiently pinpointed 61 promising candidates from an initial set of 6912. Tran et al. comprehensively built the “Open Catalyst 2022” dataset,120 which comprehensively includes 62331 DFT relaxations and approximately 10 million single-point calculations across various oxide materials. They utilized advanced neural network frameworks, including GemNet-OC,53 SchNet,50 DimeNet++,52 ForceNet,108 SpinConv,51 GNN, and PaiNN,242 with GemNet-OC demonstrating the best performance. Their work highlights the effectiveness of fine-tuning pre-trained models on this large, specialized dataset, improving prediction accuracy for complex oxide surfaces and providing key insights into the stability and energy dynamics of these electrocatalysts.
Fig. 20 (a) Model structure of the optimal site on a zigzag nanoribbon (reproduced from ref. 244 with permission). (b) Schematic of graphene-supported SACs for ML models: Single vacancy (three carbon atoms), double vacancy (four nitrogen/carbon atoms), and four pyridine nitrogen configurations. TM atoms in orange, neighboring N/C in green, and other C atoms in gray (reproduced from ref. 245 with permission). (c) Atomic structures of TM dual-metal catalysts on carbon surfaces include 23 defect types across seven N-doping levels (e.g., 4C-2N for two nitrogen substitutions) and 729 compositional combinations, totaling 16767 unique DAC structures (reproduced from ref. 249 with permission). (d) Top and side views of MN4–O–MN4 show upper (orange) and lower (magenta) transition metals (M1 and M2). Red, blue, and gray depict oxygen, nitrogen, and carbon atoms, respectively (reproduced from ref. 250 with permission). |
These studies collectively highlight the potential of ML to reduce the computational cost associated with DFT calculations. Commonly, the input features for ML models in these studies include atomic and electronic properties of the TMs, such as atomic mass, atomic radius, d-electron number, and electronegativity. Moreover, as we concluded in Section 3.4 for HER, the structural properties of the catalyst, including the coordination environment and bond connectivity, are also preferred and considered in carbon materials. Material insights gleaned from these studies generally underline the importance of electronic structure and atom-environment interactions in determining catalytic activity. For instance, the electron number of the d orbital, the oxide and hydride formation enthalpies, and the electronegativity values of the central TM atom and its surrounding atoms emerge as critical descriptors. These features directly relate to the catalyst's ability to facilitate electron transfer and bond formation/breaking during the OER process. Finally, across the works, elements such as Fe, Co, Ni, and non-precious metals embedded in nitrogen-doped graphene are often identified as promising candidates for efficient OER catalysts.
In addition to the regular TM–N–C system that we have discussed, some researchers also considered related variants. Wu et al. explored the potential of double-atom catalysts in carbon matrices, which has increased the complexity (Fig. 20c).249 They innovatively applied a topological information-based feature-engineering method to handle model input that integrates atomic properties and the structural topology of active sites and their substrate environment. They also found an effective intrinsic descriptor for clarity. Besides the d-band properties of two TM atoms, the descriptor also includes the number and electronegativity of nearby C and N atoms, reflecting their unique impacts. Shan et al., however, think of another possibility in which the TM–N4 active sites are bridge-bonded by an O atom (Fig. 20d).250 Their calculation results pinpointed CoN4–O–RhN4 and RhN4–O–AgN4 as standout monofunctional catalysts for ORR and OER, respectively, and CoN4–O–AgN4 as an exceptionally efficient bifunctional catalyst. The electronic structure analysis reveals that the d-band centers of the active sites in the bifunctional catalysts result in moderate TM atom adsorption on intermediates due to the synergistic effects from bridge-bonded O ligands. This finding aligns with the previously discussed research on single-layer TM–N–C systems.
Fig. 21 (a) Illustration of the configuration of TM/VN-CN and the considered TM atoms as SAC candidates (reproduced from ref. 39 with permission). (b) Atomic structures of prevalent carbon nitrides, their CxNy-based SACs with single TM atoms indicated by blue circles, and screened transition metals on CxNys (reproduced from ref. 252 with permission). (c) Optimized g-CN structure, selection of metal atoms (Sc to Au), and binding configurations of M2 dimers on g-CN, showing both M atoms bonded with either three or two N atoms. Additionally, calculated formation energies (Ef) and Udiss for M2/g-CN are presented (reproduced from ref. 30 with permission). |
Fig. 22 (a) Schematic illustration of the side and top views of the investigated O-terminated M2M′X2O2-type doped MXenes. TM acts as the active site for the adsorption of intermediates (reproduced with ref. 253 with permission). (b) Top view and front view of the M–N4–Gr(aphene)/MXene heterojunction structure (reproduced from ref. 255 with permission). (c) The optimized structures of the 2D MnPS3 monolayer and the TM/MnPS3 catalysts (reproduced from ref. 256 with permission). |
Researchers have also investigated other unique TM compound systems. Liu et al. studied 2D GaPS4 as a substrate for hosting TM single atoms on sulfur vacancies,257 namely TM@VS-GaPS4. Using a GBDT regressor, they identified key descriptors such as the number of d electrons, bond length, and electronegativity as crucial in their ML data mining, and Pt@VS1-GaPS4 was identified as the most outstanding candidate in this system. Similarly, Li et al. investigated single TM atoms anchored on MnPS3 (Fig. 22c),256 identifying Rh/MnPS3 and Ni/MnPS3 as the best candidates. Unsurprisingly, the ML analysis revealed the number of d electrons in the TM atoms influenced the adsorption strength of OH* species, thus becoming the crucial feature. In addition, the authors of ref. 186 studied OER besides HER on a monolayer C3B substrate. It should be reminded that their results revealed Ni and Pt as the best doping candidates, and ML data mining revealed that the number of d electrons surpasses other most important features: electronegativity, atomic radius, and first ionization energy of the TM as the most significant factor.
Although these studies investigated different substrates for hosting TM atoms, ranging from MXenes with different structures/doping strategies to GaPS4, MnPS3 and C3B, we can see that the results are similar. TM d-band-related features such as the number of d electrons should be considered the most decisive feature in determining oxygen intermediate adsorption behaviors. Moreover, Ni and Pt are consistently discovered to be optimal candidates in these 2D systems. We might also consider it as the embodiment of robustness, transferability, repeatability, and reliability of ML. In contrast to the previously mentioned studies that are highly homogeneous in terms of methodology and research system, Craig et al. uniquely investigated molecular OER catalysts,33 targeting TM catalysts coordinated with specific ligands such as porphyrins using an AL approach. This method was adeptly applied to identify catalysts capable of operating through an extra oxidation mechanism, a novel area in OER catalyst research. The balance they sought between low overpotentials, and achievable proton transfer barriers was critically dependent on the use of GP regressors for predicting binding energies. A significant aspect of their methodology was the employment of reduced autocorrelation functions to generate input features, paired with a bespoke acquisition function developed for their AL framework.
Ding et al. comprehensively studied the various aspects of MEA optimization in proton exchange membrane (PEM) water electrolyzers, including OER electrocatalyst design rules for MEA that best balance cost and durability.69 They compiled an extensive database from 122 research papers, resulting in 578 entries which included detailed operating conditions, electrocatalyst compositions, and performance metrics. Their models showed great regression prediction performance for both MEA's activity and long-term stability, especially in the large current density area which is most important for the electrolyzer efficiency. The best-performing model for predicting current density at 1.9 V achieved an impressive R2 of 0.943 (Fig. 23a), demonstrating the model's accuracy. Furthermore, the researchers realized that basic feature importance ranking was not enough to capture the nuanced interplay of factors influencing MEA performance. Hence, they innovatively noticed the importance of qualitative black-box interpretation for engineering and industry targets like MEA, so they used advanced 2D SHAP and PDP interaction plots to visualize the complex relationships between variables. They also innovatively proposed to use Friedman's H statistics method262 to analyze the non-linear interaction degree between input features to help with inspecting the most impactful feature interaction pairs (Fig. 23b). Their analysis finally revealed that certain combinations of MEA design features, such as Ir weight percentage in anode electrocatalysts, would be suggested to be around 80% to balance durability and activity (Fig. 23c). Similarly, Günay et al. presented another comprehensive study,263 incorporating a wide array of components like porous transport layers and various electrode electrocatalysts in their analysis. They meticulously compiled a database from 30 recent publications, culminating in 789 data points which included intricate details like electrode compositions and operational parameters. The researchers adeptly applied a combination of ML techniques, including PCA and classification and regression tree modeling, to unravel the complex interrelations in PEM electrolyzer performance. Their nuanced approach enabled them to identify key performance indicators such as the mole fractions of Ni and Co on electrode surfaces, leading to the precise prediction of electrolyzer polarization with an impressive RMSE of 0.18 A cm−2 (Fig. 23d). Moreover, the study shed light on high-performance electrocatalysts for PEM electrolyzers, affirming the superiority of proton conductor electrolyzers over their anion exchange counterparts and highlighting the potential risks associated with certain materials like unsupported/V-doped TiO2.
Fig. 23 (a) Best ML algorithms' performance in predicting current density at 1.9 V, with 21 features shown by red points plotting predicted values against actual values; proximity to the Y = X reference line (blue) indicates accuracy. The gray area shows the common prediction range. Bar charts display the average and standard deviation of current densities, with MAE, MSE, and RMSE gauging prediction errors; lower values signify better model performance. (b) Second-order Friedman H-statistic matrices after weighted averaging of the ML models trained with the selected core features for the regression of current density. (c) 2D PDP interaction plots of Ir wt% and Ru wt% in different tasks for modeling current densities at different voltages. (a–c are reproduced from ref. 69 with permission). (d) Regression tree model prediction of the electrolyzer polarization curve's unseen observations (reproduced from ref. 263 with permission). |
Although electrocatalysts fundamentally operate at the microscale, their real-world industrial application necessitates a transition to a more holistic approach in future research. Using actual MEA data, the research community should now focus on understanding and optimizing the interplay between the intricate microscale phenomena of electrocatalysts and the macroscale operational dynamics of electrolyzer systems. This shift is crucial for tailoring electrocatalyst designs that not only excel in theoretical and laboratory settings, but also thrive in practical commercial electrolyzers, thereby bridging the gap between experimental research and industrial application for sustainable and efficient hydrogen production.
When further breaking down the results based on material categories, notable differences can be discerned. TM oxides typically have catalytic sites on specific crystal facets exposed in a homogeneous phase with short-range periodicity, whereas carbon-based and other TM compound systems often involve atomically dispersed TM heteroatoms incorporated into the substrate material. While both categories identify the number of d electrons as the most decisive feature, TM oxides focus more on electronic properties and chemical stability, while carbon-based and other TM compound systems rely heavily on their structural configuration due to their diverse bonding environments.
Finally, when breaking down the data into material categories, minor preference differences can be observed. Though both categories favor ensemble algorithms like RF and GBDT, TM oxides uniquely favor GP. GP is the most used for TM oxide systems and is reported as the best-performing algorithm as frequently as RF. This preference might be due to the popularity of AL and BO in TM oxide studies. Among the corresponding five studies, GP is adopted for its inherent flexibility and ability to provide uncertainty estimates. Additionally, symbolic regression, despite its weaker fitting ability, is uniquely favored in TM oxide studies. This trend might indicate that for TM oxides, researchers prefer interpretability over ML model capability. Due to the more complex OER mechanism on TM oxide surfaces and limited budget in query, researchers prefer using ML strategies to identify key decisive design factors rather than directly predicting OER activities, for example, straightforward combination of formulas like the octahedral factor divided by the tolerance factor in perovskites.
To address this, recent research has been directed toward exploring non-precious metal catalysts and innovative electrocatalyst designs that can enhance HOR activity in alkaline conditions. This includes the use of TM alloys, carbides, nitrides, and engineered nanostructures that aim to optimize the hydrogen binding energy and facilitate effective adsorption–desorption processes.267 Despite fewer numbers, some research still seeks to use ML in boosting HOR electrocatalysts design, especially in HEA systems. Men et al., in their experimental study of the PdNiRuIrRh HEA system,268 employed ML potential-based Monte Carlo simulations (Fig. 26a) for an in-depth analysis of the alloy's catalytic properties. They used a novel approach to construct the deep potential for the HEA, involving a dual neural network setup and a sophisticated training process via the deep potential generator, which iteratively refined the dataset based on DFT calculations. This method enabled the accurate prediction of high-dimensional potential energy surfaces, leading to the identification of key surface atomic distributions and coordination environments, such as the critical Pd–Pd–Ni and Pd–Pd–Pd bonding environments and Ni/Ru oxophilic sites. The study's findings revealed a significant enhancement in the HEA's HOR activity, with a mass activity of 3.25 mA μg−1, far surpassing that of conventional Pt/C catalysts. Moreover, the authors further used ML's potential to simulate the particle surface dissolution process, which provided insights into the enhanced stability mechanisms of the HEA nanoparticles (Fig. 26b). Hitt et al., however, had applied ML to drive their entire experimental exploration based on the parallel fluorescent screening of a broad array of alloy electrocatalysts.269 Using a unique combination of experimental data comprised of catalyst compositions, onset potentials, and extensive material characterization, they applied ML to not only predict new active catalysts, but also to uncover key insights such as the critical role of average work function and metal-oxygen bond enthalpy to determine the catalytic activity (Fig. 26c). Their approach notably identifies Pt6Sn4 as a highly effective alloy, surpassing traditional Pt/C in alkaline polymer membrane fuel cells with a higher power density of 132 mW cm−2 mgPt−1. In conclusion, less studied HOR in alkaline conditions remains challenging and could benefit from ML in boosting the catalyst design in complex alloy systems and revealing the mechanisms.
Fig. 26 (a) Optimization of the HEA nanoparticle model via ML–MC simulations. (b) The schematic of the process dissolving surface metal atoms in HEA nanoparticle to evaluate the stability. The energies of various nanoparticle systems are obtained based on the trained ML potential. (a and b are reproduced by ref. 268 with permission). (c) Left: Neural network-predicted onset potential for the alkaline HOR of an SnPtRh array. Right: Experimental onset potential for an SnPtRh array with slight discrepancies, notably along the Pt–Rh binary line (reproduced by ref. 269 with permission). |
For ORR electrocatalysts, currently there is a heavy reliance on Pt-based noble metals in commercial applications.270 The superior catalytic performance of Pt and its alloys has set a high benchmark both in ORR efficiency and predominantly four-electron direct reaction selectivity. However, due to the high cost and scarcity of Pt, researchers have been exploring alternative strategies such as doping Pt with other elements to enhance its catalytic activity or stability, and more recently, the use of high-entropy alloys (HEAs).271 HEAs have gained significant attention in the field of ORR due to their unique properties arising from their complex compositions. Despite these benefits, there remains a strong interest in non-precious metal-based electrocatalysts, especially carbon-based catalysts such as TM–N–C.272 These catalysts, particularly those featuring single-atom sites, have shown promising ORR activity. The use of transition metals such as Fe and Co combined with nitrogen-doped carbon structures has led to the development of catalysts that offer a cost-effective alternative to Pt-based electrocatalysts.273 Given the complexity of ORR mechanisms and the diverse range of catalysts being explored, this field also presents an ideal scenario for the application of ML techniques.
Fig. 28 (a) Snapshots for O2 in bulk water, the initial state, the transitional state, and the final state. The substrate is Au (100) surface (reproduced from ref. 280 with permission). (b) Left: t-SNE plot of 1300 platinum nanoparticles showing x–y distribution based on similarity in 121 dimensions. Right: Order-labeled minimum distance plot from ILS clustering with two peaks indicating distinct clusters, color-graded from blue to red based on label iteration. (c) Left: t-SNE distribution of 1300 Pt nanoparticles, colored by ILS-assigned clusters. Right: Confusion matrix confirming perfect separability of classes, primarily influenced by processing conditions and order parameters. (d) Examples of Pt nanoparticles in the set, of comparable size, with atoms encoded by the coordination number. Atoms are color-coded by coordination number: dark blue for 7, light blue for 8, green for 9, yellow for 10, and red for 11. (i) Class 1 Pt nanoparticle featuring abundant surface microstructures, (ii) Class 1 with numerous surface facets, (iii) Class 2 with a high density of surface microstructures, and (iv) Class 2 rich in surface facets. (b-d are reproduced by ref. 281 with permission). |
Fig. 29 (a) Structure of a randomly ordered Pt–Ni 85 atom octahedron. (b) Global search results of physically niche genetic-ML in the homotopic space of a Pt(85–x)–Nix nanocluster. (a and b are reproduced from ref. 283 with permission). (c) Upper: Training (green), testing (red) nanoparticles sizes used, and ML predictions of strain, with size scale truncated at 2.86 nm. Bottom: Forecasted optimal mass activities for nanoparticles with Pt shells and fcc core metals vary by size and distribution (reproduced from ref. 285 with permission). (d) Mass activity at 0.9 V (versus RHE) before and after 30000 accelerated stress test cycles (reproduced from ref. 287 with permission). |
Besides binary alloys, researchers have also investigated possibilities in ternary alloys. Chun et al. studied PtFeCu ternary alloys and trained neural network potentials to predict forces and energies in the crystal, enabling high-throughput screening of 396862 structures to pinpoint the most active and stable configurations.287 The DFT-ML emulator guided the identification of candidate compositions for experimental exploration: Pt0.82Fe0.18 (PtFe), Pt0.82Fe0.12Cu0.06 (PtFehighCulow), and Pt0.8Fe0.08Cu0.12 (PtFelowCuhigh). Moreover, the authors revealed the atomic distribution of Cu as a critical factor for enhancing activity and stability. In a half-cell test, the best Pt0.82Fe0.12Cu0.06 synthesized not only showed a three-fold higher mass activity than that of Pt/C (Fig. 29d), but also performed well in accelerated stability tests. Kang et al. also applied a similar strategy, incorporating neural network potential with Monte Carlo and MD simulations as an efficient emulator for the ternary PtNiCu system.288 They adeptly employed Gaussian descriptors for radial and angular symmetry functions (G2 and G4) as input features to predict the total energy of nanoparticles, leading to the discovery of an optimal 2.6 nm icosahedron ternary nanocatalyst, comprising 60% Pt and 40% Ni/Cu, as the best theoretical candidate with enhanced activity and stability for ORR in acidic environments. Lee et al. further broadened the choice of the doping element to include a wider range of TMs in a Pt15MmNn (m + n = 5) system, representing typical Pt3M systems.289 They employed CGCNN to efficiently predict the stability (ΔHf) and activity (ΔEO) of more than two million surface structures, using crystal structures as input to identify 29 ternary Pt alloys with enhanced ORR activity and stability under acidic conditions. This approach revealed that certain combinations, notably those including elements like Ir and Rh as secondary doping elements, could significantly stabilize the Pt-skin surface. Park et al. used a modified CGCNN, a slab-graph convolutional neural network,35 which significantly enhanced the prediction of adsorption energies by incorporating slab-graph constructions tailored for catalytic system applications. Their interest is in the ternary core–shell structure like X3Y@Z, which demands exploration of the vast ternary alloy space. As a result, the authors successfully identified Cu3Au as core and Pt as shell as a superior catalyst, demonstrating a roughly two-fold increase in kinetic current density and a significant reduction in Pt usage through experimental validation.
Fig. 30 (a) Surface configurations parameterized by nearest neighbors. Left: *OH on-top binding highlighted by zones—binding site (orange), single-coordinated surface (light green), and subsurface (light gray) neighbors. Right: *O fcc hollow binding with zones—binding site (orange, 35 parameters), single-coordinated (light green/light gray), and double-coordinated (dark green/dark gray) neighbors (reproduced from ref. 290 with permission). (b) From top to bottom, on the top is a schematic of the complex solid solutions surface populations: Red for oxygen, white for hydrogen, with varied colors for complex solid solutions. The second layer shows histograms depicting *OH (green), *O (blue), and combined (grey) binding energy distributions across a 10000-atom surface, showing optimum energies on volcano curves. The third layer shows example polarization curves for Ag4Ir16Pd30Pt14Ru36 measured against potential, with red lines at 0.82 V versus RHE. The bottom are activity maps from models I, II, and III, showing current at 0.82 V versus RHE for selected compositions highlighted by a black box (reproduced from ref. 291 with permission). (c) Workflow of the BO algorithm: Terminated at N = 150 samples to assess the deviation in samples needed for optimal composition discovery. Acquisition function evaluated with n = 1000 random compositions (reproduced from ref. 292 with permission). (d) Visualization of compositional coverage for ternary and quaternary libraries across 342 measurement areas on a 100-mm diameter substrate, spaced at 4.5 mm intervals. Bottom: Demonstration of co-deposition from five sources and compositional gradients in a co-sputtered quinary materials library, with the same measurement grid (reproduced from ref. 293 with permission). |
Further advancing the field, their third paper introduced BO to efficiently navigate the compositional space of Ag–Ir–Pd–Pt–Ru and Ir–Pd–Pt–Rh–Ru systems292 (Fig. 30c). Using a kinetic model informed by DFT calculations, the study input features involved molar fraction vectors for alloy compositions. The authors targeted the optimization of current density corresponding to different compositions that were computed based on models proposed in the previous work.291 This approach allowed for the prediction and experimental validation of optimal catalytic activities with a significantly reduced experimental dataset in only about 50 attempts, exemplified by discovering optimal binary alloys such as Ag14Pd86, Ir35Pt65, and Pd65Ru35 with high ORR activity. Recently, the authors advanced their methodology from previous works by incorporating a unique combinatorial strategy, building upon their established foundation of integrating computational predictions with high-throughput experimentation. This latest effort systematically covered the vast composition space of Ru–Rh–Pd–Ir–Pt.293 By deploying a data-guided experimentation approach, which involved permutations of deposition source arrangements, they efficiently expanded the experimentally explored composition space beyond their earlier achievements with BO and DFT surrogate models. This methodology enabled the identification of an optimal electrocatalytic composition, Ru25Rh15Pd31Ir15Pt14, demonstrating the enhanced power of combining advanced simulation with large experimental datasets (Fig. 30d). The study also revealed the critical importance of Ru and Pd content for enhancing electrocatalytic activity in HEA systems for ORR. Across these studies, the team adeptly navigated from scale-specific, DFT-based microscale predictions to macroscale experimental validations, highlighting a transition from conventional DFT surrogate models to employing advanced optimization and high-throughput experimental approaches. This progression demonstrates a strategic application of ML to bridge theoretical models with empirical evidence, effectively exploring the complex composition space of HEAs for identifying superior electrocatalysts.
There are also several other notable works. Lu et al. investigated the Ir–Pd–Pt–Rh–Ru system,294 like ref. 290, but chose neural networks for regression modeling, uniquely applying them to decouple the ligand and coordination effects in HEA catalysts. By leveraging a neural network trained on DFT-calculated adsorption energies, they achieved a MAE of 0.09 eV. Moreover, their approach allowed them to dive deeper into mechanisms. They identified that coordination number and element identity are critical factors in determining the adsorption energy. This derived the pattern that more undercoordinated sites bind to *OH more strongly, ending up with higher ORR activities. Similarly, Saidi reported a work on Pt-free multinary PdAuAgTi alloy.295 By focusing on the ΔEOH as the ORR activity descriptor, the study identified an optimal composition range of 8–12 at% Ti, which showed enhanced ORR performance close to that of Pt. Further, the research unveiled that substituting Au and Ag with more cost-effective elements like Cu and Zn not only maintained but potentially improved the catalytic activity, thereby opening avenues for more economically viable catalyst options. Yuan et al. further investigated the HEA without noble metals in the system of CoFeNi–X (X = Mo, Mn, or Cr).296 Through a standard DFT-ML strategy, they found that Mo and Cr could enhance the formation of bridge and on-top binding sites, which are crucial for ORR processes. Remarkably, they demonstrated that the typical scaling relationship between ΔEOH and ΔEO remains consistent across equimolar HEAs, yet stoichiometric adjustments can disrupt this balance. Of particular note, we found that in this section's research works the output fitting targets are all ΔEOH and ΔEO, rather than the commonly used ΔG previously adopted by works in the HER and OER studies. We speculate that the focus on adsorption energies in these studies is due to the complexity of the thermodynamics on HEA surfaces, where site variations cause different thermodynamic behaviors. This simplifies catalytic activity exploration by prioritizing a key step in electrocatalysis and avoiding the detailed thermodynamic corrections that could usually be managed with constants in other research works.
Fig. 31 (a) Left: Zigzag graphene nanoribbons with sulfur dopants at various sites (S), and active sites (Z), with absent hydrogen marked by blue circles. Z2′ and Z3′ sites specific to dual-atom doping. Right: Armchair graphene nanoribbons featuring substitutional (S) and active sites (A), with A2′′ and A3′′ exclusive to single-atom doping (reproduced from ref. 300 with permission). (b) Linear sweep voltammetry polarization curves of samples annealed at 950, 1050, and 1150 °C and Pt/C catalyst at 1600 rpm in 0.1 M KOH saturated with O2. (c) Corresponding Tafel plots. (b and c are reproduced from ref. 302 with permission,) (d) Top: Structure of M–N4C10 with 28 central metals and six environmental atoms illustrated. Bottom: Eight configurations of SACs defining the sample space, with blue-violet for M, green for N, gray for C, and pink for doped atoms (reproduced from ref. 304 with permission). (e) Geometric structure of left: bare and right: OH-modified TM1TM2–N6 structures. Gray, red, blue, and white balls represent C, O, N, and H atoms, respectively, while pink and brown balls represent TM atoms (reproduced from ref. 305 with permission). |
As a consensus in the field,298,299 further doping of TM atoms would increase the electrocatalytic activity, validated by both experiments and theoretical simulations. Therefore, the TM–N–C configurations have attracted attention and resulted in several similar studies aimed at expediting the discovery and optimization of such SACs304,306,307 (2023; 2023; 2020). These investigations have systematically explored the influence of transition metals and environmental atoms on SACs’ performance by applying DFT calculations alongside ML to predict catalytic activities with high precision. For instance,304 one study considered other non-metal environmental atoms besides N: P, S, O, etc. (Fig. 31d). The authors identified 30 high-performance catalysts from a vast sample space of 1344 structures by combining geometric and electronic features, achieving an impressive predictive accuracy (RMSE of 0.12 V). In another work,306 the incorporation of unique descriptors such as the valence electron correction and the degree of construction differences has significantly improved model predictions, highlighting the importance of local structural configurations surrounding the active centers. As for deeper insights into the structure-performance relationships, key findings across these studies underscore the pivotal role of the central metal's electronic structure, particularly the number of d-electrons, radius, and electronegativity, in determining SACs’ ORR activity. Among the various TM–N–C configurations studied, Fe–N–C and Co–N–C emerged as the most promising candidates, owing to their optimal balance of catalytic activity and stability, as revealed through importance analysis. Such results are highly consistent with domain consensus validated by experiments.
Wang et al. further explored more possible types of N–C substrates (15 types) for hosting TM single atoms,308 leveraging ML to decouple the effects of adsorbate geometry and substrate-specific properties on the adsorption energy of O2, which is crucial for optimizing electrocatalytic activity. Their innovative approach identified a novel, data-driven descriptor related to the geometrical configuration of the adsorbed O2, which emerged as the most significant factor influencing adsorption energy, thereby providing a quantitative basis for the design of TM–N–C SACs with tailored catalytic properties for ORR. Finally, two different groups have coincidentally noticed dual-TM–N–C sites with geometric structures (Fig. 31e) like TM1TM2–N6. Deng et al. and Zhu et al. both targeted the design and efficiency optimization of this system. Deng et al. discovered that Co2–N–C and other eight configurations exhibit superior ORR activity,309 surpassing Pt benchmarks, with Co–Ni–N–C showing a notable limiting potential of 0.88 V. Zhu et al., however, identified Cu–Fe and Ni–Cu as candidates.305 Nevertheless, both studies highlighted the pivotal role of geometric parameters as critical factors, a finding revealed through ML, which underscored the simple geometric distances between TM atoms and coordinated N atoms as key to enhancing ORR performance. The impact of electronic descriptors such as electron affinity and electronegativity are also consistent between the two studies.
First for metal-free systems, Dan et al. focused on N-doped graphene,310 with their main emphasis on investigating the electron transfer numbers. This value is crucial for determining the two-electron or four-electron ORR pathway, as previously introduced. By applying ML to correlate synthesis parameters and material characteristics with ORR performance, they discovered that synthesis time and N doping levels are critical for optimizing the electrocatalytic efficiency of N-doped graphene materials. Jiang et al. investigated polymer hollow spheres (Fig. 32a) using ML to guide the choice for reactants like dopamine, trioctylamin, ammonia, and so forth.71 Their method revealed that reaction time and the amount of TOA and water were critical for the morphology of the spheres. The quantitative ML approach could successfully predict product morphologies to be solid or hollow, which could benefit the fine control of nano-synthesis. Xia et al. further proposed to build their ML models based on 123 different metal-free carbon materials collected from 50 works,311 focusing on N content and surface area as critical descriptors for predicting the onset potential of ORR. Their application of materials informatics led to the identification of nitrogen-doped graphene nanomesh as an optimal substrate for anchoring iron phthalocyanine, culminating in the fabrication of the sample under ML guidance. This catalyst showcased an unprecedented electrocatalytic activity for ORR in alkaline environments, with the most positive ORR peak at 0.87 V and an onset potential of 0.99 V in alkaline condition, surpassing even commercial product 20 wt% Pt/C.
Fig. 32 (a) Schematic diagram for guiding emulsion interfacial polymerization to prepare hollow spheres by ML (reproduced from ref. 71 with permission). (b) Adaptive learning in material design uses existing data and ML to correlate material properties with performance outcomes. By integrating uncertainty quantification and optimization, it guides the selection of new materials for testing to achieve specific targets and reduce model uncertainty. The highlighted approach prioritizes testing materials with greater predictive uncertainty, enhancing algorithmic performance and refining computational models with each iteration (reproduced from ref. 28 with permission). |
Like OER, the acidic medium is more challenging for carbon-based materials. A group of researchers led by Zelenay et al. investigated the zeolitic imidazolate framework-8 (ZIF-8) derived Fe–N–C, as such systems have been regarded as some of the most promising candidates in acid medium. Their initial study312 focused on input features such as Fe precursor identity, content, and pyrolysis temperature. Through this approach, they discovered that GBDT and SVR models were most effective, leading to a 36% increase in measured mass activity. The importance analysis revealed the pyrolysis temperature as the most critical parameter influencing catalyst performance. Building upon this foundation, their subsequent work introduced an adaptive learning framework (Fig. 32b),28 enhancing the methodology by incorporating statistical inference and uncertainty quantification to navigate a six-dimensional search space efficiently. This advanced approach resulted in the identification of four catalysts outperforming the original dataset, with the best catalyst showing a 33% improvement in ORR activity, specifically, an impressive mass activity of 16.3 mA mg−1. Ding et al. also explored the ZIF-8 system, with a unique angle, uncovering the often-overlooked significance of pyrolysis time alongside pyridinic nitrogen species as decisive factors through the ML analysis of comprehensive experimental datasets.313 Their approach, underpinned by data mining from 103 studies and a dataset encompassing 225 entries, revealed that pyrolysis time, typically not varied in previous studies, plays a crucial role in catalytic performance. By integrating ML predictions with experimental validations, they demonstrated a volcano-like relationship under different pyrolysis temperatures between pyrolysis time and E1/2, pinpointing an optimal pyrolysis time that led to a superior E1/2 of 0.82 V in acidic conditions for the best-performing catalyst. Moreover, combining characterization results and SHAP analysis, the authors revealed that the deeper mechanism of such a trend is the conversion of N species throughout the pyrolysis process, which has further proven the potential of ML in electrocatalyst research.
Similar to the case of OER in Section 4.3, looking at ORR electrocatalysts from an MEA perspective is critical for real-world fuel cell applications. Due to the difference in reaction conditions, candidates that are theoretically or experimentally half-cell validated might not be able to have the same performance in the MEA component. From the chemical engineering perspective, the macro electrochemical performance does not solely depend on the intrinsic activity of electrocatalysts. The component, preparation methods, support type of the ORR electrocatalysts, as well as engineering parameters like catalyst loading, solvent type, recipe, and thickness of ion-conducting membrane, are coupled together.16 As a result, experimental and theoretical research on electrocatalysts alone often cannot achieve satisfactory results in fuel cell single cells. Noticing this point, Ding et al. have leveraged ML to streamline the optimization of Pt-based MEA for PEMFCs.315 Their comprehensive approach used a dataset constructed from 295 articles spanning 17 years, resulting in 918 entries with 66 initial features, and focused on identifying key parameters that influence MEA performance. Their feature importance analysis on the domain knowledge revealed the pattern that, compared to parameters related to nano- and micro-scale synthesis and electrocatalysts components, the engineering parameters of MEA are more decisive toward power density as macro performance (Fig. 33a). For the next step of ML workflow, they distilled 27 critical features (Fig. 33b) from the initial 66 to obtain good regressors that could predict MEA power densities with less than 15% error (Fig. 33c). Moreover, the visualized DT and apriori associate rule mining found that for Pt-based catalysts in MEA, the recommended carbon substrate mass fraction should be kept lower than 57.75 wt%. This is not a good strategy in more idealized half-cell tests in pursuing higher ORR activity, but it is a practical approach to ensuring good macro electrochemical performance in MEAs. The authors also investigated non-precious metal (namely carbon-based TM–N–C)-based MEAs.70 First, they found consistent patterns showing that MEA engineering parameters are more decisive in feature importance ranking (Fig. 33d). They also obtained several applicable catalyst design rules recommended for carbon-based ORR electrocatalysts, specifically in MEA. For example, due to increased TM–Nx active site density, micropores are generally preferred for increasing the intrinsic ORR activity for TM–N–C-type carbon-based materials. However, through visualized DT, the authors identified mesopore and macropore in child nodes, indicating a balanced tradeoff between increasing intrinsic activity and ensuring enhanced mass transfer. Huo et al. further used the dataset collected by Ding et al.'s previous work70 for the carbon-based MEA system, and introduced more advanced ML algorithms like CNN to increase the prediction accuracy for single-cell polarization curves.316 Their enhanced model can serve as good experimental surrogate models in guiding the optimization of TM–N–C carbon-based ORR electrocatalysts with much less cost on trial-and-error attempts.
Fig. 33 (a) Feature importance heuristic by XGBoost algorithm pre-feature selection, categorizing features into the microscopic properties of Pt-based nanocatalysts (black), preparation process parameters (blue), and single-cell device operating conditions plus MEA preparation (red). (b) Top: Feature importance after the selection. Bottom: Test set classification performance comparison before and after feature selection, illustrating the algorithm's efficiency in identifying and using key features for predictive accuracy. (c) Predictions output by the best performing ANN regressor on the test set (a–c are reproduced from ref. 315 with permission). (d) Feature importance heuristic from the XGBoost algorithm, with red features linked to PEMFC operating conditions and black features linked to non-precious metal electrocatalysts’ intrinsic properties (reproduced from ref. 70 with permission). |
For carbon-based materials, typically TM and N-doped graphene, a different trend in descriptors is observed. Beyond bond length, which describes topological structure and intrinsic physical atomic properties like ionization energy and the number of d-electrons, studies have adopted unique features such as pyridinic nitrogen content, pyrrolic nitrogen content, and Brunauer–Emmett–Teller (BET) surface area. These features are not frequently used for carbon materials in the HER and OER sections. The reason for this difference is that, unlike HER/OER studies which are typically theoretically based on DFT simulations, ORR studies of carbon-based electrocatalysts emphasize ML based on datasets derived from direct experimental synthesis and evaluation. This emphasis results in the use of techniques like AL and BO, which are suitable for limited data cases, and brings insights from a meso-macro perspective. The segmentation of nitrogen species and pore structures is dominant in the performance of carbon-based materials for ORR. Typical Pt-based metal/alloy electrocatalysts do not require a high surface area for the substrate carbon. For example, commercial Pt/C uses Vulcan XC-72 carbon black rather than BP2000. However, due to the intrinsic difference in active sites, to enrich the density of ORR active M–N–C sites, carbon-based electrocatalysts focus on nano-engineering to increase surface area and the abundance of pyridinic (metallic)-type nitrogen species. These species are important both for their intrinsic ORR activities and their ability to host TM dopant atoms to form more effective M–N–C sites.317–319 The demand for achieving good ORR activity is further revealed in Fig. 34b by the emergence of synthesis parameters: reaction (hydrothermal and pyrolysis) time and pyrolysis temperature. This trend is consistent with the focus on facet engineering in metal/alloy systems,320 while for carbon-based materials, the focus is on identifying better atomically dispersed defect doping structures or optimizing synthesis conditions to improve experimentally observed performance.
From paradigm perspective, ML, particularly through supervised learning for surrogate model training, has advanced both computational simulations and experimental explorations, enabling the rapid discovery and optimization of novel electrocatalysts. Moreover, data mining and interpretative analysis of “black-box” models have offered deep insights into the physical and chemical attributes of these electrocatalysts, aiding in the identification of key descriptors and design parameters. The integration of ML marks a significant paradigm shift toward data-centric approaches in electrocatalyst design, significantly enhancing the pace of electrocatalyst discovery and the understanding of electrocatalytic processes. This shift has not only led to the prediction of catalytic performance and the discovery of novel electrocatalysts with unparalleled speed, but also highlighted the potential of ML in addressing the economic and sustainability challenges in hydrogen energy production. As we move forward, ML's ability to bridge the gap between computational predictions and experimental validations is poised to revolutionize electrocatalyst development for hydrogen energy conversion, promising more sustainable and energy-efficient solutions.
To overcome this, leveraging ML's capability as surrogate models for cross-scale and multi-fidelity simulations, for example from DFT to MD, might be a possible choice. Though it requires more dataset preparation cost, this approach enables comprehensive modeling of electrocatalytic processes, from atomic interactions to macro-scale fluid dynamics, significantly enhancing the accuracy and predictive power of simulations. This is already a hot topic in life sciences and can be applied to electrocatalysis.321 By integrating these scales, theoretical simulation-based ML can rapidly identify optimal configurations and conditions with higher fidelity to be validated by experiments.
To enable automated experimental investigation, robotic automated laboratories have been established, such as the autonomous mobile robot reported by Cooper et al.,322 which optimizes photocatalytic electrocatalysts using a Bayesian decision-making method. Brabec et al. reported a high-throughput autonomous decision and experimental platform for the rapid synthesis of ABO3-type perovskites for ML data analysis.323 Since then, similar reports of robotic autonomous electrocatalyst synthesis and evaluation platforms have gradually increased over the past two years324,325 (2022; 2023).
On the other hand, the advent of generative AI, for instance, large language models (LLMs) like ChatGPT have made the NLP work of scientific publications, which typically requires expertise in materials science and chemistry, much faster. For example, Yaghi et al. recently reported the use of ChatGPT to rapidly extract information from MOF-related publications,326 directly obtaining a large amount of tabular data from synthesis-related paragraphs to aid ML modeling and guide experiments. Recently, Dagdelen et al. leveraged LLMs like GPT-3 and Llama-2 to perform joint named entity recognition and relation extraction tasks in materials science.327 By fine-tuning these models on annotated text passages, the study demonstrates how LLMs can extract complex, structured information about materials, such as dopants, host materials, and metal–organic frameworks, from scientific texts. The approach simplifies the creation of large, structured databases of specialized scientific knowledge, facilitating the advancement of materials discovery and design.
There is also more complex system that combine all the above-mentioned ML models based on different knowledge sources together. Ceder et al. proposed A-lab for the discovery of oxides in lithium-ion batteries.328 Their innovation lies in the push for high-throughput automated robotic experiments using a multi-decision framework. In their work, which employs multiple ML expert systems for different processes, researchers have enabled decision-making based on DFT simulations and extensive scientific text mining to participate simultaneously in the active learning cycle of robotic synthesis. This equates to allowing the autonomous discovery process of electrocatalysts to benefit from multiple data sources and expert system decisions from domain knowledge (published literature), theoretical simulations (DFT), and local experimental data (e.g., both manual and automated laboratories). The power of generative AI also extends to the development of more advanced and efficient generative models for theoretical molecular and material design. For instance, Daigavane et al. recently introduced Symphony,329 an E (3)-equivariant autoregressive model that uses higher-degree spherical harmonic projections to generate accurate 3D molecular geometries, outperforming existing models in capturing complex molecular symmetries. Similarly, Zeni et al. presented MatterGen,330 a diffusion-based generative model that not only produces stable, diverse inorganic materials across the periodic table but can also be fine-tuned to meet specific property constraints, such as symmetry or magnetic density. These advancements in generative AI highlight its potential to significantly surpass traditional surrogate ML models, enabling faster and more autonomous discovery of electrocatalysts by integrating diverse data sources and experts.
As discussed in this review, researchers in hydrogen electrocatalysts have experimented with manually crafted high-throughput electrolysis cells,269 co-sputtering deposition,293 and automated platforms28 for weighing, dispensing, and shaking. However, these approaches are not yet mainstream due to the lengthy synthesis routes and high costs associated with electrocatalyst evaluation. While high-throughput synthesis instruments are not widely adopted in electrocatalysis due to their high costs, more accessible alternatives can be explored. For instance, inkjet printers, popular in the sensor field and easily programmable, can be adapted for high-throughput catalyst preparation.331,332 Studies have demonstrated that integrating ML with these systems can facilitate the design of flexible electronics, and similar methodologies can be applied to hydrogen electrocatalysts. By using cost-effective, readily available devices like inkjet printers, researchers can potentially achieve rapid, data-driven discovery and optimization in electrocatalyst development.
The first viable approach is to employ DFT data for early rapid screening and subsequently use targeted experimental data to identify potential candidates, allowing researchers to achieve focused optimization of electrocatalysts. This phased method, using data of varying fidelity, mirrors the engineering design process's funnel approach: starting broadly and then narrowing down to specific details. Researchers can learn from the ideas in multi-fidelity ML, and regard DFT as cheap low-fidelity data and experimental observation as expensive high-fidelity data. Taking into account the cost factor (using a certain indicator to quantify the computational cost and experimental cost) for query budget, the mature multi-fidelity active learning workflow is used to complete efficient optimization development.333
Another approach is through techniques like transfer learning. We can disseminate insights and knowledge across these datasets and corresponding systems, thereby reducing the costs associated with training data. For example, within the same system, data of different fidelity levels, typically from DFT simulations and actual experimental data, can be leveraged to capitalize on their respective strengths within the same electrocatalyst discovery process. Drawing inspiration from the fields of NLP and computer vision, a potential strategy could involve using high-throughput DFT data to train initial ML models, followed by fine-tuning these models with a selected set of costly experimental data. This approach not only ensures efficient resource utilization, but also guarantees that models accurately reflect real-world experimental conditions. Furthermore, for similar electrocatalyst systems, we should also explore the possibility of transferring knowledge between them. For instance, recent work by Ding et al., based on automated text mining, has shown that models based on alkaline and acidic HER/OER publication data can achieve favorable results by fine-tuning them on a small set of neutral HER/OER publication data.334
To address these challenges, there is a need to shift from solely relying on black-box ML models to adopting more transparent, white-box or grey-box models. These models not only provide greater interpretability but also enhance the reliability and acceptance of ML predictions in practical applications, reducing risks associated with model deployment and facilitating deeper insights into electrocatalytic processes. Such models are integrated with fundamental physical and chemical principles, or other types of domain knowledge, alongside data science. Moreover, enhancing interpretability is not only a matter of reducing risks but also vital in creating chances for mining deeper insights into the mechanisms underlying processes. Here we present some practical strategies for white box models: Incorporating domain knowledge involves embedding physical and chemical laws directly into the ML models, which constrains predictions to be physically plausible by using known reaction mechanisms or material properties as part of the model input. Model simplification can also enhance interpretability by utilizing simpler models such as linear regression, decision trees, or SISSO, especially when these models effectively capture the essential relationships within the data. Additionally, hybrid models that combine ML models with mechanistic models can leverage the strengths of both approaches; for example, using ML to predict parameters in a mechanistic model can provide interpretable and reliable results.
A significant oversight in current studies is the lack of consideration for MEA requirements for electrocatalysts at the engineering level. The gap between the idealized conditions often represented in DFT simulations and the complex, real-world operational environments of electrolyzers and fuel cells is substantial. Future research must prioritize high-throughput experimental approaches that focus on device-level synthesis and testing. Such high-throughput experiments are instrumental in facilitating the practical deployment of ML-optimized electrocatalysts, making the leap from theoretical models to tangible, operational systems.
To address these challenges effectively, collaboration between academia and industry (including national labs) is essential. Industry partners provide real-world needs and support, along with practical problems that need solutions, which can guide academic research toward more applicable solutions. Specific collaborative initiatives could include joint research projects, shared datasets, and industry-sponsored research programs. Successful collaborations in related fields, such as the development of ML models for drug discovery, provide a blueprint for similar efforts in electrocatalysis. For instance, AlphaFold is a successful collaborative outcome by Google DeepMind and European Molecular Biology Laboratory. Recommendations for future collaborative efforts should focus on the transition from high-throughput materials screening to real-time applications, ensuring that ML-optimized electrocatalysts are not only developed efficiently but also deployed effectively in practical settings. The benefits of these collaborations are manifold. Accelerated innovation, resource sharing, and the practical deployment of research outcomes are just a few. By working together, academia and industry can leverage their respective strengths to overcome the economic and sustainability challenges in hydrogen energy production. This collaborative approach will enable the rapid advancement of ML applications in electrocatalysis, driving the development of more efficient and sustainable hydrogen energy conversion technologies.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4cs00844h |
This journal is © The Royal Society of Chemistry 2024 |