Darik A.
Rosser
ab,
Brianna R.
Farris
ab and
Kevin C.
Leonard
*ab
aCenter for Environmentally Beneficial Catalysis, The University of Kansas, Lawrence, KS, USA
bDepartment of Chemical & Petroleum Engineering, The University of Kansas, Lawrence, KS, USA. E-mail: kcleonard@ku.edu; Tel: +1 785-864-1437
First published on 1st December 2023
Obtaining useful insights from machine learning models trained on experimental datasets collected across different groups to improve the sustainability of chemical processes can be challenging due to the small size and heterogeneity of the dataset. Here we show that shallow learning models such as decision trees and random forest algorithms can be an effective tool for guiding experimental research in the sustainable chemistry field. This study trained four different machine learning algorithms (linear regression, decision tree, random forest, and multilayer perceptron) using different sized datasets containing up to 520 unique reaction conditions for the nitrogen reduction reaction (NRR) on heterogeneous electrocatalysts. Using the catalyst properties and experimental conditions as the features, we determined the ability of each model to regress the ammonia production rate and the faradaic efficiency. We observed that the shallow learning decision tree and random forest models had equal or better predictive power compared to the deep learning multilayer perceptron models and the simple linear regression models. Moreover, decision tree and random forest models enable the extraction of feature importance, which is a powerful tool in guiding experimental research. Analysis of the models showed the complex interaction between the applied potential and catalysts on the effective rate for the NRR. We also suggest some underexplored catalysts–electrolyte combinations to experimental researchers looking to improve both the rate and efficiency of the NRR reaction.
Machine learning is an essential tool for accelerating chemical research because it deconvolutes trends in higher dimensional spaces.1 The rise of machine learning-related publications in the fields of catalysis and sustainable chemistry indicates an eagerness to adopt new methods to predict catalyst performance.2–8 Moreover, the free availability of ready-to-use machine learning packages has made machine learning prevalent in varied fields, including medicine,9 material science,10 energy,11 food science12,13 and engineering.14,15 Despite this surge, large data sets still must be generated to train complex machine-learning algorithms. In the chemical field, this is being done by populating data-sets with density functional theory calculations to augment the search for new catalysts.15–19 However, training machine learning models on experimental data is very challenging because a central repository of experimental data does not exist. Currently, researchers are dependent on the large, non-uniform body of experimental work documented in the archival literature to train machine-learning algorithms.
Despite the challenges, the application of machine learning to experimental data sets is an exciting and growing field. Recently our group explored machine learning on the electrocatalytic reduction of CO2.20 This work focused on classification algorithms to predict product selectivity and determine feature importance; however, regression of reaction efficiency and/or rate has yet to be fully explored. One reaction pathway where regression machine learning models should be explored is the electrocatalytic nitrogen reduction reaction (NRR) to ammonia. The NRR is seen as a popular route for enabling the electrification of the ammonia industry and for utilizing water-derived hydrogen instead of fossil fuel-derived hydrogen.21 This interest has created a well-developed field of NRR research and provided some data for training machine learning models.
The electrocatalytic reduction of nitrogen into ammonia is particularly challenging for several reasons. First, the thermodynamic standard reduction potential is close to that of proton reduction to hydrogen, creating significant competition between the NRR and the hydrogen evolution reaction (HER).22 Moreover, the NRR may go through either a dissociative or associative mechanism that requires at least six proton-coupled electron transfer steps, which typically keep efficiencies low. Thus, the catalyst, electrolyte, and applied potential are all variables that can have convoluted effects on both the rate and the efficiency. Ultimately the low faradaic efficiencies and small reaction rates typical of NRR leave researchers uncertain about what direction to take research next.21
Our objective was to determine what insights off-the-shelf machine learning algorithms trained on experimental data sets reveal about the NRR, how these tools may be used in other fields, and how the accuracy scales with data availability. We also set out to both train a highly accurate model and also to describe how machine-learning algorithms compare to the most basic regression algorithm – linear regression. Specifically, we assessed how off-the-shelf shallow learning and deep learning algorithms trained on experimental data amassed from a wide range of groups and experimental conditions could predict the faradaic efficiency or the ammonia production rate when given the reactor operating conditions. Even though these are highly convoluted problems with relatively small and diverse data-sets, shallow learning algorithms could achieve coefficients of determination (R2) greater than 0.9. The shallow learning decision tree and random forest models performed as well as the deep learning multi-layer perceptron models, which makes it easier for experimental researchers to apply shallow models to adjacent fields. Decision tree and random forest models also discerned more complicated patterns with respect to feature importance, a step toward improving electrocatalytic NRR research.
Model training was performed by splitting the remaining data with 80% in the training set and 20% in the testing set to produce single pass R2 scores. All MLPs were tuned and trained using 50 epochs. All models were trained using a mean squared error loss function. To further ensure the accuracy scores for the machine learning algorithms, 5-fold cross-validation was performed. The scores are calculated for each stratified testing set, and an average is taken to give the cross-validation score. Feature importance from the random forest regression models was calculated using SKLearn's built in function based on mean decrease in impurity.
The input features were one-hot encoded using Scikit-Learn built-in encoders so that all models are trained on comparable data-sets. For linear regression, decision tree and random forest models the label encoded results are also presented in the ESI, section S5.†
All source code for this study can be found in the ESI.†
Using our human-curated data set (Fig. 1), regression analyses were performed targeting either the faradaic efficiency or the NRR standardized rate. The efficiency decision tree had 197 nodes and a depth of 12, while the rate decision tree had 161 nodes and a depth of 12. The efficiency and rate random forest regressors both had max depths of 18 with 10 estimators for a total of 2734 and 4962 tunable parameters, respectively. The efficiency neural network used 97 nodes through 6 layers with relu activation functions and 3215 tunable parameters while the rate neural network used 25 nodes through 3 layers with relu activation functions and 1276 tunable parameters. The final layer of both neural networks had 1 node with a linear activation function. The number of tunable parameters for each model should be compared to the magnitude of the training set to show that it is reasonable in the context of the problem. The training set used 333 data points, and the most tuning parameters used were 4962 by the random forest predicting rate, approximately 15 tuning parameters for each training data point.
Additionally, the averaged single pass testing and training R2 scores were analyzed to interpret the degree of over-fitting on each model, presented in ESI, Fig. S3b.† The most over-fit model was the MLP predicting the faradaic efficiency with 0.145 difference between training and testing coefficients of determination while the average difference was 0.107–an indication of some degree of over-fitting, especially in the decision trees and MLPs.
Visualization of the regression model is shown in Fig. 3, which depicts a plot of actual versus predicted values. This type of data visualization can be used to identify outliers and poorly predicted points in the testing set. For a perfectly accurate prediction, the graphs would generate a line with a slope of 1. Points close to the diagonal line are predicted better than points far from the diagonal line. Points above the diagonal were predicted greater than reality, while the reverse is true for points below the diagonal. Outliers were determined via z-score of the absolute difference between predicted and actual values with a threshold of three standard deviations.
Interestingly, the same outlier exists in all models where the models are under-predicting the efficiency by one third. The outlying point had a faradaic efficiency of 0.35 while the random forest model predicted a faradaic efficiency of 0.11. The outlier is an experiment on Au single atom catalysts at −0.2 Volts NHE in aqueous Na2SO4 electrolyte.23 In the dataset, there is an identical experiment reported with an efficiency of 0.09, except that the Au catalyst is not atomically dispersed. In this case, the only difference is the microstructure–demonstrating the importance of how microstructure affects activity. This is an example of how machine learning can identify interesting experimental outliers that might be overlooked otherwise and help guide experimental research.
Intuitively, one may expect that the catalyst would be the most important feature for driving either the NRR rate or efficiency. However, the feature importance analysis showed support was the most important general feature, demonstrating the importance of catalyst–support interactions in the NRR. In addition, the applied potential was more important than the catalyst used for both parameters. This is an important finding because it provides guidance to experimental researchers in the NRR area to perform bulk electrolysis and rate determination measurements across a wide range of potentials to maximize the rate and efficiency performance. Fig. 4 also reflects trends researchers expect from NRR. Potential, element, and electrolyte are important to the random forest, mirroring the known HER competition at more negative potentials and adsorptive competition with protons from the electrolyte.21,24 Microstructure and cell type were important for predicting rate which corresponds to the known challenge of nitrogen diffusion and turnover.21,24
Comparing label encoded and one hot encoded feature importances shows that the catalyst feature had different fractional importance. Specific catalysts were highly important for predicitng both rate and efficiency, however random forests trained on label encoded data sets neglect the catalyst feature. Label encoding gave the catalysts randomly arranged integer values - essentially erasing the physical meaning of the value the random forests would read. In contrast, one hot encoding explicitly notes whether or not a catalyst was used in a given data point. Thus one hot encoding may preserve physically important information that label encoding blurs. This highlights how important selecting physically relevant features, and understanding how those features will be enumerated, is for a researcher applying machine learning models to catalysis.
From Fig. 5 FeMo and lithium trifluoromethanesulfonate (LiOTf) are the catalyst and electrolyte with the highest interquartile range for predicting NRR efficiency and could make a more efficient system when combined (Fig. 5a and b). To the best of our knowledge FeMo catalysts and LiOTf electrolytes have not been explored together. Additionally, C3N4 supports and nanofiber microstructures have the highest interquartile ranges for predicting the rate (Fig. 5c and d) and have been previously explored together.25 To the best of our knowledge the combination FeMo catalyst supported on C3N4 with nanofiber structures in LiOTf electrolyte has not been explored, and may yield both more efficient and higher yielding catalysts.
Finally, analysis of the feature importance from one hot encoded random forests showed that Au, CoFe, FeMo, VFe, CrN, B4C, Fe2O3 and Pt have an impact on the measured rate. The important catalysts were then highlighted on a plot of rate vs. potential as shown in Fig. 6. The clustering of important catalysts introduces new information to the researcher. For example, investigating the Au clustering at different potentials indicates that atomically dispersed gold on carbon support (Au/C) catalysts have higher NRR rates at less negative potentials compared to poly-crystalline Au catalysts. This analysis also shows that Co catalysts under-perform Au catalysts in the same potential range of (−0.75 V to −1.25 V vs. NHE) except for CoFe, which performs highly. These insights promote research into Au and CoFe based catalysts for electrocatalytic ammonia production.
Moreover, these shallow learning models can provide additional insights, including the most important features of the dataset, via analysis of the random forests models and the branch decision-making with decision trees. For the experimental researcher, understanding which experimental parameters have the largest effect is actually more important than the predictive power of a model that does not give feature importance. Our analysis uncovered specific combinations of applied potential in conjunction with the catalyst used to improve the rate of the NRR. Moreover, our analysis showed the combinations of FeMOcatalyst supported on C3N4 with nanofiber structures in LiOTf electrolyte have not been fully explored, which may be good avenues for experimentalists in this area to develop.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3dd00151b |
This journal is © The Royal Society of Chemistry 2024 |