Predictive machine learning models trained on experimental datasets for electrochemical nitrogen reduction†
Abstract
Obtaining useful insights from machine learning models trained on experimental datasets collected across different groups to improve the sustainability of chemical processes can be challenging due to the small size and heterogeneity of the dataset. Here we show that shallow learning models such as decision trees and random forest algorithms can be an effective tool for guiding experimental research in the sustainable chemistry field. This study trained four different machine learning algorithms (linear regression, decision tree, random forest, and multilayer perceptron) using different sized datasets containing up to 520 unique reaction conditions for the nitrogen reduction reaction (NRR) on heterogeneous electrocatalysts. Using the catalyst properties and experimental conditions as the features, we determined the ability of each model to regress the ammonia production rate and the faradaic efficiency. We observed that the shallow learning decision tree and random forest models had equal or better predictive power compared to the deep learning multilayer perceptron models and the simple linear regression models. Moreover, decision tree and random forest models enable the extraction of feature importance, which is a powerful tool in guiding experimental research. Analysis of the models showed the complex interaction between the applied potential and catalysts on the effective rate for the NRR. We also suggest some underexplored catalysts–electrolyte combinations to experimental researchers looking to improve both the rate and efficiency of the NRR reaction.