Experimentally validated machine learning predictions of ultralow thermal conductivity for SnSe materials†
Abstract
Machine-learning (ML) models are used to predict optimal thermoelectric properties for efficient thermoelectric devices. Often, ML models utilize available databases or published sources that might be inconsistent. Herein, we report a boosting ML model – eXtreme gradient boosting (XGBoost) – built from our own lab-generated data with weighted element-to-chemical property features, which predicts the ultralow (<1 W m−1 K−1) total thermal conductivity (κ) for p- and n-type doped bulk SnSe materials prior to the synthesis. The metrics of the model included a coefficient of determination (R2) of 0.94, a root-mean-square error (RMSE) of κ = 0.05 W m−1 K−1 and a mean absolute error (MAE) of 0.04 W m−1 K−1 on the validation set using the fivefold cross validation method. The model was able to accurately predict the thermal conductivity values it was trained for, i.e., the Na–Ag–Sn–Se series. The κ values for Na0.033Ag0.015–0.016Sn0.963–0.961Se were predicted to be 0.54 W m−1 K−1 on average and experimentally found to be 0.55 W m−1 K−1. The model also successfully discriminated at low temperatures within the series, with Na0.033Ag0.015Sn0.961Se predicted to have κ = 0.85 W m−1 K−1 and measured to have κ = 0.80 (±0.04) W m−1 K−1, and similarly, Na0.033Ag0.016Sn0.963Se with a predicted κ = 1.06 W m−1 K−1 and a measured κ = 0.98 (±0.05) W m−1 K−1. We pushed the model to the limits to predict the κ values of Cl-doped SnSe, although the training set did not include any κ values with Cl. The predicted and measured values and trends were in good agreement with the RMSE and MAE values achieved by XGBoost's model for this new experimental test dataset and on average agreed with the experimentally determined κ values to be within 9%.