Understanding the importance of individual samples and their effects on materials data using explainable artificial intelligence†
Abstract
Explaining the influence of data instances (materials) to predictions such as structure/property relationships in materials informatics can complement structural feature importance profiling, and guide data generation, cleaning, and verification. In this paper we combine explainable artificial intelligence (XAI) and influence statistics to value the contribution of individual materials to the prediction of diffusion energy barriers in dilute solvents, the formation energy of perovskites, and the glass transition temperature of metallic glasses. In each case, we identify that materials with certain chemical elements negatively impact the performance of machine learning models and warrant removal, while others contribute differently to the prediction errors and warrant further investigation. Our general approach can be applied to any structured materials dataset to provide a similar forensic analysis.