A data-driven XRD analysis protocol for phase identification and phase-fraction prediction of multiphase inorganic compounds†
Abstract
Deep learning (DL) models trained with synthetic XRD data have never accomplished a satisfactory quantitative XRD analysis for the exact prediction of a constituent-phase fraction in unknown multiphase inorganic compounds, although DL-based phase identification has been successful. Here, we report a novel data-driven XRD analysis protocol involving a convolutional neural network (CNN) for exact phase identification and other machine learning (ML) techniques for accurate phase-fraction prediction. A key concept behind this reliable, pragmatic protocol is training with a huge amount of cheap synthetic data and testing with a small amount of expensive real-world experimental data. The protocol was applied to a Li–La–Zr–O quaternary compositional system that involves 218 ICSD-registered inorganic compounds, some of which are known as solid electrolyte materials. Synthetic data-driven XRD analysis has achieved a test accuracy of 96.47% for phase identification and a mean square error (MSE) of 0.0018 and an R2 of 0.9685 for phase-fraction regression. Real-world data tests have led to a phase-identification accuracy of 91.11% and a phase-fraction regression MSE of 0.0024 with an R2 of 0.9587.