Machine learning powered by principal component descriptors as the key for sorted structural fit of XANES†
Abstract
Modern synchrotron radiation sources and free electron laser made X-ray absorption spectroscopy (XAS) an analytical tool for the structural analysis of materials under in situ or operando conditions. Fourier approach applied to the extended region of the XAS spectrum (EXAFS) allows the estimation of the number of structural and non-structural parameters which can be refined through a fitting procedure. The near edge region of the XAS spectrum (XANES) is also sensitive to the coordinates of all the atoms in the local cluster around the absorbing atom. However, in contrast to EXAFS, the existing approaches of quantitative analysis provide no estimation for the number of structural parameters that can be evaluated for a given XANES spectrum. This problem exists both for the classical gradient descent approaches and for modern machine learning methods based on neural networks. We developed a new approach for rational fit based on principal component descriptors of the spectrum. In this work the principal component analysis (PCA) is applied to a dataset of theoretical spectra calculated a priori on a grid of variable structural parameters of a molecule or cluster. Each principal component of the dataset is related then to a combined variation of several structural parameters, similar to the vibrational normal mode. Orthogonal principal components determine orthogonal deformations that can be extracted independently upon the analysis of the XANES spectrum. Applying statistical criteria, the PCA-based fit of the XANES determines the accessible structural information in the spectrum for a given system.