Understanding the patterns that neural networks learn from chemical spectra†
Abstract
Analysing spectra from experimental characterization of materials is time consuming, susceptible to distortions in data, requires specific domain knowledge, and may be susceptible to biases in general heuristics under human analysis. Recent work has shown the potential of using neural networks to solve this task, and assist spectral interpretation with automated and unbiased analysis on-the-fly. However, the black-box nature of most neural networks poses challenges when interpreting which patterns from the data are being used to make predictions. Understanding how neural networks learn is critical to assess their accuracy on unseen data, justify critical decision-making based on predictions, and potentially unravel meaningful scientific insights. We present a 1D neural network to classify infrared spectra from small organic molecules according to their functional groups. Our model is within range of state-of-the-art performance while being significantly less complex than previously used networks reported in the literature. A smaller network reduces the risk of overfitting and enables exploring what the model has learned about the patterns in the spectra that relate to molecular structure and composition. With a novel two-step approach for explaining the neural network's classification process, our findings not only demonstrate that the model learns the characteristic group frequencies of functional groups, but also suggest it uses non-intuitive patterns such as tails and overtones when classifying spectra.