Sparse modeling for small data: case studies in controlled synthesis of 2D materials†
Abstract
Data-scientific approaches have permeated into chemistry and materials science. In general, these approaches are not easily applied to small data, such as experimental data in laboratories. Our group has focused on sparse modeling (SpM) for small data in materials science and chemistry. The controlled synthesis of 2D materials, involving improvement of the yield and control of the size, was achieved by SpM coupled with our chemical perspectives for small data (SpM-S). In the present work, the conceptual and methodological advantages of SpM-S were studied using real experimental datasets to enable comparison with other machine learning (ML) methods, such as neural networks. The training datasets consisted of ca. 40 explanatory variables (xn) and 50 objective variables (y) regarding the yield, size, and size-distribution of exfoliated nanosheets. SpM-S provided more straightforward, generalizable, and interpretable prediction models and better prediction accuracy for new experiments as an unknown test dataset. The results indicate that machine learning coupled with our experience, intuition, and perspective can be applied to small data in a variety of fields.