Study on the identification of soybean origins combining laser-induced breakdown spectroscopy and convolutional neural networks
Abstract
This study integrates Laser-Induced Breakdown Spectroscopy (LIBS) with CNNs to classify soybeans from five regions in China: Shandong, Heilongjiang, Guizhou, Zhejiang, and Neimenggu. A dataset of 650 samples was constructed, with 130 samples from each region. High-resolution spectral data were obtained using a customized LIBS system with optimized acquisition settings. After preprocessing—including noise reduction and feature selection—12 key spectral features were identified. The dataset was then split into a training set (265 samples) and a test set (110 samples), ensuring balanced representation. A CNN model was developed to automatically extract and classify spectral features. The model achieved 99.09% accuracy on the test set, significantly outperforming traditional machine learning models such as random forest and SVM. This demonstrates the effectiveness of combining LIBS with CNNs for the rapid and accurate identification of soybean origins. The approach offers promising applications in food quality control, traceability, and safety. The LIBS-CNN framework also opens new avenues for applying spectroscopic techniques with deep learning in fields such as authenticity verification, environmental monitoring, and materials science.