Navigating predictions at nanoscale: a comprehensive study of regression models in magnetic nanoparticle synthesis†
Abstract
The applicability of magnetic nanoparticles (MNP) highly depends on their physical properties, especially their size. Synthesizing MNP with a specific size is challenging due to the large number of interdepend parameters during the synthesis that control their properties. In general, synthesis control cannot be described by white box approaches (empirical, simulation or physics based). To handle synthesis control, this study presents machine learning based approaches for predicting the size of MNP during their synthesis. A dataset comprising 17 synthesis parameters and the corresponding MNP sizes were analyzed. Eight regression algorithms (ridge, lasso, elastic net, decision trees, random forest, gradient boosting, support vectors and multilayer perceptron) were evaluated. The model performance was assessed via root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and standard deviation of residuals. Support vector regression (SVR) exhibited the lowest RMSE values of 3.44 and a standard deviation for the residuals of 5.13. SVR demonstrated a favorable balance between accuracy and consistency among these methods. Qualitative factors like adaptability to online learning and robustness against outliers were additionally considered. Altogether, SVR emerged as the most suitable approach to predict MNP sizes due to its ability to continuously learn from new data and resilience to noise, making it well-suited for real-time applications with varying data quality. In this way, a feasible optimization framework for automated and self-regulated MNP synthesis was implemented. Key challenges included the limited dataset size, potential violations of modeling assumptions, and sensitivity to hyperparameters. Strategies like data regularization, correlation analysis, and grid search for model hyperparameters were employed to mitigate these issues.