Machine learning-based q-RASPR predictions of detonation heat for nitrogen-containing compounds†
Abstract
The quantitative Read-Across Structure–Property Relationship (q-RASPR) is a novel method for the property predictions derived from the integrated concept of both similarity-based predictions (i.e., Read-Across or RA) and statistical modelling-based predictions (i.e., Quantitative Structure–Property Relationship or QSPR). The main performance index of ammunition used in air-to-air and underwater weapons is the detonation heat energy. In the present work, we have applied the q-RASPR modeling approach and various Machine Learning (ML) algorithms to predict the detonation heat (an intrinsic property) of different N-containing compounds. The data set was collected from the literature, curated, and further divided into training and test sets using the Euclidean distance-based algorithm. The feature selection was done on the basis of internal validation metrics of Genetic Algorithm (GA) models. A Multiple Linear Regression (MLR) QSPR model with 6 descriptors was selected, and the model features were used to calculate the similarity and error-based RASPR descriptors. The RASPR descriptor matrix was then merged with the features of the QSPR model. A grid search was performed for the selection of a combination of descriptors which were then subjected to Partial Least Squares (PLS) regression to obviate the inter-correlation among the descriptors. We have also employed various ML algorithms by optimizing the hyperparameters based on a cross-validation approach and compared the final test set prediction results. The PLS q-RASPR model was selected to be the best model based on the external validation metrics and it also shows enhanced prediction quality using 2D-descriptors compared to the previous model reported with 3D-descriptors. The developed model can be used for the detection of the detonation heat of compounds containing nitrogen with an effective performance.