Application of machine/statistical learning, artificial intelligence and statistical experimental design for the modeling and optimization of methylene blue and Cd(ii) removal from a binary aqueous solution by natural walnut carbon
Abstract
Analytical chemists apply statistical methods for both the validation and prediction of proposed models. Methods are required that are adequate for finding the typical features of a dataset, such as nonlinearities and interactions. Boosted regression trees (BRTs), as an ensemble technique, are fundamentally different to other conventional techniques, with the aim to fit a single parsimonious model. In this work, BRT, artificial neural network (ANN) and response surface methodology (RSM) models have been used for the optimization and/or modeling of the stirring time (min), pH, adsorbent mass (mg) and concentrations of MB and Cd2+ ions (mg L−1) in order to develop respective predictive equations for simulation of the efficiency of MB and Cd2+ adsorption based on the experimental data set. Activated carbon, as an adsorbent, was synthesized from walnut wood waste which is abundant, non-toxic, cheap and locally available. This adsorbent was characterized using different techniques such as FT-IR, BET, SEM, point of zero charge (pHpzc) and also the determination of oxygen containing functional groups. The influence of various parameters (i.e. pH, stirring time, adsorbent mass and concentrations of MB and Cd2+ ions) on the percentage removal was calculated by investigation of sensitive function, variable importance rankings (BRT) and analysis of variance (RSM). Furthermore, a central composite design (CCD) combined with a desirability function approach (DFA) as a global optimization technique was used for the simultaneous optimization of the effective parameters. The applicability of the BRT, ANN and RSM models for the description of experimental data was examined using four statistical criteria (absolute average deviation (AAD), mean absolute error (MAE), root mean square error (RMSE) and coefficient of determination (R2)). All three models demonstrated good predictions in this study. The BRT model was more precise compared to the other models and this showed that BRT could be a powerful tool for the modeling and optimizing of removal of MB and Cd(II). Sensitivity analysis (calculated from the weight of neurons in ANN) confirmed that the adsorbent mass and pH were the essential factors affecting the removal of MB and Cd(II), with relative importances of 28.82% and 38.34%, respectively. A good agreement (R2 > 0.960) between the predicted and experimental values was obtained. Maximum removal (R% > 99) was achieved at an initial dye concentration of 15 mg L−1, a Cd2+ concentration of 20 mg L−1, a pH of 5.2, an adsorbent mass of 0.55 g and a time of 35 min.