Yumeng
Zhang
a,
Min
Dai
b,
Ke
Liu
c,
Changsheng
Peng
*ab,
Yufeng
Du
a,
Quanchao
Chang
a,
Imran
Ali
ad,
Iffat
Naz
*ef and
Devendra P.
Saroj
*f
aThe Key Lab of Marine Environmental Science and Ecology, Ministry of Education, College of Environmental Science and Engineering, Ocean University of China, Qingdao 266100, China. E-mail: cspeng@ouc.edu.cn; Tel: +86 532 66782011
bSchool of Environmental and Chemical Engineering, Zhaoqing University, Zhaoqing 526061, China
cDepartment of Mathematical Sciences, Tsinghua University, Beijing 100084, China
dDepartment of Agricultural Engineering, Bahauddin Zakariya University, Bosan Road, Multan 60800, Pakistan
eDepartment of Biology, Deanship of Educational Services, Qassim University, Buraidah 51452, Kingdom of Saudi Arabia. E-mail: iffatkhattak@yahoo.com; Tel: +966533897891
fDepartment of Civil and Environmental Engineering, University of Surrey, Guildford, UK. E-mail: d.saroj@surrey.ac.uk; Tel: +44(0) 1483686634
First published on 24th September 2019
Graphene oxide (GO), as an emerging material, exhibits extraordinary performance in terms of water treatment. Adsorption is a process that is influenced by multiple factors and is difficult to simulate by traditional statistical models. Artificial neural networks (ANNs) can establish highly accurate nonlinear functional relationships between multiple variables; hence, we constructed a three-layered ANN model to predict the removal performance of Cu(II) metal ions by the prepared GO. In the present research work, GO was prepared and characterized by FT-IR spectroscopy, SEM, and XRD analysis techniques. In ANN modeling, the Levenberg–Marquardt learning algorithm (LMA) was applied by comparing 13 different back-propagation (BP) learning algorithms. The network structure and parameters were optimized according to various error indicators between the predicted and experimental data. The hidden layer neurons were set to be 12, and optimal network learning rate was 0.08. Contour and 3-D diagrams were used to illustrate the interactions of different influencing factors on the adsorption efficiency. Based on the results of batch adsorption experiments combined with the optimization of influencing factors by ANN, the optimum pH, initial Cu(II) ion concentration and temperature were anticipated to be 5.5, 15 mg L−1 and 318 K, respectively. Moreover, the adsorption experiments reached equilibrium at about 120 min. Combined with sensitivity analysis, the degree of influence of each factor could be ranked as: pH > initial concentration > temperature > contact time.
Therefore, to secure the biota (plants, animals and ecological environment), many researchers focus on the treatments of metal polluted wastewater via different methods and processes including adsorption, chemical precipitation, and electro-chemical methods.7–13 Among all water treatment technologies, adsorption is presumed to be effective and easy to carry out.14–16 In addition to this, novel and efficient adsorbents have been synthesized and applied in various fields of production and life. For example, nanomaterials are used in the field of environmental protection for controlling water and air pollution, and they are also used to store hydrogen energy in the new energy field. The discovery of new carbon nano-materials represented by carbon nanotubes, fullerenes and graphene has further boosted the progress of nanomaterial research.17–19 Graphene oxide (GO) is a special material with a monolayer of carbon atom thickness, and has different oxygenated functional groups on the surface and edge, making it a potentially effective adsorbent. The nature and magnitude of oxygenated functional groups of GO can be altered by the oxidation method, which may directly or indirectly influence the adsorption properties.20,21
Artificial neural network (ANN) is a computational model derived from the structure and function of the biological nervous system. The capability of an ANN model to study complex and non-linear processes enables it to accurately simulate human intuition in making conclusions.22 ANNs have been widely used to establish and optimize models in environmental studies such as environmental quality evaluation, analysis and prediction.23–26
The present work is concentrated on the preparation of GO and its adsorption properties of Cu(II). We constructed an ANN model to fit the adsorption results. The affecting factors were selected as the input variables of ANN, and the adsorption efficiency was the output variable. Compared to single-factor analysis, our neural network structure was more complex and required more parameters to learn. Therefore, the learning algorithm of the network, the structure of the covered layer and the learning rate were optimized to improve accuracy and robustness. The network after training was used to predict the corresponding removal efficiency under a combination of various factors, and we tested the validity by comparing predicted and experimental data.
The characteristics (such as morphology, elemental composition, presence of functional groups) of the as-prepared GO were examined using modern machinery including scanning electron microscope (SEM, EM6900, KYKY Technology Co., Ltd., China), powder X-ray diffractometer (XRD, D8 Advance, Bruker Ltd., Germany), and Fourier transform infrared spectroscope (FTIR-8000s, Shimadzu Ltd., Japan).
(1) |
The Neural Network Toolbox of MATLAB (version 8.6.0) was utilized in the current research to build an ANN model. Aiming for a comprehensive evaluation of the adsorption capacity of the samples, we carried out adsorption experiments employing 90 different combinations of the four factors and took them as the input and adsorption efficiency as output values for our ANN model. Table 1 gives the range of input and output variables.
Variables | Range of the parameter value |
---|---|
Input parameters | |
pH | 2.0–5.5 |
Initial concentration (mg L−1) | 5–30 |
Temperature (K) | 298–318 |
Contact time (min) | 10–120 |
Output parameters | |
Adsorption efficiency (%) | 21.5–93.2 |
The parameters were optimized for constructing an ANN model, including hidden layer neurons, optimization algorithm, learning rate and the number of iterations. After the optimization, the ANN model was trained by the training set to carry out effective prediction. Given the input and output data, the connection weights and thresholds between neurons were adjusted as variables to lessen prediction errors.30 The predicted output of ANN model was linked with the trial data, and the biases were modified by calculating the error.31 When the error was less than the threshold (E(n) < ζ) or the number of iterations reached the upper limit (iterations > 100), the training process automatically stopped. The flowchart of ANN training process is shown in Fig. 1.
Based on the experimental data and network training steps, we constructed a three-layer network model. Signals were transmitted linearly and activated by processing unit function tan-sigmoid in the network.32 We compared 13 BP algorithms to find the most suitable algorithm. Then, based on the optimal BP algorithm, the optimal learning rate and structure of hidden layer were evaluated. Finally, the predicted data was compared with the desired data as the basis for judging our neural network learning results. The configuration of the model is also disclosed in Fig. 1.33
Under the condition that the hidden layer contained 10 neurons, we selected the optimal training algorithm from 13 different algorithms by comparing the values of different parameters, such as root mean squared error (RMSE), correlation coefficient (R2), iteration number (IN) and optimal linear equation (OLE). The RMSE and R2 are calculated using eqn (2) and (3):
(2) |
(3) |
As shown in Table 2, the LMA with its smallest RMSE (0.0298) and fewer iterations was found to be the best of 13 BP learning algorithms, followed by the Bayesian regularization algorithm (BRA) with a RMSE of 0.0304. However, compared with LMA, which only needed 18 iterations of training, BRA took 100 iterations (iteration limit) to complete training.38 The RMSE of training and conjugate gradient algorithms such as gradient descent algorithm, one step secant algorithm, Fletcher–Powell algorithm were much larger than LMA.39 In addition to the properties of the algorithm itself, this may be due to the combinatorial properties and intrinsic link of the data set.
Learning algorithms | Function | INa | RMSEb | R 2 c | Gradient | OLE1d | OLE2e |
---|---|---|---|---|---|---|---|
a IN, iteration number. b RMSE, root mean squared error. c R 2, correlation coefficient. d OLE1, the slope of optimal linear equation. e OLE2, the intercept of optimal linear equation. | |||||||
Levenberg–Marquardt | trainlm | 18 | 0.0298 | 0.995 | 0.835 | 0.9914 | 0.6384 |
Bayesian regularization | trainbr | 100 | 0.0304 | 0.973 | 0.417 | 0.9602 | 2.7892 |
BFGS Quasi-Newton | trainbfg | 49 | 0.0489 | 0.975 | 0.606 | 0.9746 | 1.8732 |
Resilient backpropagation | trainrp | 50 | 0.0499 | 0.943 | 0.885 | 0.9483 | 3.9410 |
Scaled conjugate gradient | trainscg | 46 | 0.0407 | 0.940 | 0.812 | 0.9361 | 5.1059 |
Conjugate gradient with Powell/Beale restarts | traincgb | 35 | 0.0493 | 0.971 | 0.395 | 0.9718 | 1.4668 |
Fletcher–Powell conjugate gradient | traincgf | 21 | 0.1672 | 0.946 | 1.430 | 0.9632 | 2.7050 |
Polak–Ribiére conjugate gradient | traincgp | 18 | 0.0744 | 0.912 | 0.678 | 0.8910 | 7.7585 |
One step secant | trainoss | 17 | 0.1392 | 0.903 | 1.020 | 0.8266 | 13.7999 |
Variable learning rate gradient descent | traingdx | 21 | 0.0794 | 0.555 | 1.080 | 0.6954 | 32.1423 |
Gradient descent with momentum | traingdm | 36 | 0.2180 | 0.709 | 0.885 | 0.7289 | 21.5543 |
Gradient descent | traingd | 100 | 0.1085 | 0.825 | 1.730 | 0.8791 | 8.6023 |
Adaptive learning rate gradient descent | traingda | 91 | 0.1033 | 0.897 | 1.620 | 0.8662 | 11.2128 |
Due to the randomness of the learning process, we performed training for 200 times at each learning rate and counted the ratio of iterations less than 20 and RMSE less than 0.035. The ratio of learning iterations less than 20 was generally around 80%, and the ratio was the highest when the learning rate was 0.08, reaching 87%. RMSE was used as an important indicator for evaluating the learning rate because it was related to the training results.44 At a learning rate of 0.08, the ratio of RMSE less than 0.035 reached a maximum of 66.5%. Therefore, the learning rate selected in this study was 0.08.
As shown in Fig. 6, the sensitivity of the pH is much greater than that of the other factors, both in terms of the RMSE values and the ranges of the output variables varying with the input variables. When the pH of the input variable was reduced by 15%, the RMSE reached 8.53. It can also be seen from the original experimental data that the pH had the foremost effect on the results, which was consistent with the analysis in Section 3.7.46 The input variable of contact time was the least influential factor. When the contact time was reduced by 15%, the RMSE of the output variable was only 2.51. It was confirmed by the original experimental data that the adsorption efficiency tended to be flat after 90 minutes and reached equilibrium at approximately 120 minutes. The influence of temperature was slightly larger than that of time but far less than that of pH. When the temperature was reduced by 15%, the RMSE of the output variable was 3.70. The higher RMSE value of the output variable indicated that the value of the initial concentration was greater than that of the proportional decreased. This was because the effects of factors other than the initial concentration tended to be stable as the values increased.47
Fig. 7 Contour and three-dimensional diagrams for interactive effects of (A) pH × C0; (B) T × C0; (C) C0 × t; (D) pH × t; (E) T × t on R. |
The interaction of pH and initial concentration on copper removal efficiency was studied at a fixed temperature and time (Fig. 7a). The effect of pH on adsorption efficiency is strictly monotonically proportional, whereas the initial concentration is monotonically inversely proportional though not rigorous. It was reported that metal ions occupied the adsorption sites more quickly at lower concentrations. However, at pH ranging from 4.5 to 5.5, the removal efficiency increased slightly and then decreased with the concentration boosting from 5 to 20 mg L−1. This may be because GO was saturated at an initial concentration above 15 mg L−1. At low pH (2.0–3.5), the influence of high concentration on adsorption efficiency was lesser than that at a higher pH (4.5–5.5). The pH of the solution can alter the presence and magnitude of Cu(II) metal ions, the surface electrical properties of materials, and the interaction between the materials and the ions.48 Excessive H+ adsorbs on GO at lower pH, occupying the adsorption sites of Cu(II) metal ions, while a higher pH value enhances the attraction between copper metal ions and the negative charge on the GO surface.49,50 Under the conditions of fixed temperature and time, a minimum adsorption efficiency of 33.2% was obtained at pH = 2.0 and an initial concentration of 30 mg L−1; the maximum efficiency of 93.2% was obtained at pH = 5.5 and an initial concentration of 15 mg L−1.
At fixed pH and adsorption time, the interaction of temperature and initial concentration is depicted in Fig. 7b. The increasing temperature and decreasing concentration enhanced copper removal, and the Cu(II) metal ion concentration has a greater influence. The adsorptive efficiency was about 85.4% at low concentrations (5–23 mg L−1) and low temperatures (293–303 K). When the temperature rose above 303 K, the adsorption efficiency could exceed 90%, suggesting that the adsorption method may be endothermic. In the range of 23 to 30 mg L−1, the minimum efficiency reached 58.1% at low temperatures and 71.3% at high temperatures, which meant that an appropriate initial concentration was more conducive to higher removal efficiency than the temperature (consistent with Section 3.6).
Fig. 7c illustrates the interaction between concentration and time at a fixed temperature of 318 K and pH of 5.5. The strong attraction between GO and Cu(II) metal cations led to the rapid increase of adsorption efficiency at the initial stage (10–45 min).51 After 90 minutes, the adsorption rate tended to be flat and finally reached equilibrium for all initial concentration conditions at approximately 120 minutes. When the concentration was 5–20 mg L−1, the efficiency was substantially over 90% after 120 min.52 Minimum efficiency of 43.2% was obtained for a contact time of 10 min and an initial concentration of 30 mg L−1. The maximum adsorption efficiency of 93.2% was acquired at the initial concentration of 15 mg L−1 for the contact time of 120 min. Based on the analysis of Fig. 7a–c, the degree of influence of the related variables is ranked as: pH > initial concentration > temperature, which is consistent with Section 3.6.
Fig. 7d shows the interactive effects of pH and contact time on adsorption efficiency. With the increase of pH and adsorption time, the adsorption efficiency first increased greatly and then flattened.53 At pH = 2, the lowest and highest adsorption efficiencies were 23.3% and 74.8%, which were obtained at contact times of 10 and 120 min, respectively. At pH = 5.5, the lowest and highest adsorption efficiencies were 50.7% and 93.0%, which were obtained at a contact time of 10 and 120 min, respectively. This may be due to the promotion of faster adsorption of Cu2+ at higher pH. Although the efficiency interval was different, the efficiency gap was almost identical under the same contact time.
The interaction of temperature and contact time on adsorption efficiency was also investigated (Fig. 7e). Within the temperature range of 298 to 318 K, the adsorption efficiency increased from 42.2 to 48.2% and 83.9 to 93.0% for the contact time of 10 and 120 min, respectively. Compared with Fig. 7d, the effect of pH was greater than that of the temperature, but the effects of pH and temperature are roughly the same (both positive correlations).54 To compare the effects of different factors, we normalized the input data of the network. Although the adsorption efficiency varied greatly with the contact time before the equilibrium, the adsorption was almost unaffected by the contact time after the equilibrium at 120 min, making contact time the least sensitive factor of the whole.55
In this work, an ANN model was built to study the various factors related to the adsorption process. Signals were transmitted linearly and activated by processing unit function tan-sigmoid in a three-layer ANN. To make the model more suitable for the data, the structure and parameters of the neural network were optimized by comparing the evaluation index, including RMSE. The number of iterations and RMSE of 13 different BP algorithms were compared comprehensively, and LMA was found to be the best algorithm. The optimal hidden layer neurons were 12 with the minimum RMSE of 0.0179. The learning rate selected in this study was 0.08 with the ratio of RMSE of less than 0.035 reaching a maximum of 66.5%, and the training was stopped at 11 iterations.
The contour and three-dimensional diagrams showed that the removal efficiency was boosted with an increase in the temperature, pH, contact time, and decrease in the initial concentration. According to the diagrams and sensitivity analysis, the influence degree of each factor on the adsorption efficiency was: pH > initial concentration > temperature > contact time. Moreover, the RMSE of the output variable that increased the value of initial concentration was greater than that of the proportional decreased. The explanation is that the efficiency decreased with the increasing concentration and tended to be stable with the increase of other factors. In brief, the ability of an ANN model to learn and summarize complex and non-linear processes can provide us with a new perspective for the adsorption process.
This journal is © The Royal Society of Chemistry 2019 |