Lina Chi*abc,
Jie Wangb,
Tianshu Chub,
Yingjia Qiana,
Zhenjiang Yua,
Deyi Wua,
Zhenjia Zhanga,
Zheng Jiangc and
James O. Leckieb
aSchool of Environmental Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, People's Republic of China. E-mail: lnchi@sjtu.edu.cn; Tel: +86 13816632156
bThe Center for Sustainable Development and Global Competitiveness (CSDGC), Stanford University, Stanford, CA 94305, USA
cFaculty of Engineering and the Environment, University of Southampton, Southampton, SO17 1BJ, UK
First published on 11th March 2016
Mathematical models play an important role in performance prediction and optimization of ultrafiltration (UF) membranes fabricated via dry/wet phase inversion in an efficient and economical manner. In this study, a systematic approach, namely, a supervised, learning-based experimental data analytics framework, is developed to model and optimize the flux and rejection rate of poly(vinyl chloride) (PVC) and polyvinyl butyral (PVB) blend UF membranes. Four supervised learning (SL) approaches, namely, the multiple additive regression tree (MART), the neural network (NN), linear regression (LR), and the support vector machine (SVM), are employed in a rigorous fashion. The dependent variables representing membrane performance response with regard to independent variables representing fabrication conditions are systematically analyzed. By comparing the predicting indicators of the four SL methods, the NN model is found to be superior to the other SL models with training and testing R-squared values as high as 0.8897 and 0.6344, respectively, for the rejection rate, and 0.9175 and 0.8093, respectively, for the flux. The optimal combination of processing parameters and the most favorable flux and rejection rate for PVC/PVB ultrafiltration membranes are further predicted by the NN model and verified by experiments. We hope the approach is able to shed light on how to systematically analyze multi-objective optimization issues for fabrication conditions to obtain the desired ultrafiltration membrane performance based on complex experimental data characteristics.
In recent years, considerable research has been conducted in order to overcome this problem. Among all available methods, polymer blends often exhibit superior properties when compared with a standalone, individual component polymer; in addition, the polymer blend method also has the advantages of a simple procedure for preparation and easy control of physical properties for various compositional changes. There are several polymers that have been studied as functional polymer pairs of PVC, such as PMMA,1 PU,3 EVA,4 PEO,5 and PVB6 among others. In most previous studies,7,8 PVB is found to be one of the ideal polymers to blend with PVC due to its well-predicted miscible properties, chemical similarity, and less unfavorable heat while mixing. In addition, owing to the –OH bond, the PVC/PVB blend demonstrates more hydrophilicity than the original PVC membrane.6,9
The selection of membrane material is essential for developing high-performance membranes. However, due to the complexities of the fabrication process, even more critical—especially when the membranes are made via a complex dry/wet phase inversion—is a consistent and robust data analysis procedure for effectively analyzing these membranes for better performance. Pure water flux (PWF) and rejection rate of Bull Serum Albumin (BSA) are the most important performances for UF membranes,10,11 depending not only upon the composition of the casting solution but also upon the technical conditions used in the fabrication process. Typical variables of importance for membrane development include the types and amounts of polymer, additive, and the pore-forming agents used in the casting solution, the kind and concentration of gelation medium, the evaporation time and temperature of the spread-casting solution, the length of gelation period, and the temperature of gelation bath12 etc. Some of the above mentioned variables have to be classified as categorical variables, such as the type of the polymer, the pore-forming reagent, or the gelation medium used, since they cannot be quantified. Remaining variables are quantitative ones, including the temperature of evaporation or gelation, the amount of pore-forming reagent added, and the duration of evaporation or gelation. Generally, these complex influential factors in the membrane fabrication process would greatly delay the development cycle and increase research and development (R & D) costs. Therefore, it is worthwhile to investigate efficient statistical and computational methods to optimize experiment design and to minimize the number of experiments.
Traditionally, statistically-based design of experiments (DOE) has been widely used as a proper approach to optimize membrane parameters in membrane fabrication processing.13–15 However, DOE is based on the assumption that interactions between factors are not likely to be significant,16,17 which is usually not the case in the real world. When reducing the number of runs, a fractional factorial DOE becomes insufficient to evaluate the impact of some of the factors independently.16 Moreover, it is also beyond the ability of DOE in dealing with categorical factors in experiments. As a result, DOE has limitations in modeling a membrane fabrication process and in optimizing the filtration performance of the membrane.
Recently, the supervised learning (SL) approach—a powerful method in analyzing complex, but data-rich problems—has found strong application in diverse engineering fields such as control, robotics, pattern recognition, forecasting, power systems, manufacturing, optimization, and signal processing, etc.18–20 Although the idea of solving engineering problems using SL has been around for decades, it has been introduced only recently into the field of material studies.21 There are several publications discussing the application of SL to the modeling and optimization of membrane fabrication. S. S. Madaeni modeled and optimized PES- and PS-membrane fabrication using artificial neural networks,22 while Xi and Wang23 reported that the Support Vector Machine (SVM) model could be an efficient approach for optimizing fabrication conditions of homemade VC-co-VAc-OH microfiltration membranes. Yet, there are still a couple of key issues that need to be investigated. A systematic framework for using SL approaches is required to discover the relationships between membrane performance and complicated fabrication conditions.
The purpose of this research is to develop such a framework. More specifically, we need first to evaluate experimental data quality, which is important in making valid assumptions and selecting proper models for analyzing complex data. Secondly, we need to develop an approach for efficiently employing reliable analysis models, including the decision tree approach, neural network method, linear regression, and support vector machine, for thoroughly analyzing all features and all responses of the membranes, as opposed to current approaches that analyze only a single response with regard to either one feature or all of the features. Finally, we need to select the most suitable SL approach to predict the optimal combination of features for membrane fabrication.
Jw = V/(At) | (1) |
Membrane retention ability was tested using 100 mg L−1 BSA at a temperature of 20 °C and under an operating pressure of 0.1 MPa. The concentrations of both the feed water and the permeation water were determined using an ultraviolet spectrophotometer (TU-1810, Beijing Purkinje Genera, China) at a wavelength of 280 nm. The percentage of the observed rejection solutes BSA phosphate buffer for each permeate collected was calculated as the following eqn (2):
R = (1 − Cp/Cf) × 100% | (2) |
PVC wt%/k = polymer wt% | (3) |
DMAc wt% + polymer wt% + additive wt% = 100% | (4) |
Before the data analysis process, we briefly verify the characteristics of the data by scattering the measurement points under different parameter-indicator pairs in Fig. 1. If the processing parameters are categorical, box-plots are used instead of scatter plots. Obviously, the rejection rate and the flux are negatively correlated. For numerical parameters, PVC wt% and DMAc wt% have the strongest correlations with flux and rejection rate, respectively, while evaporation time and blade temperature have cross-like scatterings, thus indicating very weak correlations. Both categorical parameters can provide considerable information for performance prediction. This is especially true for the additive type, where the significant differences of indicators are shown between different groups of additives. In general, useful information can be found in the data for performance prediction, but there are not enough measurements to estimate how the indicators are distributed with regard to processing parameters. In other words, our predicted indicators using SL tools will have a low bias but high variance, and we need to carefully balance the accuracy and stability of modeling.
Furthermore, to estimate the accuracy of each SL algorithm, we apply the Monte Carlo method by repeating the learning processes 50 times on our measurement data. During each learning process, we first randomly split the data into a training set and a testing set, with the ratio 50/18. Next, we train each SL model based on the predictors of the training set with cross-validation and make predictions of responses over the training and testing sets using the trained learning model. Finally, we estimate the accuracy of each model by R-squared over the training and testing sets, computed as:
(5) |
In case of MART analysis, the resulting importance rankings of each predictor for predictions are shown in Fig. 4. We can see that the number of significant predictors is even fewer than that in LR for each indicator. The importance order is DMAc wt% > bath type > PVC wt% for rejection rate, and only PVC wt% determines the regression tree for flux.
In summary, LR suggests that PVC wt% and DMAc wt% are the two most significant predictors. MART claims that the importance order of predictors is DMAc wt% > bath type > PVC wt% for rejection rate, while only PVC wt% determines the regression tree for flux. Based on the results of LR and MART, we remove the insignificant predictors (solution temperature) and then train NN and SVM with the appropriate controlling parameters determined by cross validation.
Furthermore, we select appropriate controlling parameters. Usually, one hidden layer is sufficient for a small training set. To select the optimal number of hidden units, we repeat the learning processes 50 times for each, and then select the one with a high mean and a low variance of testing R-squared values. During each process, we randomly split the data into a training set, a validation set, and a testing set, with the ratio 51/10/7, and then select the best number of epochs through cross-validation. The resulting box-plots are shown in Fig. 5. We can see the optimal number of hidden units is 9, with both the highest mean (0.8218) and the lowest variance of testing R-squared values.
Fig. 5 Box-plots of testing R-squared values over 50 training processes with different hidden layer sizes. |
As regard to SVM, since our data size is small, we select only the statistically significant 6 predictors in LR and MART to avoid overfitting. Furthermore, we choose the appropriate controlling parameters with five-fold cross-validation. The resulting support vectors are from all measurements except the 43rd or 18th measurements for the rejection rate or the flux, implying the risk of over-fitting.
MART | NN | LR | SVM | |
---|---|---|---|---|
Rm(y1) | 0.2122 | 0.8897 | 0.6577 | 0.8065 |
Rm(y2) | 0.0725 | 0.9175 | 0.6887 | 0.6583 |
Rn(y1) | 0.0784 | 0.6344 | 0.3104 | 0.4344 |
Rn(y2) | −0.0329 | 0.8093 | 0.1800 | 0.6583 |
By combining the performance results in Table 1 and the properties of each SL model, we can reveal some interesting underlying characteristics of the data. We begin with the worst SL model, MART, which has very low R-squared values for all conditions. In other words, the piecewise constant approximation does not work on this data, partially due to the small number of controlled measurements. However, we find that both the bias and variance are lower for the rejection rate. Thus, compared to the flux, the rejection rate has relatively high order interactions with processing parameters. This argument can be verified with the performance of LR. Both training R-squared values are relatively high. Especially for the flux, this value is even higher than that of SVM. Furthermore, SVM has much higher training R-squared of the rejection rate, and testing R-squared of both rejection and flux than those of LR. Therefore, the relationship between the flux and the processing parameters is approximately linear, but the rejection rate may have more complex and higher order interactions between the processing parameters. In addition, the noise of the measurement data is relatively high. Finally, although the testing R-squared values of SVM are much higher than LR due to the noise reduction in the higher dimensional feature space, they are still much lower than those of NN. This verifies the overfitting of SVM on small data, even when the regularization cost is set as high as 2.5
NN beats all other SL models in all aspects, and if the whole data is used for training, it has training R-squared values as high as 0.8992 and 0.9559 for the rejection rate and the flux. Thus, compared to the numerical approximation on categorical predictors, the correlation between the rejection rate and the flux is much more important in our predictions. To visualize the performance of NN, we plot the prediction versus the true response in Fig. 6. The performance is considered perfect if the point lies on the line with intersection 0 and slope 1. Furthermore, we plot the training data points and fitting curves of SVM and NN inside the predictor subspace of PVC wt% and DMAc wt% in Fig. 7 and 8 by fixing all other predictors as additive wt% = 0%, additive type = none, evaporation time = 5 s, blade temperature = 60 °C, bath type = water, and volume concentration of solute in gelation bath = 0 mg L−1. We can see that the fitting curves of NN are smoother and fit the training data better. In summary, because our data set is very small and noisy, the complex relationship between the rejection rate and the processing parameters is hard to fit with a good trade-off between bias and variance. Fortunately, we have the helpful information that tells us that it is correlated with the flux, which has a much simpler linear relationship, so we can apply NN to fit these two indicators.
Fig. 6 Prediction versus response plots for training, validation, testing, and the whole data set; target and output denote the true response and the predicted response by NN, respectively. |
Fig. 7 Training data and fitting curves of rejection rate and flux in the subspace of PVC wt% and DMAc wt% using SVM. |
Fig. 8 Training data and fitting curves of rejection rate and flux in the subspace of PVC wt% and DMAc wt% using NN. |
Fig. 9 Possible combinations of PVC wt% and DMAc wt% for specific constraints on indicators fixing all other processing parameters. |
So we can use k instead of DMAc wt%. On the other hand, although the prediction accuracy is not guaranteed over the whole predictor space, both training and testing R-squared are very high within the data set. This means that if the search points are not too far away from the measurement points, the corresponding predictions are reliable. In particular, we have the search space PVC wt% = 7.5:0.5:18 (%), k = (PVC wt%/21), 0.05:0.9, and additive wt% = 1:1:5 (%) if additive type is not none, evaporation time = 5:15:110 (s), blade temperature = 30:10:80 (°C), and bath concentration = 10:10:80 (mg L−1).
Finally, we select the combination of processing parameters that have the maximum flux under the constraint 80% ≤ rejection rate ≤ 100%. We find with the water bath that the optimal combination of processing parameters is PVC wt% = 7.5%, DMAc wt% = 84%, additive wt% = 1%, k = 0.5 (PVB wt% = 7.5%), additive type = PEG600, evaporation time = 5 (s), and blade temperature = 30 (°C), leading to the rejection rate = 80.03% and the flux = 329.88 (L (m2 h)−1). Similarly, in the DMAc bath, we find that when PVC wt% = 16%, DMAc wt% = 78%, additive wt% = 2%, k = 0.8 (PVB wt% = 4%), additive type = PVP K90, evaporation time = 5 (s), blade temperature = 30 (°C), and bath concentration = 80 (mg L−1), we have the rejection rate = 81.39% and the maximum flux = 271.61 L (m2 h)−1. Although our results are not guaranteed to be globally optimal, they are much robust than the best measurement, which has the rejection rate = 82.07% and the flux = 122.70 L (m2 h)−1 (with the processing parameters PVC wt% = 12.6%, DMAc wt% = 77%, additive wt% = 5%, k = 0.7 (PVB wt% = 5.4%), additive type = PEG600, evaporation time = 10 s, blade temperature = 60 °C, bath type = DMAc, and bath concentration = 80 mg L−1). To check the accuracy of the models used to optimize membrane performance, we fabricated PVC/PVB flat sheet membranes strictly under the above optimized parameters. Fig. 10 shows the surface and cross-section morphology and the contact angle of the as-prepared membranes. In the case of pure water gelation bath, the rejection rate of the as-prepared membrane was 80.2% and the flux was 318.27 L (m2 h)−1, while in the case of DMAc as the solute of gelation bath, the as-prepared membrane has the rejection rate of 86.2% and the flux of 298.5 L (m2 h)−1. The results showed that there was a very good agreement between the model predictions and experimental data.
Additionally, we glean several interesting findings from this research. One is how to find the optimal mixture of feature compounds for the fabrication processes more effectively and efficiently. Another is that among the tested SL approaches, the NN method provides the most reliable and trusted results. In the future, we will investigate how to develop a recursive and automated data-driven experimental analytics approach to design performance-specific membranes more effectively and efficiently.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c5ra24654g |
This journal is © The Royal Society of Chemistry 2016 |