Mood
Mohan
*ab,
Omar
Demerdash
b,
Blake A.
Simmons
ac,
Jeremy C.
Smith
bd,
Michelle
K. Kidder
e and
Seema
Singh
*a
aDeconstruction Division, Joint BioEnergy Institute, 5885 Hollis Street, Emeryville, California 94608, USA. E-mail: moodm@ornl.gov; mohanchauhan08@gmail.com; Seema.Rose.Singh@gmail.com; ssingh@lbl.gov
bBiosciences Division and Center for Molecular Biophysics, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA
cBiological Systems and Engineering Division, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California 94720, USA
dDepartment of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, Tennessee 37996, USA
eManufacturing Science Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831-6201, USA
First published on 28th February 2023
Carbon dioxide (CO2) emissions from fossil fuel combustion are a significant source of greenhouse gas, contributing in a major way to global warming and climate change. Carbon dioxide capture and sequestration is gaining much attention as a potential method for controlling these greenhouse gas emissions. Among the environmentally friendly solvents, deep eutectic solvents (DESs) have demonstrated the potential capability for carbon capture. To establish a theoretical framework for DES activity, thermodynamics modeling and solubility predictions are significant factors to anticipate and understand the system behavior. Here, we combine the COSMO-RS model with machine learning techniques to predict the solubility of CO2 in various deep eutectic solvents. A comprehensive data set was established comprising 1973 CO2 solubility data points in 132 different DESs at a variety of temperatures, pressures, and DES molar ratios. This data set was then utilized for the further verification and development of the COSMO-RS model. The CO2 solubility (ln(xCO2)) in DESs calculated with the COSMO-RS model differs significantly from the experiment with an average absolute relative deviation (AARD) of 23.4%. A multilinear regression model was developed using the COSMO-RS predicted solubility and a temperature-pressure dependent parameter, which improved the AARD to 12%. Finally, a machine learning model using COSMO-RS-derived features was developed based on an artificial neural network algorithm. The results are in excellent agreement with the experimental CO2 solubilities, with an AARD of only 2.72%. The ML model will be a potentially useful tool for the design and selection of DESs for CO2 capture and utilization.
There are several different technologies that are being investigated for the capture of CO2, for example, pressure-swing adsorption and physical or chemical-solvent scrubbing.7,10 However, most technologies still suffer from high energy requirements, increased costs, and significant secondary pollution as a result of the complexity of the gas components.7,11 There is therefore a pressing need for the development of new capture technologies, which may include the design of new solvents and novel processes. Ionic liquids (ILs) are among the potential solvents for CO2 capture12,13 and have been extensively studied due to their unique and attractive properties.13–15 However, due to the extensive procedures and multiple steps involved in the synthesis and purification process, ILs are expensive solvents. For this reason, deep eutectic solvents (DESs) have emerged as promising alternatives to ILs in a wide variety of research areas and industries, including CO2 capture, biomass processing, nanotechnology, extraction processes, electrochemistry, catalysts, etc.16,17
DESs are unique solvents with many desirable characteristics, including low vapor pressure, high conductivity, high thermal and chemical stability, non-flammability, non-toxicity and a large chemical window.18,19 When compared to ILs, DESs offer a few primary advantages, the most notable of which is that the preparation of DESs is simple and economical, and there is no additional purification step required.18,20 The most fascinating property of DESs is their structural diversity. DESs are prepared by mixing a hydrogen bond acceptor (HBA) and hydrogen bond donor (HBD) at a specific molar ratio, and the resulting mixture turns into a liquid that is driven by strong interactions between HBA and HBD.20,21 A large number of cheap and renewable compounds can serve as the HBA (e.g., [Ch]Cl) and HBD (e.g., urea, sugars, acids, etc.), making DESs more affordable and sustainable than ILs.17
In recent years, DESs have been demonstrated as a potential solvent for CO2 absorption.5,22,23 However, to date the majority of the research into CO2 absorption using DESs has relied on experimental methods, which have only been able to address a small fraction of potential DES candidates.24,25 Because of structural diversity, there are approximately 1018 DES combinations that can be used to design a solvent with potentially improved CO2 absorption capabilities.26 The experimental screening of such a large number of combinations for their capacity to solubilize CO2 is intractable. Therefore, in this context, it is highly desirable and emerging to have a reliable computational model for predicting CO2 solubilities in DESs. This would reduce both the cost and the time required to develop effective solvent systems for carbon capture and utilization.
In recent years, a variety of thermodynamic models such as NRTL (non-random two-liquid), UNIQUAC (UNIversal QUAsiChemical), and UNIFAC (UNIQUAC Functional-group Activity Coefficients)27 and equation of state methods (i.e., PC-SAFT (perturbed chain-statistical associating fluid theory),28 soft-SAFT,29 CPA (Cubic-Plus Association),30 and PR-EoS (Peng-Robinson equation of state30,31) have been successfully implemented in DES-containing systems for the purpose of predicting gas solubility. However, these methods require experimental input data to fit molecule-specific binary interaction and mixing parameters, which limits the applicability space for novel solvent systems such as ILs and DESs. Recently, Biswas (2022)32 performed molecular dynamics (MD) simulations of CO2 in ionic liquids (ILs). Also using MD, Wang et al. (2019)33 studied the interaction of phosphonium-based DESs with CO2. However, performing MD simulations for large numbers of new ionic combinations and DESs is challenging due to the difficulty in generating force field parameters. Moreover, MD, MC (Monte Carlo), and explicit quantum chemical (QC) calculations of molecular complexes that explicitly take into account DES-DES and DES-CO2 interactions require prohibitive computational resources. Fortunately, a first-principles quantum chemical-based thermodynamic model, COSMO-RS (COnductor like Screening MOdel for Real Solvents), has been extensively used for screening solvents and predicting gas solubilities with acceptable accuracy.25,26 Only information on the structure of the molecule is typically required for the COSMO-RS calculations to predict the solubility and other thermodynamic properties. However, recent studies show that the COSMO-RS model overpredicts or underpredicts the gas solubilities in DESs. For instance, Liu et al. (2020) predicted the solubility of CO2 in 35 DESs using the COSMO-RS model and found 59–78% average absolute relative deviation (AARD) from experiment.25 A similar result was also reported by Wang et al. (2021) during their study on CO2 solubility in DES.26 However, these studies completely ignored the conformers of HBA and HBDs during the COSMO-RS predictions. As alternatives, molecular dynamics and Monte-Carlo simulations have been demonstrated to be reliable computational techniques for predicting the thermodynamic and phase equilibria properties, including gas solubility in solvents;33,34 however, these methods are computationally expensive, making them impractical for addressing the wide range of solvent space diversity of gasses in DES.
A potentially useful approach is to develop machine learning models based on quantitative structure–property relationships (QSPR). This could provide an accurate and cost-effective tool for evaluating CO2 solubility and DES properties while also offering useful insights into the relationships between molecular-level interactions and their macroscopic properties. As a prerequisite for QSPR models, COSMO-RS-based descriptors, such as the probability distribution of a molecular surface segment having a specific charge density, i.e., the Sigma profile charge distribution area (Sσ-profile), have been demonstrated to be reliable molecular-specific input features for predicting solvent properties (e.g., for ILs and DESs). For example, recently, Abranches et al.(2022)35 developed a machine learning model for predicting density, refractive index, and aqueous solubility using the COSMO-RS-derived Sigma profile features as input. Lemaoui et al. extensively used the COSMO-RS calculated Sigma profile areas as an input parameter for developing QSPR models for predicting the thermodynamic properties (density, viscosity, surface tension, electrical conductivity, and pH) of DESs.36–38 In addition, Nordness et al. (2021) have developed a machine learning model for predicting thermophysical properties of ionic liquids using the Sigma profiles.39 Therefore, the COSMO-RS derived Sigma profile parameters might also be explored for establishing a machine learning model for CO2 solubility prediction in DESs.
Given the limitations of linear and multilinear models in describing many thermophysical properties, machine learning (ML) algorithms have become increasingly popular for developing and building more complex non-linear QSPR models for predicting physicochemical and phase equilibrium properties. Among these, as a highly effective tool for simulating a wide range of phenomena, artificial neural networks (ANNs) have emerged as a promising tool for modeling complex processes.40 Numerous studies in the literature report that ANN models have a high level of accuracy for predicting thermodynamic properties based on molecular descriptors. For example, Adeyemi et al. (2018)41 developed an ANN bagging model to predict the density and conductivity of DESs and reported an R2 of 0.999. Atashrouz et al. (2015) predicted the surface tension of ILs using the ANN model and achieved a remarkable performance with an AARD of 4.5%.42 Further, Lemaoui et al. (2022)37 reported the prediction of surface tension of DESs using an ANN model with an AARD of 1.43% and 3.04% for training and testing sets, respectively. Therefore, the performance of ANN-based models appears to be remarkable for predicting thermodynamic properties. However, the development of an ANN model for CO2 solubility prediction has not been previously described. Therefore, a systematic screening of structurally diverse DESs is highly desirable for developing a comprehensive ANN model for CO2 solubility prediction.
In the present study, an ANN-based machine learning model was developed to predict CO2 solubility in various DESs over wide ranges of temperature and pressure. It is important to mention that the present study aims to focus on the solubility of CO2 in physical-based DESs. For the physical-based DES, CO2 absorption capacity is in accordance with Henry's constant and selectivity, and directly related to the structure of HBA and HBD. According to the literature, physical-based DES does not form covalent bonds with CO2.4,33 A comprehensive survey of the published experimental results of CO2 solubility was carried out for different types of physical-based DESs at different experimental conditions. The COSMO-RS model was used to calculate the solubility of CO2 in DESs, and the results were then compared with experimental CO2 solubilities. Further, the Sigma profile descriptors of HBA and HBD of DESs were derived from the COSMO-RS calculations. Based on the literature database and COSMO-RS-derived input features of DESs, a machine learning model was developed and validated. Using the model, novel HBA and HBD combinations are proposed for improving CO2 solubility in DES.
pj = p0j × xj × γj | (1) |
(2) |
Fig. 3(a and b) displays the σ-profiles of HBAs and HBDs of DESs. It has been seen that the σ-profile distributions in hydrogen bond donor and acceptor regions as well as the σ-profile areas of the molecules vary widely, revealing a unique σ-profile property for each molecule.35 The σ-profiles are divided into three regions: H-bond acceptor (σ > 1 e nm−2), H-bond donor (σ < −1 e nm−2), and non-polar (−1 e nm−2 < σ > +1 e nm−2) regions. To determine the σ-profile input descriptors for the machine learning model, the σ-profiles of DES constituents were divided into 10 fractions (i.e., S1–S10) by integrating σ-profile px(σ) curves over the screening charge density, σ. As exemplified by HBA and HBD in Fig. 1a and b, the fractions of the Sσ-profiles are classified into five classes depending on the screening charge densities: (1) The strong donor region [S1 and S2], (2) the weak donor region [S3], (3) non-polar region [S4, S5, S6, and S7], (4) the weak acceptor region [S8], and (5) the strong acceptor region [S9 and S10].
The Sσ-profiles of the modeled DESs are defined as the molar-weighted average of the constituents, which is the standard approach used to define the DES in the literature.36,37 The equation is expressed as follows:
(3) |
Each perceptron has an associated weight that reflects how strongly it contributes to the ANN model's output. The following is a definition of the hidden neurons that are contained within the neural network (Hn,p):31
(4) |
In this work, an ANN-based machine learning model was developed using the JMP Pro statistical software (JMP SAS 14.3.0)59 by utilizing the temperature, pressure, and the 10 Sσ-profiles molecular descriptors as input features to predict the solubility of CO2 in DESs as an output variable. The predictive correlation is defined as follows:
(5) |
(6) |
(7) |
(8) |
(9) |
To run COSMO-RS model for CO2 solubility, a large number of experimental data points were collected from the literature for CO2 solubility in 132 DESs over a wide range of temperatures (T = 293.15 K to 348.15 K), pressures (P = 26.3 kPa to 7620 kPa), and DES molar ratios (1:1 to 1:16). Similar experimental conditions (T, P, DESs, and molar ratios), were used as input to calculate the solubility of CO2 using COSMO-RS. The COSMO-RS predicted and experimental CO2 solubility data are compared and summarized in Fig. 4 and Table S1.† The COMSO-RS model calculates the solubility of CO2 in DESs with an AARD of 23.4% and R2 of 0.85. Table S1† shows that the calculated solubility of CO2 increases with pressure and decreases with increasing temperature, which is in agreement with experimental observations. However, because of the relatively high AARD values, COSMO-RS agreement with experiment is only qualitative, not quantitative. For example, the experimental solubility of CO2 (ln(xCO2)) in [Ch]Cl-phenol DES at 1:2 molar ratio is −4.87 at T = 293.15 K and P = 197.2 kPa, and −4.95 at T = 303.15 K and P = 198.2 kPa. The corresponding COSMO-RS predicted ln(xCO2) are −3.46 and −3.71, respectively, results within ∼25–28% of AARD, indicating that COSMO-RS correctly predicts the CO2 solubility qualitatively (as the T increases, ln(xCO2) decreases). It is worth noting that the AARD between experimental and COSMO-RS predictions decreases with increasing temperature. For instance, the AARD of [TBA]Cl-LA at 1:2 decreases with increasing temperatures (AARD at 93 kPa for 308 K and 318 K are 11.3% and 6.8%, respectively). In contrast, the AARD increases with pressure (e.g., [TBA]Cl-LA (1:2) DES, AARD is 11.3% to 19.5% for 93 kPa to 1992 kPa at 308 K). The higher AARD at lower temperatures may be because the COSMO-RS model underpredicts the CO2 solubility in DESs, and also might be a possibility for higher viscosity of DESs which limits the solubility.
Fig. 4 Correlation between the COSMO-RS predicted and experimental CO2 solubility in deep eutectic solvents. |
A closer look at Fig. 4 shows that the COSMO-RS-calculated CO2 solubility values are lower than the experimental results. Interestingly, at higher temperatures, the AARD values are lower than at lower temperatures and the DESs with longer alkyl chain length HBAs (e.g., [TBA]+) or larger size (e.g., [ATPP]+ cations/salts) with phenols as HBD show AARDs less than 10%, which is in excellent agreement with experimental solubility.
We also compared our COSMO-RS-calculated results with related works in the literature. Recently, Liu et al. (2020)25 used the COSMO-RS model to calculate the solubility of CO2 in 35 DESs with 502 data points. They reported that the average ARD between experimental and COSMO-RS predictions was 59.2–78.2%, which is a much higher deviation than current study predictions. This may be due to not using the energetically optimal DESs (HBA and HBD) conformers in their COSMO-RS calculations, resulting in higher CO2 solubility deviations. However, with increasing pressure and decreasing temperature, the discrepancies in the present work become larger, and this is consistent with the observations by Liu et al. (2020)25 and Kamgar et al. (2017).64 Therefore, using optimal molecular conformers of DESs provides a significant benefit to COSMO-RS calculations, which in turn leads to better predictions of CO2 solubility.
(10) |
Here, r, T, and P are the molar ratio of DES, temperature (K), and pressure (kPa). k1–k6 are the fitting parameters. To obtain the k1–k6 parameters, the experimental results of CO2 solubilities in DESs at different molar ratios, temperatures, and pressures were used as fitting targets. In total 1973 experimental data points were included in fitting with a multilinear regression model. The values of the fitting parameters are listed in Table 1.
Adjustable parameters | |||||
---|---|---|---|---|---|
k 1 | k 2 | k 3 | k 4 | k 5 | k 6 |
332.37 | −1799.04 | 7.1 × 10−5 | −4.13 × 10−5 | −1.116 | 4.92 |
The CO2 solubilities obtained with the MLR model were compared with the corresponding experimental solubilities (Fig. 5). The MLR model results are much closer to the experimental CO2 solubilities than the original COSMO-RS model, with an AARD of 12%, and R2 of 0.87. Further, the results of the MLR model developed in the present study were compared with those of the MLR model of Liu et al. (2020),25 and we found that the MLR model in the present study yields lower AARD values (12%) than that of Liu et al. (2020) (59%). A higher deviation was also reported by Liu et al. (2021)63 during their study on the evaluation of MLR model proposed by Liu et al. (2020)25 in predicting the CO2 solubility in a new set of DESs and molar ratios. It is important to mention that Liu et al. (2020)25 developed a model that has certain limitations, such as not being applicable to situations with higher molar ratios of HBA to HBD (≥1:7), new DESs, and higher pressures (≥3000 kPa). Moreover, the model was developed with a smaller set of data points (502) and a smaller number of DESs (35); thus, it cannot be considered as a universal model for CO2 solubility prediction in all situations. In contrast, the MLR model of the present study was developed by considering a wider range of HBA to HBD molar ratios (1:1 to 1:16), temperatures (293.15 K to 348.15 K), and pressures (26.3 kPa to 7620 kPa) than that of the study by Liu et al. (2020), as well as a larger set of experimental data points (i.e., 1973), and a greater diversity of different DESs (132).
Fig. 5 Correlation between the COSMO-RS corrected multilinear regression model and experimental CO2 solubility in deep eutectic solvents. |
For ML, 55% (1084 data points) of the data was used for training and the remaining 45% (889 data points) of the data was used for testing. Fig. 6 illustrates the correlation of experimental and ML predicted CO2 solubilities in the training and testing sets. Fig. 6 also lists the statistical parameters for the ML model including R2, AARD, MAE, and RMSE. As depicted in the parity plot in Fig. 6, the predictions for the training and testing sets are in excellent agreement with experimental data. For the total set of data points, R2, AARD, MAE, and RMSE values are 0.99, 2.72%, 0.087, and 0.1287, respectively, which are all at a very desirable level of accuracy. Furthermore, statistical residual analysis was also performed for the ML model and confirmed the goodness-of-fit through a normal probability plot of the relative deviations, relative deviations vs. predicted values plot, and histogram of the relative deviations. Fig. S1 and S2† depict the statistical analysis plots and show that the CO2 solubility relative deviations are within 10% with an AARD of 2.72% and RMSE of 0.1287. Moreover, the distribution of the relative deviations in different ARD ranges is also shown in Fig. 7; the majority of CO2 solubility prediction data (87%) lies within 5% of AARD and 94.5% of data within 10% of AARD. Only 1.7% of the data lies beyond 15% of AARD. These results clearly demonstrate the accuracy of the developed ML model for CO2 solubility predictions. However, the ML model has certain limitations; the model predictions are more accurate for physical-based DES systems, but not reliable for chemical-based DESs.
Fig. 6 Experimental and predicted CO2 solubility in DESs using an ANN-based machine learning model (a) training set and (b) testing set. |
Fig. 7 (a) relative deviation between the experimental and predicted CO2 solubilities in DES, and (b) the distribution of the absolute relative deviation in different deviation ranges. |
hi = vi(VTV)−1 × vTi | (11) |
(12) |
A William plot illustrates a model's domain of applicability by plotting the standardized residuals (SDR) versus the leverage values (hi) of each data point. The SDR boundaries in the William plot are between −3 < SDR < +3 and 0 < hi < h*.67
Fig. 8 shows the Williams plot for each data point, where the AD boundaries consist of a critical leverage h* = 0.036 (vertical green dashed line) and the SDR, which are ±3 (two horizontal green dashed lines). The boundary lines divide the Williams plot into four major regions (I, II, III, and IV). Predictions of the chemical substances in region I are biased, which is maybe due to the large uncertainty in the experimental data rather than wrong model predictions. The data points in region II are within the application domain of the model and these predictions are considered reliable. Interpolation among the corresponding data points can be done with reduced uncertainty. The chemical substances in region III are both response outliers (high SDR) and high leverage (>h*) values. If the data points are slightly higher than critical leverage h* and SDR, the impact on the model is negligible. However, if the data points are far away from critical leverage h* and SDR, the outlier should be removed from the model's scope of application. Finally, the data points in region IV are both response outliers and high leverage values (i.e., >h*), indicate that the predictions have a certain deviation.
Fig. 8 Williams plot (standardized residual vs. leverage) of the total set of ML model for the CO2 solubility in DESs. |
From Fig. 8, the ANN model exhibits no structural outliers in region IV as all the data points have leverage values lower than the critical value (hi < h*; region II). However, the predictions of CO2 solubility in a few DESs in both the training and testing sets are considered structural outliers as they exhibit SDR values greater than three limits (±3; region I), which brings down the AD coverage to 98.22%. 35 data points are outside of the AD limit (region I and IV), accounting for 1.78% of the total (1973), and the double extraterritorial region is blank (region III). The response outliers in the ANN model include [Ch]Cl-EA (1:7), [TPA]Cl-EA (1:7), [BHDE]Cl-LA (1:2), [ATPP]Br-DEG (1:4 and 1:10), [ATPP]Br-TEG (1:4), and [MTPP]Br-GLY (1:4). The response outliers above the SDR ± 3 boundaries may arise from large deviations in experimental measurements, and are mostly at lower temperatures and pressures (<400 kPa) in both the training and testing sets. Based on the obtained AD analysis, it can be concluded that the prediction of a new combination of DES that (i) are within the model's applicability domain and (ii) contain similar constituents to the ones utilized in the training set could be considered reliable. However, the development of new DESs that are not within the model's applicability domain should be treated with more caution. In addition, it may be worthwhile to perform experiments carefully and precisely at lower temperatures and pressures. Overall, the AD results indicate that the developed ML model possesses ample robustness and generalizability due to its large AD and structural coverage.
In addition, the covariance matrix plot between ML input features was investigated and depicted in Fig. 9. From Fig. 9, there is no significant linear connection between input features of ML except Sσ-profiles-5 (S5) and Sσ-profiles-6 (S6) of Sigma profile descriptors. The lack of linear correlation between input features indicates that the features are nonredundant and may result in a more robust ML model that more accurately predicts CO2 solubility. Fig. 9 also illustrates the correlation between ML input features and predicted CO2 solubility. The positive influence of the input features on the CO2 solubility prediction is indicated by the positive covariance matrix value, while the negative covariance matrix value indicates negative influence. It is worth mentioning that pressure, S5, S6, S2, S1, and S9 show a positive influence on the CO2 solubility predictions, implying that as the value of these parameters increases, the solubility of CO2 is seen to increase. On the other hand, the temperature has shown a negative correlation with CO2 solubility, which implies that CO2 solubility decreases with an increase in temperature; this result is in accordance with the experimental observations.
Fig. 9 Heatmap of the covariance matrix. Correlation between features of the input descriptor set and predicted CO2 solubility in DESs. |
The developed ML model shows an excellent performance and rationality in predicting CO2 solubility and reproducing experimentally observed trends in the solubility that vary systematically with physicochemical characteristics of the solvent. It is also of interest to compare the model performance with that of other computational models reported in the literature. Table 2 shows the comparison of the results of the different models along with their AARD values. From Table 2, traditional thermodynamic models such as PR-EoS (Peng-Robinson Equation of State) and PC-SAFT show good performance with low AARDs. However, a caveat is that a very small set of data points and DESs was used in validating these models. Also, these models require experimental input data to fit molecule-specific binary interaction and mixing parameters, which restricts their applicability to new solvent systems such as ILs and DESs. Considering the inapplicability of the traditional models for novel solvent systems (i.e., DES-CO2), the development of machine learning or QSPR models are emerging. Recently, Wang et al. (2021)26 proposed a QSPR model based on random forest regression for CO2 solubilities in DESs and reported an AARD of 7.76%, which is three times higher than that of the model in the present study (AARD is 2.74%). On the other hand, it is important to note that a greater number of DESs and data points were used to develop our model than that of Wang et al. (2021).26
Model | No. of DESs (molar ratio HBA:HBD) | Data points | T (K) | P (kPa) | AARD (%) | Ref. |
---|---|---|---|---|---|---|
PC-SAFT | 4 (2:1 to 3:1) | 180 | 298.15–318.15 | 10–2000 | 3.97% | Zubeir et al. (2016)28 |
PR-EoS | 3 (1:2) | 57 | 309–329 K | 40–160 | 0.80% | Mirza et al. (2015)71 |
COSMO-RS | 35 (1:2 to 1:6) | 502 | 293.15–333.15 | 71.5–2068 | 78.2% | Liu et al. (2020)25 |
COSMO-RS-based MLR | 35 (1:2 to 1:6) | 502 | 293.15–333.15 | 71.5–2068 | 10.8% | Liu et al. (2020)25 |
COSMO-RS | 59 (1:1.5 to 1:16) | 1011 | 293.15–343.15 | 36–12730 | 64.81% | Wang et al. (2021)26 |
QSPR (random forest regression) | 59 (1:1.5 to 1:16) | 1011 | 293.15–343.15 | 36–12730 | 7.76% | Wang et al. (2021)26 |
CPA | 13 (1:2 to 1:6) | 353 | 293.15–343.15 | 63–11820 | 7.02% | Pelaquim et al. (2022)30 |
PR-EoS | 13 (1:2 to 1:6) | 353 | 293.15–343.15 | 63–11820 | 5.50% | Pelaquim et al. (2022)30 |
COSMO-RS | 132 (1:1 to 1:16) | 1973 | 293.15–343.15 | 26.3–7620 | 23.4% | Present study |
COSMO-RS-based MLR | 132 (1:1 to 1:16) | 1973 | 293.15–343.15 | 26.3–7620 | 12% | Present study |
Machine learning (ANN) | 132 (1:1 to 1:16) | 1973 | 293.15–343.15 | 26.3–7620 | 2.72% | Present study |
On the other hand, the COSMO-RS model is widely used to predict the solubility of CO2 in a variety of solvent systems (molecular solvents, ionic liquids, and DESs), so it is instructive to compare the accuracies of that model reported in the literature with those of our ML model derived from COSMO-RS features that is presented here, as well as the corresponding accuracies of our in-house prediction using just the COSMO-RS model itself without ML. As summarized in Table 2, the AARD of COSMO-RS-predicted CO2 solubilities reported in the literature are in the range of 65–78.2%, while in our case, it is 23.4%. The lower AARD yielded by the COSMO-RS model in the present study is due to the consideration of multiple lowest energy molecular conformers of HBA and HBD, leading to more reliable predictions of CO2 solubility. Further, Liu et al. (2020)25 have developed a MLR model for CO2 solubility and reported 10.8% of AARD, which is consistent with our COSMO-RS-based MLR model predictions. However, the ML model is more reliable and accurate for CO2 solubility prediction than the COSMO-RS model, but nonetheless the COSMO-RS-derived descriptors are useful for developing ML models.
Fig. 11 SHAP feature importance for the testing data set of CO2 solubility in deep eutectic solvents. |
Based on the SHAP analysis, HBAs such as [TBA]Br, [TBP]Br, [TOA]Br, [ATPP]Br, menthol, and thymol, and HBDs such as TEG, DEG, decanoic acid (DecA), methyldiethanolamine (MDEA), ethanolamine (EA), ethylenecyanohydrin (ECH), and EG are potential candidates for high CO2 solubility due to the higher values of S4, S5, and S6 and lower values of polar regions (S1–S3 and S8–S10). It has also been reported that longer alkyl chain lengths of DESs, or hydrophobic moieties in general, are better solvents for CO2.23,33 Bearing this in mind, novel DES combinations were chosen based on our ML predictions and the following DESs combinations are proposed at different molar ratios and a wide range of pressures: menthol–decanoic acid (1:2), menthol–dodecanoic acid (1:2), [TBP]Br-TEG, [TOA]Br-TEG, [TOMA]Br-TEG, [ATPP]Br-DECA, [ATPP]Br-EA, [ATPP]Br-MDEA, and [ATPP]Br-ECH. Fig. 12 shows the calculated solubility of CO2 in the newly proposed DESs at 298.15 K and different pressures. As the pressure increases, the solubility of CO2 predicted by the ML model also increases, in accord with Henry's Law. More importantly from the perspective of solvents for CO2 capture, menthol–DecA, [TBA]Br-TEG, [TOMA]Br-TEG, and [ATPP]Br-DecA appear to be promising solvents for improving CO2 solubilities. The higher solubility in menthol- and phosphonium-based DESs is due to larger free volumes of HBA and HBD and strong interactions with CO2 through vdW interactions.23,68 Furthermore, to confirm the molar ratios of newly developed DES combinations, we performed COSMO-RS for menthol and decanoic acid/dodecanoic acid as an example of calculating the eutectic point composition. The eutectic point compositions for [ATPP]Br-based DES were not calculated and validated due to the lack of phase transition properties (i.e., melting point and heat fusion values) in the literature. The detailed procedure for the calculation of the eutectic point composition is discussed in our previous study.19 Fig. S4† shows the COSMO-RS-calculated eutectic point composition of both DESs (menthol: DECA and menthol: DoDECA). Menthol forms a eutectic point with decanoic acid and dodecanoic acids, and the calculated eutectic point is in liquid state at room temperature. Moreover, menthol–decanoic acid DES has a lower eutectic temperature (TE = 265.8 K) than menthol–dodecanoic acid (TE = 279 K), which indicates that menthol–DECA has a lower viscosity than menthol–DoDECA due to the larger liquid window.
Footnote |
† Electronic supplementary information (ESI) available: The CO2 solubility data in 132 DESs at different experimental conditions are provided in ESI along with different model predicted CO2 solubility. In addition, ML model validation and eutectic point composition of menthol/acids are also provided in the ESI along with this manuscript. See DOI: https://doi.org/10.1039/d2gc04425k |
This journal is © The Royal Society of Chemistry 2023 |