Derek P.
Reynolds
*,
Maria Chiara
Storer
and
Christopher A.
Hunter
*
Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK. E-mail: herchelsmith.orgchem@ch.cam.ac.uk
First published on 16th September 2021
Surface site interaction points (SSIP) provide a quantitative description of the non-covalent interactions a molecule makes with the environment based on specific intermolecular contacts, such as H-bonds. Summation of the free energy of interaction of each SSIP across the surface of a molecule allows calculation of solvation energies and partition coefficients. A rule-based approach to the assignment of SSIPs based on chemical structure has been developed, and a combination of experimental data on the formation of 1:1 H-bonded complexes in non-polar solvents and partition of solutes between different solvents was used to parameterise the method. The resulting model is simple to implement using just a spreadsheet and accurately describes the transfer of a wide range of different solutes from water to a wide range of different organic solvents (overall rmsd is 1.4 kJ mol−1 for 1713 data points). The hydrophobic effect as well as the properties of perfluorocarbon solvents are described well by the model, and new descriptors have been determined for range of organic solvents that were not accessible by direct investigation of H-bond formation in non-polar solvents.
Abraham developed a different approach by using experimental data on 1:1 complexation to develop summation solvation parameters to describe the total H-bond donor or acceptor capacity of a molecule.16–18 Here we show that the description of a molecule as a set of specific interaction points, which are associated with individual functional groups, can be used to sum interactions across the molecular surface and accurately predict solvation properties directly from chemical structure. The approach integrates the treatment of intermolecular interactions between solutes and phase transfer equilibria, which means that diverse experimental data can be used for parameterisation. The description of molecules as a collection of the functional groups allows extrapolation to compounds for which experimental data is not available.
The generalised H-bond donor and acceptor parameters α and β can be used to describe the non-covalent interaction properties of both solvents and solutes.19,20 The free energy of formation (−ΔG°) for complexes in solution is determined by the competition between solute–solute, solute–solvent and solvent–solvent interactions, as illustrated in Fig. 1.
Given experimentally determined H-bond parameters for two solutes (α and β) and the solvent (αS and βS), it is possible to make a reliable quantitative estimate of the free energy of formation of a 1:1 complex between a H-bond donor (D) and H-bond acceptor (A) using eqn (1).
ΔG°/kJ mol−1 = −(α − αS)(β − βS) + 6 | (1) |
This surprisingly simple model is based solely on the properties of the pairwise local contacts between molecules, and it does not require any consideration of long range interactions, solvent dielectric constant, solvent structure, cavitation or entropic terms. The success of the model suggests that these terms are small relative to the free energy contributions due to local interactions, or that they cancel out, or that they are captured in some way by the constant of 6 kJ mol−1, which does not vary much between solvents.
The approach illustrated in Fig. 1 has been extrapolated from single point interactions to a complete description of molecular surfaces by assigning a set of surface site interaction points (SSIP) to describe all of the non-covalent interactions that a molecule can make with the surroundings. Each SSIP is assigned an interaction parameter, which is equivalent to the empirically measured α and β parameters illustrated in Fig. 1. Fig. 2 shows the result for water, which is represented by two donor SSIPs and two acceptor SSIPs. Two approaches have been developed for obtaining the SSIP description of a molecule: manual assignment based on functional groups,21 and a computational method based on footprinting of ab initio calculated molecular electrostatic potential surfaces.22,23 The molecular SSIP description can be used to estimate the free energy contribution of non-covalent interactions to the stability of a solid by assuming that SSIPs are paired in a hierarchical fashion to maximise the total interaction energy, and this approach has been applied to successfully predict cocrystal formation.24 The free energy of interaction of a molecule with a solvent can be calculated by using the SSIP descriptions of solute and solvent in the surface site interaction model for the properties of liquids at equilibrium (SSIMPLE), and this approach has been applied in the calculation of solution phase properties like partition.21,23
Fig. 2 Water is represented by four SSIPs that describe the two H-bond donor (blue) and two H-bond acceptor (red) sites. |
In SSIMPLE, solvation free energies are obtained by calculating the equilibrium distribution of all pairwise SSIP contacts, also allowing for non-bonded states to account for the significant void space present in a liquid. There is an implicit assumption in eqn (1) that all four complexes shown in Fig. 1 are fully bound and that non-polar van der Waals interactions cancel out, so that the equilibrium is dominated by polar interactions. The introduction of non-bonded states in SSIMPLE therefore required an additional treatment of van der Waals interactions, and calculation of solvation energies using this approach is significantly more complicated than eqn (1) would suggest. Here, we present an alternative approach, which is operationally simpler and has the advantage that empirical elements are introduced to improve the accuracy. For teaching purposes, it is also useful to have an approach where partition can be calculated using only a pocket calculator or simple spreadsheet.
ΔG° = −αβ + αSβ + αβS − αSβS + RTln[S·S] | (2) |
In general, the effective concentration of solvent–solvent interactions is not well-defined, so reliable values of [S·S] can only be deduced from experimental data. Experimentally determined association constants for formation of 1:1 H-bonded complexes in carbon tetrachloride were used to derive eqn (1), which gave a value of 10 M for [S·S] for this solvent. In this paper, we describe an approach based on partitioning of solutes between different solvents, which allows experimental determination of [S·S] for a wide range of different solvents.
The first term in eqn (2) describes the free energy of the A·D complex, and the other terms describe the free energy changes associated with desolvation of the two solutes on complexation. If we define ΔgS as the free energy of transfer of a SSIP from an external reference state into a solvent S, then eqn (2) can be rewritten as eqn (3).
ΔG° = −αβ − ΔgS(α) − ΔgS(β) | (3) |
ΔgS(α) = −αβS − Cβ | (4) |
ΔgS(β) = −αSβ − Cα | (5) |
The two constants Cα and Cβ describe the solvent–solvent interactions that are disrupted when a solute enters the solvent. The sum of these constants should match the corresponding terms in eqn (2), which provide a composite description of the polarity and the concentration of the interactions that each solvent SSIP makes with the surrounding bulk solvent. Since these solvent–solvent terms are self-association, then in solvents where there is only one type of solvent–solvent interaction, the two constants must equal, as defined in eqn (6).
(6) |
This reformulation of eqn (1) leads to a description of the free energy of solvation of an individual solute SSIP, and summing over a set of SSIPs that represent the surface of a molecule provides a straightforward method for calculating molecular solvation energies and hence partition. Eqn (7) shows the free energy change for transfer of a solute with Nα H-bond donor SSIPs and Nβ H-bond acceptor SSIPs from the non-solvated reference state to solvent S.
(7) |
A constant, C0, is introduced in eqn (7) to take care of any additional free energy contributions. The requirement for this constant can be demonstrated by examining experimental data for the free energies of transfer of alkanes from the gas phase into different solvents. Fig. 4 compares the gas–liquid partitioning of alkanes into n-hexadecane with the corresponding values for partitioning into n-hexane. Although the slope is very close to unity (1.01), there is an offset of 1.39 kJ mol−1 in favour of transfer into n-hexane. The fact that the offset is a constant for different alkane solutes with very different numbers of interaction sites implies that this phenomenon is due to differences in the intrinsic properties of the two solvents. Similar behaviour is observed for pairwise comparisons of some other non-polar solvents (see ESI Section S1†). Values of the offset vary from solvent to solvent, so we define the constant C0 relative to a reference solvent, n-hexadecane, to describe this intrinsic difference between different solvents.
The overall free energy of partition for a molecule is obtained by summing over all solute SSIPs using eqn (7), and the free energy of transfer between two different solvents S1 and S2 can then be obtained using eqn (8).
(8) |
Many of the parameters required for the implementation of eqn (4)–(8) can be determined from the experimentally determined association constants for formation of 1:1 H-bonded complexes between polar solutes in non-polar solvents. However, H-bonded complexes are not sufficiently stable in polar solvents for characterisation of the solvation properties of a wider range of solvents, and interactions between non-polar solutes are too weak to lead to complexation in solution. Here we investigate the use of experimental partition data as a way of deducing parameters to describe these systems.
[SSIP] = (Nα + Nβ)[liquid] ≈ 220 M | (9) |
Solvent H-bond parameters for alkanes have been determined experimentally by applying eqn (1) to measurements of association constants for the formation of 1:1 H-bonded complexes.29 The experimental values of αS = 1.20, βS = 0.60, and RTln[S·S] = +6 kJ mol−1 can be used in eqn (6) to obtain the constants required to describe alkane solvents: Cα = Cβ = 2.64 kJ mol−1. In contrast, eqn (1) cannot be used directly to determine the parameters required to describe the properties of alkane solutes, because they are not polar enough to form stable 1:1 complexes. We therefore used experimental data on phase transfer free energies to develop a SSIP description of alkane solutes.
Fig. 5 shows experimental data for the free energy of transfer of alkanes from water to n-hexadecane. The correlation with the number of hydrogen atoms (NH) is significantly better than the correlation with the number of carbon atoms (NC), because the hydrogen atoms are always exposed on the surface of the molecule, whereas the carbon atoms are not. For example, the quaternary carbon atom in neopentane is completely buried. The approach to construction of a SSIP description of alkane solutes is therefore based on the number of CH bonds. Application of eqn (9) to alkanes indicates that the total number of SSIPs required to describe an alkane (Nα + Nβ) is approximately twice the number of CH bonds (Fig. 6).30 Assigning two SSIPs to each CH bond results in a total SSIP concentration that varies from 208 M for n-pentane to 232 M for n-hexadecane (cf. benchmark value of 220 M for water). Fig. 7 shows the calculated MEPS of methane. There are four regions of positive potential over the hydrogen atoms on the end of each CH bond and four regions of negative potential over the carbon atoms at the back of each CH bond. We therefore assign one α and one β for every CH bond in an alkane, and assume that the SSIP parameters determined for alkane solvents can be used to describe the non-covalent interaction properties of alkane solutes, i.e. α = 1.20, β = 0.60.
Fig. 6 Comparison of the total number of SSIPs (Nα + Nβ) calculated using eqn (9) with the number of CH bonds (NCH) for 182 alkanes. The line of best fit through the origin is shown (y = 1.90x). |
Aromatic acceptors | β |
---|---|
Benzene | 2.00 |
Toluene | 2.20 |
ortho-, meta- or para-xylene | 2.40 |
Mesitylene | 2.70 |
Hexamethylbenzene | 3.10 |
Water is a unique solvent in that the concentration of solvent–solvent interactions is known, because the H-bonds are almost fully bound in liquid water at 298 K.31 We can therefore set the value of [S·S] equal to 110 M in eqn (6). If we assume that the H-bond parameters that have been experimentally measured for water as a solute can be used to describe water as a solvent, i.e. αS = 2.80 and βS = 4.50, eqn (6) gives the constants required to describe water as Cα = Cβ = −0.47 kJ mol−1. However, calculation of the partition of alkanes between water and n-hexadecane using these parameters failed to reproduce the experimentally measured values (Fig. 9a). The preference for alkanes to partition into n-hexadecane is overestimated, which suggests that the solute parameters for water cannot be used to describe the properties of water as a solvent in this model. By using experimental data for the free energy of transfer of aliphatic and aromatic hydrocarbon solutes from water to n-hexadecane, it was possible to optimise the solvent parameters for water: αS = 3.80 and βS = 3.47. Fig. 9b shows that the resulting parameters provide an excellent description of partitioning of a variety of different types of hydrocarbon into water.
Fig. 10 The SSIP description of functional groups. The values shown as bold italic were optimised in order to minimise the rmsd between calculated and experimental free energies for the partition models listed in Table 2. The other values are based on previously measured experimental parameters. |
For some functional groups, experimental values of α and β have previously been determined by applying eqn (1) to measurements of association constants for the formation of 1:1 H-bonded complexes, and these values were used unmodified and assigned to the relevant SSIPs. However, these experimental parameters only provide information on the most polar site present in a solute, so when more than one type of α or β is required to describe a functional group, optimisation of the parameter describing the second site was often required. For example, an alcohol is described by one α and two β SSIPs (Fig. 10), but using the experimental value of 5.30 for both β SSIPs overestimated the polarity of this functional group. An accurate description of the partition coefficients of alcohols was obtained by reducing one of the two β parameters from 5.30 to 3.98. Alkyl groups are well-described by the same α and β parameters developed for alkanes above and to simplify the assignment problem the effects of electronegative substituents (O, N or S) on these parameters were assumed to be negligible. Similarly, the parameters developed for aromatic hydrocarbons generally provide a good description of substituted aromatic rings, with the exception that the value of β for the two H-bond acceptor sites over the centre of the ring had to be varied depending on the electronic properties of the ring substituents. The same approach starting from experimental functional group H-bond parameters and atom type hybridisation models was applied to ketones, nitriles, amines, amides, thioethers, alkyl fluorides, alkyl chlorides, fluoro and chloro substituted benzenes and phenols in order to expand the range of solute functional groups (see ESI Section S3† for full details). The SSIP parameters for these functional groups were optimised using data from partition into both non-polar and polar solvents (see below).
Solvent | Solvent descriptors | Solvent/water partition | 1:1 complexation | ||||||
---|---|---|---|---|---|---|---|---|---|
α S | C α | β S | C β | C 0 | n | rmsd/kJ mol−1 | n | rmsd/kJ mol−1 | |
Water | 3.80 | −0.76 | 3.47 | −0.76 | 0 | ||||
Hexadecane | 1.20 | 2.64 | 0.60 | 2.64 | 0 | 219 | 1.2 | ||
Benzene | 1.40 | 2.50 | 2.00 | 1.09 | 0 | 37 | 1.2 | 108 | 1.7 |
Toluene | 1.40 | 2.50 | 2.00 | 1.09 | 0 | 34 | 1.1 | 46 | 1.7 |
Hexane | 1.20 | 2.62 | 0.60 | 2.62 | 1.38 | 87 | 1.1 | 6 | 0.7 |
Cyclohexane | 1.20 | 2.61 | 0.60 | 2.61 | 1.33 | 51 | 1.2 | 109 | 1.2 |
Carbon tetrachloride | 1.40 | 2.58 | 0.60 | 2.58 | 1.34 | 86 | 1.2 | 475 | 0.8 |
Dichloromethane | 1.80 | 2.16 | 1.40 | 1.76 | 1.73 | 28 | 1.3 | 42 | 1.1 |
Chloroform | 2.10 | 1.78 | 1.30 | 2.11 | 0.60 | 71 | 1.6 | 13 | 1.2 |
1,2-Dichloroethane | 1.70 | 2.23 | 1.60 | 1.41 | 1.93 | 58 | 1.6 | 34 | 1.0 |
Chlorobenzene | 1.40 | 2.51 | 1.40 | 1.71 | 1.48 | 57 | 1.2 | 22 | 1.5 |
Perfluoroalkane | 1.2 | 2.41 | 0.60 | 2.41 | 0.81 | 27 | 1.85 | 15 | 1.7 |
Experimental data on association constants for the formation of 1:1 H-bonded complexes provide an independent validation of these solvent parameters. Combining eqn (3)–(5) gives eqn (10), which can be used to calculate values of ΔG° for formation of 1:1 complexes. For each solvent, the values of αS, βS, Cα and Cβ in Table 2 were used in conjunction with previously determined experimental values of solute α and β parameters to calculate the free energy change for complexes formed by a variety of H-bond acceptors with H-bond donors. In all cases, the rmsd between the calculated and experimental values is less than 1.7 kJ mol−1 (see ESI Section S4†) confirming the reliability of the solvent parameters. Fig. 11b illustrates the quality of the description of 1:1 complexation data for dichloromethane.
ΔG° = −αβ + αβS + αSβ + Cα + Cβ | (10) |
We will describe development of the model for ethers to illustrate the approach used to parameterise polar organic solvents. Ethers are the simplest class of polar organic solvent, because they have two types of acceptor SSIP, which describe the non-polar hydrocarbon region (βS1) and the polar oxygen sites (βS2), but only one type of CH donor SSIP (αS1). Fig. 12 shows that the free energy of transfer of alkanes from the gas phase into alkyl ethers is very similar to the value for transfer into n-hexadecane. This result suggests that the hydrocarbon component of an ether has similar properties to an alkane, so the solvation of non-polar solute SSIPs can be described using the same parameters used to describe alkane solvents, i.e. αS1 = 1.2 and βS1 = 0.6. The parameter for the polar oxygen acceptor SSIP can be estimated from the solute values in Fig. 10, which gives βS2 = 5.3.
The solvation of solute H-bond acceptor SSIPs is described in a straightforward manner by eqn (5) using αS = αS1 = 1.2 for the solvent H-bond donor SSIPs. Solvation of solute H-bond donors is more complicated, because they interact with two different acceptor sites. We assume that the solvent is present in a large excess relative to the solute, so that each solute SSIP is solvated according to an effective equilibrium constant for interaction with each type of solvent SSIP. The equilibrium constants for the interaction of a solute H-bond donor SSIP α with the two different solvent H-bond acceptor SSIPs βS1 and βS2 are given by eqn (11) and (12).
(11) |
(12) |
The total free energy contribution due to solvation of the solute H-bond donor SSIP α is given by the sum of these equilibrium constants weighted by the fraction of interactions made with the relevant solvent SSIP, as shown in eqn (13).
(13) |
The constants Cβ1 and Cβ2 are determined by the polarity and concentration of solvent–solvent interactions and are likely to vary from solvent to solvent. However, when values of transfer free energies from water to different ether solvents were used to independently optimise values of Cα1, Cβ1, Cβ2 and C0 for tetrahydrofuran, diethyl ether and di-n-butyl ether, the variation in the values of the constants between solvents was rather small (see ESI Section S5†). Fig. 13 shows that using a single generic value of each constant for ethers (see Table 3) gives a good description of the experimental data for all three solvents.
Solvent class | Solvent descriptors | Solvent/water partition | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
α S1 | C α1 | β S1 | C β1 | α S2 | C α2 | β S2 | C β2 | C 0 | n | rmsd/kJ mol−1 | |
a Diethyl ether, di-n-butyl ether and tetrahydrofuran. b Acetonitrile, propionitrile and butyronitrile. c Acetone, butanone and cyclohexanone. d Methanol, ethanol, propan-1-ol, butan-1-ol, pentan-1-ol, hexan-1-ol, heptan-1-ol, octan-1-ol, decan-1-ol, propan-2-ol, butan-2-ol, 2-methylpropan-1-ol, 2-methylpropan-2-ol and 3-methylbutan-1-ol. | |||||||||||
Ethersa | 1.20 | 2.58 | 0.60 | 2.98 | — | — | 5.30 | −3.65 | 2.26 | 109 | 1.7 |
Nitrilesb | 1.20 | −0.79 | 0.60 | 2.69 | 1.50 | 2.67 | 5.15 | −3.54 | 3.55 | 76 | 1.7 |
Ketonesc | 1.20 | −0.78 | 0.60 | 2.82 | 1.50 | 2.71 | 5.80 | −4.09 | 1.89 | 118 | 1.4 |
Alcoholsd | 1.20 | 2.75 | 0.60 | 2.95 | 3.50 | −6.02 | 6.90 | −6.42 | 0.18 | 604 | 1.5 |
For alcohols, there are two different types of acceptor SSIP, the polar oxygen and non-polar hydrocarbon, so eqn (11)–(13) can be used to describe the solvation of solute donor SSIPs. There are also two different types of donor SSIP, the polar OH and the nonpolar CH groups, so a similar set of eqn (14)–(16) are required to described the equilibria involved in the solvation of solute acceptor SSIPs.
(14) |
(15) |
(16) |
The effect of alcohols on the association constants for formation of 1:1 H-bonded complexes has been investigated in alcohol-alkane mixtures.32 These experiments suggest that alcohol self-association leads to an increase in the polarity of the hydroxyl group compared with monomeric alcohols in dilute solution. The effective value of the hydroxyl α increases from 2.7 to 3.5 and β increases from 5.3 to 6.9 in the self-associated bulk liquid. These parameters were therefore used as αS2 and βS2 to describe the polar SSIPs in alcohols. The alkane solvent parameters (αS1 = 1.2 and βS1 = 0.6) were used to describe the non-polar hydrocarbon SSIPs in alcohols. Partition data were available for 14 different alcohol solvents, and these data were used to optimise a single set of constants (Cα1, Cα2, Cβ1, Cβ2 and C0 in Table 3) that describe the solvent–solvent interactions in all of these alcohols.
Experimental values of αS and βS have been determined for nitrile and ketone solvents by applying eqn (1) to measurements of association constants for the formation of 1:1 H-bonded complexes (see ESI Section S6†). These values were therefore used to describe the polar SSIPs in these solvents, αS2 and βS2, and the non-polar hydrocarbon SSIPs were described using the alkane parameters as above (αS1 = 1.2 and βS1 = 0.6). As for ethers and alcohols, a generic set of constants (Cα1, Cα2, Cβ1, Cβ2 and C0) were obtained for dialkyl ketones and for alkyl nitriles by minimising the rmsd between experimental partition data and calculated transfer free energies (Table 3).
Experimental data for partition between water and n-hexadecane was available for 219 of the solutes in the training set. The solvent parameters in Tables 2 and 3 were used with SSIP representations of these solutes to calculate transfer free energies between water and 34 different organic solvents (see ESI Section S9† for details). The results are illustrated in Fig. 14. The agreement is excellent with an overall rmsd of 1.4 kJ mol−1 for 1713 data points, and there are no major outliers.
For 189 of the solutes in the training set, data was also available for partition between wet octanol and water, and these data were used to validate the generic alcohol solvent parameters. Comparison of calculated and experimental values was satisfactory with an rmsd of 1.6 kJ mol−1. In addition, calculations were performed on a validation set of 84 solutes that were not present in any of data sets used to parameterise the model. Again good agreement with the experimental data was obtained with an overall rmsd of 1.7 kJ mol−1 for octanol–water partition and 1.6 kJ mol−1 for n-hexadecane-water partition. The original training set was limited to simple compounds, some of the validation set solutes contained more than one functional group (e.g. 1,4-dicyanobutane), and these compounds are described well. These results suggest that the model developed here is robust and has applicability beyond the systems used for parameterisation.
The performance of the model was evaluated by comparing the results with related methods. Fig. 15 compares the model with the performance of the SSIMPLE method for the same for 1713 experimental data points used in Fig. 14. SSIMPLE uses ab initio calculations to obtain SSIPs for both solute and solvent and evaluates all pairwise SSIP interactions in the liquid phase, including van der Waals interactions.21 There are very few empirical parameters in SSIMPLE, and although the trend is reasonable, the performance is significantly worse than the model developed in this paper, with an overall rmsd of 3.5 kJ mol−1. For octanol–water partition, a number of methods have been developed. clogP is a structure-based method that uses empirically derived parameters for functional group fragments, so it can be generalised to a wide range of solutes but cannot be generalised to other solvents.33 Abraham's linear solvation energy relationship (LSER) is based on summation over solvent–solute interactions and can be applied to a wide range of solvents, but empirical parameters are required for each solute.18 For 189 wet octanol–water data points, the rmsd values obtained using clogP and Abraham's linear solvation energy relationship (LSER) are 0.8 kJ mol−1 and 1.1 kJ mol−1 respectively. Although the rmsd obtained using the model described here is slightly higher (1.6 kJ mol−1), the advantage is that extrapolation to a wider range of solutes and solvents requires minimal reparameterisation.
Many of the required values of the α and β parameters were available from experimental measurements of association constants for 1:1 H-bonded complexes in carbon tetrachloride, and these parameters were used directly. For alkanes, which do not form H-bonds in carbon tetrachloride, experimental 1:1 complexation data in alkane solvents have been used to determine solvent H-bond parameters, and these parameters were used for alkane solutes.
For functional groups that have more than one possible H-bond acceptor site, such as alcohols, the experimental H-bond parameter was used for the first β SSIP, and value for the second β SSIP was optimised using partition data. In these cases, the optimised value of the second SSIP was always less polar than the experimental value used for the first SSIP. These results are consistent with a range of computational and experimental studies.34–40 Analysis of the H-bond acceptor parameters for carbonyl groups that make an intramolecular H-bond confirms the accuracy of the SSIP values. Fig. 16 shows that the equilibrium constant is a surrogate for K2, the equilibrium constant for formation of a second intermolecular H-bond with a carbonyl group. The experimental values of correspond to β values of 3.8–4.3, which are significantly lower than the H-bond parameters measured for the first H-bond using K1 (5.5–6.4).37 These parameters agree well with the two different β values (3.8 and 5.8) used to describe the two lone pairs of a carbonyl group in Fig. 10.
However, the agreement between calculated and experimental free energies of transfer for chloroalkanes which have CH2Cl groups was not so good (rmsd = 1.6 kJ mol−1 for 10 organic solvents). We have modelled alkyl substituents in various compound classes discussed above using the same parameters as for alkanes. Polarisation of C–H bonds adjacent to heteroatoms might be expected, but reasonable results were obtained without modifying the SSIPs in most cases. As the large values of α found for dichloromethane and chloroform might suggest, this is not the case for chloroalkanes. Fig. 17 shows that increasing the value of α for the C–H groups adjacent to the chlorine from 1.2 to 1.6 significantly improved the description of these compounds (rmsd = 0.9 kJ mol−1 for 10 organic solvents). Note that one of the assumptions implicit in the approach described here is that there are no changes in the net van der Waals interaction in the equilibrium illustrated in Fig. 1, so that partition can be described purely in terms of polar interactions between SSIPs. The excellent agreement between the calculated and experimental partition data obtained for chloroalkanes suggests that this assumption holds for second row elements as well.
Each solvent SSIP is also assigned a constant (Cα or Cβ), which quantifies the interactions that are broken with the bulk solvent when the solvent SSIP solvates a solute. The values of these constants were optimised using partition data, and they capture information about both the polarity and the concentration of solvent–solvent interactions. For solvents with just two types of SSIP, the sum of the two constants can be directly related to the properties of the solvent–solvent interactions by eqn (17).
Cα + Cβ = −αSβS + RTln[S·S] | (17) |
As explained above, the effective concentration of solvent–solvent interactions [S·S] is not easy to estimate for most solvents. When eqn (1) was first proposed the constant of +6 kJ mol−1 was obtained from measurements of formation of 1:1 H-bonded complexes in carbon tetrachloride.19 This value corresponds to [S·S] = 10 M, and we have assumed that a constant of +6 kJ mol−1 can also be used in eqn (1) to describe 1:1 complexation other organic solvents. However, the values of Cα and Cβ obtained here from the partition data can be used in eqn (17) to obtain a direct experimental measurement of this parameter. The results shown in Table 4 confirm that the values of RTln[S·S] for non-polar solvents are approximately constant (+5.5 to +6.6 kJ mol−1). The high value for chloroform (6.6 kJ mol−1) is indicative of the greater solvating power of a more polar solvent, and the low value for perfluoroalkanes (5.5 kJ mol−1) reflects the very weak intermolecular interactions present in these solvents. The results in Table 4 show why using a value of +6 kJ mol−1 for the constant in eqn (1) provides an accurate description of the association constants for formation of 1:1 complexes in different non-polar solvents. In contrast, partition is much more sensitive to the precise values of the solvent constants, because the small differences in Table 4 are significant when they are multiplied up by the total number of SSIPs in a solute.
The behaviour of polar solvents is more complicated, because with two types of donor and two types of acceptor there are four different types of solvent–solvent interaction. However in the case of alcohols, the solvent parameters in Table 3 show that Cα1 ≈ Cβ1 and Cα2 ≈ Cβ2. Structural studies on alcohols showing clustering of the alkyl chains and separate H-bonded aggregates of the hydroxyl groups.41,42 In other words to a first approximation, we can think of alcohols as consisting of two independent solvating domains. The solvent parameters that describe the non-polar hydrocarbon domain in alcohols are similar to those found for alkanes. Using the values of αS1, Cα1, βS1 and Cβ1 in eqn (17) gives RTln[S·S] = +6.4 kJ mol−1 for the hydrocarbon domain in alcohols, which is similar to the values found for non-polar solvents in Table 4. The polar hydroxyl domain is described by αS2, Cα2, βS2 and Cβ2, and using these values in eqn (17) gives a value of +11.7 kJ mol−1 for RTln[S·S], which is the same as the value for water in Table 4. The model that emerges from the partition data is that alcohol solvents behave as a mixture of a water-like domain and an alkane-like domain.
There is an additional solvent constant C0, which is used to describe differences between solvents that are not accounted for by the SSIP interaction model. The values of C0 listed in Tables 2 and 3 are all positive and small (generally less than +2 kJ mol−1), and there are no obvious patterns. This result indicates that a simple model based on pairwise interactions between specific interaction sites on solvent and solute provides a rather general description of solvation phenomena and that there are no major additional factors that need to be considered.
Consider for example transfer of the 9 different alkanes in Table 5 from n-hexadecane into water. Each of these alkanes has 16 hydrogen atoms, and each C–H groups has two SSIPs with values of α = 1.2 and β = 0.6. The free energy of solvation for each of the SSIPs can be calculated using eqn (4) and (5) with the solvent constants listed in Table 2.
Alkane | Formula | n-Hexadecane to water | n-Hexadecane to gas | Melting point |
---|---|---|---|---|
ΔG°/kJ mol−1 | ΔG°/kJ mol−1 | K | ||
n-Heptane | C7H16 | 29.3 | 18.1 | 182 |
3-Methylhexane | C7H16 | 28.7 | 17.4 | 154 |
2,2-Dimethylpentane | C7H16 | 28.0 | 16.0 | 149 |
Ethylcyclohexane | C8H16 | 31.1 | 22.1 | 162 |
Propylcyclopentane | C8H16 | 30.6 | 21.7 | 156 |
cis-1,2-Dimethylcyclohexane | C8H16 | 28.6 | 21.9 | 223 |
trans-1,4-Dimethylcyclohexane | C8H16 | 28.7 | 20.8 | 236 |
Cyclooctane | C8H16 | 28.3 | 24.7 | 288 |
Adamantane | C10H16 | 27.5 | 28.1 | 543 |
For water:
ΔgS(α)/kJ mol−1 = −αβS − Cβ = −1.20 × 3.47 + 0.76 = −3.40 |
ΔgS(β)/kJ mol−1 = −αSβ − Cα = −3.80 × 0.60 + 0.76 = −1.52 |
For n-hexadecane:
ΔgS(α)/kJ mol−1 = −αβS − Cβ = −1.20 × 0.60 − 2.64 = −3.36 |
ΔgS(β)/kJ mol−1 = −αSβ − Cα = −0.60 × 1.20 − 2.64 = −3.36 |
The difference between these solvation energies is (−3.40–1.52) − (−3.36–3.36) = 1.80 kJ mol−1, which represents the free energy of transfer of one C–H group from n-hexadecane to water. Thus the calculated value for the free energy of transfer is the same for all of the alkanes, 16 × 1.8 = 28.8 kJ mol−1, which agrees very well with the experimental values in Table 5 (29 ± 2 kJ mol−1).
Solutes are described based on the composition of functional groups, which allows straightforward rule-based translation of chemical structure into a SSIP description without the need for the ab initio calculations we have used previously. Positive SSIPs (α) are assigned to hydrogen atoms, and negative SSIPs (β) are assigned to represent lone pairs and π-electron density. The total number of SSIPs used to represent a solute is fixed such that the concentration of SSIPs in a liquid is approximately constant. The assumption is that van der Waals contributions cancel out in any association or phase transfer equilibria, so that the behaviour of the system is dominated by polar interactions between SSIPs. Equilibria are treated as a competition between pairwise interactions of solute and solvent SSIPs, which has been shown previously to provide a rather good description of experimental data on 1:1 complexation in organic solvents.
Values of the most polar SSIPs required to describe each functional group were obtained from the experimentally determined H-bond parameters α and β. The values of any additional SSIPs required to complete description of each functional group were obtained by optimisation using experimental data on phase transfer equilibria. Similarly, the values of the SSIPs required to describe many organic solvents (αS and βS) have been determined previously from experimental data on 1:1 complexation. The SSIP description of other solvents like water, where 1:1 complexes are not sufficiently stable for experimental study, were obtained by optimisation using experimental data on phase transfer equilibria. For solvents that contain both polar and non-polar functional groups like alcohols, two sets of solvent SSIPs were used to describe the equilibrium between the two different solvation modes. In addition, for each solvent SSIP, a constant term was used to describe the effects of solvent–solvent interactions. The resulting model is simple to implement using just a spreadsheet (see ESI Section S9†) and accurately describes the transfer of a wide range of different solutes from water to a wide range of different organic solvents (overall rmsd is 1.4 kJ mol−1 for 1713 data points).
The model described above is the result of a feasibility study based on solutes with one functional group and a limited set of solvents. The purpose was to determine whether the solvent competition model originally applied to H-bond complexes in non-polar solvents could be extended to describe whole molecule solvation and partition between organic solvents and water. This simple model describes the hydrophobic effect with surprising accuracy. It has also been possible to deduce new descriptors for range of organic solvents that were not accessible by direct investigation of H-bond formation in non-polar solvents. These empirical parameters should be of value in linking the results of ab initio quantum mechanics calculations with the thermodynamic properties of molecules in solution.
Experimental H-bond parameters are available for most organic functional groups, and these parameters can be used to develop the method for application to a much wider range of solutes than those used in the parameterisation described here. In addition, the availability of experimental H-bond parameters for anionic and cationic functional groups may provide a method for the treatment of ionisable compounds. One limitation is that the experimental data on 1:1 complexation and partition is generally restricted to room temperature, which means that extrapolation to different temperatures would require a new treatment. There are some challenges that will have to be addressed in development of the model to tackle more complex polyfunctional compounds. Electronic interactions between functional groups through the bonding framework can affect the H-bond parameters, and intramolecular non-covalent interactions may change the availability of SSIPs for interaction with solvent. In addition, the current model is based purely on covalent connectivity, and a more elaborate treatment would be required to tackle the effects of conformer distribution on solvation energies.
Footnote |
† Electronic supplementary information (ESI) available: Experimental data and references, substructure fragments, H-bond parameters, solvent parameters, surface area calculations, and spreadsheets illustrating the calculations. See DOI: 10.1039/d1sc03392a |
This journal is © The Royal Society of Chemistry 2021 |