Mark D.
Driver
a,
Mark J.
Williamson
a,
Joanne L.
Cook
b and
Christopher A.
Hunter
*a
aDepartment of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK. E-mail: herchelsmith.orgchem@ch.cam.ac.uk; Tel: +44 (0)1223 336710
bDepartment of Chemistry, University of Sheffield, Sheffield, S3 7HF, UK
First published on 21st April 2020
Solvation has profound effects on the behaviour of supramolecular systems, but the effects can be difficult to predict even at a qualitative level. Functional group interaction profiles (FGIPs) provide a simple visual method for understanding how solvent affects the free energy contribution due to a single point interaction, such as a hydrogen bond, between two solute functional groups. A generalised theoretical approach has been developed, which allows calculation of FGIPs for any solvent or solvent mixture, and FGIPs for 300 different solvents have been produced, providing a comprehensive description of solvent effects on non-covalent chemistry. The free energy calculations have been validated using experimental measurements of association constants for hydrogen bonded complexes in multiple solvent mixtures. The calculated FGIPs provide good descriptions of the solvation of polar solutes, solvophobic interactions between non-polar solutes in polar solvents like water, and preferential solvation in solvent mixtures. Applications are explored of the use of FGIPs in drug design, for optimising receptor-ligand interactions, and in enantioselective catalysis for solvent selection to optimise selectivity.
The association of two solutes in solution can be described by the equilibrium shown in Fig. 1. The equilibrium constant depends on stability of the four complexes shown, which can be estimated using eqn (1).19
ΔG°/kJ mol−1 = −(α − αs)(β − βs) + c | (1) |
Fig. 1 The solvent competition model for the formation of a hydrogen bonded complex between two solutes. The position of equilibrium is determined by the energies of the solute–solvent interactions in the free state, and the solute–solute and solvent–solvent interactions in the bound state. D a hydrogen bond donor solute and A represents a hydrogen bond acceptor solute.19 |
The H-bond parameters used in eqn (1) have been experimentally determined for a wide range of different functional groups13 and can also be calculated from the values of the maxima and minima in ab initio molecular electrostatic potential surfaces.21 The validity of eqn (1) has been experimentally demonstrated for formation of 1:1 complexes between a wide variety of different solutes in a range of different solvents.20,22–35 The constant c in eqn (1) is related to the fact that organic solvents typically have a concentration of about 10 M, whereas the standard state for solutes is 1 M.36 The origin of the value c will be discussed in more detail below, but the focus of this paper is the development of a computational method for calculation of the first term in eqn (1) for any solvent environment. The first term in eqn (1) represents the free energy change associated with the exchange of polar interactions between solutes and solvent. Expressing this energy as ΔΔGFGI in eqn (2) provides a useful tool for predicting the free energy contribution that a specific functional group interaction makes to the stability of a supramolecular system where there are multiple non-covalent interactions.
ΔΔGFGI/kJ mol−1 = −(α − αs)(β − βs) | (2) |
We call a two-dimensional plot of ΔΔGFGI calculated as a function of the two solute H-bond parameters, α and β, a Functional Group Interaction Profile (FGIP).19 A FGIP shows the free energy contribution for all possible solute–solute interactions in a given solvent and provides a simple visual method for understanding how that solvent affects non-covalent chemistry. Fig. 2 illustrates the general result. The FGIP is divided into four different regions, which are characterised by which of the four complexes shown in Fig. 1 is the most stable.
Fig. 2 Generic Functional Group Interaction Profile (FGIP) for the free energy contribution due to the interaction of a hydrogen bond donor (α) with a hydrogen bond acceptor (β) in solvent S (ΔΔGFGI). In the two red quadrants, ΔΔGFGI is positive, and the functional group interactions are unfavorable. In the two blue quadrants, ΔΔGFGI is negative, and the functional group interactions are favorable. The solvent parameters αS and βS set the boundaries between these quadrants.19 |
When the solute–solute interaction is the most stable of the four complexes, the equilibrium in Fig. 1 lies to the right, i.e. when α > αS and β > βS, ΔΔGFGI will be negative. This regime is represented by quadrant 2 of Fig. 2.
When either of the two solute–solvent interactions are the most stable of the four species in Fig. 1, then the equilibrium lies to the left, i.e. when α < αS and β > βS, or when β < βS and α > αS, ΔΔGFGI will be positive. These two regimes are represented by quadrants 1 and 4 in Fig. 2.
If the solvent–solvent interaction is the most stable of the four species in Fig. 1, the equilibrium will lie to the right, i.e. when α < αS and β < βS, ΔΔGFGI will be negative. This regime describes solvophobic interactions and is represented by quadrant 3 in Fig. 2.
The boundaries between the four regimes in the FGIP are defined by the lines α = αS and β = βS where like for like interactions between solvent and solute are exchanged.
In practice, eqn (2) overestimates the magnitude of solvophobic interactions, and a more complicated formulation was developed to accurately describe the hydrophobic effect in water.19 However, the major limitation of eqn (2) is that the solvent is described by a single type of hydrogen bond donor and a single type of hydrogen bond acceptor. Thus eqn (2) cannot be used to construct the FGIP for solvents like alcohols that have both OH and CH donors. Another limitation is that the stability of H-bonded complexes is known to depend on the concentrations of the solvating functional groups as well as their polarity, and eqn (2) does not capture any information about solvent concentration.25 Solvent mixtures are similarly beyond the scope of eqn (2). In this paper, we develop a generalised treatment that allows calculation of FGIPs for any solvent composition and illustrate the power of the approach by providing FGIPs for about 300 different solvents and solvent mixtures.
Briefly in SSIMPLE to describe a liquid, SSIP interactions are treated in a pairwise manner, such that the association constant for interaction between the ith and jth SSIP, Kij, is given by eqn (3).
(3) |
The interaction energy is made up of a polar term and a non-polar term, EvdW, which is the energy of the van der Waals interaction between two SSIPs. For repulsive interactions (i.e. εi and εj have the same sign), it is assumed that a state can be found where the polar sites are misaligned such that only non-directional van der Waals interactions are made, and the polar interaction term, εiεj, is set to zero. The standard state used to ensure Kij is dimensionless is the maximum theoretical density of SSIPs, cmax = 300 M.36 The speciation of all SSIP contacts in the liquid phase can then be calculated.
The free energy of solvation of the SSIP that represents solute 1, ΔGS(1), can be calculated by considering the concentration of this SSIP that is not bonded to a solvent SSIP ([1nb]). ΔGS in eqn (4) is the free energy of transfer of solute 1 from a reference state, which corresponds to a dilute gas where there are no SSIP interactions.
(4) |
The first term in eqn (4) describes the interactions made by the solute SSIP with the solvent SSIPs. The second term in eqn (4), ΔGc, corrects for the increased probability of interaction between SSIPs when they are confined to a condensed phase.36 This confinement energy affects the free energy of transfer between two phases of different SSIP density, but for processes that take place within the same phase, such as solute complexation, ΔGc will be the same in the free and bound states and cancels out.
In order to use the solvation energies calculated with SSIMPLE to describe the free state in Fig. 1, the free energy of the bound state must be defined relative to the same non-bonded reference state. Therefore we require the probability that the solute SSIPs do not interact with one another in a phase that describes the bound state. We consider the bound state to be a phase where only the two solute SSIPs are present and the total SSIP concentration is the same as the bulk liquid. The total concentrations of each SSIP in the bound state, [1] and [2], are given by eqn (5) and (6).
[1] = [1nb] + 2K12[1nb][2nb] + 2K11[1nb]2 | (5) |
[2] = [2nb] + 2K12[1nb][2nb] + 2K22[2nb]2 | (6) |
The total concentrations of the two SSIPs in the bound state are the same, and because the self-interactions are both repulsive, K11 and K22 are both equal to KvdW. The non-bonded concentrations, [1nb] and [2nb], are therefore equal.
Rearrangement of eqn (5) and (6) gives eqn (7).
(7) |
Thus the free energy for transfer of a solute SSIP from the reference state to the bound state is given by ΔGB (eqn (8)). As for the solvation energy, the first term describes the SSIP interactions made in the bound state, and the second term corrects for the probability of confinement in a condensed phase.
(8) |
Thus eqn (4) can be used to calculate the free energy of a solute in the free state, and eqn (8) can be used to calculate the free energy of a solute in the bound state, both relative to the same reference state. These equations can be combined to give a calculated value of ΔΔGFGI (eqn (9)).
ΔΔGFGI = ΔGB(1) + ΔGB(2) − ΔGS(1) − ΔGS(2) | (9) |
Note that the second term in eqn (4) and (8), which is associated with the confinement energy, cancels out in this analysis, so the value ΔΔGFGI depends only on the relative probability of SSIP interactions in the free and bound states. The key feature of this treatment is that the value of ΔΔGFGI is zero when α = αS and β = βS, as demonstrated in Fig. 3 and 4. Fig. 3 shows the FGIP for a hypothetical room temperature liquid state of a nobel gas, where there are no polar interactions (εi is zero for all solvent SSIPs). The value of ΔΔGFGI at the origin is zero as required. Fig. 4 shows the FGIP for water, where polar interactions dominate. The value of ΔΔGFGI is zero at the centre of the FGIP (α = αS = 2.8, β = βS = 4.5), where the solutes and solvent have the same polarity.
Fig. 3 FGIP for the interaction of two solutes in a hypothetical room temperature liquid state of a noble gas where the concentration of solvent SSIPs is 160 M (ΔΔGFGI in kJ mol−1). |
Fig. 4 FGIP for the interaction of two solutes in water at 298 K (ΔΔGFGI in kJ mol−1). The solute–solute interactions are favourable in the blue region, and unfavourable in the red region. |
Representative hydrogen bond acceptor functional groups are illustrated along the top of Fig. 3 and 4. The hydrogen bond acceptor interaction sites are colored red and are aligned with the corresponding β values. Representative hydrogen bond donor functional groups are illustrated down the right side of Fig. 3 and 4. The hydrogen bond interaction sites are colored blue and are aligned with the corresponding α values. The interaction between any pair of functional groups can simply be read from these plots. For example, the interaction between a phosphine oxide (β ≈ 10) and a phenol (α ≈ 4) is about −37 kJ mol−1 in the hypothetical liquid state of a noble gas and about −7 kJ mol−1 in water. This interaction is in the blue region for both FGIPS, which corresponds to a favourable interaction. The interaction of an aryl CH donor (α ≈ 1) with a phosphine oxide is quite different. In water, this interaction falls in the red region, which corresponds to an unfavourable interaction, and is worth +11 kJ mol−1. In the hypothetical liquid state of a noble gas, this CH–O H-bond is in the blue region and corresponds to a favourable contribution to the free energy of binding of −7 kJ mol−1.
The SSIMPLE algorithm was used to calculate values of ΔGS for all values of solute SSIP between −10 and 5. Eqn (9) was then used to produce the FGIP for each of the 261 solvents. The complete set of FGIPs for all solvents is provided in the ESI,† but we will highlight some of the key features with selected examples.
Fig. 5 FGIP for the interaction of two solutes in ethanol at 298 K (ΔΔGFGI in kJ mol−1). The solute–solute interactions are favourable in the blue region, and unfavourable in the red region. |
Fig. 7 FGIP for the interaction of two solutes in formic acid at 298 K (ΔΔGFGI in kJ mol−1). The solute–solute interactions in the blue region, and unfavourable in the red region. |
Fig. 8 FGIP for the interaction of two solutes in heptanoic acid at 298 K (ΔΔGFGI in kJ mol−1). The solute–solute interactions in the blue region, and unfavourable in the red region. |
Fig. 10 shows FGIPs for mixtures of tetrahydrofuran (THF) and chloroform (see ESI† for more compositions). The FGIPs for the pure solvents show only two of the quadrants from Fig. 2. In THF, only quadrants 1 and 2 appear. The reason is that the most positive SSIP in THF is +0.6, so the effective value of αS is close to zero, and quadrants 3 and 4 disappear. The most negative SSIP in THF has a value of −6.3, so βS falls in the middle of the β scale, splitting the FGIP into two regions of similar area. In chloroform, only quadrants 2 and 4 appear. The reason is that the most negative SSIP in chloroform is −0.5, so the effective value of βS is close to zero, and quadrants 1 and 3 disappear. The most positive SSIP in chloroform has a value of +2.2, so αS falls in the middle of the α scale, splitting the FGIP into two regions of similar area. Thus THF is a moderately good acceptor and solvates hydrogen bond donor solutes well, and chloroform is a moderately good donor and solvates hydrogen bond acceptors well. In mixtures of THF and chloroform, preferential solvation leads to good solvation of hydrogen bond donor solutes by THF and good solvation of hydrogen bond acceptor solutes by chloroform. As a result, interactions between polar solutes are less favourable in mixtures than in either of the two pure solvents.23 The FGIPs for THF–chloroform mixtures in Fig. 10(b) and (c) are effectively a combination of the red quadrants from the two pure solvent FGIPs.
Fig. 11 Association equilibrium for 1:1 complex between tri-n-butylphosphine oxide (hydrogen acceptor A) and 4-phenyl azophenol (hydrogen bond donor D). |
The experimentally measured free energy change for formation of a 1:1 complex between a hydrogen acceptor A and a hydrogen bond donor D, ΔG°, is defined by eqn (10).
(10) |
The concentration of the A·D complex is related to the concentration of solute–solute interactions in an SSIMPLE calculation by eqn (11).
(11) |
The concentration of free hydrogen bond donor [D] is given by the concentration of solute 1 SSIP that is not bound to solute 2 SSIP in the SSIMPLE calculation (eqn (12)).
(12) |
(13) |
A similar expression can be written for [A] in terms of the concentration of solute 2 SSIP. Substitution for [A·D], [A] and [D] in eqn (10) gives eqn (14).
(14) |
The solvation energy defined in eqn (4) can also be expressed in terms of KS as follows (eqn (15)).
(15) |
Substituting into in eqn (14) yields eqn (16), which defines the free energy for the formation of a 1:1 complex in terms of the SSIP values of the two solutes (i.e. α and β), the solvation energies of the two solutes, and some constants.
ΔG° = ε1ε2 + EvdW + RTln(cmax) − ΔGS(1) − ΔGS(2) + 2ΔGc | (16) |
Fig. 12 compares the values of ΔG° calculated using eqn (16) with the corresponding experimentally measured values for the azophenol·phosphine oxide complex as a function of solvent composition for THF–chloroform mixtures. The agreement is both qualitatively and quantitatively good. The calculated line closely follows the experimental data, and the largest difference is within 2 kJ mol−1 for pure THF. This agreement with experiment suggests that SSIMPLE provides an accurate description of such complex solvent environments and that the calculated FGIPs provide a realistic description of the solvation properties of liquids.
Fig. 12 ΔG° (kJ mol−1) for formation of a 1:1 complex between tri-n-butylphosphine oxide (β = 10.7) and 4-phenyl azophenol (α = 4.1) in THF–chloroform mixtures. Experimental measurements (black circles) and calculated values using eqn (16) (black line) as function of chloroform volume fraction, ϕ. The experimental values are the average of five experiments with error bars at the 95% confidence limit. |
(17) |
Eqn (17) suggests that c is not a constant but depends on the solute SSIP values, because ε1 and ε2 appear in the equation. However in the limit of tight binding for two polar solutes, i.e. when K12 is large, eqn (17) simplifies to give a constant value for c (eqn (18)).
(18) |
The concentration inside the bracket in eqn (18) is half the total concentration of solvent SSIPs, i.e. the concentration of solvent–solvent SSIP interactions if they were fully bound. This suggests that the origin of the constant in the eqn (1) is related to the concentration of solvent–solvent interactions. Experimentally determined equilibrium constants are conventionally defined relative to the 1 M standard state, and the concentration of solvent is conventionally ignored. However as Fig. 1 illustrates, complexation in solution is a competition between solvent–solute, solute–solute and solvent–solvent interactions, so the concentration of the solvent–solvent complex shown in Fig. 1 should be considered in the expression for the equilibrium constant (eqn (19)).
(19) |
Eqn (18) and (19) both indicate that the constant c in eqn (1) accounts for the absence of the solvent concentration in the conventional definition of equilibrium constants. Eqn (18) provides an expression for c that is independent of the solute SSIP values, so it is possible to calculate values of c for different solvents. Fig. 13 shows the distribution of c values for the 261 pure solvents studied here. The value is more or less constant for all solvents at 10.5 ± 0.6 kJ mol−1.
Fig. 13 Frequency distribution for values of c calculated using eqn (18) for 261 solvents. |
This calculated value of c is some 4 kJ mol−1 higher than the experimentally determined value of 6 kJ mol−1 for carbon tetrachloride. There are some important differences between the experimentally derived eqn (1), which considers only the exchange of polar interaction sites, and the SSIMPLE calculation, which also takes into account non-polar van der Waals interactions and the concentrations of different interaction sites. This is presumably the origin of the difference between the two constants. A better comparison is therefore to look at values of ΔG° calculated using eqn (1) and (16). Fig. 14 shows a plot of ΔG° in carbon tetrachloride calculated using both methods for all solute combinations with α values between 0 and 5 and β values between 0 and 10 (in increments of 0.1). The agreement is much better than the difference in the values of c would suggest. There is generally good agreement between the absolute values of ΔG°. The largest deviations occur for the most polar and least polar interactions, where the SSIMPLE values are respectively 2 kJ mol−1 lower and higher than eqn (1).
Fig. 14 Comparison of values of ΔG° in carbon tetrachloride calculated using eqn (1) with values calculated using eqn (16) for all values of ε1 between 0 and +5 and all values of ε2 between 0 and −10 (0.1 increments). Black line is y = x. |
Fig. 15(a) shows a cartoon of an idealised binding interface between a ligand and a protein. Sites A, B and C indicate a CH–O interaction and two different amide–amide hydrogen bonding interactions. The FGIP for water shown in Fig. 15(b) can be used to evaluate the free energy contributions that each of these interactions makes to the overall stability of the complex. Interaction A falls in region 4 of the FGIP (cf.Fig. 2), so the CH–O interaction reduces the stability of the complex, due to the free energy penalty for desolvation of the ether oxygen. Interaction B falls on the borderline between regions 2 and 4, because the hydrogen bond donor parameter for the amide NH is approximately the same as the hydrogen bond donor parameter for water, i.e. α ≈ αS ≈ 3. Thus the amide–amide hydrogen bond makes no contribution to the stability of the complex, because the free energy penalty for desolvation of the polar groups is exactly matched by the new hydrogen bond made in the complex. Interaction C is not shown in Fig. 15(b), because it is the same as Interaction B and falls at the same point on the FGIP.
Fig. 15(b) also illustrates how the FGIP can be used to develop strategies for optimising these interactions in order to increase the overall binding affinity. The arrow next to Interaction A in Fig. 15(b) indicates that decreasing the hydrogen bond acceptor strength would lead to a more favourable interaction. A less polar acceptor would reduce the desolvation penalty and move Interaction A into the solvophobic zone of the FGIP (region 3 in Fig. 2). For example, replacing the alkyl ether in the ligand with an aryl ether would give rise to a favourable hydrophobic interaction, making ΔΔGFGI more negative by about 5 kJ mol−1 (reading from the contours in Fig. 15(b)), and lead to an increase in binding affinity of an order of magnitude. The arrow next to Interaction B in Fig. 15(b) indicates that increasing the hydrogen bond donor strength would lead to a more favourable interaction. For example, replacing the amide group in the ligand with a phenol would make ΔΔGFGI more favourable by about 5 kJ mol−1, leading to an increase in binding affinity of an order of magnitude. The situation is quite different for Interaction C. Interaction C falls at the same point on the FGIP as Interaction B, but in this case, it is not possible to improve the binding affinity by changing the functional group on the ligand, the amide acceptor. A horizontal dotted line is drawn at α ≈ 3, which corresponds to the hydrogen bond donor parameter for the protein amide NH. This line falls almost exactly on the borderline between regions 2 and 4 of the FGIP, showing that ΔΔGFGI would be approximately zero for all hydrogen bond acceptor partners, regardless of polarity. Of course there are many other factors that govern overall binding affinity in addition to the properties of single point receptor–ligand interactions, but Fig. 15 shows that an FGIP can suggest useful strategies for guiding drug design, because it integrates interaction strength and desolvation on a free energy scale.
Fig. 16(a) shows a cartoon of an idealised interaction between a substrate and an enantioselective catalyst. The two modes of interaction present different faces of the prochiral substrate to the catalytic site (labelled cat) and would therefore lead to different stereochemical outcomes in the reaction. In one mode, there is a CH–O interaction between the substrate and the catalyst (labelled D), and in the other, there is a π-facial hydrogen bond between the substrate and the catalyst (labelled E). The FGIPs for THF and chloroform shown in Fig. 16(b) and (c) can be used to evaluate solvent effects on the free energy contributions that each of these interactions makes to the position of equilibrium in Fig. 16(a). Interaction D is favourable in THF (ΔΔGFGI ≈ −3 kJ mol−1 reading from the contours) and unfavourable in chloroform (+5 kJ mol−1). In contrast, Interaction E is unfavourable in THF (+7 kJ mol−1) and favourable in chloroform (−3 kJ mol−1). These differences in free energy of interaction can be used to select a solvent to favour a particular stereochemical outcome of the reaction. Thus we would predict that chloroform would favour one enantiomer and that THF would favour the other. These two solvents have been chosen to illustrate the approach, but the full set of FGIPs provided in the ESI† can be scrutinised to find the optimum solvent combination to favour/disfavour particular sets of non-covalent interactions for specific applications. Alternatively, the solvent-dependence of the enantiomeric excess of a reaction could be used to infer what kind of interactions are likely to be important in determining stereoselectivity for a particular catalyst.
The non-covalent interaction properties of 261 different solvent molecules have been characterised as a set of surface site interaction points (SSIPs), which were obtained by footprinting ab initio molecular electrostatic potential surfaces calculated for the isolated molecules in the gas phase. These solvent descriptions were then used to calculate solvation free energies for all possible solute SSIP values in pure solvents and in solvent mixtures. The results allow construction of FGIPs for the pairwise interaction of any two solutes in any solvent environment. The ESI† provides a point of reference with 300 such plots for both pure solvents and solvent mixtures. The examples illustrated in the main text show that the approach provides a good description of solvophobic effects, such as the hydrophobic effect, as well as polar solvent–solute interactions and selective solvation phenomena in solvent mixtures.
The validity of the approach has been demonstrated by experimentally measuring the equilibrium constant for formation of a hydrogen bonded complex across the full composition range of mixtures of chloroform and tetrahydrofuran. The calculated and experimental values of ΔG° are within 2 kJ mol−1 for all measurements. The theoretical analysis also provides insight into the factors that govern the free energy of complexation in the liquid state. There are three contributions: the exchange of polar interactions between solvent and solutes, which can be described by the hydrogen bond parameters, α and β; the exchange of non-polar van der Waals contacts, which usually cancel out; and a constant term associated with the fact that the concentration of solvent is significantly higher than the conventional standard state of 1 M for solutes. Thus the FGIPs described here not only provide a quantitative guide to solvent effects on the free energy contributions that can be expected for specific non-covalent interactions between different functional groups, but also provide a qualitative understanding of the relative magnitudes the different factors that influence the strengths of these interactions.
Footnote |
† Electronic supplementary information (ESI) available: Automated UV-Vis titration protocol, pure solvent SSIP descriptions, and FGIP plots for 303 solvents. Please contact the authors to obtain FGIPs of solvents that are not included here. See DOI: 10.1039/d0sc01288b |
This journal is © The Royal Society of Chemistry 2020 |