Sheenam
Khuttan
ab,
Solmaz
Azimi
ab,
Joe Z.
Wu
ac,
Sebastian
Dick‡
d,
Chuanjie
Wu§
d,
Huafeng
Xu¶
d and
Emilio
Gallicchio
*abc
aDepartment of Chemistry, Brooklyn College of the City University of New York, New York, USA. E-mail: egallicchio@brooklyn.cuny.edu
bPhD Program in Biochemistry, Graduate Center of the City University of New York, USA
cPhD Program in Chemistry, Graduate Center of the City University of New York, USA
dRoivant Sciences, New York, NY, USA
First published on 7th September 2023
We apply the Alchemical Transfer Method (ATM) and a bespoke fixed partial charge force field to the SAMPL9 bCD host–guest binding free energy prediction challenge that comprises a combination of complexes formed between five phenothiazine guests and two cyclodextrin hosts. Multiple chemical forms, competing binding poses, and computational modeling challenges pose significant obstacles to obtaining reliable computational predictions for these systems. The phenothiazine guests exist in solution as racemic mixtures of enantiomers related by nitrogen inversions that bind the hosts in various binding poses, each requiring an individual free energy analysis. Due to the large size of the guests and the conformational reorganization of the hosts, which prevent a direct absolute binding free energy route, binding free energies are obtained by a series of absolute and relative binding alchemical steps for each chemical species in each binding pose. Metadynamics-accelerated conformational sampling was found to be necessary to address the poor convergence of some numerical estimates affected by conformational trapping. Despite these challenges, our blinded predictions quantitatively reproduced the experimental affinities for the β-cyclodextrin host and, to a lesser extent, those with a methylated derivative. The work illustrates the challenges of obtaining reliable free energy data in in silico drug design for even seemingly simple systems and introduces some of the technologies available to tackle them.
The validation of computational predictions with respect to experimental binding affinities has given the community an understanding of the pitfalls of the models, with indications of ways in which they can be avoided.9,16 Blinded validations, such as the Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL),17 have played a particularly important role in this process.18–21 Because computational predictions are formulated without prior knowledge of experimental results, the evaluation of the models' relative performance is free of implicit biases and reflects more faithfully the expected reliability of the computational models in actual research and discovery settings.
Many SAMPL challenges include host–guest systems are considered to be straightforward, as well as more approachable, theoretically and experimentally, than macromolecular systems in terms of testing for reliability in free energy prediction tools.22,23 In this work, we report our findings in tackling the SAMPL9 bCD challenge set, which includes the binding of five phenothiazine-based drugs24 to the β-cyclodextrin host and its methylated derivative.25,26 Molecular complexes of β-cyclodextrin (bCD) are well-known and are used in a variety of biomedical and food science applications.27 They are extensively modeled28–33 and thus provide a familiar testing ground for computational models. However, as we will show, the binding equilibrium between phenothiazines and cyclodextrin hosts is far from straightforward and requires deploying the most advanced computational tools and methods in our arsenal. As also discussed in later sections, handling conformational heterogeneity in the form of chirality and multiple binding poses has been the greatest challenge in our computational protocol.
This paper is organized as follows. We first describe the above molecular systems in detail to illustrate how these exist in equilibrium as a mixture of many conformations, each with its distinct binding characteristics. We then review the Alchemical Transfer Method (ATM)33,34 and the FFEngine bespoke force field parameterization used here to estimate the binding free energies of the cyclodextrin complexes. We describe the extensive alchemical process involving absolute as well as relative binding free energy calculations to obtain the binding constants of each pose in the host–guest systems and how these constants are combined35 to yield values comparable to the experimental readouts. Reaching convergence for some complexes involving slow intramolecular degrees of freedom required advanced metadynamics-based conformational sampling strategies,36,37 which we incorporated into the alchemical binding free energy calculations. This significant intellectual and computational effort resulted in converged binding free energy estimates with a very good experimental agreement for the bCD host. The effort also illustrates the major challenges inherent in modeling complex molecular binding phenomena as well as the theories and technologies available to tackle these challenges.
The β-cyclodextrin host (Fig. 2) is a cyclic oligosaccharide of seven D-glucose monomers forming a binding cavity with a wide opening surrounded by secondary hydroxyl groups (top in Fig. 2) and a narrower opening (bottom) surrounded by primary hydroxyl groups. Hence, we will refer to the wide opening as the secondary face of bCD and the narrow opening as the primary face. The second host, heptakis-2,6-dimethyl-β-cyclodextrin (mCD, Fig. 2), is a derivative of β-cyclodextrin in which all of the primary hydroxyl groups and half of the secondary ones are methylated, affecting the size, accessibility, and hydrophobicity of the binding cavity. Although mCD does not have secondary and primary hydroxyl groups, for simplicity, we will refer to the two openings of mCD as secondary and primary faces by analogy with bCD. Being composed of chiral monomers, bCD and mCD are themselves chiral molecules with potentially different affinities for the enantiomers of optically active guests.39
The experimental binding assay reports an average over the contributions of the various chemical species of the guests. However, because interconversions between species cannot occur in molecular mechanics simulations or occur too slowly relative to the molecular dynamics timescales, to obtain a binding affinity estimate comparable with the experimental observations, it is necessary to model the binding of each relevant species individually and combine the results.41 In this work, we have considered the two conformational enantiomers for each guest (including those of PMZ and PMT with the unsubstituted phenothiazine scaffold compounds to test for convergence), plus the two chiral forms of the protonated alkyl nitrogen of TDZ and the two forms of TFP protonated at the distal and proximal alkyl nitrogens, for a total of 14 guest species. We labeled the seven species with R chirality of the phenothiazine scaffold as PMZ1R, CPZ1R, PMT1R, TDZNR1R, TDZNS1S, TFP1aR, and TFP1bR, where the first part of the label identifies the compound, followed by the net charge (+1 for all the species considered) with ‘a’ and ‘b’ label identifying the distal and proximal protonated forms of TFP respectively, plus ‘NR’ and ‘NS’ labels to distinguish the R and S chiral forms of the protonated alkyl nitrogen of TDZ. The last letter identifies the chirality of the phenothiazine scaffold so that the seven species with S chirality are named PMZ1S, CPZ1S, etc.
Fig. 4 Illustration of the classification of the four binding poses of the phenothiazine/cyclodextrin complexes based on the polar and twist angles introduced in Fig. 6. Poses are labeled as ‘ss’, ‘sp’, etc. where ‘s’ refers to the secondary face of the host and ‘p’ to the primary face of the host. The first letter of the label refers to the orientation of the alkylamine sidechain that can protrude out from the secondary face (poses ‘ss’ and ‘sp’) or from the primary face of the host (poses ‘ps’ and ‘pp’). Similarly, the second letter refers to the position of the small substituent of the phenothiazine moiety protruding out from either the primary or secondary faces of the host. |
The binding mode labels are combined with the labels discussed above that identify the chemical form of the guest to obtain the labels for each form of the guest in each binding mode. For example, the guest PMT with +1 charge with R chirality in the ‘ss’ binding mode is labeled as PMT1Rss.
For the purpose of the alchemical calculations, the binding modes are defined geometrically in terms of the polar angle θ and the azimuthal angle ψ illustrated in Fig. 6. θ is the angle between the molecular axes of the host and the guest and determines the orientation of the alkylamine sidechain relative to the host. The molecular axis of the cyclodextrin host (labeled z in Fig. 6) is oriented from the primary to the secondary faces going through the centroid of the atoms lining the faces (see Computational details). The molecular axis of the guests goes from the sulfur and nitrogen atoms of the central phenothiazine ring. The angle ψ describes the rotation around the molecular axis of the guest and determines the position of the of the phenothiazine substituent. The ‘sp’ binding mode, for example, is defined by 0 ≤ θ ≤ 90° and 90° ≤ ψ ≤ 180° (see Fig. 4 and Computational details).
Fig. 5 The structures and abbreviations of the molecular guests used as intermediate compounds in the alchemical process. |
Fig. 7 The ‘s’ and ‘p’ binding modes of the G1/β-cyclodextrin complex. The ‘s’ mode, in which the hydroxyl group points towards the secondary face of the host, is used as a starting species for the ‘ss’ and ‘sp’ binding modes of the phenothiazine/cyclodextrin complexes. The ‘p’ mode, which points towards the primary face, is the starting species for the ‘ps’ and ‘pp’ modes (Fig. 4). |
Each binding mode of the complex with G1 was then alchemically converted to an intermediary phenothiazine guest (N-methylphenothiazine, or MTZ, in Fig. 5), similar to the SAMPL9 phenothiazine derivatives with a small methyl group replacing the large alkylamine sidechains. Even though MTZ does not have conformational chirality (Fig. 3), we treated its S and R enantiomers individually to test the convergence of the RBFE estimates for each binding pose. We used atom indexes to distinguish the S and R enantiomers of these symmetric guests. Calculations were conducted to obtain the RBFEs from the G1s to the MTZRsp, MTZRss, MTZSsp, and MTZSss binding poses of the complexes of MTZ with bCD and mCD, and from G1p to the MTZRps, MTZRpp, MTZSps, and MTZSpp of the same complexes, all independently and from different starting conformations. The MTZRsp, MTZRss, MTZSsp, and MTZSss complexes are equivalent and should yield the same RBFE values within uncertainty. Similarly, the MTZRps, MTZRpp, MTZSps, and MTZSpp should yield equivalent RBFEs but distinguishable from those of the MTZRsp, MTZRss, MTZSsp, MTZSss group.
Next, RBFEs were obtained for each complex of MTZ to the corresponding complex of PMZ. For example, the MTZRsp binding pose of the MTZ complex with bCD and mCD were converted to the PMZ1Rsp binding pose of the corresponding complexes between PMZ and the hosts. Finally, the RBFEs between each pose of PMZ and the corresponding binding poses of the other guests were obtained. During this process, we monitored convergence by looking at the discrepancy between the RBFEs corresponding to the equivalent symmetric poses of the achiral PMZ and PMT guests. The overall alchemical process to obtain the ABFEs of the SAMPL9 phenothiazine guests is summarized in Fig. 8.
(1) |
(2) |
Moreover, as also shown in the Appendix, the fractional contribution of binding mode i to the overall binding constant is the population, P1(i) of mode i of the complex:35
(3) |
In this specific application, the binding modes refer to the ‘ss’, ‘sp’, etc. orientations of the R and S enantiomers of each guest. We individually obtained the binding constants Kb(i) for each binding mode. In the corresponding alchemical simulations, the orientation of the ligand in the binding site is set by restraining potentials based on the θ and ψ angles (see Fig. 6 and Computational details). These orientations are equally likely in solutions. We also assume an equal likelihood of the R and S conformational enantiomers of the guests in solution, leading to P0(i) = 1/8 for each pose of each guest. The TDZ and TFP guests have twice as many poses due to point chirality and multiple protonation states of their alkylamine sidechain that are approximately equally likely in solution based on pKa analysis with Epik.44 Hence, for simplicity, we set P0(i) = 1/16 for each state of the TDZ and TFP guests.
Uλ(x) = U0(x) + Wλ(u) | (4) |
In this work, we used the strategy above to compute the absolute binding free energy (ABFE) of the G1 guest in two different poses. The binding free energies of the phenothiazine guests are obtained by a series of relative binding free energy (RBFE) calculations starting from G1 (Fig. 8). The alchemical RBFE implementation of the ATM method34 is similar to ABFE implementation except that one ligand of the pair under investigation is transferred from the solution to the binding site while the other ligand is transferred in the opposite direction. The receptor and the two ligands are placed in a single solvated simulation box in such a way that one ligand is bound to the receptor and the other is placed in an arbitrary position in the solvent bulk. Molecular dynamics simulations are then conducted with a λ-dependent alchemical potential energy function that connects, in an alchemical sense, the state of the system where one guest is bound to the receptor and the other in solution, to the state where the positions of the two guests are reversed. The free energy change of this process yields the relative binding free energy of the two guests. To enhance convergence, the two ligands are kept in approximate alignment to prevent the one in solution to reorient away from the orientation of the bound pose. We have shown mathematically that the alignment restraints implemented in ATM do not bias the binding free energy estimates.34
In this work, we employed metadynamics conformational sampling to obtain converged RBFE estimates for the PMT guest. Well-tempered metadynamics45 is a well-established enhanced sampling technique to sample rare events during MD simulations when they are separated from other metastable states by large energy barriers. The technique uses a bias potential, Ubias, that lowers energy barriers along a slow degree of freedom. In this work, the metadynamics biasing potential is obtained along a dihedral angle, φ, of the alkylamine sidechain of PMT (see Computational details) from a simulation in a water solution, using OpenMM's well-tempered metadynamics implementation by Peter Eastman.37 The alchemical binding free energy calculation is then conducted with the biasing potential, Ubias(φ), added to the alchemical potential energy function in eqn (4). The resulting binding free energy estimate is then unbiased using a book-ending approach46 by computing the free energy differences of the system without the biasing potential from samples collected with the biasing potential at the endpoints of the alchemical path. In this work, we used a simple unidirectional exponential averaging formula
−kBTln〈exp(Ubias/kBT)〉bias |
Force field parameters were assigned as described above. In RBFE calculations, the second ligand in the pair was placed in the solvent by translating it along the displacement vector. The resulting system was then solvated using tleap51 in a TIP3P box with a 10 Å solvent buffer and sodium and chloride counterions to balance the negative and positive charges, respectively. The systems were energy minimized, thermalized, and relaxed in the NPT ensemble at 300 K and 1 atm pressure. Annealing of the systems to λ = 1/2 in the NVT ensemble followed to obtain initial structures to initiate the parallel replica exchange ATM calculations. Alchemical calculations were conducted with the OpenMM 7.7 MD engine on the CUDA platform with a 2 fs time-step, using the AToM-OpenMM software.52 Asynchronous Hamiltonian Replica Exchange53 in λ space was attempted every 20 ps and binding energy samples were collected at the same frequency. The ATM-RBFE employed 22 λ states distributed between λ = 0 to λ = 1 using the non-linear alchemical soft-plus potential and the soft-core perturbation energy with parameters umax = 200 kcal mol−1, uc = 100 kcal mol−1, and a = 1/16.13 The input files of the simulations are provided in our lab's GitHub repository (https://github.com/GallicchioLab/SAMPL9-bCD). We collected approximately 20 ns trajectories for each replica corresponding to approximately 440 ns for each of the 64 alchemical steps for each host (Fig. 8). Overall, we simulated the systems for over 6 μs. UWHAM54 was used to compute binding free energies and the corresponding uncertainties after discarding the first half of the trajectory.
To obtain the torsional flattening biasing potential, we simulated the PMT guest in solution with metadynamics over the (C–N–C–C) alkylamine sidechain torsional angle highlighted in Fig. 10. A well-tempered metadynamics bias factor of 8 was used, with a Gaussian height of 0.3 kcal mol−1 and width of 10°.45 The bias potential was collected for 20 ns, updating it every 100 ps. The resulting potential of mean force is shown in Fig. 10. The metadynamics-derived biasing potential was used in all the RBFE calculations involving the PMT guest to accelerate the sampling of the slow torsional degree of freedom in question.
Name | ΔGb (SAMPL9)abc | ΔGb (ATM)abd | ΔGb (expt)ae |
---|---|---|---|
a In kcal mol−1. b Errors are reported as twice the standard deviation. c Blinded computational predictions submitted to the SAMPL9 challenge organizers. d Revised ATM computational predictions. e SAMPL9 blinded experimental isothermal calorimetry data.25 | |||
bCD-TDZ | −4.28(90) | −4.56(47) | −5.73 |
bCD-TFP | −6.51(111) | −5.42(99) | −5.09 |
bCD-PMZ | −3.73(48) | −4.03(45) | −5.00 |
bCD-PMT | −2.53(70) | −3.00(47) | −4.50 |
bCD-CPZ | −7.28(92) | −4.64(70) | −5.45 |
mCD-TDZ | −5.16(140) | −2.96(68) | −6.50 |
mCD-TFP | −4.14(62) | −3.98(70) | −5.57 |
mCD-PMZ | −2.37(54) | −2.34(55) | −5.08 |
mCD-PMT | −1.80(99) | −1.58(60) | −5.39 |
mCD-CPZ | −5.22(90) | −5.13(88) | −5.43 |
The predictions for the bCD complexes are in reasonable agreement with the experiments. The revised predictions, in particular, are all within 1.5 kcal mol−1 of the experimental measurements and within 1 kcal mol−1 for four of the five bCD complexes. Although the range of the binding affinities is small, some trends are reproduced and the weakest binder (PMT) is correctly identified. The quality of the predictions for the mCD host is not as good, and it worsened upon revision. The experiments show that the phenothiazine guests bind slightly more strongly to mCD than bCD. However, except for CPZ, the calculations predict significantly weaker binding to mCD relative to bCD. The computed free energies of the mCD complexes are on overage over 2 kcal mol−1 less favorable than the experimental ones. The revised prediction for the mCD-TDZ complex is particularly poor and fails to identify this complex as the most stable in the set. While a detailed investigation of the sources of the poor prediction for mCD has not been carried out, our model could not have identified the best possible binding poses for this more flexible host. mCD is also more hydrophobic and the energy model may overpredict the reorganization free energy to go from the apo to the holo conformational ensemble for this host.
Reassuringly, the calculations predict that the populations of the symmetric binding modes of the complexes with the PMZ and PMT guests are more evenly distributed than for the other complexes. Lacking a substituent of the phenothiazine moiety (Fig. 1), the PMZ and PMT guests do not display conformational chirality (Fig. 3). Hence, their ‘ssS’, ‘spS’, ‘ssR’, and ‘spR’ binding modes are chemically equivalent and should have the same population. Similarly, the binding modes ‘psS’, ‘ppS’, ‘psR’, and ‘ppR’ of these guests are mutually equivalent. Still, they are distinguishable from the ‘ssS’, ‘spS’, ‘ssR’, ‘spR’ group by the position of the alkylamine sidechain (Fig. 4). We used these equivalences to assess the level of convergence of the binding free energy estimates. Although redundant for symmetric poses, we simulated each binding mode of these guests individually, starting from different initial configurations, and checked how close the pose-specific binding free energies varied within each symmetric group. For example, the computed populations of the ‘ssS’, ‘spS’, ‘ssR’, and ‘spR’ poses of the PMZ-bCD complex vary in a narrow range between 7.5 and 15.9%, indicating good convergence. However, the corresponding populations for the complex with mCD are not as consistent – the ‘ssS’ mode predicted to be significantly less populated (4%) than the other modes (20–40%) – reflecting poorer convergence.
The pose-specific binding free energy estimates probe the chiral binding specificity of the hosts. Except for the TFP guest that is predicted to bind predominantly in the R chiral form (88% population), bCD shows little chiral preference. mCD displays a slightly stronger chiral specificity, with CPZ predicted to bind predominantly in the S form and TFP in the R form.
Consistent with the significant conformational variability that we predicted, only a sparse set of NOEs of the complexes of bCD and mCD with the symmetric guests PMT and PMZ are observed.25 NOE signals for these complexes include only a few interactions between the phenothiazine moieties and protons at either face of the host, as seen in the molecular simulations. No NOE interactions are observed for the mCD-PMT complex. The high conformational variability of the complexes with the TDZ guest (Fig. 9A) is similarly confirmed by the lack of NOEs involving the alkylamine sidechain and the observed NOEs with the phenothiazine core, indicating that it binds in both orientations with the substituted end near both the primary and secondary faces of the host.
In agreement with our predictions (Fig. 9A), the observed NOEs indicate binding of the CPZ and TFP guests in a single dominant binding pose, but not always the predicted pose. The set of NOEs of the bCD-CPZ complex,25 is in agreement with the prediction that the bCD-CPZ complex has an aggregate ‘sp’ binding pose population of over 80% (Fig. 9A). However, the computational prediction that the mCD-CPZ complex exists predominantly in the ‘spS’ binding pose (Fig. 9B) does not appear to be well supported by the experimental NOEs between the secondary face of mCD and the protons of the phenothiazine moiety closest to the substituent.25 Finally, the predictions that the TFP guest binds the bCD and mCD hosts mainly in the ‘sp’ pose, in which the alkylamine sidechain is oriented toward the secondary face of the host and the trifluoro-methyl group toward the primary face, is not born out in the experimental NOEs that indicate a strong preference for ‘ps’ and ‘pp’ poses. The contrast between these structural inconsistencies and the good alignment between the computed and experimental binding free energies for these complexes (Table 1) are potentially a further indication that binding pose populations can be very sensitive to minute shifts in interatomic interaction energies.
Pose | ΔΔGb (ATM)abc | ΔΔGb (ATM + MetaD)abd |
---|---|---|
a In kcal mol−1. b Errors are reported as twice the standard deviation. c Estimates computational predictions submitted to the SAMPL9 challenge organizers. d Revised ATM estimates with metadynamics conformational sampling. | ||
bCD | ||
spS | 3.94(39) | 0.44(25) |
ssS | 2.80(39) | 0.58(25) |
spR | 0.28(34) | 1.12(24) |
ssR | 4.62(45) | 1.65(25) |
psS | 1.98(36) | 1.49(24) |
ppS | 2.03(39) | 1.10(24) |
psR | 0.77(29) | 1.46(24) |
ppR | 0.28(39) | 0.57(24) |
mCD | ||
spS | 1.93(37) | 0.96(25) |
ssS | 0.57(44) | 0.11(26) |
spR | 1.59(41) | 0.30(25) |
ssR | 0.25(42) | 1.90(25) |
psS | 1.26(39) | −0.14(24) |
ppS | 0.20(42) | 0.95(25) |
psR | 1.54(41) | −0.41(25) |
ppR | 0.10(38) | 0.10(24) |
The molecular dynamics trajectories analysis later revealed that the observed scatter of relative binding free energy estimates was due to the conformational trapping of the PMT guest in the starting conformation, which was randomly set during the system setup. Simulations with PMT trapped in some conformations overestimated the RBFE while those in the other underestimated it. We pin-pointed the conformational trapping to the branched alkylamine side chain of PMT which showed hindered rotation around one of its torsional angles (Fig. 10) caused by a large free energy barrier separating the gauche(+) and gauche(−) conformers (Fig. 10). The variations of conformers in the alchemical calculation broke the symmetry between equivalent poses and caused the scatter in the observed RBFEs.
Fig. 10 The potential of mean force (PMF) in water solution along the highlighted torsional angle, φ, of PMT1 computed by well-tempered metadynamics sampling.45 The PMF identifies two major gauche conformational states at positive and negative angles around 60° and −120° separated by a large free energy barrier at 180° of more than 7 kcal mol−1 from the global minimum at −50°. The free energy barrier is sufficiently high that interconversions between the two stable conformational states are not observed in the time-scale of the alchemical calculations without the metadynamics landscape-flattening potential. |
To correct these inconsistencies, we modified our alchemical binding protocol to include a metadynamics-derived flattening potential bias that reduced the magnitude of the free energy barrier separating the conformers of the alkylamine sidechain of PMT (see Methods and computational details). We confirmed that the biasing potential successfully induced rapid conformational transitions between these conformers, which were rarely achieved with the conventional ATM protocol. Consequently, integrating metadynamics-enhanced sampling with ATM (ATM-MetaD) indeed produced much better convergence of binding free energy estimates of symmetric poses starting from different initial conformers (Table 2). For example, the large discrepancy of RBFE estimates between the ‘spS’ and ‘spR’ binding poses was reduced to less than 1 kcal mol−1 with ATM-metaD and in closer consistency with statistical uncertainties. With only one exception, improved convergence was also achieved for the equivalent binding poses of bCD and mCD, falling within a 1 kcal mol−1 range of each other (Table 2).
The present SAMPL9 bCD challenge highlights the importance of properly treating conformational heterogeneity to obtain reliable quantitative descriptions of binding equilibria. We undertook this challenge with the mindset that host–guest systems are simpler surrogates of more challenging and conformationally diverse protein–ligand complexes and, hence, more suitable to assess computational methodologies. However, as later confirmed by the NMR NOE experimental analysis,25 we realized that the phenothiazine/cyclodextrin complexes could be far more chemically and conformationally diverse than expected. Most of the guests exist in solution as mixtures of enantiomers related by nitrogen inversion (Fig. 3) which are distinctly recognized by the chiral hosts. As a result, one enantiomer could be significantly enriched relative to the other when bound to the host. In addition, each enantiomer binds the host in four generally distinct binding orientations that differ in the placement of the alkylamine sidechain and the substituent of the phenothiazine moiety relative to the host (Fig. 4). While in the experimental setting the guests and the complexes rapidly transition from one pose to another, this level of conformational heterogeneity poses serious challenges for standard molecular dynamics conformational sampling algorithms, which are generally limited to one binding pose.
When facing these complexities, it is tempting to limit the modeling to the most important binding pose. While it is true that often the most favorable pose contributes the most to the binding affinity and that neglecting minor poses results in small errors, binding pose selection remains an unresolved issue. Clearly, identifying the major pose cannot be carried out by binding free energy analysis of each pose because that is precisely what one seeks to avoid in such a scenario. Whichever approach is adopted, it must be capable of identifying the most stable pose of each complex among many competing poses. The present results illustrate this challenge. For example, the populations derived from our free energy analysis indicate that the ‘spR’ binding pose is often one of the most populated (red in Fig. 9). However, CPZ is predicted to strongly prefer the ‘spS’ pose when binding to mCD (orange in Fig. 9B), and limiting the modeling to the ‘spR’ pose would result in a gross underestimation of the binding free energy. Similarly, the TDZ-bCD complex is predicted to exist in a variety of poses (Fig. 9A), including, for instance, the ‘psR’ pose with the alkylamine sidechain pointing towards the primary face of bCD, with the ‘spR’ pose contributing only a small fraction of the observed binding affinity. Clearly, at least in this case, limiting the modeling to one carefully selected predetermined pose would lead to significant mispredictions for individual complexes.
To obtain an estimate of the observed binding constants, in this work, we opted to compute the binding free energies of all of the relevant binding poses of the system and to integrate them using the free energy combination formula [eqn (1)]. The combination formula requires the populations of the conformational states of the guest in solution that, in this case, are easily obtained by symmetry arguments. Still, the work involved 64 individual alchemical free energy calculations (Fig. 8) and hundreds of nanoseconds of simulation on GPU devices. While we attempted to automate the process as much as possible, setup errors were made and it is likely that some yet undiscovered defects are still affecting our revised results. We assessed the convergence of the pose-specific binding free energy estimates by monitoring the consistency between the results for symmetric poses. As a result of this assessment, we realized that one guest (PMT) was affected by slow conformational reorganization that required metadynamics treatment. This best-effort attempt resulted in good quantitative predictions for the complexes with β-cyclodextrin host. However, our model failed to properly describe the binding free energies of the complexes with the methylated derivative (mCD). Force field limitations that cause excessive reorganization of the host in solution and the existence of alternative binding modes not considered in our analysis are some of the possible explanations of why our free energy predictions consistently underestimated the binding affinities of the complexes with mCD (Table 1).
(5) |
(6) |
To obtain an expression for the reminder ratio of the partition function of binding mode i in the unbound ensemble to the partition function of the complex in the bound ensemble, multiply and divide by the partition function of the system in the unbound ensemble noting that:
(7) |
(8) |
Footnotes |
† Electronic supplementary information (ESI) available: Spreadsheets SAMPL9-bCDpc-FFEngine and SAMPL9-mCDpc-FFEngine containing: (i) the absolute binding free energy of host G1 in the ‘s’ and ‘p’ binding modes, (ii) the relative binding free energies between G1 in the ‘p’ and ‘s’ poses and MTZ in the ‘ss’, ‘sp’, etc. binding modes, (iii) the relative binding free energy between MTZ and PMZ in each of the eight binding modes, (iv) the relative binding free energies between PMZ and the other guests in each of the eight binding poses, (v) the binding mode specific binding constants for each complex in each binding mode, and (vi) the calculated populations of each binding mode for each complex. See DOI: https://doi.org/10.1039/d3cp02125d |
‡ Present address: D.E. Shaw Research, New York, NY, USA. |
§ Present address: Schrödinger Inc., New York, NY, USA. |
¶ Present address: Atomapper Inc., New York, NY, USA. |
This journal is © the Owner Societies 2023 |