Cyclic peptides: backbone rigidification and capability of mimicking motifs at protein–protein interfaces

He Huang; Jovan Damjanovic; Jiayuan Miao; Yu-Shan Lin

doi:10.1039/D0CP04633G

View PDF VersionPrevious ArticleNext Article

DOI: 10.1039/D0CP04633G (Paper) Phys. Chem. Chem. Phys., 2021, 23, 607-616

Cyclic peptides: backbone rigidification and capability of mimicking motifs at protein–protein interfaces†

He Huang‡ , Jovan Damjanovic‡ , Jiayuan Miao and Yu-Shan Lin *
Department of Chemistry, Tufts University, Medford, Massachusetts 02155, USA. E-mail: yu-shan.lin@tufts.edu

Received 2nd September 2020 , Accepted 30th November 2020

First published on 17th December 2020

Abstract

Cyclization is commonly employed in efforts to improve the target binding affinity of peptide-based probes and therapeutics. Many structural motifs have been identified at protein–protein interfaces and provide promising targets for inhibitor design using cyclic peptides. Cyclized peptides are generally assumed to be rigidified relative to their linear counterparts. This rigidification potentially pre-organizes the molecules to interact properly with their targets. However, the actual impact of cyclization on, for example, peptide configurational entropy, is currently poorly understood in terms of both its magnitude and molecular-level origins. Moreover, even with thousands of desired structural motifs at hand, it is currently not possible to a priori identify the ones that are most promising to mimic using cyclic peptides nor to select the ideal linker length. Instead, labor-intensive chemical synthesis and experimental characterization of various cyclic peptide designs are required, in hopes of finding one with improved target affinity. Herein, using molecular dynamics simulations of polyglycines, we elucidated how head-to-tail cyclization impacts peptide backbone dihedral entropy and developed a simple strategy to rapidly screen for structures that can be reliably mimicked by preorganized cyclic peptides. As expected, cyclization generally led to a reduction in backbone dihedral entropy; notably, however, this effect was minimal when the length of polyglycines was >9 residues. We also found that the reduction in backbone dihedral entropy upon cyclization of small polyglycine peptides does not result from more restricted distributions of the dihedrals; rather, it was the correlations between specific dihedrals that caused the decrease in configurational entropy in the cyclic peptides. Using our comprehensive cyclo-G_n structural ensembles, we obtained a holistic picture of what conformations are accessible to cyclic peptides. Using “hot loops” recently identified at protein–protein interfaces as an example, we provide clear guidelines for choosing the “easiest” hot loops for cyclic peptides to mimic and for identifying appropriate cyclic peptide lengths. In conclusion, our results provide an understanding of the thermodynamics and structures of this interesting class of molecules. This information should prove particularly useful for designing cyclic peptide inhibitors of protein–protein interactions.

Introduction

Peptides have many unique properties that make them promising candidates for the development of chemical probes and therapeutics.^1–3 For example, they can be used to inhibit protein–protein interactions, which are involved in many biological processes and diseases.^5–7 However, peptides come with many liabilities, including structural flexibility that often leads to relatively poor affinity, selectivity, and bioavailability. In an effort to address these deficiencies, peptide cyclization is now a common strategy to rigidify these molecules and thereby improve their binding affinity, selectivity, and bioavailability.^3,8–10 Notably, more than seven thousand “hot loops” were recently identified at interfaces of protein–protein interactions.^11,12 These hot loops have been proposed to be viable starting points for using cyclic peptides to inhibit protein–protein interactions.

Although the potential applications of peptide cyclization are well-demonstrated, it remains very difficult to predict conformational outcomes of cyclization and the magnitude of the associated decrease in peptide configurational entropy. NMR measurements show that head-to-tail cyclization indeed leads to rigidification of the peptide backbones, resulting in, for example, smaller amide temperature coefficients and an increase in the order parameter.¹³ However, thermodynamic measurements of binding show that addition of a conformational restraint does not always confer favorable changes in binding entropy.^14,15 It is possible, for instance, that the favorable change in peptide entropy is compensated by an unfavorable change in solvation entropy; however, it is challenging to separate these two contributions using thermodynamic measurements. Therefore, the change of peptide entropy upon cyclization remains poorly understood, both in terms of its magnitude, the molecular-level sources of the change, and its dependence on peptide size. Moreover, cyclization is not always beneficial and it can have deleterious effects on binding affinities.^10,15–18 Because it remains difficult to rationally design linkers that can properly rigidify the peptides of interest into desired conformations, synthesis of many variants is required, in hopes of finding one with appropriate conformational properties. In sum, the impact of cyclization on peptide properties is neither well-understood nor straightforward. Guidelines on appropriate linker choices and what conformations can be reliably mimicked by cyclic peptides would be highly beneficial.

Molecular dynamics (MD) simulations are a powerful tool to study molecules with atomic-level detail. They have been used to characterize the thermodynamic properties and structural ensembles of linear and cyclic peptides. For instance, using MD simulations of linear polyglycines as model systems, Drake et al. found that the dihedral configurational entropy of such peptides increases linearly with their chain lengths.⁴ MD simulations have also been applied to estimating the conformational entropy for a series of cyclic RGD peptides with a variety of modifications.¹⁹ MD simulations can further characterize the conformational ensembles of cyclic peptides, enabling understanding of their activities and designing well-structured cyclic peptides.^20–23

However, while previous simulations have provided useful information and insights, they generally focused on either linear peptides or cyclic peptides. In this report, we perform a systematic comparative study of both linear and cyclic peptides to compare and contrast the differences in their configurational entropy (Fig. 1A). Furthermore, as mentioned previously, thousands of hot loops at protein–protein interfaces have been identified as ideal starting points for cyclic peptide inhibitor design. Closely mimicking the structure of a protein–protein interface has been used to great effect in macrocycle inhibitor design, and presents a strong starting point for structure-based inhibitor development.^12,24,25 Nonetheless, it remains difficult to a priori know even whether a given hot loop can indeed be mimicked by a cyclic peptide in the first place, let alone what the appropriate cyclic peptide size should be. Here, we use the structural ensembles of cyclic polyglycines to provide a quick way to determine which sizes of cyclic peptides (if any) may be the most promising at mimicking the hot loop of interest. The premise is that cyclic polyglycines are the most flexible cyclic peptides and the most likely to sample various conformations adoptable by cyclic peptides. By comparing a hot loop conformation to a cyclo-G_n structural ensemble, we can then determine whether the specific hot loop conformation is mimicable by cyclic peptides of size n (Fig. 1B). Our results provide clear guidelines on the “easiest” hot loops for cyclic peptides to mimic and suggest candidate cyclic peptide sizes to test.


	Fig. 1 (A) Linear and cyclic polyglycines with length ranging from 5 to 15 residues are used to study the size-dependent effects of cyclization on the thermodynamic and structural properties of peptides. (B) If a desired hot loop conformation is observed in a cyclo-G_n structural ensemble, it is assumed that a cyclic peptide of size n should be viable for mimicking the specific hot loop. The next step can be simulating cyclic peptides of size n with the specific hot loop sequence incorporated. The hot loop shown is ⁵²DLIYY⁵⁶ from chain A in PDB ID = 2JB0; the sidechains of the hotspot residues (D52, Y55, and Y56) are highlighted. *However, this does not conclude that if the desired structure is not observed in a cyclo-G_n structural ensemble, it will not be mimicked by a cyclic peptide of size n.

Methods

A. Linear and cyclic peptides of G₅ to G₁₅

Linear and head-to-tail cyclized polyglycine peptides with length varying from 5 to 15 residues, denoted as linear- and cyclo-G_n, with 5 ≤ n ≤ 15, were built and simulated (Fig. 1A). The linear peptides were capped with an acetyl group on the N-terminus and with an N-methyl amide group on the C-terminus. The capping prevented creation of charged termini, which could have led to strong Coulombic interaction. It also ensured that linear and cyclic peptides of the same length had the same number of backbone ϕ/ψ dihedrals. For each peptide system, two sets of simulations and analyses, starting from two different initial structures, were performed in parallel to verify the convergence of the simulation results. Details on how the initial structures were generated can be found in the ESI.†

B. Molecular dynamics simulations

To characterize their thermodynamic and structural properties, all the cyclic and linear peptides were simulated using molecular dynamics (MD) simulations. All the simulations were performed using the GROMACS 4.6.7 suite²⁶ with the RSFF2 force field and TIP3P water model.^27,28 The RSFF2 force field was parameterized using a coil library and was previously shown to be able to fold well-behaved peptides²⁹ and recapitulate the crystal structures of cyclic peptides reasonably well.³⁰ Two sets of simulations starting from two different initial structures were performed for each peptide system (herein called S1 and S2, simulations 1 and 2). Each initial structure was solvated, equilibrated, and a 1.0 μs production run was then performed at 300 K and 1 bar with trajectories output every 1.0 ps for subsequent analysis. Details on the simulation setup can be found in the ESI.†

C. Bias-exchange metadynamics simulations

Besides using the two independent simulations S1 and S2 starting from different initial structures to verify the convergence of the simulation results, bias-exchange metadynamics (BE-META) simulations were carried out for selected systems of linear-G_n (n = 5, 10, 15) and cyclo-G_n (n = 5–10, 15) to further validate the convergence of the thermodynamic analysis results. BE-META simulations were performed for 250 ns using the PLUMED 2 plugin for the GROMACS 4.6.7 suite.³¹ Each replica was biased along (ϕ_i, ψ_i) or (ψ_i, ϕ_i+1), which were previously shown to represent the essential transitional motions of cyclic peptides and greatly enhance their conformational sampling.³² Additionally, five neutral replicas with no bias were added to generate unbiased trajectories for structural analysis. Exchanges between replicas were attempted every 5 ps; Gaussian hills with a height of 0.1 kJ mol⁻¹ and a width of 0.314 rad in each dimension were added every 4 ps. The simulation trajectories were saved every 1 ps for subsequent analysis.

D. Estimation of backbone configurational entropy

For a system with N degrees of freedom, its configurational entropy S^config is:


	(1)

where r_i represents degree of freedom i, and ρ(r₁,…, r_N) is the N-dimensional joint probability density function (PDF). Eqn (1) can be reformatted using the mutual information expansion (MIE) scheme:³³


	(2)

In eqn (2), S₁(r_i) is the marginal entropy of degree of freedom r_i; I₂(r_i,r_j) is the pairwise joint mutual information measuring the correlation between two degrees of freedom r_i and r_j, which is calculated as the difference between the simple summation of S₁(r_i) and S₁(r_j) and the actual joint entropy, S₂(r_i,r_j):


I₂(r_i,r_j) = S₁(r_i) + S₁(r_j) − S₂(r_i,r_j).	(3)

Similarly, I₃ is calculated as


I₃(r_i,r_j,r_k) = S₁(r_i) + S₁(r_j) + S₁(r_k) − S₂(r_i,r_j) − S₂(r_j,r_k) − S₂(r_i,r_k) + S₃(r_i,r_j,r_k),	(4)

and so forth. Although in principle, N-dimensional PDFs could be determined, it is typically not practical to achieve adequate convergence for 3D PDFs and beyond and indeed PDFs in ≥3D were extremely difficult to converge for our systems as well.⁴ Hence, in the calculations of configurational entropy, eqn (2) above was usually truncated to only consider up to I₂. When only the marginal entropies of individual degrees of freedom are included, it is called the first-order approximation of the full configurational entropy:


	(5)

The second order MIE approximation (S^MIE₂) incorporates the pairwise mutual information terms:


	(6)

In this paper, we focused on the backbone ϕ and ψ angles and computed both the S₁ and S^MIE₂, using the latter to provide a lower bound of the backbone configurational entropy. It was assumed that only the backbone ϕ and ψ dihedral angles contribute to the difference of configurational entropy between linear and cyclic peptides, and contributions from the other degrees of freedom, including the bonds, angles, and nearly rigid ω angles could be neglected in the entropy estimations.⁴

To obtain an upper bound of the configurational entropy, the maximum information spanning tree (MIST) algorithm was used:³⁴


	(7)

Instead of subtracting all the possible I₂ terms as in eqn (6), in eqn (7) only the N − 1 pairwise terms that facilitate the maximal mutual information sum are considered. Therefore, in contrast to S^MIE₂, which provides a lower bound, S^MIST₂ provides an upper bound of the configurational entropy.³⁴

For each simulation, ϕ(C′–N–C_α–C) and ψ(N–C_α–C–N′′) backbone dihedrals were calculated and used as input for an in-house Python adaptation of the PARENT program to calculate S₁, S^MIE₂, and S^MIST₂.³⁵ 120 bins were used for each dimension to construct the probability density functions. Our modified PyPARENT code follows previous standards, which apply a bias correction for systematic errors in entropy calculations due to finite sampling.^35–37 The calculated entropies showed very little dependence on the number of bins used (Fig. S1, ESI†).

E. Structural comparison to hot loops

Approximately seven thousand “hot loops” have been identified at various protein–protein interactions.^11,12 These hot loops are loop-like structures predicted to have high binding affinity to their partner proteins by Rosetta-based alanine scanning.^38,39 From among all loops at protein–protein interfaces, three criteria were used to identify the “hot loops”: the average ΔΔG_res of the loop ≥0.6 Rosetta Energy Units, the number of hot spots in the loop ≥3, and the loop representing ≥50% of the total predicted interface energy. Out of the 7225 hot loops identified, 210 satisfied all three criteria and were considered the best starting points for peptide therapeutic design of protein–protein interaction inhibitors.¹² Here we estimated the likelihood that cyclic peptides of different sizes could be used to mimic each of these 210 hot loops. Three of the 210 hot loops seemed to have additional residues that were not included in the original hot loop library and thus were not used in the analysis here. Furthermore, fourteen hot loops had cis peptide bonds and were also excluded from the analysis. For the 193 all-trans hot loops, the dihedral angles ψ₁, ϕ₂, ψ₂,…,ϕ_m were computed (where m was the size of the hot loop). These dihedrals were compared to the cyclo-G_n conformations sampled in the MD simulations with cyclic degeneracy taken into consideration.

Results and discussion

Changes in configurational entropy upon cyclization

Peptide entropy generally decreases when a peptide binds to a target, constituting a thermodynamic cost that can dramatically reduce binding affinity. It is generally presumed that covalent cyclization of the peptide helps sidestep this cost by rigidifying the peptide's structure even in the unbound state, remedying the loss of peptide entropy upon binding and improving the peptide's binding affinity. To determine how much decrease in configurational entropy can be obtained by peptide cyclization, as well as to understand the length dependence and molecular origins of such a decrease, we performed simulations of both linear and cyclic polyglycines, estimated their configurational entropy, and investigated the sources of the entropy difference between linear and cyclic peptides.

As it is difficult to obtain probability density functions (PDFs) in ≥3 dimensions with sufficient statistics to calculate the configurational entropy (eqn (1) and (2)), we used S^MIE₂ (eqn (6); all N × (N − 1)/2 pairwise mutual information terms included) to provide a lower bound of the configurational entropy and S^MIST₂ (eqn (7); only N − 1 pairwise mutual information terms included) to provide an upper bound of the configurational entropy for each peptide. Consequently, we obtain a range of TΔS^config between cyclo-G_n and linear-G_n: the lower bound of TΔS^config was [TS^MIE₂ of cyclo-G_n − TS^MIST₂ of linear-G_n], while the upper bound of TΔS^config was [TS^MIST₂ of cyclo-G_n − TS^MIE₂ of linear-G_n] (Fig. 2A, the range between the bounds of TΔS^config was shaded in gray).


	Fig. 2 (A) Change in backbone dihedral configurational entropy upon cyclization of polyglycines as a function of peptide size at 300 K. (B) Backbone dihedral configurational entropy for linear-G_n (square line symbols) and cyclo-G_n (circle line symbols) calculated using the first-order approximation (S₁, green solid lines), the second-order MIE approximation (S^MIE₂, blue dashed line), and the second-order MIST approximation (S^MIST₂, purple dotted line). S^MIE₂ and S^MIST₂ serve as the lower and upper bounds for S^config, respectively; areas between S^MIE₂ and S^MIST₂ are thus shaded to indicate where the true S^config would lie. Error bars are calculated from two simulations starting from two different initial structures and are plotted here; however, the error bars are smaller than the symbol sizes and thus not visible. Comparison of the results from the MD simulations (shown here) to the results from the BE-META simulations can be found in Fig. S2 (ESI†), showing that the two sets of simulations provided similar results. Note that S^config for linear-G_n scaled linearly with chain length. Fitting the three curves (S₁, S^MIE₂, and S^MIST₂) using a linear regression model yielded an R² coefficient ≥0.99 in all three cases. The slopes for TS₁, TS^MIE₂, and TS^MIST₂ were 1.625, 1.413, and 1.471 kcal mol⁻¹ per residue (at 300 K), respectively (Table S1, ESI†). These values were comparable with the 1.16–1.67 kcal mol⁻¹ per residue previously reported in the work of Drake et al.⁴

Similar to the previous study of linear peptides,⁴ our calculation and analysis of configurational entropy focused on the backbone ϕ and ψ dihedrals. Because linear-G_n and cyclo-G_n have different numbers of atoms, thus different numbers of degrees of freedom, it is not straightforward to compare their configurational entropy with all degrees of freedom included (they do have the same number of ϕ and ψ dihedrals though). To circumvent this issue and test the effects of including other degrees of freedom, we used the PARENT software³⁵ to calculate the configurational entropy using all internal degrees of freedom; however, the atoms in the N- and C-terminal caps of linear-G_n were excluded in the entropy calculation to obtain the same number of atoms as in cyclo-G_n. Fig. S3 (ESI†) shows that the resulting TΔS plot was qualitatively similar to that when only backbone dihedrals were included in the entropy calculation (Fig. 2A). By comparing different types of S₁ and MI terms in the PARENT output, we also found that the dihedral S₁ and the pair MI terms involving dihedrals contribute most significantly to the TΔS between linear- and cyclo-G_n. For simplicity and straightforward comparison, the following analysis and discussion of configurational entropy focused on and referred to configurational entropy of backbone dihedrals.

Fig. 2A shows cyclization indeed led to a loss in configurational entropy, although this effect was far more pronounced with shorter chain lengths. Particularly noteworthy, cyclization of peptides longer than 9 residues had minimal effects on configurational entropy. Below, we analyze the configurational entropy results in more detail to understand the origin of the differences between linear and cyclic peptides.

Cyclization does not lead to a more restricted 1D distribution of backbone dihedrals. To understand the origin of the differences in configurational entropy between linear and cyclic peptides, we first investigated whether these differences simply came from a more restricted dihedral distribution in cyclic peptides, which would manifest as a smaller S₁ (the first-order entropy estimation that only depends on the one-dimensional probability density functions). Fig. 2B shows S₁ for both linear-G_n and cyclo-G_n (green lines with solid square and open circle symbols, respectively). Surprisingly, cyclization actually led to a small increase in S₁ in general, especially noticeable for G₅ and G₆.

To understand this phenomenon, we compared the 1D probability density functions (PDFs) along ϕ and ψ dihedrals for linear and cyclic G₅, G₆, G₇, G₈, and G₁₅ (Fig. 3A). The 1D-PDFs were very similar for all the linear peptides of various sizes, and for the larger cyclic peptides such as cyclo-G₁₅. For example, their ψ PDFs all displayed a major peak at ±180° and a minor peak at 0°. However, the 1D-PDFs for cyclo-G₅, cyclo-G₆, and cyclo-G₇ deviated significantly from those of the linear peptides, in particular their ψ PDFs (Fig. 3A). The ±180° and 0° peaks were more equally populated in cyclo-G₆ and cyclo-G₇, and cyclo-G₅ showed four peaks in its ψ PDFs, resulting in a larger S₁. This observation suggests that, owing to restraints imposed by the formation of a head-to-tail cyclized backbone, not all residues in, for example, a cyclic pentapeptide are able to adopt the dihedral angles preferred under a relaxed environment; rather, their ψ angles need to follow specific patterns to satisfy the ring closure condition.


	Fig. 3 Probability density functions (PDFs) for representative G_n systems. (A) One-dimensional PDFs for linear- and cyclo-G_5–8 and G₁₅. 1D PDFs of the ϕ angles are shown on the left column and those of ψ on the right column. For G_n with an odd n, ϕ and ψ of residue (n + 1)/2 are shown; for G_n with an even n, ϕ and ψ of residue n/2 are shown. Curves for linear peptides are shown in solid lines and those for cyclic peptides in dashed lines. (B) Two-dimensional PDFs for linear-G₅ and cyclo-G_5–7,15. The first column shows 2D PDFs of (ϕ, ψ) for the same residue (Ramachandran plots); the second column shows 2D PDFs of (ϕ_i+1, ψ_i); the third column shows 2D PDFs calculated from taking the product of 1D PDF of ϕ_i and 1D PDF of ψ_i, which gives the theoretically uncorrelated 2D PDFs with mutual information = 0. For G_n with an odd n, (ϕ, ψ) of residue (n + 1)/2 and (ϕ_(n+1)/2, ψ_(n+1)/2–1) are shown; for G_n with an even n, (ϕ, ψ) of residue n/2 and (ϕ_n/2, ψ_n/2–1) are shown.

Reduction in configurational entropy upon cyclization is caused by correlations between dihedrals. In the previous section, we examined S₁ of both linear- and cyclo-G_n, and showed that the decrease in configurational entropy upon cyclization was not due to a more restricted 1D dihedral distributions in cyclic peptides. This finding suggests that the decrease must stem from a stronger correlation between dihedrals in cyclic peptides. Mutual information, within this context, is an indication of dihedral angle correlations; the larger the mutual information for a pair of dihedrals is, the stronger these two dihedral angles correlate. Below, we present the results of the S^MIE₂ and S^MIST₂ calculation and show that pairwise mutual information is larger and displays unique patterns in small cyclic peptides, leading to a significant decrease in their configurational entropy.

As S₁ only considers 1D-PDFs, it overestimates the configurational entropy because it does not take into account the correlations between different dihedrals. The inclusion of mutual information terms in the entropy calculations, as in the computation of S^MIE₂ and S^MIST₂, should theoretically result in a reduction in the configurational entropy estimation. We observed that both S^MIE₂ and S^MIST₂ were indeed smaller than S₁ for all the systems (Fig. 2B; blue and purple lines vs. green). However, the decrease from S₁ to S^MIE₂ and S^MIST₂ was much more prominent in cyclic peptides, especially for small cyclic peptides. These results confirm that the reduction in configurational entropy upon cyclization (Fig. 2A) stems from higher mutual information in the cyclized peptides.

To investigate which pairwise mutual information terms contributed most significantly to the reduction in S^MIE₂ and S^MIST₂ when the peptides were cyclized, Fig. 4 reports each pairwise mutual information term for linear- and cyclo-G_5–7 (see Fig. S4, ESI,† for mutual information for linear- and cyclo-G_8–15). In the linear-G_n peptides, the largest pairwise mutual information occurred between ϕ and ψ within the same residues (highlighted in green boxes in Fig. 4A–C). In other words, the correlations between intra-residual ϕ_i and ψ_i were the most significant in linear peptides. The top row of Fig. 3B shows the Ramachandran plot, i.e., 2D-PDF of (ϕ_i, ψ_i), 2D-PDF of (ϕ_i+1, ψ_i), and a simple product of 1D-PDF of ϕ and 1D-PDF of ψ for linear-G₅. We observed that, while the 2D-PDF of (ϕ_i+1, ψ_i) resembled the uncorrelated (ϕ, ψ) plot, the 2D-PDF of (ϕ_i, ψ_i) showed considerable deviation, implying more significant mutual information between ϕ_i and ψ_i in linear-G₅. All these (ϕ_i, ψ_i) mutual information terms were used in the MIST calculations, which was consistent with the observation that S^MIE₂ and S^MIST₂ were very close in value for linear-G_n (Fig. 2B, blue and purple lines with solid square symbols).


	Fig. 4 Mutual information decomposition matrix for (A) linear-G₅, (B) linear-G₆, (C) linear-G₇, (D) cyclo-G₅, (E) cyclo-G₆, and (F) cyclo-G₇. For each matrix, the lower triangle shows the pairwise mutual information terms between all pairs of degrees of freedom, while the upper triangle contains the terms included in the MIST calculation.

Pairwise mutual information is large in small cyclic peptides and follows specific patterns. Unlike linear-G_n, S^MIE₂ and S^MIST₂ of cyclo-G_n were noticeably different, particularly for cyclo-G₅ and cyclo-G₆ (Fig. 2B). Fig. 4 shows that the pairwise mutual information terms in cyclo-G_5–7 and they were much larger than in linear-G_5–7, especially for cyclo-G₅ and cyclo-G₆. Furthermore, while in linear-G_5–7 the (ϕ_i, ψ_i) mutual information terms were the most significant, in cyclo-G_5–7, the largest pairwise mutual information occurred between ψ_i and ϕ_i+1 (highlighted in red boxes in Fig. 4). In agreement with this observation, Fig. 3B shows that the 2D-PDF of (ϕ_i+1, ψ_i) of cyclo-G_5–7 displayed very significant deviation from the uncorrelated scenarios and from the 2D-PDFs of linear and larger cyclic peptides.

In fact, the pairwise mutual information terms in cyclo-G₅, cyclo-G₆, and cyclo-G₇ each displayed unique patterns. Cyclo-G₅ had large mutual information between (ψ_i, ϕ_i+1), and also between (ϕ_i, ψ_i) and (ψ_i, ψ_i+1) (highlighted in red, green, and orange boxes, respectively, in Fig. 4D). Cyclo-G₆ displayed the most significant mutual information between (ψ_i, ϕ_i+1), (ϕ_i, ψ_i) and (ψ_i, ψ_i+3) (highlighted in red, green, and purple boxes, respectively, in Fig. 4E). The largest mutual information terms in cyclo-G₇ were between (ψ_i, ϕ_i+1) and (ϕ_i, ψ_i) (highlighted in red and green boxes, respectively, in Fig. 4F). As the size of the cyclic peptides increased, the mutual information patterns in cyclo-G_n gradually approached those in their linear counterparts, particularly for n ≥ 10, where the mutual information contributions of (ψ_i, ϕ_i+1) were reduced to being insignificant (<0.2 cal mol⁻¹ K⁻¹, equivalent to 0.06 kcal mol⁻¹ at 300 K) (Fig. S4, ESI†).

We noticed in Fig. 2B that S^MIE₂ was much lower for cyclo-G₆ than for cyclo-G₅. This phenomenon is likely caused by cyclo-G₆ typically adopting rather stable conformations of two β turns opposing each other with two intramolecular hydrogen bonds formed in the middle (Fig. 1A). Although the mutual information between (ψ_i, ϕ_i+1) (highlighted in red boxes) and between (ψ_i, ψ_i+1) (highlighted in yellow boxes) was larger in cyclo-G₅ (2.38 and 0.97 cal mol⁻¹ K⁻¹) than in cyclo-G₆ (1.71 and 0.76 cal mol⁻¹ K⁻¹), all the other mutual information terms were larger in cyclo-G₆ than in cyclo-G₅. In particular, the (ψ_i, ψ_i+3) terms (highlighted in purple boxes) were prominent in cyclo-G₆ (1.05 cal mol⁻¹ K⁻¹), resulting from the β turn + β turn conformations the cyclic peptide preferred.

To test how the results on the effect of cyclization of polyglycine can be extrapolated to other polypeptides, we performed BE-META simulations of linear and cyclic A_n and calculated TΔS^config between linear and cyclic polyalanines. The results (Fig. S5, ESI†) were qualitatively similar to those of linear and cyclic G_n (Fig. 2A), showing a significant decrease in entropy for small peptides, especially sizes 5 and 6. We note that linear polyalanines displayed some helical propensity.²⁷ For mimicking hot loops at protein–protein interfaces, the peptide sequences of interest likely do not form (extended) helices but rather random coils, making (the cyclization of) G_n a better model for cyclization of hot loop sequences.

Overall, our results suggest that the reduction in S^config upon cyclization can be as large as 2–3 kcal mol⁻¹ in small peptides (of lengths 5 and 6 with the most conservative estimation, Fig. 2A and Fig. S5, ESI†), although the actual number will depend on the specific sequence of interest. Such a reduction can, in principle, improve K_D by a factor of 50. However, this observation does not mean that cyclization with the shortest possible linker should always be preferred, nor that cyclization of large peptides would have no impact on binding affinity. For example, rigidifying a short peptide into a conformation that is not compatible with the desired structure would be fruitless, even though the cyclization indeed reduces the peptide entropy (see the next section for results and discussion on the capability of cyclic peptides of various sizes to mimic different hot loops). For peptides of larger sizes, although a linear peptide and a cyclic peptide may have similar peptide configurational entropy, they are still likely to adopt different structures. For binding affinity, attention should also be paid to how much of the whole peptide structural ensemble is capable of mimicking the target conformation of the segment of interest. Furthermore, even if cyclization properly pre-organizes the segment of interest and leads to a favorable change in S^config of the unbound peptide, the impact on binding affinity may be compensated by an unfavorable change in TΔS^solvation, in ΔH, or even in S^config of the complex.

Ability of cyclic peptides to mimic hot loops

Structures of most cyclic peptides can be found in the cyclo-G_n structural ensembles. In this work, we use cyclic polyglycines as a pre-screening tool to determine whether a given conformation is likely to be mimicked by a cyclic peptide. In order to confirm that cyclo-G_n are representative model systems, we set out to verify that they can access conformations assumed by cyclic peptides of diverse sequences. To that end, we compared the backbone dihedral angles of experimentally characterized cyclic peptides to the conformations sampled in our MD simulations of cyclic polyglycines of equal length. Matching conformations were defined as having a deviation of no more than 25° in any backbone dihedral. We identified matching conformations for all 42 cyclic peptides of lengths 5–8 with structures experimentally determined using X-ray crystallography, obtained from the Cambridge Structural Database (CSD). No matching conformations were found for the two 12-mer cyclic peptides, cyclo-(VPG)₄ and cyclo-(APGVGV)₂. It is important to note that both of these peptides assumed elongated, symmetric conformations with prolines facilitating the formation of β-turns, and the intramolecular hydrogen bond pattern stabilizing the structure. A detailed breakdown of the comparison is shown in Table S2 (ESI†).

Further, we compared the cyclo-G_n conformational ensembles to NMR structures of cyclic peptides reported by Hosseinzadeh et al.⁴⁰ Four out of nine cyclic peptides had matching conformations, and eight out of nine were found to have conformations with dihedral deviations of no more than 40°. The motifs less likely to be sampled in cyclo-G_n simulations again turned out to be structures in longer (n > 8) cyclic peptides and driven by the presence of proline residues. While the Ramachandran region typical of proline is a subset of that of glycine, the resulting conformational constraints are likely to favor otherwise rarely sampled conformations. Detailed results are shown in Table S3 (ESI†).

Based on these results, it is clear that, while there are structures that can be mimicked by (typically proline-containing) cyclic peptides but not sampled by cyclo-G_n, cyclic polyglycines do indeed present a robust way to check whether a conformation is accessible to a head-to-tail cyclized peptide. In other words, a negative result at this stage does not entirely rule out a cyclic peptide design target, as a structure inaccessible to a polyglycine backbone could still be accomplished via additional conformational restraints, e.g. using proline residues or residues capable of forming stabilizing side chain–backbone hydrogen bonds. However, we have shown above that cyclo-G_n are capable of sampling conformations assumed by a highly-varied set of sequences – therefore, there is nonetheless plenty of utility in identifying the more “obvious” candidates using this approach. A positive result here suggests that optimizing a cyclic peptide sequence of a given length to mimic the desired target is a worthwhile task.

Using the cyclo-G_n structural ensembles to determine whether a hot loop can be mimicked by cyclic peptides of size n. To estimate whether cyclic peptides of different sizes could be used to mimic a given hot loop, we compared the dihedral angles of hot loops to the conformations sampled in the two sets of 1.0 μs MD simulations (two million frames total) of cyclo-G_n, and reported whether each hot loop could be mimicked by any of these cyclo-G_n conformations (Fig. 5). Here, a cyclo-G_n conformation that could mimic a hot loop was defined as having a deviation <25° to every dihedral of the hot loop. Since only two sets of 1.0 μs MD simulations of cyclo-G_n were used for this analysis, the absence of any conformation similar to a hot loop does not necessarily mean that it is impossible to mimic this hot loop using cyclic peptides of size n. In addition, as mentioned above, a structure inaccessible to a polyglycine backbone could still be accomplished via additional conformational restraints imposed by unique sequences. However, this analysis does suggest that if a conformation similar to a hot loop was indeed sampled by cyclo-G_n in these two sets of 1.0 μs MD simulations, then it should be promising to use cyclic peptides of size n to mimic this hot loop. Thus, this approach quickly identifies not just which hot loops are most likely to be successfully mimicked by a cyclic peptide, but also identifies particular cyclic peptide lengths to be tested.


	Fig. 5 MD simulations of cyclo-G_n were used to gauge the conformational space accessible to cyclic peptides of different sizes and evaluate whether the cyclic peptide conformational space overlaps with the structures of hot loops observed at protein–protein interfaces. The three hot loops shown on the left of this figure are ⁵²DLIYY⁵⁶ from chain A in PDB ID = 2JB0 (top), ⁵⁵⁶PSSPYSA⁵⁶² from chain B in PDB ID = 1BXK (middle), and ⁹³YNHNGWEC¹⁰⁰ from chain C in PDB ID = 1JU5 (bottom); the sidechains of the hotspot residues are highlighted.

For 120 out of the 193 all-trans hot loops, we found at least one mimicking conformation in the cyclo-G₅ to cyclo-G₁₅ simulations. These 193 hot loops span from five- to eight-residues-long and there are 9, 42, 38, and 104 hot loops of lengths five, six, seven, and eight, respectively. We further separated these hot loops based on their lengths, and calculated the percentage of each length category whose conformations were observed in the simulations of cyclo-G_n of equal or greater length (Fig. 6). For instance, for 8 of the 9 five-residue-long hot loops, we found in the simulations of cyclo-G₁₀ at least one conformation with a deviation <25° to every hot loop dihedral, resulting in the percentage of 88.9%, as shown in Fig. 6. These results suggest that even among hot loops of the same length, there are differences as to the cyclic peptide size needed to mimic a given loop. Table S4 (ESI†) provides detailed information on the 193 hot loops and the minimal cyclic peptide size where a conformation similar to each hot loop was observed.


	Fig. 6 Percentages of the hot loops that can be mimicked by cyclic peptides with various sizes.

To provide more detailed information on each hot loop, we also calculated how many conformations out of the two million frames in each cyclo-G_n structural ensemble were indeed similar to each of the hot loops. The fractions (in ppm) for the 9 five-residue-long hot loops are reported in Fig. 7. Conformations of some hot loops (for example, LDLGV, RDLIG, and DLIYY) were readily sampled by cyclo-G_n, suggesting that they are more compatible with the cyclic peptide structural ensembles; on the other hand, conformations of some hot loops (such as PHRLL and GQWNK) were barely or never observed in our cyclo-G_n data sets. Results similar to Fig. 7, but for all of the 193 hot loops, can be found in Fig. S6 (ESI†), which can be particularly helpful for choosing hot loops that are likely the easiest to mimic using cyclic peptides.


	Fig. 7 Fraction (in ppm) of each cyclo-G_n simulation whose conformations had a deviation <25° to every dihedral of the 9 five-residue-long hot loops.

Overall, our analysis suggests that cyclic peptides could be used to effectively mimic some but not all of the hot loops previously identified at protein–protein interfaces. The conformations of certain hot loops appear to be less compatible with the conformations adopted by cyclic peptides. In those cases, using cyclization to rigidify the hot loops would likely be counterproductive, as it may actually prevent the peptide from adopting the conformation of interest. The phenomenon that the resulting cyclic peptide conformations may be incompatible with the desired structure could explain at least in part why cyclization can have deleterious effects on binding affinity.^10,15–18

For hot loops that are mimicable by cyclic peptides, our results provide helpful information for prioritizing which cyclic peptide sizes to test. For example, using cyclic decapeptides may be promising for mimicking the RDLIG hot loop shown in Fig. 7. However, a balance also needs to be struck with how much entropy reduction one can obtain from cyclization, which favors smaller cyclic peptide sizes, as discussed in the previous section. For instance, although cyclo-G₁₄ has the highest fraction of conformations that are similar to the LDLGV hot loop (Fig. 7), one should still consider even cyclic nonapeptides, as the fraction value for cyclo-G₉ is still notable and the corresponding macrocycle is much smaller in size. Furthermore, because the purpose of the analysis here is to provide a general guideline for cyclic peptide designs by using cyclo-G_n to map out the maximum accessible conformational space, none of these results are tailored or optimized for any specific hot loop sequence. Additional simulations of, for example, cyclo-(LDLGVG₄), cyclo-(LDLGVG₅),…, and cyclo-(LDLGVG₉) will be helpful in further narrowing down the candidate cyclic peptide sizes to use for mimicking the LDLGV hot loop. In this example, because the desired conformation for the LDLGV region is the same, a higher fraction in an all-glycine system could suggest a better ability of cyclic peptides with that size at mimicking the hot loop conformation and a better starting position for sequence optimization. It is plausible that, owing to the different constraints induced by different sizes of cyclization, the different potential interactions they introduce, and the different difficulty at optimizing the linker sequences to stabilize the target cyclic peptide structures, the optimal linker length may ultimately change. In that case, this screening method should be thought of as identifying the hot loop targets most compatible with conformations accessible to head-to-tail cyclic peptide and the most likely optimal linker lengths, all of which can be simulated to obtain a more complete picture.

Additionally, the information obtained by this approach can be used for other structure-based design strategies. For instance, structures sampled by cyclo-G_n can be utilized as a starting point for docking simulations with side chains mutated to the sequence of interest, to determine whether those conformations, along with corresponding side chain orientations, would lead to favorable binding to the target surface.

Lastly, NMR studies show that most cyclic peptides adopt multiple conformations in solution, existing as structural ensembles.^41–51 Currently, robust experimental methods for structurally characterizing each conformation in an ensemble do not exist.⁵² Such information, however, can be obtained from MD simulations, making them indispensable in demystifying the impact of cyclization on peptide structure and binding affinity.

Conclusions

Using MD simulations of model polyglycine peptides we examined the differences in configurational entropy between linear and cyclic peptides. We found that the decrease in configurational entropy observed in cyclic polyglycines resulted from stronger correlation between dihedral angles. For small cyclic polyglycines, the pairwise dihedral correlation displayed specific patterns. Our results also show that cyclization can be a promising strategy to reduce peptide configurational entropy, especially for peptides shorter than 10 residues. However, attention must be paid to whether the conformations of the resulting cyclic peptides are compatible with the desired structure. Using cyclo-G_n structural ensembles, we identified hot loops whose conformations should be mimicable by cyclic peptides. This analysis provides quick guidelines on potential cyclic peptide sizes for mimicking various hot loops. The approach of using cyclo-G_n as a surrogate of the accessible conformational space can be applied to gauge whether cyclic peptides of any given size n are capable of mimicking target structures besides the hot loops tested here. More focused simulations of cyclic peptides incorporating the specific (hot loop) sequences of interest could then be further performed to help determine whether a particular cyclic peptide length or design should be pursued.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

We thank the support of the Knez Family Faculty Investment Fund and the National Institute of General Medical Sciences of the National Institutes of Health under award number R01GM124160 (PI: Y.-S. L.) We thank support from the Tufts Technology Services and computer resources from Tufts Research Cluster. Initial structures of the simulations were built using UCSF Chimera, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from NIH P41-GM103311. We thank Prof. Joshua Kritzer and Stephanie Kearing for their helpful discussions and comments on the manuscript.

References

M. R. Naylor, A. T. Bockus, M. J. Blanco and R. S. Lokey, Curr. Opin. Chem. Biol., 2017, 38, 141 CrossRef CAS PubMed .
D. S. Nielsen, N. E. Shepherd, W. Xu, A. J. Lucke, M. J. Stoermer and D. P. Fairlie, Chem. Rev., 2017, 117, 8094 CrossRef CAS PubMed .
X. Jing and K. Jin, Med. Res. Rev., 2020, 40, 753 CrossRef CAS PubMed .
J. A. Drake and B. M. Pettitt, Biophys. J., 2018, 114, 2799 CrossRef CAS PubMed .
S. Rubin and N. Qvit, Crit. Rev. Eukaryotic Gene Expression, 2016, 26, 199 CrossRef PubMed .
A. Zorzi, K. Deyle and C. Heinis, Curr. Opin. Chem. Biol., 2017, 38, 24 CrossRef CAS PubMed .
Z. Qian, P. G. Dougherty and D. Pei, Curr. Opin. Chem. Biol., 2017, 38, 80 CrossRef CAS PubMed .
O. Demmer, A. O. Frank and H. Kessler, in Peptide and Protein Design for Biopharmaceutical Applications, ed. K. J. Jensen, John Wiley & Sons, Ltd, Chichester, UK, 2009, pp. 133–176 Search PubMed .
C. J. White and A. K. Yudin, Nat. Chem., 2011, 3, 509 CrossRef CAS PubMed .
B. Claro, M. Bastos and R. Garcia-Fandino, Pept. Appl. Biomed., Biotechnol. Bioeng., 2018, p. 87 Search PubMed .
J. Gavenonis, B. A. Sheneman, T. R. Siegert, M. R. Eshelman and J. A. Kritzer, Nat. Chem. Biol., 2014, 10, 716 CrossRef CAS PubMed .
T. R. Siegert, M. J. Bird, K. M. Makwana and J. A. Kritzer, J. Am. Chem. Soc., 2016, 138, 12876 CrossRef CAS PubMed .
C. K. Wang, J. E. Swedberg, S. E. Northfield and D. J. Craik, J. Phys. Chem. B, 2015, 119, 15821 CrossRef CAS PubMed .
D. G. Udugamasooriya and M. R. Spaller, Biopolymers, 2008, 89, 653 CrossRef CAS PubMed .
J. E. Delorbe, J. H. Clements, B. B. Whiddon and S. F. Martin, ACS Med. Chem. Lett., 2010, 1, 448 CrossRef CAS PubMed .
K. Burgess, D. Lim and S. A. Mousa, J. Med. Chem., 1996, 39, 4520 CrossRef CAS PubMed .
W. R. Baumbach, M. H. P. Tikva, A. Carrick, B. Bingham, D. Carmignac, I. C. A. F. Robinson, R. Houghten, C. M. Eppler, L. A. Price and J. R. Zysk, Mol. Pharmacol., 1998, 54, 864 CrossRef CAS PubMed .
A. Roxin and G. Zheng, Future Med. Chem., 2012, 4, 1601 CrossRef CAS PubMed .
A. E. Wakefield, W. M. Wuest and V. A. Voelz, J. Chem. Inf. Model., 2015, 55, 806 CrossRef CAS PubMed .
D. P. Slough, S. M. McHugh, A. E. Cummings, P. Dai, B. L. Pentelute, J. A. Kritzer and Y. S. Lin, J. Phys. Chem. B, 2018, 122, 3908 CrossRef CAS PubMed .
C. Paissoni, M. Ghitti, L. Belvisi, A. Spitaleri and G. Musco, Chem. – Eur. J., 2015, 21, 14165 CrossRef CAS PubMed .
A. S. Kamenik, U. Lessel, J. E. Fuchs, T. Fox and K. R. Liedl, J. Chem. Inf. Model., 2018, 58, 982 CrossRef CAS PubMed .
A. E. Cummings, J. Miao, D. P. Slough, S. M. McHugh, J. A. Kritzer and Y. S. Lin, Biophys. J., 2019, 116, 433 CrossRef CAS PubMed .
B. Laudet, C. Barette, V. Dulery, O. Renaudet, P. Dumy, A. Metz, R. Prudent, A. Deshiere, O. Dideberg, O. Filhol and C. Cochet, Biochem. J., 2007, 408, 363 CrossRef CAS PubMed .
N. London, B. Raveh, D. Movshovitz-Attias and O. Schueler-Furman, Proteins, 2010, 78, 3140 CrossRef CAS PubMed .
B. Hess, C. Kutzner, D. van der Spoel and E. Lindahl, J. Chem. Theory Comput., 2008, 4, 435 CrossRef CAS PubMed .
C. Y. Zhou, F. Jiang and Y. D. Wu, J. Phys. Chem. B, 2015, 119, 1035 CrossRef CAS PubMed .
W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein, J. Chem. Phys., 1983, 79, 926 CrossRef CAS .
C.-Y. Zhou, F. Jiang and Y.-D. Wu, J. Chem. Theory Comput., 2015, 11, 5473 CrossRef CAS PubMed .
H. Geng, F. Jiang and Y. D. Wu, J. Phys. Chem. Lett., 2016, 7, 1805 CrossRef CAS PubMed .
G. A. Tribello, M. Bonomi, D. Branduardi, C. Camilloni and G. Bussi, Comput. Phys. Commun., 2014, 185, 604 CrossRef CAS .
S. M. McHugh, J. R. Rogers, H. Yu and Y.-S. Lin, J. Chem. Theory Comput., 2016, 12, 2480 CrossRef CAS PubMed .
B. J. Killian, J. Y. Kravitz and M. K. Gilson, J. Chem. Phys., 2007, 127, 024107 CrossRef PubMed .
B. M. King and B. Tidor, Bioinformatics, 2009, 25, 1165 CrossRef CAS PubMed .
M. Fleck, A. A. Polyansky and B. Zagrovic, J. Chem. Theory Comput., 2016, 12, 2055 CrossRef CAS PubMed .
J. Numata and E.-W. Knapp, J. Chem. Theory Comput., 2012, 8, 1235 CrossRef CAS PubMed .
H. Herzel, O. Schmitt and W. Ebeling, Chaos, Solitons Fractals, 1994, 4, 97 CrossRef .
T. Kortemme and D. Baker, Proc. Natl. Acad. Sci. U. S. A., 2002, 99, 14116 CrossRef CAS PubMed .
T. Kortemme, D. E. Kim and D. Baker, Sci. STKE, 2004, 219, pl2 Search PubMed .
P. Hosseinzadeh, G. Bhardwaj, V. K. Mulligan, M. D. Shortridge, T. W. Craven, F. Pardo-Avila, S. A. Rettie, D. E. Kim, D.-A. Silva, Y. M. Ibrahim, I. K. Webb, J. R. Cort, J. N. Adkins, G. Varani and D. Baker, Science, 2017, 358, 1461 CrossRef CAS PubMed .
K. D. Kopple, A. Go, R. J. Logan, Jr. and J. Savrda, J. Am. Chem. Soc., 1972, 94, 973 CrossRef CAS PubMed .
A. E. Tonelli and A. I. Brewster, J. Am. Chem. Soc., 1972, 94, 2851 CrossRef CAS PubMed .
K. D. Kopple, A. Go and T. J. Schamper, J. Am. Chem. Soc., 1978, 100, 4289 CrossRef CAS .
E. R. Blout, Biopolymers, 1981, 20, 1901 CrossRef CAS .
C.-H. Yang, J. N. Brown and K. D. Kopple, J. Am. Chem. Soc., 1981, 103, 1715 CrossRef CAS .
K. I. Varughese, G. Kartha and K. D. Kopple, J. Am. Chem. Soc., 1981, 103, 3310 CrossRef CAS .
K. D. Kopple, Y.-S. Wang, A. G. Cheng and K. K. Bhandary, J. Am. Chem. Soc., 1988, 110, 4168 CrossRef CAS .
S. J. Stradley, J. Rizo, M. D. Bruch, A. N. Stroup and L. M. Gierasch, Biopolymers, 1990, 29, 263 CrossRef CAS PubMed .
K. D. Kopple, J. W. Bean, K. K. Bhandary, J. Briand, C. A. D'Ambrosio and C. E. Peishoff, Biopolymers, 1993, 33, 1093 CrossRef CAS PubMed .
D. G. Alberg and S. L. Schreiber, Science, 1993, 262, 248 CrossRef CAS PubMed .
G. R. Marshall, D. D. Beusen and G. V. Nikiforovich, in Peptides: Synthesis, Structures, and Applications, ed. B. Gutte, 1995, p. 193 Search PubMed .
D. H. Brookes and T. Head-Gordon, J. Am. Chem. Soc., 2016, 138, 4530 CrossRef CAS PubMed .

Footnotes

† Electronic supplementary information (ESI) available: Details on system setup and MD simulations; dependence of the calculated backbone dihedral entropy on the number of bins used (Fig. S1); tabulated results of backbone dihedral entropy calculation for linear- and cyclo-G_5–15 (Table S1); comparison of backbone dihedral entropy estimation from MD and BE-META simulations (Fig. S2); change in configurational entropy upon cyclization of polyglycines calculated by the PARENT software (Fig. S3); results of mutual information calculation for linear- and cyclo-G_8–15 (Fig. S4); change in backbone dihedral entropy upon cyclization of polyalanines (Fig. S5); comparison of the cyclo-G_n structural ensembles and the structures of a number of experimentally characterized cyclic peptides (Tables S2 and S3); hot loop sequences and the minimal size of cyclic peptides needed to mimic each hot loop (Table S4); fraction of the cyclo-G_n structural ensembles mimicking each hot loop (Table S5). See DOI: 10.1039/d0cp04633g

‡ These authors contributed equally to this work.

Click here to see how this site uses Cookies. View our privacy policy here.