Gökçe
Geylan
*ab,
Jon Paul
Janet
a,
Alessandro
Tibo
a,
Jiazhen
He
a,
Atanas
Patronov
c,
Mikhail
Kabeshov
a,
Werngard
Czechtizky
d,
Florian
David
b,
Ola
Engkvist
ae and
Leonardo
De Maria
d
aMolecular AI, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden. E-mail: gokcegeylan96@gmail.com
bDivision of Systems and Synthetic Biology, Department of Life Sciences, Chalmers University of Technology, Gothenburg, Sweden
cQuantitative Biology, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
dMedicinal Chemistry, Research and Early Development, Respiratory & Immunology, BioPharmaceuticals R&D, AstraZeneca, Gothenburg, Sweden
eDepartment of Computer Science and Engineering, Chalmers University of Technology, University of Gothenburg, Gothenburg, Sweden
First published on 16th April 2025
Peptides play a crucial role in drug design and discovery whether as a therapeutic modality or a delivery agent. Non-natural amino acids (NNAAs) have been used to enhance the peptide properties such as binding affinity, plasma stability and permeability. Incorporating novel NNAAs facilitates the design of more effective peptides with improved properties. The generative models used in the field have focused on navigating the peptide sequence space. The sequence space is formed by combinations of a predefined set of amino acids. However, there is still a need for a tool to explore the peptide landscape beyond this enumerated space to unlock and effectively incorporate the de novo design of new amino acids. To thoroughly explore the theoretical chemical space of peptides, we present PepINVENT, a novel generative AI-based tool as an extension to the small molecule molecular design platform, REINVENT. PepINVENT navigates the vast space of natural and non-natural amino acids to propose valid, novel, and diverse peptide designs. The generative model can serve as a central tool for peptide-related tasks, as it was not trained on peptides with specific properties or topologies. The prior was trained to understand the granularity of peptides and to design amino acids for filling the masked positions within a peptide. PepINVENT coupled with reinforcement learning enables the goal-oriented design of peptides using its chemistry-informed generative capabilities. This study demonstrates PepINVENT's ability to explore the peptide space with unique and novel designs and its capacity for property optimization in the context of therapeutically relevant peptides. Our tool can be employed for multi-parameter learning objectives, peptidomimetics, lead optimization, and a variety of other tasks within the peptide domain.
In many peptide drug projects, a peptide hit is identified through large library screenings navigating the enumerated space.6 The integration of non-natural amino acids (NNAAs), amino acids not encoded by DNA, offers a compelling opportunity to improve the physicochemical and pharmacokinetic profile of peptides in hit-to-lead development. This includes enhancing metabolic stability, binding affinity, or cell permeability.5,6 The incorporation of NNAAs enables researchers to access an even broader and more diverse chemical space. Considering only the α-amino acid space, each side chain is chosen from a space similar to that of small molecules. Exploring this uncharted space has transformed peptide therapeutics, allowing further refinement of drug designs for better target specificity and both established and novel biological activities.5
Conventional methods, such as display technologies, peptidomimetics, and structure-based computational studies, have been instrumental to the progress of peptide therapeutics.1 While these methods have played a crucial role in peptide design, they are often limited by the natural amino acids.1 Even though simple modifications are included in this library, such as stereochemical modifications, the reach of their design space still falls short of the potential scale offered by NNAAs. The virtual space presents a significant challenge to create and is always constrained by the capabilities of the design-make-test-analyze cycle. Generative models have been employed to accelerate the drug discovery and development process to efficiently navigate the chemical space. Generative capabilities allow de novo design or molecule optimization with desired properties.7 In recent years, there have been many generative modelling studies to design peptides with various optimization tasks such as antimicrobial activity, cell penetration, anticancer properties and immunogenicity.4 These studies differ by the characteristics of peptides, the representations and the model architectures they explore; however, they have a common goal of designing a peptide sequence with a set number of amino acids, typically 20 natural amino acids.8–12 Grisoni et al. used a long short-term memory (LSTM) model, trained on cationic amphipathic peptides, and fine-tuned on known anticancer peptide sequences.11 The model was later used to design membranolytic anticancer peptides, composed of natural amino acids. The novel peptide sequences were later validated experimentally for anticancer activity.11 In other applications, NNAAs were introduced into the building block library to expand the generative model's enumerated space. One example of this was Schissel et al. introducing a peptide generator to design peptides for antisense delivery within an enumerated library.12 Their approach incorporates three unnatural amino acids into the enumerated space of both the generative and the predictive model. A generator–predictor–optimizer loop operates in this expanded repertoire to design peptides with enhanced antisense delivery and lower arginine content in a diversity-conscious manner. The learning loop, mimicking a directed evolution scenario with the genetic algorithm, was shown to propose peptide designs with the desired properties.12 These generative model applications enable better access to the chemical space and can provide a greater diversity of designs compared to conventional methods. However, the research within the enumerated set of building blocks restricts the peptides to a sequence-level design. Despite the demonstrated uptake of generative models in navigating the peptide sequence space, there remains a need for a design tool that efficiently optimizes peptides within the fully enabled chemical space.
To address the need for flexible generation of natural amino acids and NNAAs, we introduce PepINVENT tailored for de novo peptide design. PepINVENT stems from the REINVENT framework.13,14 In the small molecule realm, the state-of-the-art REINVENT framework utilizes reinforcement learning (RL) with a generative model trained on the chemical language, Simplified Molecular Input Line Entry System (SMILES),15 to design de novo molecules through a multi-parameter optimization (MPO) scenario.13 Analogous to REINVENT, PepINVENT is an open-source framework consisting of a chemistry-aware pretrained generative model coupled with RL. The framework facilitates the generation of novel NNAAs and diverse peptide topologies to design novel peptides. Inspired by the translation process of proteins and peptides in ribosomes, PepINVENT learns the peptide space on a per amino acid basis and preserves the intricate granularity of the peptide structure. As the generative model proposes amino acids, reinforcement learning guides the overall peptide design using a goal-oriented approach. We demonstrate the potential of PepINVENT to accelerate the peptide drug discovery and development pipeline by extending the design capabilities to novel NNAAs. The tool is suitable for de novo design, peptidomimetics, lead optimization and/or peptide property optimization tasks. In this work, we illustrate the utility and effectiveness of PepINVENT through a series of experiments, showcasing: (i) its navigation within the peptidic chemical space, (ii) its capability for the flexible generation of diverse peptide topologies, and (iii) how it can be used to perform MPO for peptide property optimization, with the example of enhancing the permeability and solubility for cyclo REV binding protein.
CHUCKLES19,20 is a representation method that encodes amino acids at the atomic level with SMILES.15 This representation follows its own standardized SMILES pattern at the monomer level. The pattern starts with the amino group in the backbone followed by the α-carbon, the sidechain, and the remaining backbone. This standardized format of N-to-C denotes the carboxyl group as carbonyl when used in a peptide sequence. Therefore, a sequential concatenation of the CHUCKLES strings of amino acids yields a valid SMILES pattern for the peptide, enabling syntactically correct peptide representation. Our building block library was translated to the CHUCKLES pattern after removing the charges from the amino acids.
The generation of semi-synthetic peptide data encapsulated a decision scheme for peptide length, topology, NNAA content, and common mutations, i.e. stereoisomerism information and backbone N-methylations. The data scheme begins with the selection of a peptide topology among the options of linear or variations of cyclic, including head-to-tail, sidechain-to-tail or disulfide bridging.
Downstream decisions are made for a predefined number of samples for the query topology. Initially, the number of amino acids in the peptide, or the peptide length, was indicated by selecting a length between 6 and 18 from a normal distribution (Fig. 1). Subsequently, the fraction of NNAAs was determined through random sampling from a left-skewed normal distribution, covering the range of [0, 0.3] (ESI Fig. 1†). Although the range was arbitrarily selected, it was chosen to recognize that generating a high fraction of NNAAs would significantly impact the synthetic feasibility of the peptides. Peptide sequences were enforced to contain up to and primarily around 30% NNAAs by the skewed distribution. Therefore, the semi-synthetic data ensured that the generative model encountered diverse building blocks while continuing to learn within the traditional chemical space with the natural amino acids. The chosen fraction was utilized to define the number of natural amino acids and NNAAs needed for the selected size. The determined numbers of amino acids were sampled without replacement from their respective sets.
Sidechain-to-tail cyclic peptides were constructed through a different amino acid scheme. The amino acids contributing to this cyclization were determined by selecting an amino acid containing a primary amine in its sidechain for the cyclization start and randomly selecting an amino acid for the cyclization termination. The amino acid for starting cyclization was placed at a random position in the given length, fulfilling the condition of forming a cycle with at least 5 amino acids. Similarly, disulfide bridging was achieved by selecting two amino acids containing a sulfhydryl in their sidechains. In both cases, the remaining positions were filled by sampling the natural and non-natural sets according to the selected fractions. The chosen amino acids, except for those involved in cyclization, were randomly shuffled to mix the natural and non-natural building blocks.
Amino acids selected for a peptide sequence were preprocessed by a series of modifications, starting with stereochemical mutations. The scheme follows a similar trend to the amino acid selection. Initially, a fraction of amino acids containing stereochemical mutations was determined through sampling from a left-skewed distribution, in the range of [0, 0.25] (ESI Fig. 1†). The chosen fraction determined the number of amino acids in the peptide to be modified. A random sampling of the amino acids according to this fraction determined the specific amino acids to be modified. To achieve the stereoisomeric modification, a string manipulation of the stereochemical information was implemented. The backbone N-methylation was incorporated into a subset of amino acids by replicating the selection process used for stereochemical modification. N-methylation was introduced into the selected amino acids by adding “(C)” after the starting character of the CHUCKLES pattern, representing the nitrogen atom.
Finally, we conducted a preprocessing step to achieve the selected topology. If the topology is linear, the amino acids were concatenated to form the SMILES string with the carboxylic acid of the last amino acid completed (Fig. 2). In the case of cyclization, the atoms of the amino acids involved in the cyclization were modified to denote the beginning and the end of the ring structure. The amino acids selected to contain the topological information of peptides or for N-methylation were examined to not contain a secondary amine group in the backbone, i.e. proline, for standard cyclization. The distributions for the modification decisions are placed in ESI Fig. 1.† The generated data contained similar distributions of varying size, sequence, NNAA content and modifications to amino acids for each topology.
![]() | ||
Fig. 2 CHUCKLES representation for (A) an individual amino acid, tyrosine (T), and (B) a tripeptide, CTP (Cys-Tyr-Pro). |
A total of 1 million unique peptides were generated in this scheme, comprising 40% linear sequences and an equal distribution of the remaining topology categories, 20% each. The peptides with varying topology, size, sequence, NNAA content and modifications to amino acids were split into 90% training, 5% validation and 5% test sets with stratification to preserve both the peptide length and topology distributions. To evaluate the performance of the generative model, two test sub-sets were prepared from the test set. The first set consisted of 100 masked peptides from each topological class, totaling 400 masked peptides. The peptides from each of these classes were selected in a stratified manner based on the peptide length. This set was used to assess the generative model performance. The second set was utilized for assessing if the model understands the topological context of the peptides. This set was created from the test data by taking 10 peptides from each topological category, totaling 40. The selected peptides with cyclized topologies had one of the amino acids with the topological information unmasked, where the second one was included among the masked positions.
The pairs were generated by masking around 30% of the amino acids in the peptide. Although the masking percentage was arbitrary, it was inspired by the BART model and Chemformer, a BART-based model pretrained for cheminformatics tasks such as retrosynthesis prediction and molecular optimization.21,22 The masking applied in these models was shown to be robust for natural language processing and in cheminformatics tasks, respectively. Similarly, PepINVENT was trained on a text infilling task by masking spans of tokens corresponding to masking individual amino acids from the chemical representation. The selection of amino acids to mask involved determining the fractions of natural amino acids and NNAAs. The fraction for natural amino acids was randomly sampled between 0 and 0.5 from a left-skewed distribution, with a mean of around 0.3 (ESI Fig. 1†). Therefore, the amino acid selection was biased towards more NNAAs overall to prevent overfitting on the natural amino acid patterns. Natural amino acids and NNAAs were randomly masked according to their respective assigned fractions. The pretraining objective was defined as proposing a set of amino acids to fill the masked positions of the input peptide. When the number of generated amino acids equaled the number of masked positions, the generated amino acids, the target, were mapped to the source peptide, resulting in the generated peptide. The generated peptides must have syntactically correct chemical representation as the model learns the chemistry of individual amino acids and peptides.
fθ:χ × χ → [0, 1]|V| |
The source and target strings were tokenized with a SMILES-based tokenizer and were input to the encoder and decoder during training, respectively. We denote the vocabulary with V, i.e. the set of all the possible tokens.25fθ assigns the probability of the tokens by which the elements of chemical space χ are represented.25 From now on we assume the source and target strings: x, y ∈ χ. The following loss function was used for training the model:
The model was trained for 24 epochs on an NVIDIA V100 with 32 GB. During an epoch, all the source–target pairs in the training set are included once with a batch size of 16 and batches are shuffled at each epoch. The model was trained following the same strategy and using the same hyperparameters as the original REINVENT transformer model,23 including the Adam optimizer with a learning rate of 0.0001 with 4000 warm-up steps.
Once trained, the model can be used to generate peptides conditioned on proposing amino acids to fill the masked positions of a source peptide by predicting one token at a time. Initially, the decoder processes the start token along with the encoder outputs to sample the next token from the probability distribution over all the tokens in the vocabulary. The generation process iteratively continues by producing the next token from the encoder outputs and all the previous generated tokens until the end token is found or a predefined maximum sequence length, 500, is reached. To allow for the sampling of multiple generated peptides, multinomial sampling or beam search is used.
The model was trained with 900 K masked peptides and their filler amino acid pairs. After training, the model can generate the exact number of amino acids required to fill the masked positions in the input peptide. As the peptides were represented with chemical language for strings of amino acids, the model learns the overall peptide language as a composition of individual amino acids. The chemical language also enables generating novel amino acids and simple modifications such as backbone N-methylation and stereochemical mutations.
(1) Validity: a generated peptide with a syntactically accurate SMILES that follows the chemical rules such as valency and chemical bonding was categorized as valid, with validity assessed using RDKit.26
(2) Uniqueness: it is defined in multiple levels:
(a) Peptide-level uniqueness: the number of unique SMILES strings after the separators are removed and the generated peptide is canonicalized with chirality. As the model generates amino acids to complete an input peptide, the generation of two peptides from the same input might contain the same amino acids in different orders. This makes the two peptides unique but results in a duplicated set of amino acids.
(b) Amino acid-level uniqueness: this was evaluated at three levels to detect the non-canonical, stereochemical and canonical variability of the generated building blocks, respectively, to:
(i) String-level uniqueness refers to the number of amino acid strings generated being unique by comparing them character by character.
(ii) Isomeric SMILES-level uniqueness, similarly to the peptide-level uniqueness, is the number of unique amino acids after the SMILES strings are canonicalized while retaining the chirality.
(iii) Canonical SMILES-level uniqueness is the unique amino acids with canonicalization as the molecules stripped off their stereochemical information. This offers the standardized representation where the uniqueness is ensured by a distinct molecular structure.
(3) Novelty: the novelty was calculated by profiling the unique generated amino acids as natural, non-natural and novel. In this case, non-natural refers to the NNAAs utilized to create the semi-synthetic peptide data for model training, whereas novel refers to the NNAAs that are generated by the model that do not exist in the training set.
We also visualized the chemical space of the amino acids to analyze the diversity of novel amino acids from the natural and NNAA ones. The diversity was illustrated by a t-distributed stochastic neighbor embedding (tSNE). The 1024 bit Morgan fingerprints with ,
and
27 computed with
26 were projected to 2-dimensional space with
.28 All the amino acids profiled during the novelty analysis were colored according to their labels.
In the next experiment, we tested if the model learned the topological information of the peptides. For example, a macrocyclic peptide includes two amino acids defining the start and end points of the cyclization, thereby establishing the topological arrangement. This experiment aimed to assess whether the model generates amino acids considering the context of the entire peptide. The second test set comprising 40 peptides was used where 1000 filler amino acid sets were sampled for each peptide. If one of the generated amino acids did not complete the topological arrangement, the resulting peptide was considered an invalid molecule. Therefore, we evaluated the validity per topology for the test peptides.
The experiments in this section were designed to demonstrate the capabilities of the generative model for peptide property optimization by guiding the generation process through RL. PepINVENT offers peptide-based scoring components and scores the generated peptides after the filler amino acids are mapped to the masked positions of the input peptide. When multiple scoring components are selected, scores from the components are aggregated by either a weighted average or a geometric mean to compute the final score for each peptide in the learning step. As multinomial sampling was employed, the RL experiments were conducted in triplicate to avoid any potential bias. In the RL experiments on topology, a diversity filter with a penalty was used to prevent the repetitive generation of the same molecule.29 In the experiment on generating soluble and permeable macrocyclic peptides with RL, the diversity filter was selected to penalize the peptides with the same Murcko type scaffold.24 In each step of the RL loop, 32 peptides were generated.
The first experiment was to optimize the peptide to a specific topology by constraining the size of the maximum ring. The topological constraint experiments were assessed over 100 steps of the RL loop and the average score over the batch was reported across the learning steps. The second experiment was to showcase a practical example where a peptide is designed to be soluble and permeable and have a cyclic structure. In this experiment, a custom alerts component was used to penalize the generation of undesirable patterns. The configuration files that were used to run the RL experiments could be found in the PepINVENT repository.
Maximizing the ring size was subjected to a sigmoid score transformation within the window of the macrocycle condition, 12, and an arbitrary high number, 60 (Fig. 4A). For sampling head-to-tail or sidechain-to-tail peptides, the upper bound of the score window was reduced to match the typical number of ring atoms in head-to-tail peptides (Fig. 4B). The double sigmoid ensures equal scoring for macrocyclic peptides, while heavily penalizing those outside the window. Lastly, linear peptides were generated by transforming the scores with a reverse sigmoid within a window of 0 to 60, minimizing the ring size (Fig. 4C).
To demonstrate the structural flexibility of generation, a 9-mer peptide was generated with positions 1, 2, 4, and 9 masked and the remaining amino acids as alanine to facilitate the visual distinction of the generated amino acids. Moreover, the input peptide had no prior topological information to enable the generation of any topology. The described input was constructed as: .
Later, peptide optimization through reinforcement learning was explored. RL-based experiments were aimed at demonstrating the flexibility in steering the generation to a specific peptide topology compared to sampling diverse topologies. Lastly, we showcase a practical application in MPO settings to optimize for permeable and soluble macrocyclic peptides.
Metric | Sampling method | Total | Linear | Head-to-tail | Disulfide bridge | Sidechain-to-tail |
---|---|---|---|---|---|---|
Peptide validity (%) | Beam search | 99 (±7) | 100 (±0) | 100 (±0) | 100 (±2) | 98 (±14) |
Multinomial | 98 (±2) | 98 (±1) | 98 (±1) | 99 (±1) | 97 (±4) | |
Peptide uniqueness (%) | Beam search | 100 (±0) | 100 (±0) | 100 (±0) | 100 (±0) | 100 (±0) |
Multinomial | 98 (±8) | 100 (±0) | 99 (±7) | 94 (±14) | 99 (±2) |
Next, we explored the uniqueness of the generated peptides for both sampling methods. The beam search is deterministic; therefore, it generates unique strings. However, this does not guarantee that the SMILES representation translates to a unique molecule. Our generative model almost always generates chemically unique peptides, >99% (Table 1). A similar profile was observed with multinomial sampling with higher fluctuations in non-linear peptide topologies as these were reported with higher standard deviations (Table 1). The peptides with disulfide bridges were harder to diversify with multinomial sampling. This stemmed from the training set containing a limited number of amino acids with sulfur in their sidechain. Therefore, the specific topological constraint of having a disulfide substructure in the peptide molecule was harder to learn compared to other topologies. After establishing that our generative model was producing valid and unique molecules across various peptide topologies, the next step was to characterize the building blocks proposed by the model.
We defined the last step of the uniqueness analysis as the categorization of the type of amino acid generated as detailed in the Methods. In this step, we investigated how the unique amino acids were distributed to the groups of natural, non-natural from the training set and novel (Fig. 6). When a masked peptide was queried, the entire set of 20 natural amino acids was generally proposed during multinomial sampling. This demonstrates that the generative model considers proposing natural amino acids and does not only explore the non-natural space. The NNAAs from the training set were also proposed as the learned building blocks. These NNAAs were more frequently proposed compared to the novel ones. Moreover, in the canonical SMILES-level uniqueness, there was an increase in the average of the non-natural amino acids and a decrease in the novel amino acids. This highlights once again the contribution of stereochemical modifications to the diversity. The novel amino acids, at the canonical SMILES level, were generated with a significant number of options, averaging around 200 and offering as many as 1200 amino acids for a single peptide query.
The drastic difference in the number of amino acids between the two sampling methods arises from how the model learned the amino acid patterns. When a set of amino acids is generated in different orders, it can result in distinct peptides, even though the constituent amino acids are shuffled. In addition, the training set contains more natural amino acids than its non-natural counterpart and some sidechain fragments are frequent among non-natural amino acids. The CHUCKLES pattern for these amino acids and substructures is learned by the model. Considering these two points, the beam search sampling may result in oversampling of natural amino acids and frequent sidechain patterns, the most probable patterns in the training set, compared to multinomial (ESI Fig. 3†). However, beam search maintains the peptide's uniqueness through positional rearrangements of the amino acids within the peptide. This shows that the model tends to prioritize suggesting natural amino acids initially before venturing into the space of NNAAs. Moreover, shuffling the order of the amino acids illustrates the model's approach in addressing the assigned task by generating a variety of amino acid combinations in a combinatorial fashion. While this may be the case for beam search, the probabilistic nature of multinomial explores the chemical space more freely while preserving the understanding of peptides as a combination of amino acids. Hence, the peptide-level diversity expands into a high-dimensional space that is incomparably broader than the conventional sequence space, requiring strategic navigation.
Lastly, the diversity of the amino acids proposed by the model was analyzed by visualizing the chemical space. As the NNAAs in our training set were already shown to cover a large chemical space,16 the diversity analysis also described the chemical space that was presented to the model with the training set. The extracted amino acids were canonicalized with isomeric information and covered all the natural amino acids and 10000 NNAAs in the training set. Moreover, 91
826 novel amino acids were generated to propose amino acids for peptides in the test set. The dimensionality reduction plot was plotted with the canonicalized isomeric SMILES since one of the features that the generative model offers is introducing stereochemical modifications. The visualization showed similar coverage for novel and NNAAs, indicating that the novel amino acids were indeed proposed from the learned space (Fig. 7). This was also observed when comparing these novel amino acids and the amino acids from the training set based on the distributions of features related to molecular complexity such as synthetic accessibility score,34 natural product-likeness score,35 number of heavy atoms, and maximal graph length (ESI Fig. 4†).
Topology | Validity (%) |
---|---|
All topologies | 98.3 (±6.9) |
Linear | 99.9 (±0.5) |
Head-to-tail | 96.1 (±12.2) |
Disulfide bridge | 97.9 (±6.5) |
Sidechain-to-tail | 99.4 (±1.8) |
In the next RL case, we have biased the scoring to have the best scores throughout the range of the macrocycle condition to the head-to-tail cyclization condition. The learning was limited to these topologies as the average ring size increased until it reached the pre-set threshold of the score transformation (Fig. 8B). Moreover, the broader score range of the steps compared to the disulfide case demonstrates the fluctuations of generating both topologies (ESI Fig. 6†). Lastly, we flip our initial objective to favor lower ring sizes or in other words, linear peptide generation in the RL run. Once again, RL was able to steer the generation to linear peptides as the macrocycles are penalized with lower scores (Fig. 8C). The generation of linear peptides also did not affect sampling heterocycles in the sidechains (ESI Fig. 6†). The preservation of such substructures ensures the diversity of sidechains while conforming to the desired topology. In all the RL cases, the validities of the batch of peptides in the exploration and exploitation stages were generally above 90% and 95%, respectively (Fig. 8D–F). Typically, at approximately 40 learning steps, the objective is reached, marking the transition from exploration of the peptidic chemical space to exploitation of the targeted space with the desired characteristics. Achieving the objective in under 50 steps, PepINVENT demonstrated high flexibility in transitioning between diverse topologies.
Inspired by this study, we demonstrate how RBP can be modified to improve permeability and solubility. The amino acids that were shown to not have a major impact on the bioactivity in the study of Wu et al.38 were masked in RBP. These included the alanines that were previously shown to be modified without a significant change in the bioactivity and glycines that were incorporated solely for the cyclization. Over the learning steps, the RBP was modified with PepINVENT in a multi-parameter objective (MPO) scenario to propose new RBP designs by generating sets of amino acids. In the MPO, the RBP was optimized for cell permeability through scoring the designs with a peptide permeability model while maintaining cyclic structures by the topological constraint scoring component. The undesirable substructures were penalized by the custom alert scoring component and we tracked all the learning progress of the individual components as well as the final score. The learning process over a 1000-step RL run was tracked with the four scoring components mainly (Fig. 9). Maximizing the ring size constrained the topology to macrocyclic peptides during the generation (Fig. 9). The custom alerts allowed the generation to maintain consistency with the chemical relevance of the sidechains (Fig. 9). As these components defined the targeted peptidic space, components scoring the permeability and the solubility specified the property optimization objectives.
![]() | ||
Fig. 9 Progress of the scoring components of the MPO task during RL runs to design soluble and permeable cyclic Rev-binding peptides. The MPO entailed (A) the topological constraint aiming to generate macrocyclic peptides, (B) solubility measured with the CAMSOL-PTM intrinsic solubility predictor,31 (C) permeability assessed through a classifier for passive permeability of peptides and (D) custom alerts penalizing undesirable substructures generally associated with toxicity. During the runs, (E) the lipophilicity, (F) the validity, and (G) the average of the aggregated scores for the batches at each step were tracked for the generated peptides. The plots illustrate the average of triplicate runs and the individual runs are reported in ESI Fig. 7.† |
The intricate interplay between solubility and permeability has been explored by many studies.39 While there are no explicit design guidelines for peptides, it is understood that cell permeable peptides must demonstrate a certain level of solubility in order to unlock the degree of conformational flexibility necessary for passive permeability.40 The dynamic conformational shifts enable peptides to adapt to both aqueous and cell membrane's lipophilic media.41 Therefore, designing permeable and soluble peptides requires complex design strategies. In this application example, we demonstrate PepINVENT's ability to navigate the chemical space to propose design ideas balancing solubility and permeability. According to the learning progress in the RL runs with RBP, the solubility component was learned first, in the first 100 steps, followed by improving the permeability in the soluble peptide space (Fig. 9B and C). This was also observed by a steep increase followed by a steady profile in the aggregated score computed by the geometric mean of four scoring components (Fig. 9G). Over the learning steps, the proposed peptides showed high solubility and permeability, while permeability was harder to optimize. The aggregated MPO score focuses on the highest scoring peptides through the learning process (Fig. 10A). A peak in permeability scores around step 200 indicates the exploration of the chemical space to reach a design space with permeable peptides and exploiting this space. The drop after this peak demonstrates the impact of the penalty from the diversity filter to steer the model to explore more diverse peptide designs (Fig. 9C). Lipophilicity was also reported as the Wildman–Crippen logP value during the runs to visualize the shifts between solubility and this fragment-based solubility scorer (Fig. 9E).42 Heterocycle incorporation into the backbone was one of the preferred design ideas for MPO, which is a common peptide modification for permeability (Fig. 10B).39 PepINVENT effectively sampled new macrocyclic peptides that are soluble and permeable while preserving the previously demonstrated high validity of the generative model.
The scarcity of the publicly available peptides and the known NNAAs being significantly limited compared to the theoretical amino acid space pose a challenge for obtaining large data. Because of these data limitations, generative models in the field are typically customized for topology- or property-specific tasks.8–12 Our generative model was trained on semi-synthetic peptide data composed of natural amino acids and readily synthesizable NNAAs from a large virtual library.16 The semi-synthetic data were not generated based on any property-related patterns or distinct topologies. The conditional aspect of the generator is defined solely by its ability to generate an equal number of amino acids to the masked positions, yielding valid peptides. Therefore, PepINVENT could serve as a central tool for optimizing various peptide-related objectives and topologies.
Peptides were represented with the CHUCKLES pattern in our model. CHUCKLES enabled the peptides to be encoded, preserving their chemical context while maintaining the sequence order of amino acids. This approach facilitated the translation from the sequence to the molecule, allowing for both assembly of peptides and extraction of amino acids from them in a standardized format, capturing the N-to-C directionality. The generative model trained with this representation was shown to produce valid and unique peptides with diverse amino acids while preserving the granular structure of peptides. The model comprehends the complex nature of a peptide as a sequence of building blocks that make up a beyond-the-rule-of-5 modality. The generative model goes beyond the building block library in the training set to propose novel NNAAs featuring unique sidechains and various stereochemical and backbone modifications.
We demonstrated the capabilities of the PepINVENT framework encompassing the generative model and the RL through two experiments. The first experiment utilized a straightforward physicochemical descriptor, the ring size of the largest ring in the peptide. Various score transformation functions were employed to define different objectives to optimize the largest ring's size with RL. The experiments showed that a peptide without specific structural information can be constrained to a desired topology, with specific cyclizations or linear peptides, in less than 50 learning steps. The flexibility in guiding the generation towards distinct topologies highlights its flexibility in proposing diverse peptide designs. Since this experiment utilizes a single scoring component, the ring size, to generate a particular topology, it is worth highlighting that the provided solution is not the only possible solution. Other scoring components could also be used to achieve a similar outcome. For example, a substructure match for the disulfide bond could generate disulfide bridged peptides. Another example would be setting a substructure matching objective to a scaffold that defines a part of or the entire peptide backbone. Additionally, more complex scoring components such as secondary structure predictors could be integrated to navigate the generation process toward a desired topology.
In a peptide-based drug discovery project focused on enhancing the peptide affinity, a motif in the original sequence that leads to a certain characteristic, such as permeability, may be preserved. The lead optimization in such a case could be carried out on the amino acids that do not participate in this motif. In contrast, amino acids enabling the peptide's binding to a target protein could be characterized as the pharmacophore.39 The residue-based pharmacophore is preserved while the remaining amino acids are explored for new peptide designs to improve the peptidic properties, such as solubility or oral bioavailability. PepINVENT can selectively optimize the peptide, with respect to the project goals. The latter case was employed to showcase a practical application of PepINVENT in designing a hit peptide when the pharmacophore is defined. The amino acids that are not part of the pharmacophore or modified in other studies without compromising the activity were selected for modification. The proposed peptide designs were optimized to exhibit permeable and soluble features constrained to cyclic designs. The RL-steered generation led to the proposal of novel peptides with the desired properties. This experiment showed how PepINVENT navigates the peptide landscape identifying limited regions within the chemical space that balance the trade-off among multiple properties. PepINVENT could be utilized for multiparameter optimization consisting of physicochemical properties, property predictors and methodologies. Throughout the experiments, the agent consistently avoided invalid amino acid patterns while navigating the peptide landscape, resulting in over 90% validity of the proposed peptides. Proposing novel peptide designs, with both natural and non-natural amino acids, could accelerate peptide-based drug discovery and development projects. The tool can be used for peptide property enhancements, peptidomimetics, lead optimization and many other peptide related tasks for peptides as drug molecules or as conjugates for delivery. In future studies, we aim to demonstrate the impact of the PepINVENT framework in a multi-parameter optimization setting for a real-life application and extend the available scoring components including the synthetic feasibility of the generated peptides.
In this study, we also demonstrated PepINVENT's effectiveness in navigating the peptide landscape by showing its adaptability to specific topological constraints and in showcasing its capacity to accommodate multiple topologies. Additionally, PepINVENT was employed in a multi-parameter optimization scenario to design soluble and permeable macrocyclic peptides. Reinforcement learning steered the generation to the design space that contained desired peptides for all the specified components. Thus, our framework demonstrates its capability to propose tailored peptides and could facilitate peptide optimization in real-life applications. PepINVENT is presented as an open-source framework and as an extension of the de novo small molecule design tool, REINVENT. In this work, we present PepINVENT as a tool that could facilitate peptide-based drug discovery and development by addressing the challenge of proposing peptide designs with novel NNAAs while improving targeted peptidic properties defined through MPO. Future studies will focus on demonstrating the practical utilization of the framework in peptide-based drug optimization settings.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4sc07642g |
This journal is © The Royal Society of Chemistry 2025 |