Alireza
Ghafarollahi
a and
Markus J.
Buehler
*ab
aLaboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA 02139, USA. E-mail: mbuehler@MIT.EDU
bCenter for Computational Science and Engineering, Schwarzman College of Computing, Massachusetts Institute of Technology, 77 Massachusetts Ave., Cambridge, MA 02139, USA
First published on 17th May 2024
Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data – natural vibrational frequencies – via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design.
Over the past years, in the field of de novo protein design, data-driven and machine learning methods have emerged as powerful tools, offering valuable insights and accelerating the discovery of novel proteins with desired properties.2–15 These methods have opened great avenues for predicting structure, properties, and functions of proteins solely based on their underlying AA sequence. For instance, the development of deep learning (DL)-based AlphaFold 2 marked a significant breakthrough in the field of 3D folding protein prediction with a level of accuracy that in some cases rivaled expensive and time-consuming experimental techniques.16 Moreover, deep learning-based models have been developed to explore structure–property relationships in the analysis and design of proteins. These models encompass a broad spectrum of structural and mechanical properties, serving either as constraints or target values. For example, various DL-models developed predict the secondary structure of proteins from their primary sequences. Prediction of mechanical properties of spider silk protein sequences have been enabled by DL models.17–22 Moreover, DL-based models such as graph neural networks23 and transformer-based language models24 show enhanced accuracy in predicting the protein natural frequencies compared to physics-based all-atom molecular simulations. The development of such DL models significantly reduces the cost of screening the vast sequence space to target proteins with improved or optimized mechanical performance.
A frontier, however, that still exists is how we can create intelligent tools that can solve complex tasks and draw upon a diverse set of knowledge, tools and abilities. Another critical issue is that the combination of purely data-driven tools with physics-based modeling is important for accurate predictions. Moreover, such tools should ideally also be able to retrieve knowledge from, for instance, the literature or the internet. All these aspects must be combined in a nonlinear manner where multiple dependent steps in the iteration towards and answer are necessary to ultimately provide the solution to a task. As we will discuss in this study, such an integration of tools, methods, logic, reasoning and iterative solution can be implemented through the deployment of a multi-agent system driven by sophisticated Large Language Models (LLMs).
LLMs25,26 have represented a paradigm shift in modeling problems across a spectrum of scientific and engineering domains.8,27–41 Such models, built upon attention mechanism and transformer architectures,42 have emerged as powerful tools recently in the field of materials science and related areas, contributing to various aspects ranging from knowledge retrieval to modeling, design, and analysis. For example, models such as ChatGPT and the underlying GPT-4 architecture,43 part of the Generative Pretrained Transformer (GPT) class, demonstrate exceptional proficiency in mastering human language, coding,44 logic and reasoning.45 Recent studies highlight their ability to proficiently program numerical algorithms and troubleshoot code errors across several programming languages like Python, MATLAB, Julia, C, and C++.46 The GPT class of LLMs has also represented a new paradigm in simulating and predicting the materials behavior under different conditions,28 a field of materials science often reserved for conventional deep learning frameworks47 such as Convolutional Neural Networks,48,49 Generative Adversarial Networks,50–52 Recurrent Neural Networks22,54,55 (ref. 20, 53 and 54), and Graph Neural Networks.23,55–58 Moreover, due to their proficiency in processing and comprehending vast amount of different types of multimodal data, LLMs show promising capabilities in materials analysis and prediction application including key knowledge retrieval,35 general language tasks, hypothesis generation,29 and structure-to-property mapping.28,59
At the same time, LLMs are typically not best equipped to solve specific physics-based forward and inverse design tasks, and are often focused on leveraging their conversational capabilities. Here, LLMs have been instrumental in powering conversable AI agents, facilitating the transition from AI–human conversations to AI–AI or AI–tools interactions for increased autonomy.31,35,60–62 This capability represents a significant advancement, enabling intelligent mediation, fostering interdisciplinary collaboration, and driving innovation across disparate domains, including materials analysis, design, and manufacturing. The overall process could be deemed as adapting a problem-solving strategy dictated and directed by the AI system comprised of different agents. Thereby, the entire process can be AI automated with reduced or little human intervention. Depending on the complexity of the problem, using the idea of labor division, the agents have the capability to break the overall task into subtasks for which different agents or tools are used consecutively to iteratively solve the problem until all subtasks have accomplished and the solution has achieved. There is no intrinsic limitation in defining the type of tools, making the multi-agent model a versatile approach in addressing problems across scales and disciplines. The tools could range from a simple linear mathematical function to sophisticated deep neural network architectures. The use of llm-based multi-agent strategies for discovering new materials and automating scientific research has been examined in previous studies, with applications spanning mechanics,31 chemistry,63,64 and materials science.35 The comprehensive analysis by65 specifically addresses the implementation of multi-agent strategy in enhancing biomedical research and scientific discovery, underscoring their potential to transform the field.
In this paper, we propose a multi-agent strategy to the protein design problems by introducing ProtAgents, a multi-agent modeling framework to solve protein-related analysis and design problems by leveraging customized functions across domains and disciplines. The core underpinning concept of the multi-agent systems is the state-of-the-art LLMs combined with a series of other tools. The LLM backbone demonstrates exceptional abilities in analysis, rational thinking, and strategic planning, essential for complex problem-solving. Leveraged by these capabilities, the proposed model aims to reduce the need for human intervention and intelligence at different stages of protein design. The agent model consists a suite of AI and physics-based components such as:
• Physics simulators: obtain new physical data from simulations, specifically normal modes and vibrational properties by solving partial differential equations (PDEs).
• Generative AI model: conditional/unconditional de novo protein design, based on a denoising diffusion model.
• Fine-tuned transformer model: predict mechanical properties of proteins from their sequence.
• Retrieval agent: retrieve new data from a knowledge database of scientific literature.
The main contribution of our work is summarized as follows.
• We propose ProtAgents, a pioneering multi-agent modeling framework that combines state-of-the-art LLMs with diverse tools to tackle protein design and analysis problems.
• Our model harnesses the collective capabilities of agents with specialized expertise that interact autonomously and nonlinearly to solve the protein-related task.
• Equipped with various tools and functions, the model demonstrates an advanced ability to integrate new physical data from different disciplines, surpassing conventional deep learning models in versatility and problem-solving capacity in protein science.
• Our model significantly minimizes the need for human interference throughout different stages of the problem-solving process.
• ProtAgents operates on textual input, thereby enabling non-expert researchers to effectively address and analyze challenges within the realm of protein design.
The versatility of the approach in solving complex tasks is exhibited by providing a series of experiments in the context of proteins design, modeling, and data analysis.
The plan of this paper is as follows. In Section 2, we present an overview of the multi-agent framework developed to tackle multi-objective complex tasks. Subsequently, we delve into a series of experiments where each task is initially introduced, followed by a detailed examination of various aspects throughout the problem-solving process by the multi-agent teamwork. A comprehensive discussion regarding the multi-agent framework and future prospects is provided in Section 3.
• “User”: human that poses the question.
• “Planner”: develops a plan to solve the task. Also suggests the functions to be executed.
• “Assistant”: who has access to all the customized functions, methods, and APIs and executes them to find or compute the relevant data necessary to solve the task.
• “Critic”: responsible for providing feedback about the plan developed by “planner” as well as analyzing the results and handling the possible mistakes and providing the output to the user.
The agents are organized into a team structure, overseen by a manager who coordinates overall communication among the agents. Table 1 lists the full profile for the agents recruited in our multi-agent framework. Moreover, a generic structure showing the dynamic collaboration between the team of agents proposed in the current study is depicted in Fig. 2. Further details can be found in the Materials and methods section 4.
Agent# | Agent role | Agent profile |
---|---|---|
1 | user_proxy | user_proxy. Plan execution needs to be approved by user_proxy |
2 | Planner | Planner. You develop a plan. Begin by explaining the plan. Revise the plan based on feedback from the critic and user_proxy, until user_proxy approval. The plan may involve calling custom function for retrieving knowledge, designing proteins, and computing and analyzing protein properties. You include the function names in the plan and the necessary parameters. If the plan involves retrieving knowledge, retain all the key points of the query asked by the user for the input message |
3 | Assistant | Assistant. You have access to all the custom functions. You focus on executing the functions suggested by the planner or the critic. You also have the ability to prepare the required input parameters for the functions |
4 | Critic | user_proxy. You double-check the plan, especially the functions and function parameters. Check whether the plan included all the necessary parameters for the suggested function. You provide feedback |
5 | Group chat manager | You repeat the following steps: dynamically selecting a speaker, collecting responses, and broadcasting the message to the group |
It is noteworthy that critical issues in the realm of protein design surpass the capabilities of mere Python code writing and execution. Instead, addressing these challenges necessitates the utilization of external tools specifically tailored for protein design and analysis, and the writing, adaptation, correction and execution of code depends nonlinearly on the progression of the solution strategy that is developed by the system.
The tools are incorporated into the model via the assistant agent who oversees executing the tools. To assess the performance of the multi-agent framework in handling complex interdisciplinary tasks, we have defined a rich library of functions each with special powers in solving the protein problems. Each function has a distinct profile that describes its role and requires one or more entities as inputs, each of which is also profiled to specify its identity and type such as string or integer. This helps the agents not only understand which function to choose but also how to provide the input parameters in the correct format. The functions provide the ability to, for instance, retrieve knowledge, perform protein folding, analyze the secondary structure, and predict some parameters through a pre-trained autoregressive language model. Additionally, a function can carry out simulations to compute the protein natural frequencies, thus allowing the model to integrate the new physics-based data. A full list of functions implemented in the current study is provided in Table S1 in the ESI.† It is worth mentioning that all the tools implemented in our multi-agent system are fixed, predefined functions, and the agents have not been given the ability to modify them.
Given the complexities residing in the protein design problems, the primary contribution of our multi-agent framework lies in assessing whether the team of agents can discern the requisite tools for a given query and evaluating the framework's capability to initiate the execution of these tools, along with providing the necessary inputs. The designated tasks are intentionally designed to be sufficiently complex, involving multiple subtasks where, for some cases, the execution of each depends on the successful completion of the preceding ones. This design showcases the model's capacity for the automated handling of intricate tasks, eliminating or substantially reducing the need for human intervention. Although the multi-agent framework allows for the human intervention at different stages, we skip that to further examine the team's capability in handling different possible situations, for instance in case of a failure.
The planer then correctly suggests the function “retrieve_content” to be executed with the argument “examples of protein names whose mechanical properties have been studied through experiments”. Upon execution of the function, the assistant provides us with a list of protein names. Upon inspection, we find that the agent has successfully identified experimentally studied proteins, despite an abundance of information on proteins studied theoretically, for instance, through coarse-grain simulations. Since we are interested in the PDB ids, we continue the chat by a follow-up question “Can you provide me with the PDB ids for these proteins?” when “user_proxy” is being asked to provide feedback to chat manager. Again, the planner suggests “retrieve_content” function with the following message.
The “Assistant” agent then calls the function and gives the following output:
Upon careful examination of the results, we observe that, despite all the PDB ids exist in the source database, the PDB ids do not quite match with the protein names except for a few cases (1ubq, 1ten). Nevertheless, note that the error is caused by the poor performance of the “retrieve_content” function, which implements Llama index, and the team of agents cannot mitigate that as they have no access to the knowledge database. In fact, the entire retrieval augmented generation process is solely performed by Llama index and the agents merely contribute to this process by providing the query, calling the “retrieve_content” function, and returning the results. As such, we continue to test the agent-team capability in more challenging queries centered around computational tasks and physics-based simulations by assigning the following task in the next round of conversation.
The above is a complex multi-step analysis and computation task encompasses aspects such as secondary structures analysis, natural frequency calculations, and structure classification. Additionally, the task is subject to an initial condition that must be satisfied before proceeding through the next sequence of steps, adding an extra layer of complexity. In response, the planner comes up with a detailed plan which consists of all the actions that need to be taken to complete the task. Moreover, the plan mentions all the corresponding functions that need to be executed to accomplish the task. More importantly, the “planner” perfectly realizes to fetch the protein structures first before starting to analyze the secondary structure, although it was not explicitly mentioned in the task query.
The teamwork proceeds by a follow-up feedback provided by the “critic” agent about all the plan steps and functions which is concluded by the following statement.
Therefore, the positive feedback from the “critic” further supports the good performance of the planner in addressing all the critical steps required to accomplish the tasks.
The “assistant” agent then follows the plan by calling and executing the corresponding functions, starting with AA length calculation, until all the steps have been undertaken. The results show that all the inputs to the functions are properly identified and provided and the functions are executed without any error. The conditional statement included in the tasks is also correctly satisfied for each protein, that is the computations are conducted only if the sequence length is less than 128 and are emitted otherwise. For instance, for the protein with PDB id “1hz6” the AA length is returned as 216 by the “assistant” which is then followed by the following message from the “critic”
After completion of all the tasks, the assistant returns a summary of all the results for each protein as representatively shown below for PDB id “1wit”:
The results suggest that the framework effectively retains all outputs, demonstrating its strong memory even in the face of diverse and extended results. As the last round of conversation, we ask to save all the results which allows us to load them at later time for other purposes:
In response, the planner suggests to call the python function “save_to_csv_file”. The main task here is to generate the dictionary of results in JSON and with appropriate structure as instructed by the user. However, we see that upon generating the JSON data and inputting it into the function by the “assistant” agent, the following error occurs:
Without any human intervention, the agent team is able to resolve the issue by mutual correction. In particular, the “critic” identifies the cause of error by writing.
Guided by the feedback from the critic, the “assistant” then reconstructs the JSON file from the output results and is able to successfully execute the function and thus save the results in a csv file as shown in Table 2. The complete group chat records can be found in Table S2 of the ESI.†
Protein ID# | Amino acid length | Secondary structure | First 13 frequencies | CATH classification |
---|---|---|---|---|
1wit | 93 | [‘H’: 0.0, ‘B’: 3.23, ‘E’: 51.61, ‘G’: 3.23, ‘I’: 0.0, ‘T’: 13.98, ‘S’: 5.38, ‘P’: 0.0, ‘—‘: 22.58] | [4.3755, 5.0866, 5.5052, 6.7967, 7.908, 8.1947, 9.0166, 9.8528, 11.0632, 11.3968, 11.7355, 12.1279, 12.3498] | 2.60.40.10 |
1ubq | 76 | [‘H’: 15.79, ‘B’: 2.63, ‘E’: 31.58, ‘G’: 7.89, ‘I’: 0.0, ‘T’: 15.79, ‘S’: 5.26, ‘P’: 5.26, ‘—‘: 15.79] | [0.7722, 1.0376, 1.5225, 1.6534, 2.5441, 2.9513, 3.2873, 3.7214, 4.1792, 4.3437, 4.3908, 4.6551, 5.1631] | 3.10.20.90 |
1nct | 106 | [‘H’: 0.0, ‘B’: 4.08, ‘E’: 35.71, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 2.04, ‘S’: 21.43, ‘P’: 0.0, ‘—‘: 36.73] | [3.6644, 4.425, 6.5351, 6.7432, 7.1409, 7.1986, 9.0207, 9.2223, 10.3163, 10.7313, 11.5299, 11.6373, 12.5606] | 2.60.40.10 |
1tit | 98 | [‘H’: 0.0, ‘B’: 1.12, ‘E’: 35.96, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 6.74, ‘S’: 17.98, ‘P’: 0.0, ‘—‘: 38.2] | [5.5288, 5.9092, 8.2775, 8.6267, 9.3391, 9.8783, 10.1607, 11.451, 11.5896, 11.7052, 12.1498, 12.6082, 13.8622] | 2.60.40.10 |
1qjo | 80 | [‘H’: 0.0, ‘B’: 2.5, ‘E’: 40.0, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 8.75, ‘S’: 13.75, ‘P’: 0.0, ‘—‘: 35.0] | [3.8578, 4.4398, 5.4886, 5.7815, 6.6332, 6.9269, 7.2329, 7.6453, 8.2545, 8.3076, 8.6118, 8.7135, 8.8546] | 2.40.50.100 |
2ptl | 78 | [‘H’: 15.38, ‘B’: 1.28, ‘E’: 30.77, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 7.69, ‘S’: 19.23, ‘P’: 0.0, ‘—‘: 25.64] | [0.0386, 0.1161, 0.2502, 0.5921, 1.1515, 1.5257, 2.0924, 2.6793, 3.4292, 3.9289, 4.2172, 4.6878, 4.8022] | 3.10.20.10 |
The main outcomes of this experiment are as follows.
• This experiment exemplifies how multi-agent strategy enables us to go beyond the knowledge of pre-trained LLMs by retrieving new knowledge from external sources.
• This experiment shows an instructive collaboration between the AI–AI agents in problem-solving as well as error handling.
• Multi-agent systems enable us to provide feedback at various stages and pose follow-up questions throughout the process.
• The experiment highlights the failure of the model in extracting correct knowledge from external sources.As discussed above, the last point directly stems from the failure of the function responsible for knowledge retrieval. To circumvent this, two main strategies can be adopted: (a) we can guide the agents to provide feedback on the reliability of the generated content, although this may be limited by the constraints of the pre-trained language models' knowledge; (b) we can implement more advanced Retrieval-Augmented Generation (RAG) models. Since knowledge retrieval is not the main focus of this paper, we will leave this aspect for future work.
In this experiment, we formulate a complex multi-step task with the objective of comparing the two models based on various structural and physical features derived from the folded structures obtained through Chroma and OmegaFold2. We pose the following task through the “user_proxy” agent:
The “planner” then suggests the following plan.
At first glance, the plan seems to cover all the details necessary to accomplish the tasks included in the problem statement. However, the “critic” agent who is responsible for giving feedback about the plan spots a minuscule error in the saving part of the plan as follows:
The correction made by the “critic”concerning the sequence length underscores its notable proficiency in comprehending how diverse functions and parameters influence various aspects within the realm of protein design.
The “user_proxy” agent is then asked to confirm the plan. The “assistant” then takes the stage and starts following the plan by calling and executing the functions until all the steps have been undertaken. An overview of the work performed by the “assistant” is depicted in Fig. 3. At the end of the computations, the results are formatted into a JSON dictionary to fed into the “save_to_csv_file” function. However, an error related to the JSON dictionary format occurs when executing the function as follows:
Fig. 3 Overview of the multi-agent work to solve the complex task posed in Experiment II, Section 2.2. First the multi-agent uses Chroma to generate de novo protein sequences and then computes natural frequencies and secondary structures content for the generated structures. Next, from de novo AA sequences, the model finds the 3D folded structures using OmegaFold and finally computes the frequencies and secondary structure content for the protein structures. Finally, the results are saved in a csv file as shown in Table 3. The numbers represent the sequence in which the functions are executed within the workflow. |
The “critic” then steps in by making the following comment and suggesting a plan to fix the error as follows:
The critic makes the necessary corrections and suggest the corrected JSON dictionary for the “assistant” to execute the “save_to_csv_file” function. This time, the function is successfully executed and the results are saved into a csv file as shown in Table 3. At last, the “critic” gives an evolution about the whole process:
Protein number# | Amino acid sequence | Secondary structure (pre-fold) | Frequencies (pre-fold) | Secondary structure (post-fold) | Frequencies (post-fold) |
---|---|---|---|---|---|
1 | MIIINIKTENGLSIT | ‘H’: 13.33, ‘B’: 0.0, ‘E’: 46.66, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 14.16, ‘S’: 7.5, ‘P’: 0.0, ‘—‘: 18.33 | [2.0337, 2.8678, 3.3843, 3.6263, 3.9904, 4.5381, 4.8373, 4.8956, 5.1492, 5.4416] | ‘H’: 15.83, ‘B’: 0.0, ‘E’: 46.66, ‘G’: 2.5, ‘I’: 0.0, ‘T’: 14.16, ‘S’: 4.16, ‘P’: 0.0, ‘—‘: 16.66 | [1.8739, 2.1563, 2.7611, 3.1086, 3.8712, 4.0481, 4.3759, 4.6717, 4.8183, 4.9126] |
YNSDEKKLELKYTP | |||||
VKSPEDFKFPEDAK | |||||
ATISEVEYKGKKVI | |||||
KIDAKLYVSPDLSK | |||||
AKLTIEVNADISQE | |||||
EADKIIDEFIKLLES | |||||
LGNIKLKVTKDGN | |||||
KYTIEVE | |||||
2 | GSPLPRPPLSPEEQ | ‘H’: 61.66, ‘B’: 0.0, ‘E’: 11.66, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 7.5, ‘S’: 3.33, ‘P’: 3.33, ‘—‘: 12.5 | [0.0207, 0.1058, 0.1782, 0.4189, 0.49, 0.9015, 1.1832, 1.8257, 2.1212, 2.8726] | ‘H’: 62.5, ‘B’: 0.0, ‘E’: 11.66, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 6.66, ‘S’: 1.66, ‘P’: 4.16, ‘—‘: 13.33 | [0.0444, 0.1641, 0.3379, 0.5724, 0.765, 0.9568, 1.4306, 1.5344, 1.6834, 1.8099] |
EALRKKAQEKYNE | |||||
FVSKIKELLRRAAD | |||||
RVRRGEPVELIEKT | |||||
IKIGDYEYKIVATSP | |||||
EEAKELENLIKEMI | |||||
DLGFKPSKEFSDKL | |||||
VEAARLIREGRVD | |||||
EALRLLDEM | |||||
3 | APLDPDDLSAQLR | ‘H’: 57.50, ‘B’: 0.0, ‘E’: 13.33, ‘G’: 0.0, ‘I’: 4.16, ‘T’: 8.33, ‘S’: 3.33, ‘P’: 6.66, ‘—‘: 6.66 | [0.7546, 1.0836, 1.5026, 1.8874, 2.0844, 2.3192, 2.7975, 3.0199, 3.0669, 3.1382] | ‘H’: 61.66, ‘B’: 0.0, ‘E’: 15.0, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 8.33, ‘S’: 3.33, ‘P’: 1.66, ‘—‘: 10.0 | [0.5256, 1.0278, 1.1566, 1.2877, 1.5521, 1.9111, 2.1887, 2.4664, 2.734, 2.8731] |
AAIDELVRLGYEEE | |||||
VSKPEFIEALRLYA | |||||
LDLGLKEVVLRRVT | |||||
PAPASQPGVYTVE | |||||
DVTVDLEALRKQE | |||||
LSPEEQARLEKIRA | |||||
KYDEMLADPEFQA | |||||
LLDEVLARARAA |
The main findings of this experiment are summarized as follows:
• This experiment showcases a good example of multi-agent power in developing workflows to autonomously solve complex tasks in the context of de novo protein design and analysis.
• This experiment demonstrates how new physics can be retrieved from physics simulators and integrated into the model.
• The experiment shows the great capability of the “critic” agent in providing valuable feedback to other working agents at different stages of the problem-solving endeavor, further assisting the team of agents in handling possible errors without the need for human involvement.
The plots of the generated results in this experiment including the 3D folded structures are shown in Fig. 4. The full conversations can be found in Table S3 in the ESI.†
In this example, we task the multi-agent team with generating proteins based on their fractional content of the secondary structure and subsequently performing computational and structural analysis tasks. Specifically, in addition to secondary structure analysis and natural frequency calculations, as covered in previous examples, we instruct the team to compute the maximum unfolding force (maximum force in the unfolding force–separation curve) and unfolding energy (the area under the unfolding force–separation curve) for each generated protein. To accomplish the latter, we have equipped the multi-agent team with a custom function that utilizes a trained autoregressive transformer generative AI model, ForceGPT. In addition to maximum unfolding force and energy, the trained generative model is able to predict the entire unfolding force–separation curve based solely on the protein amino acid sequence. Furthermore, the model has the capability to perform inverse design tasks by generating protein AA sequences that yield desired unfolding behavior. Detailed information about the training of the model can be found in Materials and methods section4. The task given is:
Note that, as before, we do not specify any particular function or offer hints for selecting the appropriate function to accomplish the tasks. Instead, we empower the agents to formulate a plan, wherein they decide which functions to select and determine the input parameters. The planner outlines the following plan for the given task:
It can be seen that the planner demonstrates good performance in breaking the task into sub-tasks to be accomplished step by step. Moreover, it has identified and suggested the correct functions and corresponding input parameters for each sub-task. The plan is further supported by the “critic” who provides positive feedback as follows:
The multi-agent team then proceeds to execute the different steps outlined in the plan by calling and executing the functions. Specifically, the function ‘design_protein_from_CATH’ is executed with the appropriate ‘CATH_ANNOTATION’ for a specific protein structure design, as outlined in the plan. Following the generation of all proteins, the executions are followed by structural analysis and force and energy computations. It's noteworthy that the model exhibits good performance in restoring and memorizing the sequences of the generated proteins, which are essential for the force and energy calculations. Finally, the team successfully completes the task by computing the first 10 frequencies for each protein. An overview of the computations performed by the team of agents for this experiment is shown in Fig. 5.
Fig. 5 Overview of the multi-agent work to solve the complex task posed in Experiment III, Section 2.3. First the multi-agent uses Chroma to generate de novo protein sequences and structures conditioned on the input CATH class. Then using the generated protein structures, the natural frequencies and secondary structures content are computed. Next, the force (maximum force along the unfolding force-extension curve) and energy (the area under the force-extension curve) are computed from de novo AA sequences using ProteinForceGPT. Finally, the results are saved in a csv file as shown in Table 4. The numbers represent the sequence in which the functions are executed within the workflow. |
Given the complexity of the problem involving numerous computational tasks, a decent number of results have been generated in the first round of the conversation. In the next round, to evaluate the team's ability to memorize and restore the results, we present the following task:
In this task, we not only request the team to save the data but also require them to adhere to a customized format when storing the results. The model is proficient in creating a JSON dictionary that satisfies the specified format and saving the results to a CSV file, as illustrated in Table 4.
Protein name# | AA sequence | Secondary structure | Unfolding energy | Max force | First 10 frequencies |
---|---|---|---|---|---|
mainly_alpha_protein_1 | SMKKIEDYIREKLKA | ‘H’: 89.0, ‘B’: 0.0, ‘E’: 0.0, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 4.0, ‘S’: 1.0, ‘P’: 0.0, ‘—‘: 6.0 | 0.381 | 0.444 | [0.2329, 0.4901, 0.9331, 1.3741, 1.7347, 2.1598, 2.3686, 2.6359, 2.8555, 3.0364] |
LGLSDEEIEERVKQL | |||||
MEGIKNPKKFEKEL | |||||
QKRNDRESLLIFKEA | |||||
YALYEASKDKEKGK | |||||
KLINKVQSERDKWE | |||||
TEQAEAARAAAAA | |||||
mainly_alpha_protein_2 | MSKKEIEELKKKLDE | ‘H’: 89.0, ‘B’: 0.0, ‘E’: 0.0, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 5.0, ‘S’: 0.0, ‘P’: 0.0, ‘—‘: 6.0 | 0.376 | 0.536 | [1.6126, 2.0783, 2.3073, 2.4565, 3.399, 3.475, 4.1377, 4.7104, 4.8864, 5.2187] |
IVETLKEYARQGDD | |||||
ACKKAADLIEEVKKA | |||||
LEEGNPEKYSQLKKK | |||||
LTDAINKAIEEYRKR | |||||
FEAEGKPEEAQKVID | |||||
KLKKILDEITN | |||||
mainly_beta_protein_1 | TTVTVTPPVADADG | ‘H’: 0.0, ‘B’: 0.0, ‘E’: 64.0, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 10.0, ‘S’: 6.0, ‘P’: 0.0, ‘—‘: 20.0 | 0.462 | 0.533 | [1.2806, 1.5057, 1.9846, 2.1025, 2.4723, 2.702, 2.9931, 3.1498, 3.4432, 4.1685] |
NEHSTVTAYGNKVT | |||||
ITITCPSNCTVTETV | |||||
DGVAKTLGTVSGNQ | |||||
TITETRTIAPDEVVT | |||||
RTYTCTPNASATSSK | |||||
TQTVTIKGSQPAP | |||||
mainly_beta_protein_2 | SLKAKNLEEMIKEAE | ‘H’: 58.00, ‘B’: 0.0, ‘E’: 8.0, ‘G’: 6.0, ‘I’: 0.0, ‘T’: 8.0, ‘S’: 4.0, ‘P’: 3.0, ‘—‘: 13.0 | 0.371 | 0.548 | [2.8864, 4.3752, 4.5928, 4.8295, 5.0854, 5.5618, 5.8646, 6.007, 6.3847, 7.1246] |
KLGYSRDEVEKIINE | |||||
IRDKFKKLGVKISEK | |||||
TLAYIAYLRLLGVKID | |||||
WDKIKKVKKATPAD | |||||
FRVSEEDLKKPEIQKI | |||||
LEKIKKEIN | |||||
alpha_beta_protein_1 | APTVKTFEDTINGQK | ‘H’: 15.0, ‘B’: 0.0, ‘E’: 59.0, ‘G’: 3.0, ‘I’: 0.0, ‘T’: 12.0, ‘S’: 1.0, ‘P’: 0.0, ‘—‘: 10.0 | 0.424 | 0.535 | [2.4383, 2.5651, 3.3175, 3.8231, 3.9673, 4.2655, 4.6393, 5.1509, 5.6023, 5.9555] |
VTVTVTASPGGKITI | |||||
KTSPGYGDEVAKAFI | |||||
EELKKQNVLESYKVE | |||||
SAPGKETTISDVKVK | |||||
SGATVTFYVINNGKK | |||||
GKEYSVTVDA | |||||
alpha_beta_protein_2 | MELKVTEKKGKGDY | ‘H’: 35.0, ‘B’: 0.0, ‘E’: 29.00, ‘G’: 0.0, ‘I’: 0.0, ‘T’: 3.0, ‘S’: 12.0, ‘P’: 3.0, ‘—‘: 18.0 | 0.376 | 0.543 | [2.8756, 3.8895, 4.0594, 4.2831, 4.5542, 5.171, 5.3661, 5.4312, 6.1964, 6.3066] |
KVKVIELNTPDKRYII | |||||
IESDASRESLIKAAEA | |||||
LLQGKEVEPTPVNEK | |||||
NVVLFEDEDVKTSIE | |||||
RSKKLFKSDNPEENI | |||||
KKALEYLLK |
The plots of the obtained results are shown in Fig. 6. The results indicate that Chroma has done a poor performance in creating β-rich protein named mainly_beta_protein_2 which its structure is dominant in α-helix. As an attempt to test the capability of the multi-agent model in analyzing the results, in the last round of the conversation, we ask the model to assess Chroma's performance in generating the proteins conditioned on the secondary structure by posing the following question:
The “critic” agent conducts a thorough evaluation of Chroma's performance in generating proteins with targeted secondary structure content. Through a detailed analysis of each CATH structure, it reveals the inherent strengths and weaknesses in Chroma's capabilities. Specifically, addressing the limitations of Chroma's performance, the critic's evaluation provides the following observations for the mainly beta proteins:
The main findings of this experiment are as follows.
• This experiment showcases another successful application of multi-agent collaboration in designing proteins that can possess targeted secondary structure.
• This experiment underscores the power of the multi-agent system in retrieving new physics from pre-trained sophisticated transformer-based models.
• This illustration not only highlights the multi-agent model's proficiency in performing computational tasks but also underscores its intelligence in handling intricate data analyses—an aspect traditionally reserved for human.
• The experiment highlights a potential problem with multi-agent systems: the generation of undesired content, in this case, proteins that do not possess the intended secondary structures.
In the next experiment, we will propose a strategy to effectively circumvent the issue pertaining undesired data generation.
The full conversations for this experiment can be found in Table S4 of the ESI.†
Before proceeding to the results, we should mention that we have slightly modified the profile of the planner to mitigate the approximations involved in generating CATH-conditioned proteins as observed in Experiment III. In more detail, we have added the following instruction to the planner's profile to ensure the generated proteins possess the targeted secondary structures:
If the plan involves using “design_protein_from_CATH” function, if the secondary structure does not meet expectation, you should re-design.
We pose the following task via user_proxy agent.
The planner then proposes the following plan and asks the Critic to approve the plan.
The above plan includes key features, indicating that the system has grasped two main aspects of the solution to the problem: (a) without explicit instructions, the planner decides to design three proteins with varying secondary structures using the ‘design_protein_from_CATH’ function; (b) unlike in previous experiments, the planner now takes an additional step to ensure that the secondary structures of the proteins meet the expectations.
Upon receiving approval from the Critic, the implementation of the plan commences, with functions being executed sequentially. Initially, two proteins with CATH_ANNOTATION values of 1 and 2 are generated, and their secondary structures are analyzed, revealing a high percentage of alpha-helix and beta-sheet, respectively. However, an unexpected error related to memory occurred during the execution of the third protein, which has a CATH_ANNOTATION of 3 (mixed protein):
In response, the Planner suggests retrying the function. However, the memory issue persists, and the Planner returns the following:
Since the problem persists, the Planner smartly revises the initial plan and proposes an alternative that omits the generation of the mixed protein, effectively bypassing the memory issue.
The plan implementation proceeds by calling “calculate_force_energy_from_seq” function and using the sequences of the previously designed proteins as the input. The function returns the force and energy values and the results are then saved into a csv file using the function “save_to_csv_file” which is executed successfully. Lastly, the “Planner” concludes the process by summarizing the results, reporting the main findings, and reminding the problem in designing the protein with mixed structure due to technical issues.
The main conclusions of this experiment are summarized as follows:
• The experiment highlights the capability of multi-agent modeling to create workflows that develop and execute research in protein science, potentially offering new scientific revelations.
• By issuing suitable directions, the model can be prompted to deliver more trustworthy outputs. This strategy is effective not only in decreasing the uncertainty in generative designs, as demonstrated here, but also in sidestepping errors due to hallucinations, a frequent issue in large language models.73,74
• LLM-based agents excel in responding to unexpected circumstances and devising alternative solutions, thus avoiding unforeseen errors.
To achieve this goal we constructed a group of agents, each assigned a unique profile through initial prompts, to dynamically interact in a group chat via conversations and make decisions and take actions based on their observations. The agents profile outlines their attributes, roles, and functionalities within the system and describe communication protocols to exchange information with other agents in the system. Our team of agents include a user_proxy to pose the query, a planner to formulate a plan, an function-backed assistant to execute the functions, and a critic to evaluate the outcome and criticizing the performance. We also use a chat manager to lead the group chat by dynamically choosing the working agent based on the current outcome and the agents' roles. Through a series of experiments, we unleashed the power of agents in not only conducting the roles they were assigned to, but to autonomously collaborate by discussion powered by the all-purpose LLM. For example, the agent playing the role of a planner successfully identified all the tasks in the query and suggested a details plan including the necessary functions to accomplish them. Furthermore, the agent assigned the critic role, is able to give constructive feedback about the plan or provide suggestions in case of failure, to correct errors that may emerge. Our experiments have showcased the great potential of the multi-agent modeling framework in tackling complex tasks as well as integrating AI-agents into physics-based modeling.
It is worth mentioning there are similarities between llm-based multi-agent systems as employed here to non-llm-based autonomous multi-agent systems.75–78 These agents operate with significant autonomy, drawing on previous experiences and data-driven insights to navigate complex, high-dimensional decision spaces. Whether orchestrating materials discovery workflows or engaging in multi-agent interactions, these systems exemplify how intelligent agents can reduce the need for direct human oversight and enhance efficiency in diverse fields such as materials science and virtual simulations. Furthermore, an intriguing aspect of multi-agent modeling is its compatibility with the concept of federated learning,79,80 forming a robust framework for managing distributed data and learning tasks. This integration enables the use of distinct agents embedded in different systems, which may be located in physically distinct locales and possess varying levels of data access. This setup not only enhances data privacy and security but also improves the system's adaptability and responsiveness to changing environmental conditions.
Multi-agent modeling is a powerful technique that offers enhanced problem-solving capacity as shown here in various computational experiments in the realm of protein design, physics modeling, and analysis. Given a complex query comprising multi-objective tasks, using the idea of division of labor, the model excels at developing a strategy to break the task into sub-tasks and then, recruiting a set of agents to effectively engage in problem solving tasks in an autonomous fashion. Tool-backed agents have the capacity to execute tools via function execution. We equipped an agent with a rich library of tools that span a broad spectrum of functionalities including de novo protein design, protein folding, and protein secondary structure analysis among others. The fact that there is no intrinsic limitation in customizing the functions, allows us to integrate knowledge across different disciplines into our model and analysis, for instance by integrating knowledge retrieval systems or retrieving physical data via simulations. For instance, here we utilized coarse grained simulations to obtain natural frequencies of proteins but the model offers a high flexibility in defining functions that focus on other particular area simulation (e.g. an expert in performing Density Functional Theory, Molecular Dynamics, or even physics-inspired neural network solvers37,81,82). Multi-agent framework can also accelerate the discovery of de novo proteins with targeted mechanical properties by embracing the power of robust end-to-end deep models solving forward and inverse protein design problems17,24,59,83–86
Developing these models that connect some structural protein features, such as secondary structure, to a material property, such as toughness or strength have gained a lot of attention recently. Here, we used a pre-trained autoregressive transformer model to predict the maximum force and energy of protein unfolding, but other end-to-end models could also be utilized. In the context of inverse protein design problems, a team of two agents, one expert in the forward tasks and the other in the inverse task, can be collaborated to assist the cycle check wherein the de novo proteins certainly meet the specified property criteria. Along the same line, one could benefit from the multi-agent collaboration in evaluating the accuracy of generative models in conditional designing of proteins or compare the created 3D structures with the state-of-the-art folding tools.16,87,88 For example, through an automated process of protein generation and structure analysis, our ProtAgents framework revealed the shortcomings of Chroma in designing β-sheet-rich proteins. In another example, the folded 3D structures of Chroma were compared with those obtained by OmegaFold2. All these examples, demonstrate the capacity of multi-agent framework in a wide range of applications in the context of protein design and analysis. Lastly, the model enables integrating various information across scales, whether new protein sequences or physics simulations output in form of rich data structures, for inclusion in easily readable file formats (like JSON) to be used by other agents or to be stored for future analysis.
Designing de novo proteins that meet special objectives in term of mechanical or structural properties present unique challenges calling for new strategies. The prevailing strategies often rely on developing data-driven end-to-end deep learning models to find the complex mapping from protein constitutive structure to property or vice versa. However, these models often focus on specific properties, limiting their functionality in multi-objective design purposes where several criteria needs to be met. To overcome these challenges and propel the field forward, future research endeavors could revolve around the development of an integrated system of agents designed to automate the entire lifecycle of training deep neural networks for protein design. Each agent within this system could be assigned specific responsibilities, such as data generation through simulations, data curation for ensuring quality and relevance, and the execution of the code required for model training. Additionally, a critic agent could monitor and critique the training process, making decisions like early stopping or tuning hyperparameters to enhance the model's accuracy. This collaborative and automated approach would not only streamline the design process but also contribute to achieving higher or desired levels of accuracy in the generated models. Furthermore, this agent-based strategy can extend to on-the-fly active learning, where agents dynamically adapt the model based on real-time feedback, improving its performance iteratively.
Multi-agent modeling has the potential to transform the landscape of de novo protein design, enhancing efficiency, adaptability, and the capacity to meet diverse and complex design goals, thereby establishing a new paradigm in materials design workflows. To fully realize the transformative potential of multi-agent modeling in de novo protein design, it is important to address the gap in evaluation methodologies that currently exist in the field. A crucial element in the development of AI models involves evaluating their performance, typically done using well-established benchmarks. However, current benchmarks for assessing AI models do not yet incorporate multi-agent strategies and often rely on simplistic single-shot or multi-shot responses. Thus, developing a comprehensive benchmark specifically for evaluating multi-agent strategies in protein tasks presents an intriguing avenue for future research. The development of such benchmarks would greatly enhance the ability to evaluate the success and applicability of LLM-based multi-agent systems in the field of protein science.
In our multi-agent system, the human user_proxy agent is constructed using UserProxyAgent class from Autogen, and Assistant, Planner, Critic agents are created via AssistantAgent class from Autogen; and the group chat manager is created using GroupChatManager class. Each agent is assigned a role through a profile description listed in Table 1, included as system_message at their creation.
Pre-training was conducted based on a dataset of ∼800000 amino acid sequences, using next-token predictions using a “Sequence” task (https://huggingface.co/datasets/lamm-mit/GPTProteinPretrained):
The ProteinForceGPT model was then fine-tuned bidirectionally, to predict mechanical properties of proteins from their sequence, as well as sequence candidates that meet a required force-extension behavior and various other properties. Fine-tuning is conducted using a dataset derived from molecular dynamics (MD) simulations.91 Sample tasks for the model include:
Sample results from validation of the model are shown in Fig. S2.† We only use forward predictions for use in the agent model reported here.
Footnote |
† Electronic supplementary information (ESI) available: The full records of different conversation experiments along with additional materials are provided as supplementary materials. See DOI: https://doi.org/10.1039/d4dd00013g |
This journal is © The Royal Society of Chemistry 2024 |