Baishakhi
Tikader
a,
Samir K.
Maji
*b and
Sandip
Kar
*a
aDepartment of Chemistry, IIT Bombay, Powai, Mumbai – 400076, India. E-mail: sandipkar@iitb.ac.in
bDepartment of Biosciences and Bioengineering, IIT Bombay, Powai, Mumbai – 400076, India
First published on 3rd September 2021
Amyloid formation is a generic property of many protein/polypeptide chains. A broad spectrum of proteins, despite having diversity in the inherent precursor sequence and heterogeneity present in the mechanism of aggregation produces a common cross β-spine structure that is often associated with several human diseases. However, a general modeling framework to interpret amyloid formation remains elusive. Herein, we propose a data-driven mathematical modeling approach that elucidates the most probable interaction network for the aggregation of a group of proteins (α-synuclein, Aβ42, Myb, and TTR proteins) by considering an ensemble set of network models, which include most of the mechanistic complexities and heterogeneities related to amyloidogenesis. The best-fitting model efficiently quantifies various timescales involved in the process of amyloidogenesis and explains the mechanistic basis of the monomer concentration dependency of amyloid-forming kinetics. Moreover, the present model reconciles several mutant studies and inhibitor experiments for the respective proteins, making experimentally feasible non-intuitive predictions, and provides further insights about how to fine-tune the various microscopic events related to amyloid formation kinetics. This might have an application to formulate better therapeutic measures in the future to counter unwanted amyloidogenesis. Importantly, the theoretical method used here is quite general and can be extended for any amyloid-forming protein.
To analyse the aggregation mechanism successfully, one needs to identify all possible conformational states adopted by the polypeptide chain and the corresponding kinetic rates of the transitions among such states that eventually govern the temporal dynamics of the overall transformation.10 Experimentally, the characterizations of these transient oligomers are essential since some of these conformations are often found to be highly toxic, insoluble, and causative agent for deadly diseases.24,25 Thus, precise knowledge about the distribution of the population of these oligomeric species and identifying their interconversion rates during the aggregation process can be a crucial step forward to understand the pathogenesis of the protein deposition diseases holistically. In this regard, the mathematical modelling approach has emerged as an interesting tool to shed light on the different microscopic events happening at different time scales and dynamic features that are significant and insightful during the accumulation of amyloid.26–33
Previously, several experimental and theoretical studies have been put forward to disentangle various aspects of amyloid, but none of these studies have focused extensively on the internal kinetic progression of various oligomeric states, including structural reorganization-transition rate.34–36 Moreover, connecting these kinetic rates with each particular reaction event and an extension of this idea as a generic approach for a diverse range of proteins remains elusive in the literature. Additionally, in most of these models, the number of probable oligomers and the highest degree of the polymeric structure of any particular state, which is best compatible with the experimental data have mostly been assumed.28,37 An ODE-based mathematical model proposed by Pallitto & Murphy was extremely successful in delineating the different aspects of aggregation kinetics of Aβ42 protein along with estimating information about the filament/fibril length, which quantitatively corroborated with the experimental findings.30 However, the applicability of the proposed model was never examined for other proteins. Thus, it highlights the need for constructing a general modelling framework to elucidate the detailed mechanism of aggregation for various amyloid-forming proteins, which forms aggregates in diverse time scales.
Experimentally, it is a difficult task to track the entire kinetics of all molecular species that emerge during the process of fibrillation.24,38 Although the filamentous growth process varies from protein to protein, we must emphasize that most of these proteins pass through discrete elementary steps and ultimately reproduce a common structure by forming a linear, extended, unbranched filaments with a common cross-β backbone.39–41 Therefore, it can be speculated that a broad spectrum of proteins, despite having diversity in the interconversion mechanism may follow a common framework to produce amyloid fibril. Thus, it is imperative to develop a generic modelling strategy to understand the aggregation mechanism for a diverse range of proteins in a holistic manner.
Fig. 1 Principal steps of the protein aggregation that are involved in the proposed ensemble models for amyloid formation. (a) Schematic representation of various microscopic molecular mechanisms that are principally involved in different phases of the aggregation process of any amyloid-forming protein. (b) The proposed core interaction network consists of a different probable configurational variation of the intermediate species, which can be depleted in the amyloid formation process. The heterogeneity in different probable configurational states has been summarized here. Each distinct set of particular structures has been characterized by giving a different model name (Fig. S2a–f†). The various molecular entities involved in the generic model and all the molecular events associated with amyloid formation are described in (Table 1) for two different kinds of amyloid-forming proteins (for more details see ESI Section-3†). |
For simplicity, in our model, we mostly consider the primary nucleation event in the lag phase and secondary nucleation in the growth phase following many previous studies, though recent studies have suggested that the lag phase is also composed of a significant contribution from the secondary nucleation process.43,46 At the beginning of the aggregation process, we have assumed that the protein in solution exists in its native structural state.47,48 The initial state of (M1) represents unstructured or structured monomer, which is the aggregation competent unit in our model. The first important key mechanism in the fibrillation process is the formation of the nucleus, where the monomer (M1) can self-assemble to form a dimer (M2), this dimer can further assemble with the monomer to form a trimer (M3) and the process can extend up to tetramer or pentamer (Mx, x = 1 to 5) in a highly reversible manner (Table 1).
Here, it is important to mention that in our model, the dimer/trimer/tetramer means the formation of any differently sized higher-order oligomer from monomer in the process of aggregation. We further assumed that depending on the nature of the aggregating protein, one of these oligomeric forms (i.e., either trimer, tetramer, or pentamer) can transform into a critical assembled state,49,50 either directly or facilitated by higher-order structure (Table 1). It has been shown that the self-association process of monomeric protein not only promotes aggregation to form oligomers but also induce the assembly-dependent structural transition (either unfolding for natively structured protein or partially folding for natively unstructured protein) to form partially folded intermediates [(Iy, y = 1 to 4)], for example, in the case of α-synuclein (α-Syn) protein (Table 1).51,52
However, there are instances where such partially folded intermediates were not observed or identified during the experiment, as a result, suitable experimental data for such intermediate are not available. Under such circumstances, we defined these species as the minimum fibrillar unit (as we have not provided kinetic data for partially folded intermediates) for proteins such as Aβ(1–42) protein (Aβ42), cMyb TAD (transactivation domain) peptide (Myb), and transthyretin (TTR) protein (case-2, Table 1).53–55 Thus, we have classified the amyloid-forming proteins either belonging to (case-1) or (case-2) based on the available experimental data to study the aggregation kinetics for the respective proteins (for details see Table S1†). However, it is important to emphasize that the model structurally remains the same (Fig. 1b) but the interpretation of the species involved in the aggregation process gets modified depending on the kind of amyloid-forming proteins (Table 1). The partially folded intermediate themselves can further aggregate individually with the monomers (M1), or they self-assemble to give rise to higher-order partially folded intermediate (case-1, Table 1). In the same way, we can hypothesize that the minimum fibrillar units can produce higher-order fibrillar units for the case-2 (Table 1) proteins, respectively.
Furthermore, we have inferred that higher-order partially folded intermediates can conformationally transit to the minimum fibrillar unit to further elongate into a mature fibril (case-1, Table 1). Additionally, for (case-2) proteins, we hypothesize that the minimal fibrillar units eventually convert to higher-ordered fibrillar units to form mature fibrils (Table 1). Finally, the mature fibrillar state [(Bz, z = 1 to 3)] can be formed by following two different paths, either via consecutive association with the respective minimum fibrillar units or through the secondary nucleation process56–58 by sequentially binding with the monomers of the aggregating proteins (Table 1). Thus, in our proposed generic model (Fig. 1), we have introduced most of the mechanistic details of the aggregation process that can account for various aspects of amyloidogenesis. Herein, we have considered an ensemble of models (Fig. S2a–f†) based on the different probable degrees of aggregation of the monomeric units (M3, M4, M5) and elongation units (I3, I4). For example, in the case of Model-1A, we have considered that the monomeric unit can aggregate up to a trimeric unit (M3) and the highest order elongation unit can form (I3). Whereas, in Model-2F, we assumed that the monomeric unit could extend till pentamer (M5) and the elongation unit could form a tetrameric unit (I4) as the highest structural degree. In this way, all the ensemble models are generated systematically (for more details see Fig. S2a–f, ESI Sections-2 and 3†).
To identify the optimum interaction network for a representative protein, each distinct network (Fig. S2a–f†) has been translated into sets of ordinary differential equations (ODE) (ESI Section-4†) based on simple mass-action kinetics. A generalised common framework (Table 2) of the ODEs has been summarised by considering the ODEs of all the proposed model variants (see ESI Section-4†) to succinctly depict our proposed approach. Each model variable and kinetic parameters involved in the generic model are described in Tables S2 and S3, ESI Section-1,† respectively. We have given a detailed account of the interactions in ESI Section-3,† where we have even described how different aggregated forms of the proteins interact with the thioflavin T (ThT) as designated in Fig. S3.†
However, it is extremely important to get a reasonable estimate of the kinetic parameters involved in the proposed model to elucidate and investigate the inherent mechanism behind this heterogeneous aggregation process. Thus, each specific model has been optimized with the available experimental data set for a particular protein. The statistical parameter (χ2 and AIC (Akaike information criteria) value) obtained after data fitting for each model variant have been analysed and compared to distinguish the best-fitted model (Tables S5–S8†). We have used sophisticated parameter optimization software “Potterswheel”59 to fit and optimize our models with relevant experimental data using stringent statistical criteria (see Method section). To optimize our models, we have selected a series of experimental inputs for a group of amyloidogenic proteins/peptides (α-Syn, Aβ42, Myb, and TTR protein). The experimental inputs varied for different proteins, such as for α-Syn protein, we have focused on the normalized kinetic data of intermediate helix-rich oligomers combined with time-dependent thioflavin-T data of total aggregates reported by Ghosh et al.51 For Aβ42 and TTR protein, we have utilized only ThT fluorescence kinetic assay time-resolved data of amyloid formation.54,60 In the case of Myb protein, information about the amount of the initial and final secondary structures (α-helix and β-sheet) during the amyloid formation process, and the time profile of ThT fluorescence data of the overall kinetics as reported by the Gadhave et al.55 have been used. Selecting simultaneous experimental inputs for different proteins and attaining the desire interaction networks established the fact that our proposed canonical models can fit a diverse set of kinetic experimental data. Thus, our methodology to calibrate the proposed model with experimental data is quite advanced and it determines the kinetic parameters efficiently. The link for the core model network structure is available in sbml format in GitHub repository for the (https://github.com/baichandra05/Genericmodel_protein_aggregation) users to customize and fit any type of kinetic data set according to their preferences. A flowchart of the algorithm (Scheme 1) has also been included and detailed schematics have been provided in the ESI (Fig. S17†) to make the approach easy and understandable for the general readers. Thus, we have provided a general modelling framework to obtain a comprehensive understanding of the kinetics of amyloid formation that corroborates with various experimental observations for a diverse range of amyloid-forming proteins to dissect the underlying dynamical nature of amyloidogenesis.
We further evaluated the ability of these aggregation ensemble models to improve the identifiability of the estimated parameters. To test this, we have plotted the maximum and minimum variability (bounded area shown as a spring green cloud in Fig. 2) at each time point from the best 2% fitted model trajectories and analysed the parameter identifiability by visualizing the corresponding box-plot (Fig. S5–S7†). In the case of α-Syn protein, though simultaneously two diverse kinds of data sets (the kinetics of amyloidogenesis by performing experiments with thioflavin T, and the kinetic evolution of the total partially folded intermediate (Itotal) during the amyloid formation) have been introduced to calibrate the model, still, the bounded area from the best-fitted trajectories remains well constrained along with all the experimental data (Fig. 2a and b) by reproducing a set of identifiable parameters (Fig. S5†). This observation significantly reveals that any particular parameter set belonging to the best 2% fits for Model-1F can capture the experimental data adequately. Hence, any kind of predictions made by these best-fitted kinetic rate constants (Table S4†) using Model-1 will be highly robust.
Fig. 2 Self-aggregation kinetics of different amyloid-forming proteins emerge from the global fitting of experimental data using the best-optimized model variants. The bounded area (the maximum and minimum variability corresponding to each time point) of best 2% fits of the total ThT bounded aggregates (a) and % of total partially folded intermediate aggregates (b) obtained for α-Syn protein (300 μM) by fitting both the respective dataset simultaneously.51 The bounded area of 2% best-fits the total ThT bounded aggregates of (c) Aβ42 protein (10 μM), (d) Myb protein (20 μM), and (e) TTR protein (64 μM).53–55 In each case, the thick solid line represents the average of best 2% fitted trajectories taken out of ∼3000 fit sequences by fitting the experimental data with the best-fitted model. The black lines indicate the experimental data sets with error bars (the error bars designate standard deviation calculated with a standard error model in build within “Potterswheel” software).59 |
For the Aβ42 protein, as only one experimental data input has been utilized during the global fitting procedure, the bounded area from the 2% best-fitted trajectories is comparatively more constrained (Fig. 2c), which essentially makes most of the parameters highly identifiable (Fig. S6†). However, for Myb protein, the presence of some amount of secondary structure at the beginning of aggregation has been incorporated during the fitting. This created a greater deviation in the best 2% fitted trajectories (Fig. 2d) for some initial time points of the ThT experimental data leading to a greater number of non-identifiable kinetic rate constants (Fig. S7†). This indicates that the best-fitted parameter set for Myb protein will have lesser predictive power. As we are claiming that our approach can be globally applicable to a diverse set of proteins, next we chose a protein, TTR, which aggregates at a faster time scale and does not have any distinct lag phase.54 We found that Model-1D (Fig. S4e†) reproduces a set of highly robust and identifiable parameters (Fig. S8†), which is supported by the smaller variability among the 2% best-fitted trajectories (Fig. 2e).
Importantly, we have obtained additional insights about the most probable underlying network structure (Table 3) that eventually leads to such kind of kinetics. It suggests that for α-Syn protein, a pentameric random coil state (M5), forth-ordered partially folded intermediate (I4), and the higher-order fibrillar unit (B3) will be necessary to attain such kinetics of aggregation. The primary nucleation step involves a pentameric random coil state (M5) for Aβ42 and Myb proteins as well but preferred the (I3) state as the lower order fibrillar unit to aggregate. However, for TTR protein, the global fitting method automatically chooses the lowest possible configuration (M3) during the primary nucleation process, probably aligned with faster kinetics in the absence of the lag phase.
In summary, we have established that the particular model variants present in Table 3 are the optimal models for the respective proteins, which are consistent with the corresponding aggregation kinetics data. We found that the selected best model could not be further enhanced or reduced, but all the involved interactions are required to explain the complementary experimental data set. Thus, our study indicates that fitting the experimental data for any specific protein with one particular network model will not be sufficient to determine the best-fitted global model variant, as it ignores the configurational heterogeneity. Instead, fitting the experimental data to a set of aggregation ensemble models is extremely helpful to distinguish the appropriate aggregation network model and it improves the identifiability of the corresponding estimated kinetic parameters. Analysing the identifiability of the kinetic parameters for respective optimal models for any particular protein was necessary to decipher the important features and mechanistic insights of amyloid growth kinetics.
For example, in the case of α-Syn, we have obtained a constrained set of best-fitted parameters (Table S4 and Fig. S5†) for the corresponding model (Table 3) for 300 μM protein concentration. Our model predicts that the decrease or increase of monomeric concentration of α-Syn will significantly modify the lag time threshold by altering the slope of the sigmoidal growth kinetics (Fig. 3a). However, for α-Syn, the model reconciles the difference between the time courses of fibrillation in the high, medium, and low concentration regimes following experimental data.51 The model simulation performed at a very low total protein concentration reestablishes the fact that a minimum concentration is required to maintain the nature of the amyloid formation kinetics (Fig. S9†). In the case of Aβ42 protein, our model perceives the similar observation that altering the initial monomeric protein concentration will change the lag time as well as the slope of the sigmoidal curve (Fig. 3b). However, for Myb protein, the model simulation performed under varied total protein concentrations does not help much to predict the dynamics directly, since we considered the presence of secondary structure content only at the beginning of the aggregation process according to the experimental data.55 This observation suggests that there is no perceptible change in the slope of the growth kinetics, however, the associated lag time changes moderately (Fig. 3c).
Fig. 3 Model reconciles the concentration dependency of amyloid formation dynamics for different proteins. The best-fitted model (Table 3) predicted time courses of total ThT bound aggregates are shown for (a) α-syn, (b) Aβ42, (c) Myb, and (d) TTR proteins by varying the total protein concentrations, and keeping all other parameters same as given in (Table S4†). In each case, the green solid line indicates the best-fitted trajectory and the black dots represent the experimental data points with error bars. |
For TTR protein, though there is no existence of lag time, still, we performed the variation in the total monomeric concentration and found that the bounded ThT response was considerably higher or lower depending on the increase or decrease in the total TTR concentration (Fig. 3d). All these predictions from the model simulations substantiate the experimental observation to a greater extent.51,53–55 Overall, it can be concluded that the simulation analysis based on the best-fitted parameters disentangle the monomeric concentration dependence of the overall assembly process for the corresponding protein. Importantly, our model generates the qualitative dynamics of the aggregation kinetics at different concentration regimes even though we have fitted the most favourable model for a fixed total protein concentration in each case. At this point, we can speculate that the best-fitted models could capture the critical time scales associated with the aggregation kinetics efficiently.
Fig. 4 Different time scales segregate the kinetics of amyloid formation of α-Syn protein. Time scales for the different conditions have been calculated by simulating Model-1F by varying total protein concentration Ptotal keeping the parameters fixed provided in Table S4.† (a) Schematic of the estimation of the duration of the lag phase (τlag) and the duration of the time taken to reach half of the saturation level of the bounded ThT response (τ50) at any given total protein concentration. (b) A plot of the variation of (τlag) as a function of total protein concentration calculated from the model predicted simulated time-courses. (c) The model-predicted variation of (τ50) as a function of total protein concentration showed similar behaviour as observed with (τlag). A correlation plot (inset) of (τlag) vs. (τ50) shows a higher degree of linear correlation up to a certain total protein concentration. (d) A plot of duration of partially folded intermediate (τDI) as a function of the total protein concentration obtained by simulating the Model-1F reproduces similar observation (τlag) as and (τ50). (e) The scatter plot represents the correlation of (τDI) with (τ50) and (τlag) (inset) under various total protein concentrations. (f) A plot of θ [in %, mathematically expressed as where (τtotal = τlag + τDI) against total protein, concentration illustrates the effect of intermediate in shaping up the amyloid-forming kinetics for α-Syn. |
Intriguingly, it has been speculated experimentally that intermediate aggregates (defined as a partially folded intermediate in our model) may be a remarkable kinetic controller (say for α-Syn protein) of the fibrillation process.51 Herein, we want to interrogate whether or not these intermediates play an important role in influencing the kinetics of amyloidogenesis during the transitions from either lag phase to the elongation phase or elongation phase to the plateau phase. To inspect this, we define and quantify (τDI) (duration of partially folded intermediate, Fig. S10†) from our model predicted time profiles for different total protein levels. Our analysis reveals that (τDI) varies with the total protein concentration (Fig. 4d) in a similar fashion like (τlag) and (τ50). Moreover, there is a higher degree of correlation existing between (τDI) with (τ50) (Fig. 4e) and (τlag) (Fig. 4e, inset), where the respective profiles eventually saturate, once the total protein concentration crosses a certain threshold level.
To decipher the role of these time scales (τlag, τ50) and (τDI) precisely, we further defined a relative time scale, θ (Fig. 4f) in the context of α-Syn, which delineates the influence of partially folded intermediate aggregates in the transition from elongation to the plateau phase progressively with the variation in the total protein concentration. In the lower concentration regime (<400 μM), the abrupt decrease in θ (Fig. 4f) with a total protein concentration indicates the comparative steeper decrease in (τDI) over (τlag and τ50) and thereafter creating a smaller difference between (τtotal − τ50). This observation suggests that under low to moderate total protein concentrations, partially folded intermediate aggregates do influence the amyloid-forming kinetics in a significant manner. However, once the (τDI) value gets saturated beyond a threshold level of total protein concentration (Fig. 4d), the sharper decrease in (τ50) compared to (τlag) governs the relative increase in θ dynamics (Fig. 4f). This depicts that at a higher concentration regime, both primary nucleation events and partially folded aggregated intermediate synergistically control the temporal transition of aggregates.
Additionally, we have observed the dynamical evolution of different states of the α-Syn protein, which helps to dissect the aggregation kinetics from a microscopic viewpoint. Our analysis demonstrates that for α-Syn protein, the time-scale of the transformation of the unstructured monomeric unit to mature stable fibrils depends on temporal timing equilibrium existing between the conversions of random coil (RC) to intermediate (I) to mature fibril (B), which is protein concentration-dependent (Fig. S11†). At low concentrations (up to 250 μM), monomeric state decay very steadily and the relative accumulation of the intermediate is slow resulting in very slow kinetics. Whereas for higher total protein concentration, the random coil structure converts rapidly and the appearance and disappearance of intermediate aggregates are relatively fast, which leads to rapid advancement in the kinetic procession (Fig. S11†). Thus, the diverse nature of variation of all these time scales (Fig. 4) highlights the distinct effect of the evolution of different states (RC, I and B) that govern the dynamics of aggregation under different concentration domains.
Moreover, (τlag) and (τ50) associated with Aβ42 protein aggregation demonstrate a similar trend (Fig. S12†) with increasing initial monomeric protein concentration, as observed in the case of α-Syn. Interestingly, for Aβ42 protein, the dynamics of aggregation are found to be governed predominantly by the monomer-mediated fibril generation process (Fig. S13†) for any given concentration of the protein. The Myb protein kinetics remains almost independent of the initial monomer protein concentration and TTR protein does not have any lag phase, so the analysis based on these time scales is not applicable in the case of these two proteins. In conclusion, the time scale analysis reveals that (τlag) and (τ50) are significant parameters to characterize the kinetics of polymerization for an aggregating protein. However, for proteins, which have intermediate structures, two additional parameters (in the form of (τDI) and θ) described in this work and the analysis of timing equilibria between different states, provide a basis to dissect the kinetics of amyloidogenesis in a better way.
Fig. 5 Analysis of the effect of seeding on the aggregation kinetics for different proteins. (a) Enhanced rate of the formation of the amyloid fibril in the presence of initial seed added in the form of basic intermediate like aggregates (I1 = 1.0, I2 = 0.5, I3 = 0.5, I4 = 0.5) in comparison to the control simulation performed for the WT (300 μM) α-Syn without any added seed. (b) Model simulation with only isolated partially folded intermediate predicts that the kinetics of amyloid fibril formation of these aggregates will have no lag time in comparison with monomeric protein (WT). The numerical simulation is performed (qualitatively similar to the experiment carried out to isolate the basic intermediate by Ghosh et al.51) by taking the initial conditions as (I4 = 10, AC = 10, M5 = 10), which set up the total protein concentration as [(10 × 20) + (10 × 5) + (10 × 5) = 300 μM]. (c) The model-predicted dynamics of Aβ42 protein for WT (10 μM) and in the presence of seed (I1 = 0.1). (d) Comparison of the model-predicted trajectory of Myb protein for WT (20 μM) and seeded dynamics (I1 = 0.2). Here, the numerical simulation has been performed with the parameter set provided in (Table S4†) for the respective protein using the best-optimized model (Table 3). |
Intriguingly, our model predicts that even for Aβ42 protein, the addition of the preformed fibril can evade the slow rate-determining nucleation step during aggregation in comparison to the WT situation (Fig. 5c). Similarly, for Myb protein, adding the seed material at the beginning of the aggregation process speeds up the growth process (Fig. 5d). According to our numerical simulation, even the seeding operation performed for TTR protein remarkably changes the kinetic profile of aggregation (Fig. S14†), which is commensurate with the experimental observation.
To begin with, this seems extremely difficult to achieve in our modelling framework, as our kinetic model does not contain any amino acid sequence information for a protein. Generally, the propensity of different amino acids to form different structures, such as partially folded intermediate or β-sheet rich fibril, under a given condition varies significantly. Now, if we know the relative tendencies to form different states (random coil, intermediate, or fibril) by the mutated sequences, it is possible to adjust the corresponding kinetic rate constant and get qualitative features of the kinetics of the mutant protein compared to the WT protein. To establish this hypothesis, we have chosen some mutants for α-Syn protein, which show faster kinetics than WT [A53T (fast mutant-1), A53V (fast mutant-2)] and some mutants known to behave oppositely by elongating the amyloid formation kinetics [A53K (slow artificial mutant-1), A53E (slow mutant-2), A30P (slow mutant-3)] compared to the WT α-Syn protein.63,64
The mutants such as A53V and A53T show a greater aggregation tendency to form fibril-like aggregates but have a reduced propensity to form the helical intermediate due to structural changes induced by the respective point mutations.63 On the contrary, the other set of mutants (A53K, A53E, and A30P) demonstrates a reduced propensity to form intermediate as well as fibril-like aggregates.64 In our modelling framework, we deal with such kinds of structural information's from a dynamical perspective by relatively altering the corresponding reaction rates associated with such different individual reaction mechanisms compared to the WT protein.
By comparing the relative tendencies of the particular mutant and fine-tuning the specific reaction rate constants, the model reproduces the kinetic behaviour in the case of both faster (Fig. 6a) and slower mutants (Fig. 6b), which qualitatively agrees with the experimental observations.63,64 This investigation summarizes an important inspection that if the relative inclination of any particular amino acid at different stages of fibrillization is known a priori, then it is possible to predict the aggregation propensity qualitatively for the respective mutant from our proposed model. Thus, our study can be very helpful in the context of monitoring the kinetics of mutants from a modelling perspective, as, experimentally following the dynamics of the mutant protein is a challenging and time-consuming task.
Fig. 6 The simulated dynamics of different mutants obtained for α-Syn protein. (a) The model (Model-1F) predicted relative time courses for the fast mutants [A53T (fast mutant-1, where Ks = 1.5, Ke = 0.6) or A53V (fast mutant-2, where Ks = 2.0, Ke = 0.6)] of α-Syn protein concerning the WT. (b) The model predicted relative trajectories of the slow mutants [A53K (slow mutant-1, where Ks = 0.9, Kc = 150), A53E (slow mutant-2, where Ks = 0.75, Kc = 130), and A30P (slow mutant-3, where Ks = 0.62, Kc = 110)] of α-Syn protein in comparison to the WT. For both (a) and (b), all other parameters are the same as provided in Table S4.† |
Fig. 7 The model reproduces different dynamical features of amyloid formation kinetics. (a) The model (where, Kx1 = 0.1, Ks1 = 0.1) qualitatively mimics the dynamics of α-Syn aggregation in the presence of an inhibitor such as squalamine70 in a concentration-dependent manner (see ESI Section-4†). (b) The model simulated trajectory of Aβ42 protein by increasing the primary nucleation (PN) rate by (Kx) 2 times [1], 10 times [2]. (c) Alteration in the amyloid formation kinetics of α-Syn protein by simulating the model with increasing the primary nucleation (PN) rate (Kx) by 5 times [1] and 10 times [2]. (d) The model-predicted dynamics of α-Syn protein by enhancing the effect of positive feedback (SF) mediated through the intermediate structure by increasing the rate (Kac) by 2 times [1] and 10 times [2]. (e) Acceleration of the dynamics of Aβ42 protein when the best-fitted model simulated with increasing rate (Kac, 1000 times, [1]). (f) Increasing the monomer mediated growth process (PS) of intermediate oligomers of α-Syn leads to elevated ThT response. Here, (PS) is stimulated by increasing the association rate of the elongation process (Ki), 50 times [1] in comparison to the WT case. In each case, the solid line represents the fitted trajectory and the dotted line represents the model-predicted dynamics, the simulation has been performed by keeping all other parameters the same as given in Table S4.† |
Finally, it can be concluded that by investigating a series of dynamical features of amyloidogenesis, we have established the best-possible interaction network for aggregation mechanism for a range of proteins by applying our generic modelling strategy. Therefore, this model may be used as a basis for making new predictions, which can be further verified experimentally. Our study reveals that facilitating the primary nucleation process can essentially lead to two distinct scenarios for α-Syn and Aβ42 proteins. For Aβ42 protein, a little increment in the rate of primary nucleation stimulates the aggregation process in a faster way (Fig. 7b) by reducing the lag time. In contrast, our model predicts that accelerating the primary nucleation process will slow down the aggregation kinetics in the case of α-Syn protein (Fig. 7c), as it causes rapid monomer depletion, which disrupts the monomer dependent elongation and secondary nucleation process. This implies that for α-Syn (case-1) protein, the monomer-dependent elongation and secondary nucleation processes predominantly drive the fibrillation mechanism. The rapid accumulation of the nucleus caused by the enhancement of the primary nucleation rate fails to compensate for the depletion of the monomer. Whereas for Aβ42 (case-2) protein, primary nucleation event majorly contributes to shaping the kinetics of aggregation. However, these are our model predictions, which need to be validated via experiments.
Importantly, our model analysis unravelled that the positive feedback facilitated through the higher-ordered structure (highest-ordered intermediate (case-1) structure or the highest degree of lower-order fibril (case-2)) in our proposed model either effectively (for α-Syn protein (Fig. 7d) or moderately (for Aβ42 protein (Fig. 7e) influences the kinetics of aggregation. Increasing the effect of the positive feedback accelerates the kinetics for both α-Syn and Aβ42 proteins, suggesting that positive feedback via intermediate structures has a critical contribution to the fibril growth process. This further explains why the monomer-dependent elongation and secondary nucleation processes appreciably affect the α-Syn protein aggregation. The higher degree of positive feedback regulation further supports the total monomer concentration-dependent aggregation pattern of α-Syn protein, as observed in Fig. 4.
We have considered that the aggregation process is reversible process, where equilibria exist between different states of the protein (random coil, intermediate, or fibril), and the proportion of various aggregated states changes with the advancement of the reaction. Intriguingly, our proposed model indicates that if the equilibrium is somehow shifted towards the intermediate structures (partially folded intermediate for α-Syn (case-1) protein or lower-order fibrillar structure for Aβ42 (case-2) protein) by fine-tuning the corresponding reaction rates, then one can obtain an even higher bounded ThT response (Fig. 7f and S16†). Though this prediction seems quite counter-intuitive, however, it can be rationalized, as this process may promote the formation of active seeds (intermediate (case-1) or lower order fibril (case-2)) for these two proteins. This predicted kinetics made from our modelling setup can be verified experimentally if the partially folded intermediate associated with the α-Syn or lower-order fibril for Aβ42 protein can be stabilized during the fibrillation process.73 However, this observation is only our model speculation, which demands further experimental verification.
Indeed, we observe, for α-Syn protein that if the transition to critical assembled state (SP) and conversion rate to an intermediate structure (TE) are inhibited together, then it lengthens (τlag) and (τ50) effectively than inhibiting any other microscopic processes. Thus, our model essentially provides a qualitative rationale about how to design a drug for preventing amyloid formation in the case of α-Syn protein and predicts that the drug will be highly efficient if it can affect early microscopic events (SP, TE) individually or collectively. Intriguingly, a similar analysis for Aβ42 protein unravels that fibril growth is mostly contributed by primary nucleation (PN) and monomer-mediated elongation process (PS), and inhibiting these two processes together will affect the lag-time (τlag) and the (τ50) to almost the same extent (Fig. 8c and d). It depicts that the monomer should be targeted to prevent the aggregation process in the case of Aβ42 protein. We displayed that this investigation with the help of our modelling framework would provide mechanistic insights into how to reduce aggregation by substantially affecting (τlag) and (τ50). The present systematic analysis unveils the relative contributions of various microscopic events that are controlling (τlag) and (τ50) in the case of α-Syn and Aβ42 protein (Fig. 8), which can be extended for different proteins. Moreover, it figures out the important microscopic processes that can be targeted therapeutically by synthesizing novel drugs to prevent amyloidogenesis.
Our model provides kinetics interpretations, which help to qualitatively describe the concentration dependency of the aggregation kinetics (Fig. 3) and seeding experiments (Fig. 5). The significant role of the intermediates has been established by quantifying the duration of intermediates and correlating that with τlag and τ50 at different total protein concentrations (Fig. 4). The best-fitting model obtained for the WT proteins was able to explain the kinetics of mutant proteins in a qualitative manner, which adds a new dimension to the kinetic modelling approach (Fig. 6). In the case of most of the previous theoretical models (e.g. Pallitto & Murphy),30 very few approaches have been taken where the different states introduced in the model have been characterized by various microscopic kinetic parameters. In the present work, one of the primary goals is to analyse and dissect the kinetics of aggregation describing each emerging state and connecting these states to distinct microscopic rates. Thus, the model provides the emerging kinetic information, which helps to estimate the principal microscopic state affected by several inhibitors (Fig. 7). These intriguing investigations lead to several novel predictions (Fig. 7) and help to sort out important microscopic steps (Fig. 8) that have to be targeted for design drugs.
However, further modifications in the current method can make it even more useful and robust. For example, the aggregation phenomena can happen irreversibly, but all the reaction steps in the proposed model have been considered reversible. Moreover, protein aggregation can happen through a more complex set of molecular events than considered in our case (Fig. 1). The variation in model configurations for protein aggregation can vary up to any extent. Considering bigger models will lead to more kinetic parameters. Therefore, we need to develop a better computational modelling setup as well as accurate experimental data to make the model parameters indefinable. Additionally, the proposed generic modelling approach has not been explored to study molecular crowding effects. However, our model can be extrapolated by fine-tuning the rates associated with the primary nucleation and related to other molecular events to accommodate the effect of crowding environment. Currently, the entire method and model have been built assuming that protein aggregates in a homogenous medium. Our investigation can be further modified to include the effect of a heterogeneous environment on aggregation. In summary, it can be concluded that our modelling approach effectively decodes various mechanistic details that are crucially governing the amyloidogenesis for various proteins. We believe that our computational method and the results described in this work will provide important insights to develop precise therapeutic strategies by targeting different microscopic events during the fibrillation process and will find wide applicability.
AIC = (−2 × LL) + (2 × p) |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/d1sc03190b |
This journal is © The Royal Society of Chemistry 2021 |