Amresh Prakash*a,
Vijay Kumar*b,
Naveen Kumar Meenaa and
Andrew M. Lynna
aSchool of Computational & Integrative Sciences, Jawaharlal Nehru University, New Delhi-110067, India. E-mail: amreshprakash@jnu.ac.in
bCentre for Interdisciplinary Research in Basic Sciences, Jamia Millia Islamia, Jamia Nagar, New Delhi-110025, India. E-mail: vijay9595@st.jmi.ac.in
First published on 30th May 2018
The N-terminal domain of the RNA binding protein TDP-43 (NTD) is essential to both physiology and proteinopathy; however, elucidation of its folding/unfolding still remains a major quest. In this study, we have investigated the biophysical behavior of intermediate ensembles employing all-atom molecular dynamics simulations in 8 M urea accelerated with high temperatures to achieve unfolded states in a confined computation time. The cumulative results of the 2.75 μs simulations show that unfolding of the NTD at 350 K evolves through different stable and meta-stable intermediate states. The free-energy landscape reveals two meta-stable intermediates (IN and IU) stabilized by non-native interactions, which are largely hydrophilic and highly energetically frustrated. A single buried tryptophan residue, W80, undergoes solvent exposure to different extents during unfolding; this suggests a structurally heterogeneous population of intermediate ensembles. Furthermore, the structure properties of the IN state show a resemblance to the molten globule (MG) state with most of the secondary structures intact. The unfolding of the NTD is initiated by the loss of β-strands, and the unfolded (U) states exhibit a population of non-native α-helices. These non-native unfolded intermediate ensembles may mediate protein oligomerization, leading to the formation of pathological, irreversible aggregates, characteristics of disease pathogenesis.
After being ignored for almost six years since the discovery of TDP-43, the NTD has drawn significant attention recently; in the past few years, many studies have revealed that the NTD plays a role in the physiological functions of TDP-43 (ref. 14–18) as well as in its pathological aggregation and neurotoxicity.17,19–21 The solution and crystal structures of the NTD have been elucidated by NMR and X-ray crystallography. In 2014,14 Qin et al. reported that the NTD (residues 1–102) remains in both well-folded and unfolded structures in equilibrium, and the folded structure adopts an ubiquitin-like fold that binds single-stranded DNA. The first solution-structure of the stably folded NTD (residue 1–77) was solved by Mompean et al. in 2016.22 The structure of the NTD is a canonical β-barrel structure consisting of six β-strands and a short α-helix with a topology of β1β2 αβ3β6β4β5 (PDB ID: 2N4P). The same research group later published the solution structure of a longer NTD (residues 1–102; PDB ID: 5MRG) and described the atomistic model structure for a dimeric NTD observed in vitro.23 Very recently, Afroz et al. solved the crystal structure of the NTD for the first time and reported that NTD-mediated TDP-43 oligomerization results in the physiological oligomers of TDP-43.24
The NTD domain is highly conserved among eukaryotes25 and is stable and well-folded at pH 2.0 to 8.6 and temperatures of 5 °C to 40 °C, including physiological conditions.22,23,26 Many recent structural and functional studies of the NTD were performed on His-tagged constructs.14,18,22–24 These studies clearly showed that the NTD is stably folded and functional. Recently, Tsoi et al.26 also showed that NTD1–80 without His-tags adopts a similar fold to Mompean's His-tagged NTD1–77.22
Mompean et al.22 reported that the conformational stability of the NTD is highest at acidic pH and decreases with increasing pH. However, the protein becomes unfolded only at pH > 10.0 because of mutual repulsion of anionic residues (∼12 in number). Recently, Tsoi et al.26 showed that at low pH, the NTD becomes destabilized, which leads to its monomeric form. They reported the presence of multiple native thermodynamic states (i.e. N1 and N2) at physiological conditions and suggested that these two native states correspond to pH-dependent protonation–deprotonation of the single His 62. Several studies have reported an interesting dichotomy in TDP-43, where the NTD regulates normal TDP-43 structure and function16,24,27 while also participating in driving TDP-43 aggregation in TDP-43 proteinopathies.15–17,19 TDP-43 forms physiological and reversible oligomers mediated by the NTD, and these physiological oligomers may serve as precursors for pathological and irreversible protein inclusions. Thus, the physiological oligomers of TDP-43 form when the NTD is stable and well folded. However, unfolding of the NTD results in the formation of pathological inclusions.14,28 In this regard, biophysical behavior studies of folding dynamics and structural stability analysis of the NTD will be beneficial to understand the biological roles played by TDP-43 under physiological and pathological conditions.
High-temperature unfolding simulations are a useful system to study protein unfolding on computationally accessible time-scales.29 By starting from a relevant conformational state, these high-temperature MD simulations allow easier access to relevant intermediate states and can demonstrate that both native and non-native interactions contribute to stability. A large number of high-temperature MD experiments have demonstrated that the pathways of protein folding and unfolding significantly obey the principle of microscopic reversibility.30–32 Also, the unfolding pathways for the same protein do not depend on the force-fields utilized in MD studies.33 Indeed, the unfolding pathway has been demonstrated to be temperature-independent.30,34 Therefore, high-temperature MD simulations have been employed to investigate the unfolding pathways or stabilities of several proteins.35–39
To increase the rate of the unfolding process, both experimental and computational methods utilize high temperature, high pressure, low pH, or chemical denaturants (guanidinium chloride or urea). In the present work, we have employed all-atom MD simulations to obtain detailed insights into the unfolding and thermostability of the NTD in 8 M urea at 300 K, 350 K, 400 K, 450 K and 500 K. The results of the cumulative microsecond simulations indicate (i) the presence of stable intermediates (IN and IU) and the transition state (TS) along the unfolding pathway; (ii) the formation of a molten globule (MG) during an early unfolding intermediate, IN; (iii) conformationally heterogeneous IU stabilized by non-native contacts and hydrogen bonds; (iv) initiation of unfolding through disruption of β3; and (v) the presence of residual structures in the unfolded state. This study may provide further insights into the unfolding of the N-terminal domain of TDP-43, which contributes to protein aggregation and is thus implicated in pathological roles in neurodegenerative diseases.
(1) |
Here, we described the free energy map as a function of two order parameters, that is, the root mean square deviation (RMSD) and fraction of native contacts (Nc). We also constructed an additional free energy contour map as a function of the radius of gyration (Rg) and Nc to validate the analysis.
We analyzed the frustration in NTD during unfolding, and the minimally and highly frustrated contacts are depicted on the lowest-energy structure from the Boltzmann-reweighted structures from the MD simulations.
Control MD simulations were carried out in water and 8 M urea at 300 K for 500 ns, respectively (Fig. S1†). The results clearly showed that the NTD in both water and 8 M urea at room temperature was stable and close to its native structure without any unfolding propensity. In water, the Cα root mean-square-deviation (RMSD) did not deviate greatly and stabilized around a value of <1.0 nm; meanwhile, the radius of gyration (Rg), fraction of native contacts (Nc) and solvent accessible surface area (SASA) of the NTD also remained stable during the simulations. However, in 8 M urea, slight increases in RMSD, Rg and SASA were observed along with a slight decrease in Nc. Thus, to accelerate the unfolding kinetics, the simulations in the presence of 8 M urea were augmented with higher temperatures;30,39,52 the details of the conformational changes are summarized in ESI Table 1.†
As can be seen from the results, the NTD structure deviates greatly with increasing temperature, indicating progressive unfolding of the NTD. The shift in the probability distribution of RMSD toward higher values (Fig. 1a) and the shift toward lower values in Nc (Fig. 1b) clearly indicate loss of the native structure. At 350 K, the RMSD and Nc distributions reveal a distinct bimodal nature; this indicates the presence of two major states, native and non-native. The distribution at higher temperatures gradually shifts to a non-native state. Interestingly, the distributions also demonstrate the appearance of an intermediate state around an RMSD of ∼0.8 to 1.0 nm and an Nc of ∼50, which is largely populated at 350 K; this suggests that the transition occurs at this temperature. Thus, at 350 K, Nc displays higher fluctuations during the simulation, indicating the presence of several intermediate states between the folded and unfolded states (ESI, Fig. S2b†).
Alternatively, Rg remains relatively unchanged or slightly decreases with increasing temperature (Fig. 1c and S2c†), indicating that the protein assumes a compact molten globule structure upon the initial melting of the native structure. The transition from a compact molten globule-like state to a fully unfolded state (with a large Rg) occurs at much higher temperatures (450 to 500 K).
Fig. 2a shows the evolution of the side chain SASA of W80 along with the Cα RMSD during a 500 ns trajectory at 350 K. In the native state, W80 is solvent-exposed, with a side chain SASA of ∼70 nm2. At 350 K, W80 undergoes different extents of solvent exposure at different stages of unfolding, consistent with an experimentally observed decrease in fluorescence.22,26 The first transition of solvent exposure (SASA ∼ 80 to 95 nm2) is evident for a short time scale of 50 ns (from ∼80 to 130 ns) in a native-like ensemble with RMSD < 0.12 nm. This ensemble is referred to as IN hereafter. The side chain SASA of W80 increases as IN unfolds further on a longer time scale of 100 ns (∼200 to 300 ns). This partially unfolded intermediate ensemble (referred to as IU) is populated with an RMSD of 0.15 nm. The IU ensemble further unfolds to the unfolded state, U, with RMSD > 0.17 nm. This unfolded state displays the transition of W80 between partially buried and solvent-exposed states. The fluctuation of SASA in this state is larger than both IN and IU, suggesting that the environment around W80 is more malleable in the U state. Previous fluorescence experiments22 showed that solvent exposure of W80 is accompanied by a relatively small red shift in the λmax of W80 (from 328 to 350 nm); however, the reason for this shift is not clear. Here, we show that W80 in the U ensemble is fully solvent exposed in only a fraction of U molecules explored during the last ∼100 ns of the simulation (400 to 500 ns).
IU is a more structurally loose ensemble than IN and displays a loss of most of the Nc and secondary structures. Moreover, IU is much longer-lived than IN (in all temperature simulations), suggesting that most of the decrease in fluorescence intensity observed throughout the experiment is contributed by the IU ensemble. Further, W80 was also found to be transiently and infrequently buried in the IU and U ensembles, suggesting that the conformational ensembles are structurally heterogeneous and are populated at different stages of unfolding.
In addition to W80, a tyrosine residue (Y55) also shows transient fluctuations between partially buried and fully solvent-exposed states (ESI, Fig. S7†), suggesting structural heterogeneity within the IU ensemble.
The evolution of Rg (Fig. 2b) further suggests that the IN ensemble in which W80 is exposed is highly compact in nature, with Rg ∼ 1.30 nm, and remains native-like; this is also suggested by the RMSD (<1.0 nm). In contrast, the IU ensemble shifts slightly towards the unfolded side, with Rg > 1.70 nm and RMSD > 1.50 nm. We observed transient drops in the Rg and RMSD values, which suggests non-specific collapse of the structure. The evolution of Nc (Fig. 2c) further confirms that IN has a native-like structure (Nc ∼ 0.6), whereas the IU ensemble is more unfolded (Nc ∼ 0.35).
Interestingly, a comparison of the evolutions of the W80 side chain SASA and RMSD in higher temperature simulations (ESI, Fig. S3–S6†) displays a similar manner of unfolding, i.e. a compact molten globule native-like IN structure early in the unfolding process followed by a long-lived partially unfolded IU conformation and, eventually, the fully unfolded U conformation.
The results of the simulations presented in Fig. 2 and S3–S6† show that solvent exposure of W80 can occur on different time scales and is dependent on the temperature, suggesting that the reaction is non-cooperative and is not all-or-none. However, it remains to be determined whether this non-cooperativity is due to the presence of urea or increased temperature.
The evolution of the secondary structure was calculated using the DSSP program to obtain a detailed understanding of the structural changes during the unfolding process. Fig. 2d shows that the secondary structure of the NTD at 350 K remains fairly stable up to ∼100 ns, although the number of native contacts decreases to 60%. The unfolding process initiates with the disruption of β3 and β4 along with the loss of turns and bridges between these two strands. As unfolding continues, β6 is lost completely, followed by massive loss of the α-helix at the midpoint of the simulation. However, the N-terminal strands β1 and β2 remain mostly intact during the simulation.
The early unfolding steps at higher temperatures (450 K to 500 K) involve the sequential loss of β strands at the C-terminal, followed by loss of N-terminal strands and the α-helix (Fig. S3–S6†). Formation of the non-native helix was also evident during the simulation. Thus, the β-sheets were lost more quickly than the α-helix in urea at higher temperatures. These results are in good agreement with previously reported data.35,44
The contact map analysis implies that the loss of tertiary contacts during the unfolding pathway is gradual; however, the unfolding is initiated with weakening of the long-range contacts between β3–β6 and short range contacts between β4–β5 at 100 ns in the simulation (Fig. 3a). The contacts between β3 (52–56) and β-bridge residues (64–65) are also lost. Weakening of loop contacts (32–38) is observed in the IN ensemble and becomes more prominent in IU. Moreover, gains of some non-native contacts between the α-helix and β-hairpin residues (i.e. 40–70; 41–71) between the loop and β5 (i.e. 38–72) and the β-bridge and β4 (i.e. 64–68) are observed in the IN ensemble. However, long-range contacts between β1–β6 and between the α-helix and β5, as well as short-range contacts involving the N-terminal strand β1, β2 and the β-hairpin, remain intact (Fig. 3b); this indicates the preservation of this N-terminal core in the IN state.
During the next stage of unfolding, complete loss of long-range contacts between strands β3–β6 and short-range contacts between β4–β5 is clearly observed in IU. Furthermore, the contacts involving the α-helix and β5 are also disrupted completely. The contacts at both termini of the α-helix are weakened. Loss of many non-native contacts present in the IN ensemble is observed in IU, along with the formation of some new non-native contacts between the turn and β3 residues (i.e. 48–52 and 48–54). However, the long-range contacts between β1–β6, short-range contacts between β1–β2, and α-helical contacts remain intact (Fig. 3c), indicating that these N-terminal contacts are preserved in both the IN and IU ensembles.
The disruption of long range contacts between β1–β6 along with the appearance of new non-native contacts is observed in the transition state (∼300 to 400 ns) between the IU and U states. Interestingly, many short-range non-native contacts appeared between residues of β3 (52), β4 (67–68), bridge (37–38), loop (32–34), turn (47–50), and β5 (72), indicating the formation of a non-native unfolded transition state ensemble (Fig. 3d).
The IU ensemble further unfolds with predominant loss of the α-helix. However, the contacts involving β1, β2 and the β-hairpin remain intact, indicating the presence of a residual structure even in the unfolded conformation with Nc < 20% (Fig. 3e). The existence of a residual tertiary structure in the unfolded state may be due to the change in the environment of the aromatic residues (Trp, Tyr, and Phe) as the protein unfolding commences (see Fig. 2a and S7†). Also, a transient drop in Rg for the U ensemble is consistently observed, suggesting the presence of a collapsed structure in the U ensemble (Fig. 2b). The presence of residual structure in denatured states has been previously reported for many proteins.52–55
Furthermore, snapshots of the unfolding events at 350 K are shown in Fig. 4. The structural snapshot at 50 ns corresponds to a native (N) structure, where all the native tertiary contacts and secondary structures remain intact. Moving from the N state to IN at 100 ns, we can clearly observe disruption of the β3 and β4 strands; however, the N-terminal strands β1 and β2 and the α-helix remain intact. The conformation representing the IU ensemble (200 to 300 ns) is largely unfolded, with melting of the α-helix along with loss of β5 and β6. The structural snapshot of the transition structure (400 ns) shows further loss in the single α helix, along with the presence of non-native helical structures. The unfolded ensemble at 500 ns is characterized by the presence of largely unfolded and disordered structures. The unfolded state still contains the residual structure with intact β1 and β2, and the α helix is almost completely melted. Snapshots depicting the complete unfolding pathway of the NTD at a higher temperature (400 K) are shown in the ESI, Fig. S8.†
Fig. 4 Snapshots corresponding to the N, IN, IU, TS and U conformational states observed during the unfolding pathway of the NTD at 350 K. |
The protein then visits a much broader energy basin via a transition state; within this broader phase region, the protein explores two distinguishable minima separated by a very small energy barrier of <4.0 kJ mol−1. The first minimum (referred to as IU) is achieved from 200 to 300 ns of the simulation, with an RMSD of ∼1.50 nm and Nc of ∼0.35. IU is slightly swollen compared to the N state, with Rg ∼ 1.7. The second minimum (referred to as U) displays a largely populated phase space with an RMSD of ∼1.50 to 1.75 nm and the presence of ∼20% of native contacts.
Thus, the IN ensemble is a short-lived intermediate ensemble that is observed for 50 ns (80 to 130 ns) and is structurally compact (Rg ∼ 1.3 nm) and native-like (Nc ∼ 0.6) with most of the secondary structures intact but with a significant loss of tertiary contacts. These structural features are characteristic of the molten globule state. However, the IU ensemble is a looser structure (Nc ∼ 0.35; Rg ∼ 1.8 nm), with significant loss of secondary and tertiary structures. FEL at higher temperatures (400 K to 500 K) exhibited broad prominent minima with RMSD > 1.5 nm and Nc ∼ 0.1, representing the unfolded (U) state of the protein (Fig. S9†). For simplicity, we have shown two FEL contour maps in each row. The first column depicts the FEL calculated as a function of the Nc and RMSD pair, whereas the second column shows the FEL against Nc and Rg. The overlap between these two plots clearly suggests excellent agreement for the reported configurations.
The FEL involving the number of intraprotein hydrogen bonds and the number of native contacts at 350 K is shown in Fig. 6. Remarkably, the number of intraprotein hydrogen bonds decreases from ∼40 to ∼20 along the unfolding pathway as the number of native contacts decreases from ∼70% to ∼20%. Thus, the significant decrease in non-native hydrogen bonds suggests the absence of misfolded states during the unfolding of NTD. However, at higher temperatures (400 K to 500 K), intraprotein hydrogen bonds are retained as the protein becomes completely unfolded, indicating the persistence of non-native hydrogen bonds (Figure S10†).
Fig. 6 FEL of the number of intraprotein hydrogen bonds versus the number of native contacts at 350 K. |
Fig. 7 Pairwise distance distribution of representative hydrophobic contacts during the simulation at 350 K. |
As can be seen from the results, the contact map of N is essentially hydrophobic in nature, whereas the contacts in IU are predominantly hydrophilic in nature. The number of hydrophilic contacts in the IU ensemble is significantly higher than in the N and IN states, and the hydrophobic interactions are larger compared to the IN ensemble. In contrast, the contacts in IN exhibit equal numbers of hydrophilic and hydrophobic interactions. This result suggests that IU is stabilized by the presence of greater non-native hydrophilic–hydrophilic and hydrophobic–hydrophobic interactions. We therefore suggest that the stability of the intermediates causes NTD to be highly stable both thermodynamically and kinetically.
We employed the ‘Frustratometer’ web server to compute local frustration in the native structure and the intermediates. According to the principle of minimal frustration, contacts in the native state should be minimally frustrated, i.e. energetically favorable. It has been previously shown that most of the native contacts in globular proteins are minimally frustrated (∼40%), while 10% contacts are highly frustrated in the native state.63,67 These highly frustrated contacts map to functional sites and are thought to be evolutionarily conserved.68
We calculated the configurational frustration index of the NTD, as shown in Fig. 8a. It can be seen that the NTD is less minimally frustrated compared to typical proteins. The results show that among all the native contacts, 26% are minimally frustrated (Fig. 8a, green line), compared with 37% in a typical protein, as shown by Ferreiro et al.63 However, the minimally frustrated contacts form a cluster around the protein core, as is observed in most proteins. The protein core consists of four β-sheets (β1 to β4) and an α-helix. Importantly, the highly frustrated contacts comprise 5% of the total contacts, which is less than what is expected in a globular protein (10%) (Fig. 8a, red line). A single large highly frustrated contact cluster is found in the β-hairpin region (residues 55–65) between β3 and β4. This region is unique in the NTD, and the nearby β5 strand is linked to the rest of the protein through H-bonds and hydrophobic interactions. Taken together, the NTD has fewer frustrated contacts and fewer minimally frustrated contacts, suggesting a different energy landscape compared to typical proteins.
Fig. 8 Residual frustration analysis in the selected states of the NTD. (a) Local frustration is depicted on the native NTD structure (PDB:2N4P). The large cluster of minimally frustrated interactions (green) defines the core of the protein, whereas some highly frustrated interactions (red) occur on the surface of the protein. Local frustration in the (b) IN, (c) IU, and (d) U states depicting the large changes in frustration profiles. (f–h) Contact maps of the corresponding transitions of the NTD compared with the native state, along with their frustration profiles. The distinct non-native clusters in the intermediate states are circled in black. Within the β-hairpin region (residues 55–62), many frustrated contacts remain unformed in the intermediate states (orange box). (e) Energetically favorable residue pair contacts of NTD. |
The FEL of the NTD in 8 M urea at the transition temperature (350 K) indicates the presence of highly populated long-lived intermediates, IN and IU, during the 500 ns simulation run (Fig. 5a and b). The intermediate, IN, with Rg ∼ 1.3 nm is unusually compact, and the structure generally contains a three-stranded β-sheet core (β1, β2, and β6) (Fig. 8b). Meanwhile, the second intermediate ensemble, IU, contains a two-stranded β-sheet core (β1 and β2) (Fig. 8c). The IU ensemble displays a higher number of highly frustrated contacts (10%, red line) compared to IN (5%) and fewer minimally frustrated contacts (24%, green line) than IN (28%). The unfolded ensemble, U, displays more highly frustrated contacts (Fig. 8d).
To establish how the contact maps of the unfolding intermediates display residual frustration, we plotted the contact maps according to their frustration indices; these are shown in Fig. 8e–h. In the first intermediate structure, IN, residues of the α-helix (40–43) form many non-native contacts with β2 (28–29), β3 (55) and β-hairpin residues (57–59) (Fig. 8f, circled black dots) compared to the N structure (Fig. 8e). In addition, Val69 of β4 interacts non-natively with the residues of the loop region between β5 and β6 (80–84), Tyr 85 and Val 86 of β6. The residual local frustration reveals that these non-native contacts are minimally frustrated (Fig. 8b) and can stabilize the intermediate state through energetically favorable contacts that are not observed in the native state.
Similarly, the IU ensemble also displays the formation of non-native contacts between β1 (17–19) and Val69 (β4), Ile72 (β5) and Val87 (β6) (Fig. 8g, circled black dots). Also, β6 (87–88) interacts non-natively with the loop region (32–35) between β2 and the α-helix. These non-native contacts are minimally frustrated, imparting stability to the IU state. In addition to favorable interactions, IU also exhibits highly frustrated non-native contacts involving the loop region (32–39) and β-hairpin (55–65) (Fig. 8g).
Thus, whereas the core of the structure is stabilized by both native and non-native favorable interactions, the β-hairpin structure surrounding β3 and β4 does not show any particular native or non-native contacts in either IN or IU (Fig. 8f–g, orange box). Moreover, in this region, favorable interactions are absent and some highly frustrated interactions are found. We therefore expect that both the presence of local frustration and the rugged landscape of this β-hairpin region play important roles in the stabilization of the unfolding intermediates of the NTD. Indeed, this region behaves very similarly in the unfolding ensemble, as can be seen in Fig. 8h.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c8ra03368d |
This journal is © The Royal Society of Chemistry 2018 |