Dustin P.
Patterson
a,
Ankur M.
Desai
ab,
Mark M. Banaszak
Holl
ab and
E. Neil G.
Marsh
*abc
aDepartment of Chemistry, University of Michigan, Ann Arbor, MI 48109, USA. E-mail: nmarsh@umich.edu
bMichigan Nanotechnology Institute for Medicine and Biological Sciences, University of Michigan, Ann Arbor, MI 48109, USA
cDepartment of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
First published on 20th September 2011
We evaluate a strategy for assembling proteins into large cage-like structures, based on the symmetry associated with the native protein's quaternary structure. Using a trimeric protein, KDPG aldolase, as a building block, two fusion proteins were designed that could assemble together upon mixing. The fusion proteins, designated A-(+) and A-(−), comprise the aldolase domain, a short, flexible spacer sequence, and a sequence designed to form a heterodimeric antiparallel coiled-coil between A-(+) and A-(−). The flexible spacer is included to minimize constraints on the ability of the fusion proteins to assemble into larger structures. On incubating together, A-(+) and A-(−) assembled into a mixture of complexes that were analyzed by size exclusion chromatography coupled to multi-angle laser light scattering, analytical ultracentrifugation, transmission electron microscopy and atomic force microscopy. Our analysis indicates that, despite the inherent flexibility of the assembly strategy, the proteins assemble into a limited number of globular structures. Dimeric and tetrameric complexes of A-(+) and A-(−) predominate, with some evidence for the formation of larger assemblies; e.g. octameric A-(+) : A-(−) complexes.
Fig. 1 A strategy for self-assembly of protein cages based on the symmetry imparted by the quaternary structure of the protein. In this case the assembly of a cubic cage based on a tetrameric protein is illustrated. |
The assembly of individual protein subunits into higher order (quaternary) structures is essential for the biological function of many proteins. The design of proteins that self-assemble into well-defined, higher-order structures is an important goal that has applications in synthetic biology and in material science.1–9 In both natural and man-made protein assemblies, new properties emerge that are not manifested in the individual protein building blocks but arise from the assembly as a whole.
In Nature complex biological structures are often assembled from repeating units of only one or two protein building blocks and their assembly is strongly guided by symmetry. Multimeric protein assemblies may be broadly classified as either extended (filamentous) or closed (cage-like) structures: examples of the former include collagen, fibrin, actin and tubulin,10–13 whereas the second group is represented by icosahedral virus capsid proteins,14–15 ferritins,16 the pyruvate dehydrogenase core,17 the GroEL/GroES chaperonin complex18–19 and the proteosome complex.20 In each case, the quaternary structure is integral to the biological function of the protein: thus the fibers formed by collagen reflect that protein's structural role in connective tissue; the assembly of proteins into a capsid is required to package and protect the viruses' genetic material; the hollow barrel-like structure adopted by the proteosome subunits sequesters the protease active sites from the cellular milieu and prevents degradation of undamaged proteins.
The remarkable diversity of structural and functional properties exhibited by proteins suggests that assembling them into new quaternary structures would be a promising avenue for the construction of novel, responsive biomaterials.21–22 For example, the conditions for assembly could be made dependent on general properties that influence protein:protein interactions such as pH, ionic strength and redox potential, initiated by binding metal ions or other ligands, or by a specific enzyme-catalyzed modification of the protein.
There are numerous examples in which proteins and peptides have been assembled through various strategies into extended structures as fibers, gels or two-dimensional networks.4,23–28 Such structures lend themselves to the design of materials with physical properties that are responsive to stimuli such as those mentioned above. However, we have focused on the de novo assembly of cage-like structures, for which there are far fewer examples.29–30 These could be useful for the encapsulation, delivery and release of therapeutic agents, or the construction of multi-enzyme complexes.
Fig. 2 Cage structures that could be formed by the assembly of KDPG aldolase trimers (red) equipped with complementary coiled-coil linker domains (blue and yellow). |
As linkers we designed two complementary sequences intended to adopt an antiparallel heterodimeric coiled-coil structure upon dimerization (Fig. 3). The coiled-coil motif is one of the best understood protein–protein interactions.33–34 Coiled-coils can be designed to be parallel or antiparallel, and either homodimeric or heterodimeric, furthermore the strength of the interaction between the coils can be modulated by varying the length of the peptides.35 Because they are short, they can easily be grafted onto the N- or C-termini of other proteins and do not generally interfere with folding or activity.
Fig. 3 Design of hetero-dimeric antiparallel coiled-coil linker domains. Each domain was designed to encompass six heptad repeats with an antiparallel orientation being enforced by a “knobs-into-holes” Ile-Ala packing in the hydrophobic core and electrostatic interactions between the ‘e’-‘e`’ and ‘g’-‘g`’ interfaces. |
The coiled-coil linker domains were based on the antiparallel designs developed by Oakley and Hodges.36–38 We considered an antiparallel orientation to be desirable because this should minimize steric interactions between the larger aldolase trimers. By using a heterodimeric design we aimed to exert some control over the assembly process because no higher order structures should form until proteins equipped with complementary coiled-coil sequences are mixed. As shown in Fig. 3, the coiled-coil linkers were designed to comprise 6 heptad repeats. The hetero-dimeric interaction was established by complementary electrostatic interactions between residues at the interfacial ‘e’ and ‘g’ positions. The antiparallel orientation was enforced by incorporating complementary “knobs-into-holes” packing of Ala and Ile residues at the hydrophobic ‘a’ and ‘d’ positions that could only occur in the antiparallel orientation.
Each coiled-coil-forming sequence was introduced at the C-terminus of KDPG aldolase followed by a His6 tag sequence to facilitate purification. Since one helical sequence is predominantly positively charged and the other negatively charged, we refer to these proteins as A-(+) and A-(−). Both proteins were expressed in E. coli as soluble proteins, although at lower levels than the wild-type enzyme. The purified proteins could be concentrated to ≥10 mg/ml without precipitation occurring and were stable for several days at 4 °C.
To confirm that the helical linkers did not result in miss-folding of the protein, the activity of the fusion proteins was measured. kcat for the unmodified KDPG aldolase was found to be 2.6 ± 0.3 s−1 where kcat for A-(+) and A-(−) were 2.3 ± 0.2 s−1 and 2.2 ± 0.2 s−1 respectively. A 1:1 mixture of A-(+) and A-(−) exhibited similar activity, kcat = 2.3 ± 0.2 s−1, indicating that the addition of the helical sequence and does not significantly perturb the structure of the aldolase domain.
We have also characterized the properties of the individual helical linkers (for details see supporting information), which could be expressed independently in E. coli. The isolated Helix-(+) was as expected unstructured in solution, as judged by its C.D. spectrum. Surprisingly, isolated Helix-(−) exhibited a C.D. spectrum characteristic of a highly α-helical peptide and appears to be a dimer as determined by sedimentation equilibrium ultracentrifugation, which suggests that it forms a homo-dimeric coiled-coil. Although the isolated Helix-(−) was not intended to be dimeric, when attached to the aldolase subunit we saw no evidence for self-dimerization of A-(−), as described below. Therefore in the context of our design strategy, Helix-(−) behaved as intended.
Fig. 4 Analysis of A-(+) : A-(−) complex formation by SEC—MALLS. Detection is by refractive index (thin trace) and by MALLS (thick line). In all cases some tailing of proteins is observed. The number average molecular weight, Mn, of the eluted protein, calculated from MALLS, is shown plotted as a function of elution volume. A-(+) and A-(−) elute as monodisperse species as evidenced by the close correspondence of the MALLS and RI traces. The 1:1 mixture of A-(+) and A-(−) (bottom trace) clearly shows the presence of more than one species. Mn at the leading and lagging edges of the peak (marked by arrows) are consistent with the formation of a A-(+)2A-(−)2 tetramer and a A-(+)A-(−) dimer respectively. |
The number average molar masses, Mn, for A-(+) and A-(−) (calculated from the MALLS data) were 93.9 kDa and 102 kDa respectively. These values agree well with the calculated molecular weights of ∼90 kDa for the two fusion proteins. The rms radii of A-(+) and A-(−), determined by light scattering, are 7.2 nm and 6.8 nm respectively. These radii are larger than that of the un-modified aldolase (rms radius 5.5 nm) and reflect the expect increase in size due to the addition of the linker sequences to the fusion proteins.
Having established that the fusion proteins were well behaved, we investigated the ability of A-(+) and A-(−) to assemble into larger structures. 1:1 mixtures of A-(+) and A-(−) were incubated together for 2 h at 4 °C and their oligomerization state analyzed by size exclusion chromatography with MALLS/RI detection. The chromatograph for the mixture (Fig. 4) indicates that the samples are a mixture of species with decreased retention times, indicating formation of higher order oligomeric structures by A-(+) and A-(−).
A powerful advantage of MALLS/RI detection is that it allows the number-averaged molecular weight of eluting species to be analyzed as a function of peak cross-section and thus information on the species present can be obtained even if they are not cleanly resolved by the column. The molecular weight distribution plot of the chromatograph corresponding to the 2 h incubation indicates at least two major species are present. At the leading edge of the peak the eluting protein species is characterized by Mn ∼360 kDa and an rms radius of 12.3 ± 0.3 nm, whereas at the lagging edge Mn is ∼180 kDa and rms radius of 7 ± 1 nm. These molecular weights would correspond to the formation of an A-(+)2A-(−)2 hetero-tetramer and an A-(+)A-(−) hetero-dimer in the mixture. (To simplify nomenclature we use A-(+) and A-(−) to refer to the trimeric proteins, thus an A-(+)A-(−) hetero-dimer is understood to comprise a dimer formed between the two A-(+) and A-(−) trimers.) The decrease in Mn in between the two plateau regions is best accounted for by the overlap of two peaks of comprising the 360 kDa and 180 kDa species.
The sedimentation traces were subjected to enhanced van Holde-Weischet analysis39 to examine the distribution of sedimentation coefficients in each sample (Fig. 5). The unmodified aldolase trimer is extremely homogenous and is characterized by s20w = 4.6 S. The introduction of the coiled-coil domains introduces slight heterogeneity into A-(+) and A-(−), and the median s20w values for A-(+) and A-(−) increase slightly to 5.2 S and 5.8 S respectively. The observed heterogeneity may result from the inherent mobility of the linker sequences, although some self-association of the A-(+) and A-(−) trimers is also a possibility.
Fig. 5 Van Holde-Weischet [G(s)] plots of sedimentation coefficient distributions for sedimenting proteins calculated from sedimentation velocity ultracentrifugation data for the parent KDPG-aldolase (red circles); A-(+) (purple diamonds); A-(−) (green squares) and the 1:1 mixture of A-(+) : A-(−) (blue triangles). |
As expected, the 1:1 mixture of A-(+) and A-(−) is more heterogeneous and the van Holde-Weischet plot is characterized by wider range of sedimentation coefficients that are larger than those for either individual protein (Fig. 5). The median s20w = 10 S for the mixture is close to that expected for an A-(+)A-(−) hetero-dimer. There is also a significant concentration of species with higher sedimentation coefficients ranging up to ∼25 S, indicating that higher order oligomers are formed. Assuming that these species are roughly spherical, the sedimentation coefficient distribution spans the range expected for an A-(+)2A-(−)2 hetero-tetramer. At the high end, the plot also indicates that even larger protein complexes, possibly as large as octamers, are present. Overall, the ultracentrifugation data are consistent with the results obtained by SEC-MALLS.
Fig. 6 TEM images of A-(+) : A-(−) complexes after size exclusion chromatography. Left: representative field of view for proteins complexes eluting with apparent Mr ∼106–5 × 105 Da. Right: representative field of view for proteins eluting with apparent Mr ∼5 × 105–105 Da. Representative structures corresponding to individual A-(+) or A-(−) trimers are circled; structures with morphologies consistent with “collapsed” dimeric, tetrameric and octameric complexes are indicated by rectangles, triangles and squares. Samples were prepared by negative staining with uranyl formate. |
The individual subunits of the A-(+) and A-(−) trimers were resolved in the TEM image. The trimers are predominantly associated into discrete complexes with diameters of ∼10–20 nm. There was no evidence in any images for extended or fibrous structures. A range of particle sizes are evident, which is consistent with the results from size exclusion chromatography and analytical centrifugation. The process of depositing proteins on the grid and fixing the sample collapsed the open, cage-like structures that the A-(+) : A-(−) complexes are intended to adopt, so the 3-dimensional structures of the complexes cannot be inferred from the image. However, in a significant number of cases the number of A-(+)/A-(−) building blocks associated with the complex can be discerned; in particular, examples of complexes with two, four or higher numbers of building blocks are evident.
Fig. 7 Atomic force microscopy of proteins. Left 2-D plots of surface height; Right 3-D reconstruction of the same data. Top image of A-(−) adsorbed on mica; middle image of A-(−) adsorbed on mica; bottom image of a 1:1 mixture of A(−) and A(+)adsorbed on mica. The scale bar in the AFM images represents 2 μm for all images. Differences in the x–y dimensions of particles are tip-induced artifacts. |
It is unclear why AFM did not reveal more convincing evidence for the assembly of A-(+) and A-(−), since the AUC, TEM and MALLS data presented above clearly support the formation of multimeric structures. Our failure to observe larger protein complexes may reflect the method used to prepare samples for AFM, which involves using dilute solutions and extensive washing of the mica sheets. We observed that the unmodified aldolase protein adhered poorly to the mica surface, making it hard to image, whereas the modified A-(+) and A-(−) proteins adhere well and were readily imaged. This suggests that the individual coiled-coil sequences interact strongly with the surface, which is presumably not possible when they are mediating complex formation. Thus the complexes may adhere only weakly, and so are either removed during dilution and washing, or dissociate to individual A-(+) and A-(−) components.
Therefore, in our studies we wanted to test a fundamentally different approach to protein assembly that could be applied to a wide range of proteins; in the present case any protein that possesses a homo-trimeric quaternary structure. By introducing some flexibility between the aldolase building block and the coiled-coil linker domain we aimed to allow the components sufficient freedom to assemble into structures that are compatible with the symmetry inherent to the quaternary structure of the protein. Thus, rather than forcing a specific structure on the protein, our study asks the question: given the ability to assemble into higher-order structures, what structure(s) will form?
An important design criterion for creating protein assemblies is that it should be generally applicable to many proteins. The use of coiled coils as a minimal structural unit to link larger proteins together appears to fulfil this criterion well. The coiled-coil interaction is robust and easily modulated through electrostatic interactions; moreover, because they are genetically encoded protein complexes could, if desired, be assembled in vivo. The aldolase fusion proteins retained the same activity as the unmodified enzyme and the isolated proteins remained as trimers, as judged by SEC and AUC. Furthermore, using a heterodimeric coiled-coil pair affords a measure of control over protein assembly because neither A-(+) nor A-(−) form higher-order oligomers until they are mixed together.
There are many structures that could, in principle, form by oligomerization of the triangular building block represented by the KDPG aldolase trimer. A back-to-back dimer of A-(+) and A-(−) represents the simplest complex and tetramers, octamers and 20-mers (representing tetrahedral, octahedral and icosahedral complexes respectively) are the most highly symmetrical structures that could form. However, any closed structure comprising an equal number of A-(+) and A-(−) trimers would satisfy the valency rules imposed by the need for each positive helix to associate with a negative helix. Extended 1-D and 2-D networks of aldolase trimers could also potentially form. We conjectured that although many structures may, in principle, be compatible with our simple design rules, in practice relatively few will be stable, and that symmetrical structures that can form “closed shell” complexes would be favored.
We used several different experimental techniques to characterize the assemblies produced by mixing the A-(+) and A-(−) proteins. The different techniques require different methods of sample preparation, which may influence distribution of structures formed, however SEC-MALLS, TEM and AUC each provided data that were consistent with A-(+) and A-(−) forming relatively few structures, rather than a mixture of all conceivable structures. The exception was of AFM that provided less conclusive evidence for the formation of protein complexes, which, as noted, above may be due to the sample adsorption characteristics.
The most prevalent structure, for which there is good evidence from SEC-MALLS and AUC, appears to be a dimer of trimers that would form by the back-to-back association of one A-(+) and one A-(−) trimer. This is clearly the simplest structure that could form, and indeed it would be surprising if it were not observed. More interesting is that more complex structures are formed in significant amounts. In particular, the SEC-MALLS and AUC data are consistent with the assembly of a an A-(+)2A-(−)2 tetrahedral cage. AUC and TEM also point to the formation of some larger species as minor components. Formation of these larger structures would be expected to be entropically unfavorable with respect to forming a simple A-(+)A-(−) dimer. However, if forming a dimer imposes an unfavorable conformation on the protein that is relieved upon opening up of the structure to form tetramers or octamers, this would favor formation of these larger complexes.
We note that a tetrahedrally symmetrical A-(+)2A-(−)2 tetramer cannot form if the A-(+) and A-(−) trimers remain associated as homo-trimers because the correct coiled-coil interactions cannot form. However, if the individual A-(+) and A-(−) subunits equilibrate between trimers, so that hetero-trimers (comprising one A-(+) and two A-(−) subunits, or vice versa) are formed, then a tetrahedron can readily be constructed (Fig. 2). The “shuffling” of subunits between trimer, although hard to detect, may reasonably be expected to occur since they are non-covalently associated, and the monomeric form of the enzyme has been produced by a simple point mutation of the trimer interface.31
The extent to which the mixture of complexes formed is governed by kinetic or thermodynamic factors is currently unclear. Varying the speed of mixing did not seem to significantly alter the distribution of complexes formed, which might suggest that the distribution represents a thermodynamic mixture. However, ultracentrifugation experiments, as described above, found no change in the sedimentation profile of the complexes over a 5-fold range of concentrations. This result suggests the complexes are not in dynamic equilibrium and may be kinetically trapped. Attempts to thermally unfold and re-anneal the mixture of complexes were unsuccessful as aggregation and precipitation occurred on cooling (D.P.P. unpublished observations).
Ideally, any strategy for engineering self-assembling protein cages would aim for the formation of a single, well-defined structure. Although we have not achieved this aim here, we note that these studies represent only the first iteration of the design. We believe there is considerable scope for optimizing the system to adopt one of the limited number of structures that appear to be energetically favorable. Optimization of the coiled-coil interaction by, for example, altering the strength, length or orientation (parallel vs. antiparallel) of the coiled coils, and/or adjusting the assembly conditions, could be used to favor the formation of one complex over other complexes of similar kinetic or thermodynamic stabilities, resulting in a unique design solution.
Cell pellets were resuspended in ice-cold Lysis buffer (50 mM Tris-HCl, 150 mM sodium chloride, 10% glycerol, pH 8.0) containing 1 mM β-mercaptoethanol, 0.1 mM PMSF and lysed by sonication on ice. Cell debris was removed by centrifugation at 24,000 g for 15 min at 4 °C. The supernatant was decanted into another centrifuge tube and heated at 80 °C in a water bath for 30 min to produce a white cloudy precipitate. Precipitated material was removed by centrifugation at 24,000 g for 15 min at 4 °C. The supernatant was loaded onto a 5 mL column of Ni-NTA superflow resin equilibrated in buffer A (50 mM Tris-HCL, 300 mM NaCl, 2.5 mM imidazole, 1 mM β-mercaptoethanol, 10% glycerol, pH 8.0) at a flow rate of 0.5 mL/min using a Biologic HR FPLC (Biorad, CA)and the column washed with 30 mL of buffer A. Non-specifically bound proteins were eluted with a step gradient (20 mL at flow rate of 2 mL/min for each step) of increasing concentrations (7.5 mM, 27.5 mM and 52.5 mM) of imidazole in buffer A. Finally the His6-tagged fusion proteins were eluted with buffer A containing 500 mM imidazole. This yielded protein that was greater than 95% pure as judged by SDS-PAGE. Purified protein solutions were dialyzed twice overnight against 1.5 L storage buffer (50 mM Tris-HCL, 150 mM NaCl, 0.5 mM EDTA, 10% glycerol, pH 8.0) and stored at −20 °C. Protein concentrations were determined by UV absorption measured at 280 nm assuming a molar extinction coefficient ε280 = 12,900 M−1cm−1.
The fusion protein of aldolase with Helix-(+), A-(+), was found to bind nucleic acids tightly and so the purification procedure was modified slightly. To remove nucleic acids the protein was bound to Ni-NTA column as described above and then washed with 30 mL of denaturing buffer (50 mM, Tris, 6 M guanidium-HCl, 300 mM NaCl, 5 mM imidazole, 1 mM β-mercaptoethanol, 10% glycerol, pH 8.0). The protein was refolded on the column using a decreasing linear gradient of denaturing buffer with buffer A, (60 mL total volume, flow rate 0.5 mL/min) before being eluted with 500 mM imidazole.
GPC experiments were performed on an Alliance Waters 2695 separation module equipped with a 2487 dual wavelength UV absorbance detector (Waters Corporation), a Wyatt HELEOS Multi Angle Laser Light Scattering (MALLS) detector, and an Optilab rEX differential refractometer (Wyatt Technology Corporation). Three columns connected in series were employed to separate samples TosoHaas TSK-Gel G 2000 PW 05761 (300 mm × 7.5 mm), G 3000 PW 05762 (300 mm × 7.5 mm), and G 4000 PW (300 mm × 7.5 mm). The column temperature was maintained at 25 ± 0.1 °C with a Waters temperature control module. The columns were equilibrated in PBS pH 7.4, and samples eluted at 1 mL/min. The number average molecular weight, Mn, was calculated from the MALLS and differential refractive index data using Astra 5.3.14 software (Wyatt Technology Corporation).
Imaging was performed under tapping mode on a Nanoscope IIIa Multimode AFM (Digital Instruments, Veeco Metrology, Santa Barbra, CA) equipped with an “E” scanner. Veeco DNP-10 Non-Condensed Silicon Nitride tips (force constant = 0.06 N/m) were used for imaging under HEPES Buffer (10 mM, pH 7.0) filtered with a 0.2 μM filter. Typical scan sizes were 5 μm × 5 μm or 2 μm × 2 μm and were taken at 0.5–1 Hz scanning speeds. Images were processed and height distribution determined using the program Gwyddion.
Footnote |
† Electronic Supplementary Information (ESI) available. See DOI: 10.1039/c1ra00282a/ |
This journal is © The Royal Society of Chemistry 2011 |