Matteo T.
Degiacomi
and
Justin L. P.
Benesch
Department of Chemistry, Physical & Theoretical Chemistry Laboratory, South Parks Road, Oxford, OX1 3QZ, UK. E-mail: matteo.degiacomi@chem.ox.ac.uk; justin.benesch@chem.ox.ac.uk
First published on 23rd November 2015
We present EM∩IM, software that allows the calculation of collision cross-sections from electron density maps obtained for example by means of transmission electron microscopy. This allows the assessment of structures other than those described by atomic coordinates with ion mobility mass spectrometry data, and provides a new means for contouring and validating electron density maps. EM∩IM thereby facilitates the use of data obtained in the gas phase within structural biology studies employing diverse experimental methodologies.
A number of different algorithms,9–14 tailored to specific applications, have been written to calculate the CCS of a given three-dimensional structure, allowing the relation of IM measurements to structures derived from X-ray crystallography, NMR spectroscopy, or atomic modelling.15–17 These algorithms are however limited to taking a coordinate file (e.g. pdb, or xyz format) that specifies the position in space of each constituent atom as input, meaning that detailed comparisons with structures displayed as volumes (e.g. density maps obtained by means of transmission electron microscopy, EM18,19) have not been possible. Here we present EM∩IM,‡ a computational tool that allows the display and interrogation of EM maps from the standpoint of IM-MS data. This allows the user to relate data from the two experimental techniques directly: allowing both the calculation of a CCS from an electron density map, and the exploitation of IM data to augment the interpretation of EM data.
An electron density map is typically a three-dimensional grid, with each voxel having a certain density value. In general, such a map is displayed as a volume demarcated by an isodensity surface, which is generated by specifying a contour-level, the lower electron density threshold (ρ*) for a voxel to be considered occupied. The more stringent this threshold is, the fewer voxels match the electron density criterion, and the smaller the resultant volume. Furthermore, as the electron density is typically anisotropic,18 changing the threshold can result in different shapes. Yet, despite its importance, defining the appropriate threshold is difficult, and particularly so for low resolution maps (>10 Å), where secondary structure elements are not readily identifiable.20
Our fundamental premise in designing EM∩IM was to allow calculation of mass and CCS, two physical quantities obtained in an IM-MS experiment, from an electron density map. The former is achieved simply based on the number of voxels exceeding a given electron density threshold and the voxel volume, converted into a mass using a protein density (typically 0.84 Da Å−3 (ref. 21)). To determine the CCS of an electron density map, EM∩IM converts it into a coordinate file in which hard-sphere pseudo-atoms are centred in voxels if they satisfy the given electron density threshold criterion. This approach returns a bead model similar to those generated by SEDI, an algorithm designed to generate high-resolution isodensity surfaces for small molecules.22 This coordinate file is then used to calculate a CCS using IMPACT,11 called directly from within the program, and adjusted using an empirical scaling factor to facilitate comparisons with experimental data.23
Our approach therefore provides the framework for displaying an EM map in a way that is consistent with mass and CCS data. To realise this, upon loading a map, EM∩IM performs mass and CCS calculations at a wide range of thresholds. The user is thereby able to retrieve the map that best matches experimental mass and/or CCS, or to explore the electron density as a function of IM-MS observables. EM∩IM incorporates a graphical user interface that allows the visualisation of the electron density and display of appropriate graphs, all of which can be exported in a variety of file formats.‡
We used EM∩IM to determine the thresholds that reproduce the mass (801 kDa, solid blue line) and CCS (245 nm2, red) of GroEL as and , respectively (Fig. 1A, lower panel). These ρ* values are similar, indicating that, given the mass, the CCS can in this case be estimated from the electron density to good accuracy (245 nm2, dashed blue line). Comparing the electron densities, returned by filtering either according to mass or CCS, to the GroEL crystal structure (PDB: 1SS8) reveals excellent correspondence in both cases (Fig. 1A, inset). A similar analysis for β-galactosidase returns thresholds of and , based respectively on the known mass (465 kDa, solid blue line) and CCS (159 nm2, red) (Fig. 1B, lower panel). In this case, the ρ* values are very different: the electron density contoured according to mass corresponds to a very inaccurate CCS (743 nm2, dashed blue line) and gives a very poor fit to the crystal structure (PDB: 3IAP) (Fig. 1B, inset, left). Conversely, the density that is contoured according to CCS is in excellent agreement with the crystal structure (Fig. 1B, lower panel, right). It appears therefore that CCS is a more reliable means for obtaining a good electron density threshold than mass. This is likely due to mass being particularly prone to inaccuracies caused by regions of the protein not being well represented in the electron density, whereas the CCS is directly dependent on the demarcation of the molecular surface, rendering it extremely sensitive to noise in the electron density that appears outside the perimeter of the protein.
To capitalise on this sensitivity, we generated β-galactosidase maps at resolutions varying from 3 Å to 20 Å. For all resolutions, the CCS decreases rapidly as the threshold is increased, before reaching a plateau where it remains relatively constant, and then decreasing rapidly again (Fig. 2). The higher the resolution, the more “step-like” this trend appears, such that at 3 Å the CCS is largely invariant for the majority of the thresholds examined. By fitting a sigmoid function to the data we were able to determine the points of inflection (i.e. where the slope is least negative) for each resolution (Fig. 2, white circles). In all cases, these points of inflection occur at CCS values within 10% of each other and that calculated from the crystal structure (dashed line). Notably, the plots obtained for the different resolutions intersect with each other within a very narrow range, with the average intersection point (white square) occurring within 3% of the crystal structure CCS. Conversely, plots of mass versus threshold do not display similar features that might signpost the correct mass (Fig. S1†). These observations indicate that the CCS is an effective parameter for edge-detection within molecular volumes, and reveals potential routes for the coarse estimation of CCS from an EM map (and concomitantly determination of an appropriate threshold): either through determining the point of inflection within the trend of CCS as a function of threshold, by calculating points of intersection between plots obtained for down-sampled density maps.
Fig. 3 Improving the prediction of CCS from mass. (A) Plot of CCSX-RAY and CCS estimated from protein mass via, (Fig. 1), for 35 synthetic electron densities generated for a range of proteins of different size. The trend is linear, but with significant deviation from a 1:1 correspondence. (B) Examination of the data reveals that, at high resolutions, is smaller than , with the opposite holding true at low resolution. Fitting their ratio with the sigmoid function , allows for a correction in , and a resolution-calibrated CCS estimation, CCSEM. (C) When comparing to CCSX-RAY, an average error (dashed line) of 8.2% is obtained (A). The same comparison for the corrected prediction, CCSEM return a much reduced average error of 1.2% and all maps having errors <5%. Comparing CCSEM to CCSIM shows that the experimental CCS of GroEL is poorly predicted, reflecting the collapsed gas phase conformation relative to the solution structure.25 Without these known outliers (*), the average error is 4.5%. This demonstrates that a calibrated use of mass is an effective means for extracting a CCS from an EM density, and that this CCSEM is accurate enough to identify conformations differing between solution and gas-phase measurements. |
To examine the relationship between and in more detail, we computed the ratio for each of the 35 maps, and plotted it as a function of the electron density resolution. A clear trend is observed (Fig. 3B): at high resolutions (≲5 Å), we find that is typically smaller than (i.e.), whereas the opposite is true at lower resolutions (≳5 Å). This means that will be an overestimate of CCSX-RAY in the case of high resolution EM data, and an underestimate for low resolution EM data. To compensate for this phenomenon, we fitted the relationship between (Fig. 3B) to provide a means to rescale , and thereby obtain an improved estimate of CCS, CCSEM. Comparison of CCSEM with CCSX-RAY reveals a reduction in error to an average of 1.2% (Fig. 3C and S2B†). This error is less than the experimental uncertainties typical for CCS measurements,11,24 indicating that using a calibrated mass-defined threshold can lead to an acceptable CCS estimation as the basis for comparison between IM-MS and EM data. The scaling function, by virtue of being derived from a wide range of masses and resolutions, is general in its utility, however, the user could input alternatives derived from an appropriate calibration-set into EM∩IM to enable even lower error within a targeted window.
To test the selectivity of this approach, we compared CCSEM to published values obtained from IM-MS experiments,24 CCSIM (Fig. 2C). The average error is 4.5%, not including five outlying data points, all of which correspond to GroEL. This is in line with CCSIM and CCSX-RAY for this protein being known to differ, with the gas-phase conformation of GroEL being partially collapsed relative to that in solution.25 Our results demonstrate therefore that CCSEM is an informative measure, allowing the use of experimental CCS measurements to distinguish conformations different from those represented in a given EM density.
Fig. 4 Application of CCSEM to assessing experimental electron densities. (A) Examining the relationship between CCSEM and both CCSX-RAY and CCSIM reveals very large errors. The same comparisons after using a de-noising filter implemented in EM∩IM results in vastly reduced errors, reflecting the selectivity observed in the synthetic data (Fig. 3C). (B) Comparison of CCSEM and CCSX-RAY for five correct (top) and five incorrect (bottom) GroEL initial models generated using various EM single-particle analysis algorithms.29 All of the correct reconstructions gave low errors (blue, percentage difference indicated), whereas three of the incorrect reconstructions gave large errors (red). This demonstrates the CCS measurements could be an effective means for validating or rejecting 3D models generated during EM data analysis. |
We hypothesised that these errors arise from the presence of noise, a common feature of experimental density maps (e.g.Fig. 1B),18 not present in our synthetic maps considered above (Fig. 2 and 3). To address this challenge, we implemented a de-noising filter in EM∩IM, based on a DBSCAN clustering algorithm.26,27 The filter acts to identify contiguous regions in the bead model obtained at the given threshold, with those regions containing less than 1% of the total beads being discarded (Fig. S3†). When we computed CCSEM on these de-noised maps we obtained excellent results: all predictions were within 7% of CCSX-RAY (Fig. 4A). When comparing to CCSIM, errors <8% were obtained for both β-galactosidase maps, while the GroEL maps yielded errors >12%. This mirrors the selectivity observed for the synthetic data (Fig. 3C), consistent with the CCSIM of GroEL being incompatible with the conformation in solution.25
Given the accuracy of our approach, we considered whether IM-MS data could in principle be useful for validating structural models obtained from EM data, an area of outstanding interest in the field.28 This challenge applies not only to the final reconstructions, but also to the initial models, which are generated early in the refinement process and can bias the resulting data processing.29 We analysed a set of ten alternative GroEL initial models, five of which are correct reconstructions, and five incorrect.29 We computed the CCSEM of each model, and compared them to the CCS determined from the GroEL crystal structure (Fig. 4B). For each of the correct reconstructions, the discrepancy in CCS was ≲2.5% (upper panel). Conversely, three of the five incorrect models had an error ≳10%, identifying them as poor representations of GroEL (lower panel). This test demonstrates therefore that CCS measurements could constitute an independent means to filter alternative reconstructions generated from EM data.
Our work has highlighted how IM-MS and EM, though differing in the physical interactions between probe and molecule, are conceptually complementary techniques,4 a synergy that perhaps stems from both CCSs and EM reconstructions, broadly speaking, arising from the combination of orientationally averaged two-dimensional projections.30 We anticipate therefore that EM∩IM, and the approaches it enables, will be a useful addition to the growing list of hybrid methodologies that enable structural biology studies to capitalise on the benefits brought by employing multiple techniques.31,32
Footnotes |
† Electronic supplementary information (ESI) available. See DOI: 10.1039/c5an01636c |
‡ EM∩IM is written in the Python programming language, and can be run using a graphical user interface (GUI) or from the command line, in Windows, Linux/Unix, and Mac OS X, all available for download at http://EMnIM.chem.ox.ac.uk/, together with documentation for usage and installation. |
This journal is © The Royal Society of Chemistry 2016 |