Cristina
Izquierdo-Lozano
,
Niels
van Noort
,
Stijn
van Veen
,
Marrit M. E.
Tholen
,
Francesca
Grisoni
* and
Lorenzo
Albertazzi
*
Department of Biomedical Engineering, Institute for Complex Molecular Systems (ICMS), Eindhoven University of Technology, 5612AZ Eindhoven, The Netherlands. E-mail: f.grisoni@tue.nl; l.albertazzi@tue.nl
First published on 16th October 2024
Super-resolution microscopy and Single-Molecule Localization Microscopy (SMLM) are powerful tools to characterize synthetic nanomaterials used for many applications such as drug delivery. In the last decade, imaging techniques like STORM, PALM, and PAINT have been used to study nanoparticle size, structure, and composition. While imaging has progressed significantly, image analysis has often not advanced accordingly and many studies remain limited to qualitative and semi-quantitative analyses. Therefore, it is imperative to have a robust and accurate method to analyze SMLM images of nanoparticles and extract quantitative features from them. Here, we introduce nanoFeatures, a cross-platform Matlab-based app for the automatic and quantitative analysis of super-resolution images. nanoFeatures makes use of clustering algorithms to identify nanoparticles from the raw data (localization list) and extract quantitative information about size, shape, and molecular abundance at the single-particle and single-molecule levels. Moreover, it applies a series of quality controls, increasing data quality and avoiding artifacts. nanoFeatures, thanks to its intuitive interface, is also accessible to non-experts and will facilitate analysis of super-resolution microscopy for materials scientists and nanotechnologies. This easy accessibility to expansive feature characterization at the single particle level will bring us one step closer to understanding the relationship between nanostructure features and their efficiency (https://github.com/n4nlab/nanoFeatures).
Despite these advantages, there has been limited success in translating nanoparticles to clinical applications. Since 1995, when Doxil became the first FDA-approved liposome-based nano-drug for cancer treatment,5 only 31 formulations have been clinically approved, including the emergency approval for the COVID-19 vaccine.6 The limited amount of approved nanoformulations shows that many open challenges still remain when translating the promising results of nanocarriers in vitro to a clinical application.7,8 Considering the vast number of nanoparticle formulations reported in the literature, this evidence highlights the urgent need for promoting the clinical approval of nanocarriers.
A current challenge is the lack of standardization in characterization methods, which results in a low degree of reliability and reproducibility in the nanomedicine literature.9 Properly characterizing nanoparticles is a fundamental step in their potential application, as their physical and chemical properties can differ vastly at the nanoscale. Morphological features, like size or shape, heavily influence the nanoparticle properties, leading to unforeseen behavior or even toxicity in the case of incorrect characterization.2,10 Currently available characterization techniques typically can only assess one property at a time, requiring the integration of multiple techniques for comprehensive characterization.2,11 Moreover, bulk measurement methods tend to average values and correspondingly mask the inherent heterogeneity of nanoparticles, which determines the effective distribution of nanoparticles to their site of action, and even their potential toxicity in biological media.10
Recently, super-resolution microscopy12 and, in particular, Single-Molecule Localization Microscopy (SMLM)13 have emerged as powerful techniques to improve nanoparticle characterization at the single-particle level. SMLM breaks the diffraction limit by computationally localizing individual fluorescent events separated in time and reconstructing super-resolved images based on these high-precision localizations. Therefore, SMLM images are created by stacking all the computed localization coordinates found in each frame of the acquisition movie, which allows researchers to visualize and analyze nanoparticles at the single-particle and single-molecule level with nanometer precision.13 This technique has already shown its versatility in various research fields from cell biology14,15 to material characterization,16,17 making it a powerful tool to address the challenges described above. For example, SMLM has been used to study and quantify protein corona formation on nanoparticles,18,19 which revealed their heterogeneous nature. Furthermore, by combining SMLM with Transmission Electron Microscopy (TEM), multidimensional information to describe this heterogeneity could be obtained.20 Moreover, SMLM has also allowed for the study of cell–nanoparticle interactions.21,22 Currently, research interest is shifting towards multiplexed images to correlate different modalities, which generate bigger datasets and complex data structures.23–27
A crucial side of SMLM is represented by data analysis, as it is the key to go from qualitative images to quantitative data. SMLM data analysis is based on: (1) processing the acquisition movie to obtain localizations, through the fitting of the point-spread function (PSF),28 (2) processing the localizations, for example, aligning or merging localizations,29 and finally (3) obtaining interpretable data30 that we can use to measure size and morphology, thanks to the high spatial resolution of SMLM (20–50 nm).13 Modular platforms like the Super-resolution Microscopy Analysis Platform (SMAP) allow the user to integrate many of the SMLM image analysis steps into single software.31 However, most of the available software resources are dedicated to biological imaging while materials and nanostructures lack tailored dedicated tools. A brief overview of multiple SMLM data analysis software can be found in ESI Table 3.† However most of these are meant for general image analysis, rather than a specialized characterization tool that runs on already processed localization data in batches. Therefore, these software solutions could complement the application presented in this work.
Here, we introduce nanoFeatures, an automated standalone MATLAB-based application dedicated to analyzing nanoparticles from SMLM datasets. The nanoFeatures app can process SMLM images, locate the nanoparticles, and automatically compute and display their features (Fig. 1). After simply uploading the raw datasets and setting the right parameters, the user will receive a list of the key features for each individual nanoparticle that can later be used in further analysis. By characterizing nanoparticles in a systematic and reproducible way, we aim to open the door for data-driven research, such as machine learning for property prediction or data mining. In what follows, we showcase an example application, whereby we analyse two different nanoparticle datasets: (1) dual color nanoparticles imaged with DNA-PAINT32 and (2) triple color nanoparticles imaged with exchange PAINT.23
The nanoFeatures general workflow is as follows (Fig. 2a):
(1) User input: several inputs are required from the user, which will determine the following steps of the analysis. The steps are as follows:
(a) File input. The user inputs the file, as specified in the User input subsection, containing the list of localization coordinates (ESI Fig. 1†). Multiple color channels can be input as a single file or as multiple files, corresponding to either simultaneous multi-color imaging (e.g. 2- or 3-color STORM) or sequential multiplexing (e.g. exchange-PAINT). To increase the throughput, users can select multiple files to be processed under the same settings in batch analysis.
(b) Input type selection. After inputting the file, the user must select the input type (ONI, NIKON, or ThunderSTORM), based on the microscope or software used to obtain the images.
(c) Workflow and parameter specification. Within the Graphical User Interface (GUI), users can specify the desired workflow for data processing. For example, when analyzing samples acquired sequentially, the user should activate the channel alignment option. Furthermore, users can define the parameters for filtering and feature extraction, and have the option to include the quantitative PAINT (qPAINT) analysis.33
(2) Read and pre-process file(s): once the input, workflow, and parameters have been defined, the data are pre-processed into a single list of localizations, containing all color channels.
(3) Channel alignment (optional) (Fig. 2b): if specified, the drift between different color channels is corrected before clustering.
(4) Clustering (Fig. 2c). The processed localizations are then used as the input of a clustering algorithm, which will identify potential nanoparticles, using the vicinity information.
(5) Quality filters (Fig. 2d): a series of quality filters are applied to exclude clusters of localizations not identified as single nanoparticles.
(6) qPAINT (optional) (Fig. 2e): quantifies the target ligands count based on the spatiotemporal response within each cluster.
(7) Generate output: finally, nanoFeatures will generate an output CSV file containing the features corresponding to each detected nanoparticle in the original SMLM image. This file, together with the plots prompted during the execution, will be saved in a “results” folder in the user's Matlab path.
In addition, SMLM images often have high background noise, leading to prolonged computing time for clustering analysis and complicating the automated identification of nanoparticles. For this reason, we advise pre-processing the image before using nanoFeatures, for example, by applying a density filter, as seen in ESI Fig. 2.† In this specific case study, we used the thunderSTORM plug-in for ImageJ,34 with the settings outlined in the Density filter subsection in Materials and methods. Moreover, a description of the datasets used for this case study can be found in the Datasets subsection in Materials and methods.
The following sections provide a detailed description of each of these steps.
(1) NIKON:35 TXT file format. The app reads the file as a table, identifying headers named “Channel_Name”, “X”, “Y” and “Frame”.
(2) ONI:36 comma-separated values (CSV) file format. The app reads the file as a matrix, considering the 1st column as the channel name, 2nd column as the frame number, and the 3rd and 4th columns as X and Y coordinates, respectively.
(3) ThunderSTORM:34 CSV file format. The app reads the file as a table, identifying headers named “Channel”, “Frame”, “x [nm]” and “y [nm]”.
Extension to further formats is envisioned in the future to follow the evolution of the different formats used in the community.
As an example, we showcase the workflow and parameter specification for the analysis of the file “210714_COOH_200_loc2_merge.csv”, which can be found in our online repository under the files “Datasets/dualColor/COOH_200nm_100dist_50neighbors”. The workflow and parameter selection can be seen in Fig. 3a–c and the results preview given by nanoFeatures within the GUI is seen in Fig. 3d.
For this reason, upon clicking the “Run” button and only if the “channel alignment” checkbox is selected, nanoFeatures prompts the user to input files containing the fiducial localizations for each respective color channel.
To correct the temporal drift, first, the fiducial localizations are clustered to obtain their centroid. Then, nanoFeatures finds the n-channels nearest neighbors within a limited distance, to avoid fiducial mismatching. Finally, the average drift distances for each fiducial match between sequential files are calculated, after which the coordinates are corrected and the files are merged.
To speed up computational time, the image is divided into nine sections according to their coordinates. Then these sections are processed simultaneously in nCores − 1 ≤ 9 parallel computing threads, running the DBSCAN algorithm,37 as described in Fig. 2c. This parallelization step significantly reduces the execution time, while preventing the app from potential collapse due to the generation of massive data structures required to link all localizations within a background-dense image (ESI Fig. 3†). Note that at least one core remains available for other computer processes.
DBSCAN requires two parameters: (1) the radius in which adjacent localizations are considered as neighbors, and (2) the minimum number of points in this neighborhood required to form a cluster. These are introduced by the user, and they require optimization for each experiment and sample. To do this, the user can select the Silhouette checkbox in nanoFeatures to plot the Silhouette coefficient38 for each cluster, as in ESI Fig. 4f.† However, this analysis is computationally expensive, hence it is not recommended to use with large files or during batch analysis.
Then, nanoFeatures plots the reconstructed complete image (all nine sections), with the identified clusters by DBSCAN, which are depicted by different colors as shown in Fig. 4a.
To do this, each filter (1) fits an ellipse on the clusters to obtain the aspect ratio, then (2) sets a random starting radius and adjusts it until a user-defined percentage of the cluster localizations fits in it, and finally, (3) ensures that the distance between each cluster centroid is at least the user-defined distance.
As a result of the quality filters, some clusters identified by DBSCAN are discarded, and only potential nanoparticles remain, marked by a black circle and an ID number in Fig. 4a. Additionally, nanoFeatures plots the selected nanoparticles from all nine sections, color-coded based on the localization channel. This way, users can identify co-localizing ligand populations on a nanoparticle (Fig. 4b).
At this point, within the GUI “Graphs” tab, nanoFeatures generates histograms providing an overview of nanoparticles’ characterization (Fig. 3d). For a more comprehensive analysis, users can plot the features from the generated CSV file, as showcased in Fig. 4c–f. For instance, these plots show features from a sample of 200 nm spherical nanoparticles.
Fig. 4c shows a wide distribution of nanoparticle diameters, peaking around 200 nm. Similarly, Fig. 4d shows the aspect ratio for the same sample, with its distribution mostly encompassed between 1 and 1.5, where 1 is a perfect sphere. These results align with the heterogeneous nature of nanoparticles, as described in the literature.10
First, the qPAINT filter generates a binary time trace for each identified cluster: bright (1), for each frame in which the fluorophores are active, or dark (0) when they are inactive.
Next, the user introduces the minimum number of frames that a cluster needs to be on (not dark), for the localizations to be merged into a single event. This number, the “frames threshold to merge”, needs to be optimized per sample, and it prevents false dark times.44 The duration of each dark time is then calculated by taking the derivative of the nanoparticle time traces and finding the difference between consecutive negative and positive changes. Clusters formed by non-specific localizations can be filtered by checking the corresponding checkbox within the GUI. Clusters will then be removed if their bright times do not comprise at least 50% of the total imaging time between their first and last binding events.
Furthermore, nanoFeatures constructs a cumulative distribution function (CDF) of the dark times for each remaining cluster, which is fitted with eqn (1), to obtain its mean dark time (τd*), where P represents the probability of a binding event at time t after a previous binding event (ESI Fig. 5†). Moreover, clusters are filtered on their CDF shape. When 90% of dark times are smaller than 10% of the longest dark time, the cluster is discarded.
P(t) = 1 − e−t/τd* | (1) |
![]() | (2) |
Then, τd*, in milliseconds, is used in eqn (2) to determine the number of binding sites per cluster (Fig. 4f). The association constant (kon) for each docking-imager pair and the imager concentration (Ci) are experiment-dependent.
Lastly, nanoFeatures will create a folder named “results” on the user's current Matlab path. This folder contains, for each file analyzed, most of the figures plotted by Matlab (ESI Fig. 4†) and a CSV file containing the number of binding sites, statistics of the bright and dark times, and R-squared for the exponential fit. A detailed list of all the features exported by nanoFeatures can be found in ESI, Table 2.†
In addition, nanoFeatures is an open-source platform, meaning that users can download and edit the source code of the app to add personalized functions and filters and even contribute to the public GitHub repository. Moreover, nanoFeatures is an app used daily within the research group; therefore, it is constantly updated (i.e., big fixing, new metrics, more plots, improved execution…), and will add more functionalities as they come.
Finally, despite being a Matlab-based application, the standalone version can be installed via MATLAB Runtime (without the need for a license), on multiple platforms like Linux, macOS, and Windows.
In this way, the many files generated from diverse experiments can be directly used in further analyses, minimizing or eliminating the need for extensive data pre-processing. For instance, employing machine learning to analyze various nanoparticle samples could provide valuable insights into the interrelationships of these features, thus aiding in the future design of nanoparticle formulations.
Fig. 2b and e are based on the exchange PAINT files in the “tripleColor” folder. Fig. 4a and b are based on the file “210714_COOH_200_loc2_merge.csv” from Marrit Tholen's dataset and Fig. 4c–f are based on the combination of all files from sample “COOH_200nm”. These files can be found in the “dualColor” folder.
File name | Radius (nm) | Minimum neighbors |
---|---|---|
210714_COOH_200_loc1_merge | 100 | 80 |
210616_COOH_300_loc1_merge | 25 | 10 |
210616_COOH_300_loc2_merge | 50 | 10 |
210616_COOH_300_loc3_merge | 50 | 5 |
210616_COOH_300_loc3_merge | 50 | 10 |
210616_COOH_300_loc4_merge | 50 | 5 |
210616_COOH_300_loc5_merge | 50 | 5 |
210630_COOH_300_loc1_merge | 100 | 50 |
220103_COOH_300_pG_loc2_merge | 100 | 150 |
220103_COOH_300_pG_loc4_merge | 100 | 150 |
220103_COOH_300_pG_loc5_merge | 100 | 150 |
210812_NH2_75pG25pM_loc1_merge | 75 | 80 |
210812_NH2_75pG25pM_loc2_merge | 75 | 80 |
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d4nr02573c |
This journal is © The Royal Society of Chemistry 2024 |