Lukas
Krasny
and
Paul H.
Huang
*
Division of Molecular Pathology, The Institute of Cancer Research, 237 Fulham Road, London, SW3 6JB, UK. E-mail: paul.huang@icr.ac.uk
First published on 9th October 2020
Data-independent acquisition mass spectrometry (DIA-MS) is a next generation proteomic methodology that generates permanent digital proteome maps offering highly reproducible retrospective analysis of cellular and tissue specimens. The adoption of this technology has ushered a new wave of oncology studies across a wide range of applications including its use in molecular classification, oncogenic pathway analysis, drug and biomarker discovery and unravelling mechanisms of therapy response and resistance. In this review, we provide an overview of the experimental workflows commonly used in DIA-MS, including its current strengths and limitations versus conventional data-dependent acquisition mass spectrometry (DDA-MS). We further summarise a number of key studies to illustrate the power of this technology when applied to different facets of oncology. Finally we offer a perspective of the latest innovations in DIA-MS technology and machine learning–based algorithms necessary for driving the development of high-throughput, in-depth and reproducible proteomic assays that are compatible with clinical diagnostic workflows, which will ultimately enable the delivery of precision cancer medicine to achieve optimal patient outcomes.
In contrast to the cancer genome, there is a significant gap in our knowledge of the cancer proteome. Proteins, as downstream effector molecules of the genetic code, reflect the phenotypic consequence of the cancer genome and allows one to link the relatively static genetic information with the dynamic proteomic landscape within the cell. Furthermore, given that the majority of druggable targets in tumour cells are proteins, a global overview of the cancer proteome may reveal new options for drug discovery and development. Recognising this gap, there has been significant investment in recent years in the large-scale characterisation of the tumour proteome led largely by the Clinical Proteomic Tumor Analysis Consortium (CPTAC) of the National Cancer Institute.6 These studies have provided publicly available proteogenomic datasets for several cancer types such as breast cancer, ovarian cancer and colon cancer with ongoing studies in other cancer types.7–9
Since the discovery of soft ionization techniques such as matrix-assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI), mass spectrometry (MS) has become an unrivalled analytical tool for the identification, characterization and quantification of proteins and their post-translational modifications. In particular, the combination of liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) has provided a sensitive high-throughput platform enabling analysis of several thousand proteins from an individual sample. In oncology, proteomic analysis by LC-MS/MS has been widely used in multiple applications such as biomarker discovery, drug screens and personalized medicine. Most of these applications use conventional data-dependent acquisition (DDA) or targeted methods such as single or multiple reaction monitoring (SRM/MRM) which have been comprehensively reviewed elsewhere.10–12 In this review, we focus on the use of data-independent acquisition (DIA) (also known as sequential window acquisition of all theoretical mass spectra (SWATH-MS))13 and provide an overview of specific applications in cancer proteomics to inform molecular classification, biomarker discovery and the identification of new drug targets. This review will focus on DIA-MS applications in tissue and cell line analysis, and readers who are interested in the use of this technology in liquid biopsies and plasma proteomics are referred to these excellent reviews on the topic.14–16 We further present the latest innovations in DIA-MS that will push the boundaries of this technology and accelerate its implementation in precision cancer medicine.
Typical sample processing workflows for label-free DDA-MS analysis (Fig. 2A) often include the steps of protein extraction, digestion, data acquisition and data processing (indicated by solid arrows in Fig. 2A). To increase the depth of proteomic analysis, off-line fractionation such as SDS-PAGE or liquid chromatography are often used. However, such pre-fractional steps will increase total sample amount requirements for the experiment. In DIA-MS, the sample processing and data acquisition steps are identical to single-shot DDA-MS (Fig. 2B). However, because all precursor ions in a survey scan are fragmented (Fig. 1), there is a need to incorporate post-acquisition in silico data processing steps to deconvolute the resulting complex fragment ion spectra which involves interrogating MS data with spectral libraries (Fig. 2B). A spectral library is a database which contains mass spectrometric and chromatographic parameters such as precursor and fragment m/z value, fragment type, charge and elution time for each individual peptide in the analysed sample.13,20 These study-specific spectral libraries are conventionally generated by extensive DDA-based proteomic characterization of the same samples prior to analysis by DIA-MS (Fig. 2B).21–23 However, study-specific libraries can vary between laboratories due to the lack of consistency in DDA experiments and spectral library generation. This can result in wide variations in the number and type of proteins identified and quantified between studies. As a result of the extensive number of DDA-MS experiments required to generate study-specific spectral libraries, there are also cost and time implications to consider which may decrease the attractiveness of DIA-MS. More recently, the generation of comprehensive spectral libraries as a community resource have been employed as an alternative solution. To date, comprehensive reference libraries have been generated for number of organisms including human,24 mouse,25,26 zebrafish,27 fruit fly,28 yeast,29 and various bacteria.30–32 Most of these libraries are publicly available in repositories such as SWATHAtlas.org for community use. These comprehensive reference libraries remove the need to generate study-specific libraries for each DIA-MS experiment, thus increasing inter-laboratory reproducibility while economising sample requirements and MS instrument time. This high inter-laboratory reproducibility was demonstrated by Collins et al. who undertook a multi-laboratory assessment of HEK293 cell lysates in 11 laboratories across the world and showed a very high median inter-laboratory Pearson correlation coefficient of 0.94 in the quantification of 4,077 proteins.33
Both methods typically quantify similar number of proteins (∼3000–5000) in a single shot analysis.17 Based on the published reports, it has been shown that the limit of detection (LOD) of the DIA-MS is ∼100 amol and its dynamic quantification range spans over 4–5 orders of magnitude13,33 (Fig. 3). A comparison of DDA-MS and DIA-MS performed by Gillet et al. showed that DDA-MS failed to identify reference peptides spiked into a yeast lysate background at 2–10 fold higher concentration than the LOD of DIA-MS.13 Furthermore, an up to 10-fold gain in the sensitivity of DIA-MS was reported when compared to label-free workflows based on extraction of precursor ion trace from MS1 scans.13,33 These analyses suggest that the sensitivity of DIA-MS is superior to DDA-MS although a direct head-to-head comparison of the sensitivity of these two methodologies has yet to be performed.
The nature of the LC-MS/MS analysis is based on the destructive sampling of the analyte eluted from the LC column into the MS instrument. Therefore, once the sample has been injected into the LC-MS/MS system and the data acquired, it cannot be regenerated. Given the stochastic nature of DDA-MS and the missing values resulting from this technique, it is challenging to undertake comprehensive retrospective analysis of the acquired mass spectra. Retrospective signal extraction from DDA-MS data is therefore only available for precursor ions with acquired fragmentation spectra. In contrast, DIA-MS fragments all detected precursor ions in a sample which opens new possibilities for retrospective analysis. The acquired digitized proteome files can be reprocessed with different spectral libraries and provide reliable quantitative information for new sets of queries including post-translational modifications.36,37 As a result, DIA-MS proteomic data can become an invaluable repository for the community for subsequent analyses without the need of additional data acquisition.
One major limitation of DIA-MS is the need to generate spectral libraries for data processing (Fig. 3). In situations where a comprehensive reference spectral library is not available for use or if the study involves analysis of a sub-proteome (e.g. specific subcellular compartments or post-translational modifications) that is underrepresented in reference spectral libraries, there will be a need to generate study-specific libraries. As discussed above, building a new study-specific spectral library for DIA-MS involves significantly higher starting sample amounts, instrument time and costs. This barrier may have important implications particularly where sample availability is limiting such as in the case of tissue biopsies or in rare diseases.
Cancer type | Study | Ref. | Study design | Number and type of samples | Proteome coverage | Key findings |
---|---|---|---|---|---|---|
Breast cancer | Bouchal et al. 2019 | 22 | Quantitative profiling of global proteome in biopsy samples from 4 breast cancer subtypes | 96 fresh frozen needle biopsies from 4 breast cancer subtypes: Luminal A (n = 48), Luminal B (n = 24), Her2-enriched (n = 8), triple-negative (n = 16) | 2842 proteins | • NF-KB pathway upregulated in luminal subtypes, VEGF pathway upregulated in Her2+ subtypes. |
• Decision tree classifier developed based on expression of ERBB2, INPP4B and CDK1 with correct identification rate of 84% when applied on the original dataset | ||||||
Prostate cancer | Liu et al. 2014 | 45 | Quantitative profiling of N-glycoproteins in tissue samples from prostate cancer patients | 75 fresh-frozen tissue specimens; normal (n = 10) tissue, non-aggressive (n = 24), aggressive (n = 16) and metastatic (n = 25) prostate cancer | 897 N-glycoproteins | • NAAA and PTK7 identified as potential markers for stratification of high- and low-risk prostate cancer. |
Keam et al. 2018 | 57 | Quantitative profiling of global proteome in tumour and matched adjacent tissue samples pre- and post-radiotherapy | Fresh-frozen (n = 4) and FFPE (n = 16) biopsies taken pre- and post-radiotherapy from 8 patients | 4665 proteins in fresh frozen samples | • Wound healing, extracellular remodelling and acute inflammatory response pathways were enriched in the samples after radiation therapy | |
3974 proteins in FFPE sample | ||||||
Nguyen et al. 2018 | 58 | Quantitative proteomic profiling of prostate cancer patient-derived explants treated with HSP90 inhibitors | 46 patient-derived explant tumours; discovery study (n = 16), validation (n = 30) | 4095 proteins in discovery cohort | • mRNA translation, ribosome function and RNA metabolism pathways were found downregulated and TCA metabolism upregulated after treatment with HSP90 inhibitors. | |
5450 proteins in validation cohort | • 9 proteins are universally decreased after inhibition of HSP90. | |||||
• TRFC and TIMP1 identified as candidate drug response markers for treatment of prostate cancer by AUY922 | ||||||
Latonen et al. 2019 | 66 | Multi-omic analysis of fresh frozen tissue samples by genomics, trascriptomics and proteomics | 38 fresh frozen tissue specimens; BPH (n = 10), treatment naïve PC (n = 17) and CRPC (n = 11) | 3394 proteins | • A panel of 95 miRNA identified as an important mechanism of gene expression regulation in prostate cancer. | |
• Decreased expression of miR-22 and miR-205 related to upregulation of MDH2 in CRPC compared to PC | ||||||
Kidney cancer | Guo et al. 2015 | 41 | Quantitative profiling of global proteome in 9 tumour and matched tissue biopsies | Fresh frozen tumour and matched adjacent tissue biopsy specimens from 9 patients with ccRCC (n = 6), pRCC (n = 2) and chRCC (n = 1) | 2375 proteins | • Proof-of-principle study demonstrating utility of DIA-MS for molecular characterization and biomarker identification in cancer research. |
• A set of 21 known diagnostic markers of kidney cancer identified in the dataset including AMACR, VIM and GSTA1. | ||||||
Lymphoma | Schwarzfischer et al. 2017 | 69 | Metabolomic analysis of cell lysates and tissue samples by GC-MS, LC-MS and NMR spectroscopy combined with quantitative analysis of global proteome by DIA-MS | 24 lymphoma cell lines (BL: n = 6, DLBCL: n = 18), fresh-frozen (n = 11) and FFPE (n = 13) tissue specimens | 3041 proteins in cell lines | • Higher intra- and extracelullar level of pyruvic acid in DLBCL compared to BL. |
2938 proteins in fresh-frozen tissues | • Upregulation of proteins involved in non-oxidative phosphorylation and one-carbon metabolism in BL identified as a result of metabolic reprogramming | |||||
1442 proteins in FFPE tissues. | ||||||
Liver cancer | Gao et al. 2017 | 42 | Quantitative profiling of global proteome in 14 pairs of tumour and non-tumour tissue samples by DIA-MS | 28 fresh-frozen specimens; tumour (n = 14) and adjacent normal tissue (n = 14) | 4216 proteins | • Significant upregulation of spliceosome pathway and downregulation of 37 metabolic pathways in HCC compared to adjacent normal tissue. |
• Expression of 9 proteins validated by IHC on separate cohort of 6 pairs of samples | ||||||
Zhu et al. 2019 | 43 | Quantitative profiling of global proteome in 19 pairs of tumour and non-tumour tissue samples by DIA-MS | 38 fresh-frozen specimens; tumour (n = 19) and adjacent normal tissue (n = 19) | 2579 proteins | • MCM7, proteins from HSP family and mitochondrial ribosomal proteins found upregulated in HCC samples compared to adjacent normal tissues. | |
• Upregulation of MCM7 validated by IHC on separate cohort | ||||||
Other | Guo et al 2019 | 60 | Global proteomic profiling of the NCI-60 cancer cell lines | 60 cell lines included in NCI-60 panel | 3171 proteins | • Drug response prediction based on DIA-MS data outperforms prediction based on DDA-MS data. |
• DIA-MS data can be integrated with mutational and transcriptomic data to obtain optimal predictive power for drug response simulations | ||||||
Mehnert et al. 2020 | 70 | Multi-layered proteomic analysis of Dyrk2 mutant cell lines | 6 HEK293 mutant cell lines; HEK293 wild type | 5138 proteins in | • Individual mutations of Dyrk2 cause mutation-specific reorganization of the protein–protein interactions network and changes in phosphoproteomic profile. | |
2888 phospho-peptides | • Subset of the mutations modulate Cancer Driver Proteins suggesting that these mutations are associated with cancer progression. |
The first reported application of DIA-MS in cancer proteomics was published by Guo et al. who analysed biopsy samples obtained from kidney cancer patients.41 In this pioneering work, the authors presented a novel approach of combining pressure cycling technology (PCT) for sample preparation with DIA-MS data acquisition as a rapid proteomic pipeline for the analysis of human tissue specimens. Given that DIA-MS generates profiles comprising all fragment ions in a sample, this methodology results in a permanent digital proteome map for each individual patient which can be routinely interrogated for the identification and quantification of proteins of interest. In this proof-of-principle experiment, the authors analysed tumour and matched adjacent tissue samples from 9 patients in three different subtypes of renal cell carcinoma (RCC); clear cell RCC (ccRCC), papillary RCC (pRCC) and chromophobe RCC (chRCC).41 Overall 2375 proteins were quantified by PCT-DIA-MS across all 18 samples, including 21 proteins such as alpha-methylacyl-CoA racemase (AMACR), vimentin (VIM) and glutathion-S-transferase A1 (GSTA1) which are currently used as diagnostic or prognostic biomarkers in kidney cancer. Unsupervised clustering of the whole proteomic dataset clearly separated pRCC from ccRCC suggesting that proteomic profiling is an effective means for molecular classification of this disease. In particular, the authors showed by MS that AMACR, an established diagnostic biomarker used in immunohistochemistry for distinguishing pRCC and ccRCC,46 was 13 times higher in pRCC samples in comparison to ccRCC, validating the methodology. Conversely, VIM and GSTA1, were significantly increased in ccRCC which is in accordance with previously published literature.41,46 The comparison of the ccRCC tumours versus adjacent non-tumour regions identified 296 upregulated and 317 downregulated proteins in the tumour tissue including protein kinases, transcription factors and other proteins involved in biological processes such as apoptosis, immune response or in signalling. Taken together, this work showed for the first time that DIA-MS can be applied to the analysis of human tissue biopsies in order to generate digital proteome maps that are useful for molecular classification and identification of tumour-relevant biomarkers.
Breast cancer can be molecularly classified into five intrinsic subtypes (luminal A, luminal B consisting of Luminal B and Luminal B-like, Her2 enriched, normal-like and triple-negative).38,47 There have been several published MS-based studies focused on profiling the proteomic landscape of these molecular subtypes using conventional DDA approaches.7,48–50 DIA-MS has only recently been employed by Bouchal et al. to profile 96 breast cancer needle biopsies across four of the breast cancer subtypes (48 × Luminal A, 24 × Luminal B comprising 16 × Luminal B and 8 × Luminal B-like, 8 × Her2-enriched, 16 × triple-negative).22 In total, 2842 proteins were quantified across all samples and analysis of this data led to the identification of biological pathways which are enriched in each individual subtype. For instance, the authors showed that the nuclear factor kappa-B (NF-κB) pathway was upregulated in the luminal subtypes while an enrichment of vascular endothelial growth factor (VEGF) pathway components was found in Her2-positive subtypes (Luminal B-like, Her2-enriched). Subsequent statistical analysis of the subtype-specific proteomic maps resulted in the construction of a decision tree for subtype classification based on the expression levels of three proteins – receptor tyrosine-protein kinase erbB-2 (ERBB2) or Her2, inositol polyphosphate 4-phosphatase (INPP4B) and cyclin-dependent kinase 1 (CDK1). This decision tree correctly classified 84% samples from the original cohort of 96 samples into the appropriate molecular subtype. As an orthogonal validation, the authors extended the protein-based decision tree to evaluate the gene expression levels of ERBB2, INPP4B and CDK1 in published microarray and RNASeq datasets from 883 and 1078 breast cancer patients respectively, which confirmed the association of expression levels of these three genes with individual breast cancer subtypes.
Hepatocellular carcinoma (HCC) represents ∼90% of all liver cancers and due to the asymptomatic manifestation in the early stages, patients often present with advanced disease.51,52 The availability of curative therapy consisting of liver resection and transplantation for patients with early stage HCC increases the importance of identifying biomarkers for early detection.42,52 DIA-MS has been used in a small number of studies to characterise the biology of this disease and identify new protein-based diagnostic biomarkers of HCC.42–44 For instance, Gao et al. performed a comparative proteomic analysis on 14 matched pairs of HCC tumour and adjacent non-tumour tissue resections.42 In total, the authors quantified 4216 proteins and identified 191 upregulated and 147 downregulated proteins in tumour compared to adjacent normal tissue. Gene ontology and KEGG pathway enrichment analysis revealed a significant upregulation of the spliceosome pathway in HCC as well as a downregulation of 37 metabolic pathways including the metabolism of glycine, serine and sarcosine, metabolism of retinol and biosynthesis of antibiotics.42 Based on these observations, the authors selected 9 proteins for further validation by immunoblotting in an independent set of 6 matched HCC pairs which showed expression levels changes which were consistent with the DIA-MS data. In another study, Zhu et al. analysed 19 matched pairs of HCC and adjacent tissue samples and quantified 2579 proteins by DIA-MS with 541 differentially expressed proteins between HCC and adjacent tissue.43 A number of proteins from the heat-shock proteins (HSP) family as well as mitochondrial ribosomal proteins were found to be upregulated in tumour samples compared to the adjacent tissue. The authors focused on the DNA replication licensing factor MCM7 (MCM7), which was found by DIA-MS to be upregulated in tumour specimens, and further validated this observation by IHC in an additional series of three tumour and adjacent matched tissue specimens. The authors also separated HCC samples into two groups based on the serum alpha-fetoprotein (AFP) levels, which is an FDA approved serum marker to indicate risk for liver cancer and for early detection of HCC. A comparison of adjacent normal tissue and tumour regions in HCC cases with high levels of serum AFP (>20 ng ml−1) identified 419 upregulated and 192 downregulated proteins in the tumour specimens. Conversely, no significantly altered proteins were found in the cases with low serum AFP when tumour specimens were compared to adjacent normal tissue. While hypothesis generating in nature, these studies suggest that complex metabolic reprogramming may play a role in HCC and that there are protein alterations that are specific in high risk (high serum AFP) HCC that could potentially be developed as early detection biomarkers. These findings open new opportunities in drug development for therapy and biomarker validation in this difficult-to-treat disease.
One interesting area where DIA-MS has shown some success in biomarker discovery is in glycoproteomic analysis of tissue specimens. The glycoproteome is comprised of all N- and O-glycosylated proteins present in tissue and is thought to be more amenable to biomarker discovery due to their accessibility as cell surface or secreted proteins.53 In one example, Liu et al. characterised the N-glycoproteome in prostate cancer by utilising a combination of solid phase deglycosylation of peptides and DIA-MS.45 To achieve this, they developed a novel spectral library optimised for the human N-glycoproteome generated from multiple DDA-MS sources. In this study, the authors analysed 75 tissue specimens including 10 normal prostate samples, 40 prostate cancer samples and 25 metastatic prostate cancer samples. The aim of the study was to identify protein biomarkers associated with aggressive prostate cancer. Based on the histopathological staging of the tumours (using Gleason score), the authors further divided the prostate cancer specimens into two groups, namely non-aggressive (NAG, Gleason score = 6) and aggressive (AG, Gleason score = 7–9) prostate cancer. Overall 2188 N-glycosites were identified across all 4 pooled sample groups (normal, NAG, AG and metastatic) that enabled quantification of 897 distinct N-glycoproteins. Fifty glycoproteins were found to be significantly altered between NAG and AG which included the glycoproteins N-acylethanolamine-hydrolyzing acid amidase (NAAA) and protein tyrosine kinase 7 (PTK7) which was significantly decreased and increased in AG respectively.45 These proteins were further evaluated by IHC analysis in tissue microarrays (TMA) on an expanded cohort of 56 prostate cancer cases which showed that a combined panel of these two proteins was able to discriminate between AG and NAG. These data suggest that the NAAA and PTK7 glycoproteins may be candidate markers for staging of low-risk versus high-risk prostate cancer. However, given the relatively small single centre cohort used in this study, validation in larger multi-centre independent cohorts is required to further validate their clinical utility as robust biomarkers.
These exemplar studies demonstrate the utility of DIA-MS in the acquisition of biologically relevant protein profiles from small starting sample amounts such as biopsies. These profiles not only aid in the classification of the tumour samples into molecular and histological subtypes, they also shed light on the specific biological pathways that operate within individual cancer types which may be ultimately be useful for downstream functional investigation, drug discovery and biomarker development.
The NCI-60 panel comprises of 60 cancer cell lines from nine distinct tissue types. This panel is a preclinical workhorse for the cancer community and has been subjected to in-depth molecular (genomic and transcriptional) and pharmacological (over 100000 chemical compounds) profiling. Guo et al. employed DIA-MS to analyse the proteomic landscape of the NCI-60 panel and identified 3171 proteins across all cell lines.60 The authors then used univariate and multivariate regression analysis to evaluate drug response predictions of 224 pharmacological compounds either based on the DIA-MS data alone or integrated with genomic and transcriptional features. Interrogating existing data available in CellMiner, they showed that the proteomic data contributed to a higher percentage of drug response prediction features (12%) that those derived from DNA mutations (2%) and RNA transcripts (6%). They further showed that the response of 49 screened drugs were best predicted by DIA-MS data while response to 83 compounds had optimal predictive power when combining DIA-MS data with transcript and mutational data. Notably, the authors found that the protein expression levels of multiple ATP-binding cassette family transporters were strongly associated with response to cancer drugs across several classes, including alkylating agents, histone deacetylase inhibitors and kinase inhibitors. This result underscores the importance of this family of transporters as a putative mechanism of drug response and their use as candidate biomarkers for optimisation of cancer therapy. The authors further demonstrated that the predictive power of the regression models based on DIA-MS data was generally higher compared to the models using DDA data61 due to the better quantitative accuracy and data consistency of the DIA-MS dataset. This study highlights the role that DIA-MS can play important role in the burgeoning field of pharmacoproteomics where protein level measurements not only enable deep insights into mechanisms of drug action but may also lead to predictive biomarkers of therapy response.
Commercial immortalised cell lines such as those in the NCI-60 panel have been subjected to decades of cell culture and thus may not retain many of the molecular features present in the tumours from which they were originally derived. In recent years, there has been a push towards the development of patient-derived models for preclinical cancer research. These models encompass patient-derived xenografts, organoids or tumour explants and are thought to better recapitulate the human disease.62,63 DIA-MS has been used as a characterisation tool to profile such models to identify clinical response mechanisms of drug action. One example is the study undertaken by Nguyen et al., who employed prostate cancer patient-derived explants obtained from men undergoing radical prostatectomy to study tumour-specific response to treatment with heat shock protein 90 (HSP90) inhibitors 17-AAG and AUY922.58 The use of fresh tumour specimens from different patients was important in modelling the heterogeneity inherent in prostate cancer and highlight any conserved mechanisms of treatment response found across all patients. Proteomic analysis identified a consistent downregulation of 44 proteins involved in pathways associated with mRNA translation, ribosome function and RNA metabolism. Conversely, 54 proteins were found to be increased with drug treatment with an enrichment of tricarboxylic acid metabolism components. Despite the heterogeneity amongst the 46 cases examined, the authors were remarkably able to identify 9 proteins that were universally downregulated by AUY922 treatment, including two proteins from the HIF-1 pathway, transferrin receptor protein 1 (TRFC) and metalloproteinase inhibitor 1 (TIMP1), which could serve as candidate markers of drug response. This study provides proof-of-principle evidence for the use of DIA-MS profiling in patient-derived models and brings the field one step closer to implementing this next generation proteomic strategy in precision cancer medicine.
Another interesting area of research is the design of window of opportunity studies to better understand mechanisms of therapy response and resistance.64 Such studies involve the sampling of tumour tissue prior to and after the treatment of interest for thorough pharmacodynamic assessment. In addition to chemotherapy and surgery, radiotherapy is the mainstay local treatment in a wide array of different cancer types including prostate cancer. To investigate the major cellular pathways that are regulated following the use of radiotherapy, Keam et al. performed DIA-MS based proteomic profiling of matched tissue biopsies collected at pre-treatment and 14 days post brachytherapy from 8 prostate cancer patients.57 The authors found that out of >5000 proteins identified, 24 proteins and 3 proteins were consistently up- or down-regulated post radiation respectively in all patients. The authors also identified a number of upregulated pathways in the post-radiation samples including wound healing, extracellular matrix remodelling and acute inflammatory response. These biological processes are consistent with tissue deposition and remodelling associated with radiation response. One of the limitations of this study is that it is descriptive in nature and lacks any clinical response and patient outcome data which restricts the ability to define proteins associated with brachytherapy response. Nonetheless, the identification of a number of candidate proteins which are universally regulated as a result of radiotherapy provides a useful resource for future studies elucidating their mechanistic role in radiotherapy response and resistance.
Collectively, the aforementioned examples demonstrate that DIA-MS is a useful tool for the investigation of how therapeutic interventions impact the proteomic landscape in cell lines, patient-derived models and human tissue and thus refines our current understanding of treatment responses at the molecular level. Such correlative studies can aid in revealing putative mechanisms of drug resistance and identify novel response markers to both chemotherapy and radiotherapy for subsequent functional and clinical evaluation.
Castration resistant prostate cancer (CRPC) is a chemoresistant form of prostate cancer that is unresponsive to androgen-deprivation therapy.65 Currently there are no alternative treatment options available for CRPC patients.66,67 To study the genomic, transcriptomic and proteomic changes during different stages of prostate cancer disease progression, Latonen et al. undertook an integrative multi-omic study of 11 tumour specimens from CRPC patients and compared them to profiles obtained from 17 untreated prostate cancer (PC) and 10 benign prostate hyperplasia (BPH) tissue specimens.66 Using DIA-MS, the authors quantified 3394 proteins across all samples and identified 382 and 728 differentially expressed proteins between CRPC and PC samples and PC and BPH samples, respectively. A comparison of the acquired proteomic dataset with the copy number and transcriptomic data obtained from the same specimens revealed a poor correlation between genomic, transcriptomic and proteomic measures. The authors hypothesized that this discrepancy may be due to alterations in the levels of cellular microRNA (miRNA) which can either directly lead to the degradation of mRNA targets or block the protein translation process by binding to mRNA and forming mRNA/miRNA complexes. Such complexes may alter levels of the expressed protein without affecting the overall mRNA levels of the coding gene.68 To test this hypothesis, the authors undertook miRNA sequencing and identified 95 differentially expressed miRNAs between PC and CRPC samples and these miRNAs have the potential to target almost 500 genes. From this list of potential gene targets, only 24% were differentially expressed between PC and CRPC at the mRNA level, while 45% were differentially expressed at the protein level supporting the concept that miRNAs may decrease protein levels but not the corresponding mRNA levels of the same gene target. To validate this, the authors focused on miR-22 and miR-493 that were differentially expressed between PC and CRPC and transfected them into PC-3 prostate cancer cells. The mRNA levels of the miRNA targets Endonuclease domain containing 1 (ENDOD1) and Golgi membrane protein 1 (GOLM1) were significantly decreased in the transfected cells while miRNA targets KH-type splicing regulatory protein (KHRSP1) and dynamin 1-like protein (DNML1) showed no change on the mRNA level but displayed decreased protein expression levels. In a second example, the authors identified two miRNAs (miR-22 and miR-205) with the potential to target malate dehydrogenase (MDH2). DIA-MS and RT-qPCR analysis of PC-3 cells transfected with these miRNAs revealed a decrease in MDH2 protein levels but no change in MDH2 mRNA levels. This comprehensive study demonstrates capability of DIA-MS to reveal novel insights into the regulation of gene expression in therapy resistant prostate cancer when integrated as part of multi-omic investigation.
In another example, Schwarzfischer et al. performed an integrative metabolomic and proteomic analysis of two forms of high-grade non-Hodgkin lymphomas, Burkitt's lymphoma (BL) and Diffuse large B-cell lymphoma (DLBCL).69 Metabolomic analysis of 24 lymphoma cell lines (6 BL and 18 DLBCL) identified increased intracellular levels of pyruvic acid in DLBCL compared to BL as well as higher secretion of pyruvate by DLBCL cell lines. Higher levels of pyruvate were also detected in 6 DLBCL cryopreserved tumour tissue samples when compared to 5 BL tumours. Pyruvate is a key intermediate energy metabolism and a central intersection for a number of vital metabolic pathways. To test whether the difference in pyruvate levels observed in the metabolic studies is reflected by alterations in proteins involved in specific metabolic pathways, the authors performed proteomic analysis of 11 lymphoma cell lines (5 × BL and 6 × DLBCL), 11 fresh-frozen and 13 formalin-fixed paraffin-embedded (FFPE) tissue samples. DIA-MS analysis of the lymphoma cell lines revealed a downregulation of proteins involved in pyruvate metabolism, glycolysis and oxidative phosphorylation pathways in BL compared to DLBCL. For instance, key glycolytic enzymes such as hexokinase (HXK1) and phosphoglycerate kinase (PGK1) were significantly downregulated in BL. In contrast, an upregulation of lactate dehydrogenase (LDH1), phosphoglycerate dehydrogenase (PHGDH) and phosphoserine aminotransferase (PSAT1) in BL suggests that the metabolism of glucose using non-oxidative phosphorylation and the one carbon metabolic pathway may be the predominant processes operating in this disease. The differences in expression levels of the key enzymes described above in BL and DLBCL were further confirmed by proteomic analysis of the fresh-frozen and FFPE tissue samples. This study underscores the important complementary role that DIA-MS has in the interpretation of metabolomics data and highlights the power of this integrative approach in revealing new insights into the complex metabolic reprogramming underlying the development of non-Hodgkin lymphoma.
Recent studies employing integration of orthogonal MS strategies to sample different facets of tumour biology have also been promising. For instance, Mehnert et al. developed a multi-layered proteomic approach to study effects of different mutations of Dual specificity tyrosine-phosphorylation-regulated kinase 2 (Dyrk2) on protein topology, protein–protein interactions (PPI) and global proteomic and phosphoproteomic profiles.70 Through interactions with the EDVP E3 ubiquitin ligase complex, Dyrk2 plays a key role in cell cycle and apoptosis and has been identified as both a putative tumour suppressor and oncogene.71,72 Based on published data, the authors generated a series of cancer-associated Dyrk2 mutants which were expressed in HEK293 cells. Analysis of the PPI networks by affinity purification-mass spectrometry (AP-MS) identified mutation-specific reorganization of the Dyrk2 PPI network in truncated and catalytically inactive mutants of this protein. MS-based quantitative crosslinking analysis revealed topological changes in the Dyrk2 structure as well as a decrease in Dyrk2 phosphorylation status particularly in the truncated and catalytically inactive mutants. To explore the broader effects of Dyrk2 mutations on the proteome, the authors employed DIA-MS for proteomic and phosphoproteomic analysis of the HEK293 mutant cell lines. When combined with the PPI AP-MS data, this workflow showed that a subset of Dyrk2 mutants modulated multiple proteins annotated as Cancer Driver Proteins in Cancer Gene Census catalogue, suggesting that these Dyrk2 cancer-associated mutations have the potential to contribute to cancer progression. This study highlights the power of combining orthogonal MS-based strategies with DIA-MS to deliver multi-scale molecular information to dissect the functional roles of oncogenes and tumour suppressors.
These examples provide proof-of-principle that DIA-MS can be an integral part of proteogenomic or metaboproteomic analysis of tissue samples and cell lines and we anticipate the use of such comprehensive integrative studies will continue to grow and ultimately become a routine toolkit in cancer research.
Another limitation of DIA-MS is that the complex mass spectra arising from this methodology is compounded when a short chromatographic separation is applied in order to increase sample throughput. The reason for this increased complexity is due to the lower number of data points during acquisition in combination with very high number of co-eluting peptides. The resulting complex spectra poses significant challenges for deconvolution with conventional data processing platforms. To address this challenge, machine learning algorithms have been exploited to distinguish real signals from interfering background.19,84 A very recent innovation in this area is the development of the DIA-NN algorithm which uses deep neural networks to improve proteome coverage in DIA-MS data analysis.84 Demichev et al., compared the performance of DIA-NN to conventional platforms such as Spectronaut, Skyline and OpenSWATH. In a 30 minute DIA-MS experiment, DIA-NN identified more precursors than Spectronaut and Skyline at the same false discovery rate (FDR) threshold, while OpenSWATH failed to process the data. Moreover, DIA-NN identified more precursors in a 30 minute experiment compared to Skyline and OpenSWATH in 60 minute experiment using the same FDR threshold. Such novel approaches could enable a step-change in the translation of DIA-MS into the clinical setting where fast and reliable analysis may be necessary for applications in personalised cancer medicine.
This journal is © The Royal Society of Chemistry 2021 |