Suprama
Datta
a,
Erik C.
Hett
b,
Kalpit A.
Vora
c,
Daria J.
Hazuda
bc,
Rob C.
Oslund
*b,
Olugbeminiyi O.
Fadeyi
*b and
Andrew
Emili
*a
aCenter for Network Systems Biology, Department of Biochemistry, Boston University School of Medicine, Boston, MA, USA. E-mail: aemili@bu.edu
bExploratory Science Center, Merck & Co., Inc., Cambridge, Massachusetts, USA. E-mail: rob.oslund@merck.com; olugbeminiyi.fadeyi@merck.com
cInfectious Diseases and Vaccine Research, Merck & Co., Inc., West Point, Pennsylvania, USA
First published on 23rd December 2020
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the current coronavirus disease 2019 (COVID-19) pandemic that has led to a global economic disruption and collapse. With several ongoing efforts to develop vaccines and treatments for COVID-19, understanding the molecular interaction between the coronavirus, host cells, and the immune system is critical for effective therapeutic interventions. Greater insight into these mechanisms will require the contribution and combination of multiple scientific disciplines including the techniques and strategies that have been successfully deployed by chemical biology to tease apart complex biological pathways. We highlight in this review well-established strategies and methods to study coronavirus–host biophysical interactions and discuss the impact chemical biology will have on understanding these interactions at the molecular level.
SARS-CoV-2 originated in the Wuhan province in China after the summer of 2019, rapidly spread across the globe, and was eventually declared a pandemic by the World Health Organization (WHO) in early 2020. The virus has subsequently shown some genetic changes, particularly the D614G mutation in the spike protein has been associated with higher infectivity but probably lower mortality.5 Mutations at position 614 are unlikely to affect antibodies that bind to the neutralizing epitopes. The immunological sequalae during and after infection remain incompletely understood.6 The ensuing immune response can cause a variety of outcomes. These range from severe disease initially in some patients, exemplified by acute respiratory distress syndrome (ARDS), especially those at older age or with comorbidities, to recovery and resolution in most others with the potential for establishing immunity to subsequent exposure.
The portal of entry for SARS-CoV-2 is through nose, mouth, and eyes establishing an early infection in the upper respiratory tract (Fig. 1). In the majority of individuals this virus replication in the upper respiratory tract does not progress to severe disease and is cleared quickly.7 It is postulated that a robust innate immune response and/or trained immunity, and likely pre-existing cellular immunity (T-cells generated to prior exposures to common cold coronaviruses) could account for the limited infection in the upper respiratory tract.8 Subsequently if the virus is not cleared from the upper respiratory tract, the virus will travel to the lower respiratory tract and establish the infection in lung airway or bronchiole cells leading to moderate or severe COVID-19 symptoms (Fig. 1). It is also hypothesized that the ensuing immune response in the lung to clear the infection causes pathogenic inflammation and, in some instances, manifests into ARDS.9
Immunity to several respiratory viral pathogens, including respiratory syncytial virus (RSV), human metapneumovirus (hMPV), parainfluenza virus type 3 (PIV3) and Rhino viruses in the upper respiratory tract is considered short-lived and partial.7 Likewise, protective immunity to coronavirus infection may also be short-lived, as has been observed with common cold corona, severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS) viruses.10 Protection from severe, lower respiratory tract disease may be more robust and durable but is unlikely to be complete, especially in at-risk populations. Any form of prior specific or non-specific immune education by way of vaccination or exposure may slow down the march of the virus to the lower respiratory tract. Therefore, vaccination against SARS-CoV-2 is highly desirable to avoid serious disease.11 Most prophylactic vaccines block viral entry by eliciting antibodies to the viral surface glycoproteins responsible for viral entry into the host cells and hence the spike protein of SARS-CoV-2 has become an attractive vaccine candidate antigen.12 In addition to eliciting neutralizing antibodies, the spike protein has several epitopes that can elicit CD4-T cell responses and some CD8-T cells.13 Additional viral proteins like nucleocapsid and proteases are more conserved than the spike protein and therefore are candidate antigens to elicit additional T-cell responses or boost pre-existing T cell responses.14
The goal of the SARS-CoV-2 vaccine is to raise robust spike specific humoral responses that prevent viral infection. Addition of CD4 T-cell responses can also help the process, but on their own may not afford protection. Requirement of CD8 T-cell responses is controversial as the potential for collateral damage to the lung by inducing cytotoxicity, re-modeling and fibrosis while clearing infected cells is a concern. Additionally, any vaccine developed will have to deal with the perceived risk of enhanced disease by vaccinations that raise suboptimal antibody and/or Th2 T-cell responses. Therefore, understanding the molecular interaction between the virus, host cells and the immune system is of paramount importance for developing prophylactic and therapeutic interventions.
With a history deeply rooted in understanding how biomolecules engage within the complexity of biological systems, chemical biology is well-suited for interrogating the dynamic repertoire of SARS-CoV-2–host molecular interactions. The successful integration of chemical biology within the drug discovery pipeline through the development and application of chemical probe technologies enables the elucidation of ligand–target pairs and enhances exploration of biological pathway interactions through targeted protein modulation.15,16 The potential utility of chemical biology-based approaches for understanding viral pathobiology have been further enhanced through the development of a portfolio of protein activity-based probes and labeling methods that allow for high-resolution investigation of cellular enzyme function and proximal protein communities in cellular environments, respectively.17 Mass spectrometry-based proteomic endeavors to identify viral–host protein engagement within cells have also been improved through chemical biology-based affinity tags, enrichment strategies, and selective labeling methods.18
The successful demonstration of these chemical biology-based strategies and technologies have propelled expansion into other functional characterization initiatives such as the elucidation of metabolite–protein interactions,19 understanding protein community environments within different cellular regions,20 and detailing mechanisms behind cell–cell engagement.21–24 The field of chemical biology is ideally suited to address these and other fundamental questions that are central to unlocking a more detailed understanding of coronavirus–host interactions at all phases of a viral replication life cycle. In this review, we highlight previous efforts to study coronavirus host molecular interactions and discuss the impact chemical biology will have on understanding these interactions at the molecular level.
Central metabolic pathways, including glycolysis, tricarboxylic acid (TCA) cycle and lipid metabolism are important targets for energy resources for viral replication (Fig. 2).28 Mapping the host–virus interaction interface impacting metabolic systems is crucial to understanding the: (1) cellular pathways co-opted to support viral replication and infection, (2) function of uncharacterized viral components (guilt-by-association), and (3) reveal potentially druggable targets to counter infection. Chemical probes targeting key host pathways co-opted during viral replication, or other potential ligand binders involved in antiviral immune response, could leverage metabolic responses and pathways that are comprised of actionable proteins (e.g. enzymes and transcription factors) and small molecules metabolites (glutamine, citrate, palmitate, etc.) that natively engage with bioactive compounds. The dynamic interplay between host proteins and metabolites is therefore key to the metabolic reprogramming that occurs upon infection. For example, pattern recognition receptors such as Toll-like receptors (TLRs), retinoic-acid-inducible protein I (RIG-I), RIG-1-like receptors (RLRs) and cytosolic sensors (cGAS) sense pathogen-associated molecular patterns, such as small molecules, and dimerize with their cognate adaptor molecules to activate IKK and TBK-1/IKK.29 These kinases in turn activate the transcription factors IRF3 and NF-kB to promote the production of cytokines.
Energy precursors like citrates, succinates and other glycolysis and TCA cycle intermediates also modulate production of inflammatory cytokines during host immune response.30 Lipids and fatty acids play active roles in protein modifications and the formation of the viral envelope.31 Amino acids like glutamine can serve as an alternative carbon source for viral infected cells.32 Endogenous nucleotides like cGMP and cAMP act as second messengers that signal interferon gene expression.33 Specialized organelles such as mitochondria and lysosomes serve as important subcellular hubs that integrate diverse players at the interface of immune response and metabolism.34,35 Conversely, the metabolic-reprogramming activity of the cytokine response can originate from immune signaling components whereby metabolites, such as succinate and citrate, can directly or indirectly regulate the canonical NF-κB pathway (Fig. 2). Studies suggest that itaconate, a structurally similar small molecule metabolite to malate, is transported across the mitochondrial inner membrane by the TCA cycle intermediate carriers (e.g. citrate) and activate the anti-inflammatory transcription factor nuclear factor erythroid 2-related factor 2 (Nrf2) following lipopolysaccharide (LPS) stimulation.36 A recent study by Olagnier et al. reported the antiviral effects of a chemically synthesized cell permeable derivative of itaconate and fumarate, 4-octyl itaconate (4-OI) and dimethyl fumarate (DMF) respectively, in SARS-CoV-2 infection37 and will be discussed further in the strategy section of this review. The antiviral effects of metabolic sensing enzymes (e.g., mammalian target of rapamycin (mTOR)38 and adenosine monophosphate-activated protein kinase (AMPK)39) play major regulatory roles in both metabolism and immune responses with the onset of proinflammatory signals.
With the havoc created by the COVID-19 pandemic coupled with the recent history of other zoonotic coronavirus infections, MERS and SARS-CoV, it has become imperative to understand the pathophysiology of host–SARS-CoV-2 and coronavirus infections more generally with the goal of uncovering the mechanisms associated with host cell invasion, replication and persistence. Although most of the structural and non-structural coronaviral proteins have been assigned viral replication and cell/immune modulatory functions, respectively, the functions of the accessory proteins (7b, 8a, 8b and 9b) and truncated/untruncated sub-genomic mRNAs are still unknown (Fig. 2). We also have limited knowledge of the immune players involved in SARS-CoV-2 infection. In a conventional viral infection model, innate immune metabolic players like nuclear hormone receptors (e.g. PPARs) and TCA cycle metabolites (e.g. succinate, citrate) activate the NF-kB signaling cascade which release proinflammatory cytokines (IL-1, IL-6, TNF-α) and interferons (IFNs) (Fig. 2).40,41 These proinflammatory cytokines in turn activate the mTOR signaling pathway of adjacent host cells (Fig. 2).38,42 IFNs are critical to both innate and adaptive immunity, and function as the primary activator of macrophages, in addition to stimulating natural killer cells and neutrophils.43 IFNs also take part in activating the JAK-STAT signaling pathway upon binding to its receptor on the adjacent host cell membrane.43 While all these aspects are still to be figured out for SARS-CoV-2 infection, their elucidation will be possible upon employing suitable strategies to map these virus–host interactions. The following section describes some of the well-established methods and strategies to address coronavirus–host interactions.
Notably, recent IFA studies report tracking membranous structures colocalizing with coronaviral replication/transcriptional complexes. Müller et al. studied the influence of cellular lipid metabolism on human coronavirus replication in culture using both confocal and transmission electron microscopy.50 Specifically, the group was interested in understanding the colocalization of viral-mediated dsRNA production and lysophospholipids generated by cytosolic phospholipase A2α (cPLA2α) activity. They infected Huh-7 cells with HCov-229E or MERS-CoV, and stained these cells with antibodies against dsRNA, the coronavirus Nucleocapsid (N) and NSP8 protein. To monitor lysophospholipid generation and localization, they employed the use of a fluorogenic cPLA2α active probe. In this cell system, viral replication/transcription complexes (RTC) were observed to co-localize with lysophospholipids. Notably, when cells were treated with the cPLA2α inhibitor pyrrolidine-2 to disrupt lysophospholipid production, a corresponding decrease in viral RNA and protein accumulation was detected in coronavirus infected cells.
Another study by Poppe et al. investigated the influence of HCoV-229E infection on cellular NF-κB signaling and the cellular transcriptome landscape.51 The authors used IFA to track coronavirus N protein expression to monitor viral infection and its spread in an A549 lung-derived epithelial cell line. Other microscopic methods have involved the detection of viral RNA or DNA by fluorescence in situ hybridization (FISH) whereby a fluorescent probe is used to target viral sequences.52
A recent imaging technique for visualizing SARS-CoV-2 infected nasopharyngeal epithelial cells was developed by Rut et al. using activity-based probes (ABPs) for the SARS-CoV-2 main protease (Mpro) that were developed from a hybrid combinatorial substrate library.53 An ABP from this library screen was able to detect SARS-CoV-2 Mpro at 5 nM enzyme concentration when incubated in the presence of 2.5 μM probe for 5 min. This study also resulted in the development of a potent SARS-CoV-2 Mpro inhibitor with half-maximal effective concentration (EC50) of 3.7 μM in a human cell line-based viral infection assay.
Classically, one of the most commonly used reporter genes has been lacZ (encoding beta-galactosidase), which can either be placed under the control of a viral promoter specific to the virus being studied or under a strong universal promoter such as HCMV immediate early enhancer element or the SV40 promoter.55 The choice of a promoter is dependent on the intended application for the recombinant virus. Although lacZ is an effective reporter, the infected cells/tissues require downstream processing/staining to enable the detection of beta-galactosidase, and this can add an additional 6–18 h to the protocol for detection of virus. An additional concern is the large size of the lacZ gene, which might be an important consideration for viruses with limited genome oversizing potential because of packaging constraints. This could potentially result in second site mutations (e.g. gene deletion) unrelated to the reporter gene insertion locus, which would also alter the phenotype of the virus.
Fluorescent proteins have become extremely popular for incorporation into recombinant viruses based on their sensitive detection in live cells in real time via fluorescent microscopy. The most common fluorescent reporter is green fluorescent protein (GFP) from the jelly fish (Aequorea victoria) and engineered variants that produce high levels of fluorescent signal and consequently easy detection and real-time tracking. For example, Sims and co-workers generated recombinant SARS-CoV constructs by deletion of open reading frame 7a/7b (ORF7a/7b) and insertion of GFP to study the infectivity of SARS-CoV virus in human angiotensin 1 converting enzyme 2 (hACE2) mediated invasion of an in vitro model of human ciliated airway epithelia (HAE) derived from nasal and tracheobronchial airway regions.56
An additional advantage of fluorescent proteins is that their small size enables viral protein tagging and construction of virus expressing fluorescently tagged structural proteins. One disadvantage of using GFP as a reporter gene in the context of an animal model is that some tissues have an endogenous autofluorescence that makes the detection of the reporter difficult relative to the background. The liver in particular can be very problematic due high autofluorescence background. Antibodies to GFP and other fluorescent proteins partly overcome this problem via the use of immunohistochemical staining of tissue.
A potential alternative for in vivo detection of viruses is the use of bioluminescence imaging (BLI). This particular technique is extremely sensitive and enables the detection of virus in real time in the same animal over days to weeks post-infection. BLI requires the introduction of the firefly (Photinus pyralis) luciferase reporter gene into the viral genome of interest. After interaction with their substrate, firefly luciferase (FLuc) enzymes emit photons; the specific wavelength (400–615 nm) is dependent upon the enzyme. A fairly recent introduction of a smaller variant of the 62 kDa FLuc called nanoLuciferase (nLuc), a 19 kDa enzyme from deep sea shrimp, now comprises a popular reporter assay system which produces a luminescent signal >100-fold brighter than FLuc. The key feature that makes nLuc an ideal reporter is that its luminescent signal exhibits a long lifetime, resulting in glow kinetics rather than a rapid flash signal typically produced by other small, bright luciferases such as Gaussia luciferase. nLuc produces blue light with maximum emission at 460 nm, is thermally stable, active over a broad pH range, and requires no post-translational modifications or maturation after translation to make an active enzyme. Agostini et al. used nLuc fusions to study the susceptibility of β-coronaviruses such as mouse heptatitis virus, MERS-CoV, and SARS-CoV to remdesivir in cell-based assays and observed RNA polymerase-mediated inhibition of viral replication.57
While most recent efforts have pivoted towards more targeted proteomic approaches to study viral–host protein interactions in a cellular context, as described further below, structure guided computational approaches aim to predict the conserved set of putative interactions of virus and host (Fig. 4). This has deepened understanding of virus-specific cellular targets that lead to rewiring of host pathways and are illuminating promising new aspects of disease intervention. These computational inferences involve data mining from literature-curated binary PPIs from public databases e.g. Biomolecular Interaction Network Database (BIND),60 the Database of Interaction Proteins (DIP),61 Human Protein Reference Database (HPRD),62 IntAct,63 the Molecular INTeraction database (MINT),64 Virus–Host Network (VirHost-Net),65 Biological General Repository for Interaction Datasets (BioGrid),66 InnateDB,67 and the Pathogen–Host Interaction Search Tool (PHISTO).68 These PPI datasets can be scanned for conserved Pfam-A domains present in both virus and host encoded proteins, which are mapped to form putative domain–domain interaction networks (DDIs) that can be assessed for the topological parameters using R packages like igraph or Cytoscape.
To analyze the functional impact of predicted targeting of host protein domains by viruses, enrichment analysis of GO functional annotation terms can be carried out using R and the Bioconductor topGO package. Finally, the DDIs, along with their assigned functionality, can be assessed for disease relevance from public databases (e.g. Online Mendelian Inheritance in Man (OMIM), Human Gene Mutation Database (HGMD Public)69 and Catalogue of Somatic Mutations in Cancer (COSMIC)70) to make a comprehensive list of disease-associated viral host targets. For example, Ostaszewski et al. built a comprehensive COVID-19 disease map as a standardized knowledge repository of all putative SARS-CoV-2 virus–host interactions.71 This project was an open collaboration between scientists across the globe curating SARS-CoV-2 infection related information in one platform to support the research community in understanding SARS-CoV-2 pathophysiology and aid the development of efficient diagnostics and therapies. This platform has enabled the visual exploration and computational analyses of molecular processes involved in SARS-CoV-2 entry, replication, and host–pathogen interactions, as well as elucidated immune response, host cell recovery, and repair mechanisms.
The yeast two-hybrid assay (Y2H), devised by Fields and Song for mapping binary protein–protein interactions (PPI)72 has been used in multiple high throughput viral–host PPI screens for unbiased viral genome-wide interactome exploration or focusing on a subset of viral proteins.73 These studies are based on an initial construction of a viral ORFeome, comprising clone ORFs encoding distinct viral proteins fused to a DNA binding domain (Fig. 5A). Interactions with human proteins expressed as activation domain fusions can reconstitute a functional transcriptional factor, driving expression of a reporter gene(s). This genetic assay is prone to false positives (and negatives), and so putative interactions discovered using Y2H screens must be confirmed by a secondary, biochemical method to generate high confidence data.
Using the Y2H assay, Xiao et al. reported the interaction of the N-terminal region of SARS-CoV Spike (S) protein with eIF3f, a subunit of the eukaryotic initiation factor 3 (eIF3), which they subsequently confirmed by co-immunoprecipitation (co-IP) and IFA.74 Another systems biology study employing a proteome-wide Y2H screen to identify immunophilins as interaction partners of the SARS-CoV non-structural protein 1 (Nsp1) was performed by Pfefferle et al.75 Immunophilins act as receptors of immunosuppressive drugs such as cyclosporin A (CsA) and tacrolimus and thus this study suggests non-immunosuppressive derivatives of CsA may serve as potential broad-range inhibitors of SARS-CoV infection.
Another powerful tool to investigate the biological function of specific proteins either in vitro or in vivo is the RNA interference (RNAi) platform. The introduction of small interfering RNAs (siRNA), 20–25 nucleotide short double-stranded RNAs that are specific to target mRNA sequences, into cells allow for sequence-specific degradation of the mRNA. This method is a relatively fast, simple, and robust approach to specifically downregulate protein expression and study resulting cellular function. Expression knockdown has been successfully used in virology to study the role of specific host factors in the infectious life cycle and replication of viruses. Millet et al. used this platform to successfully validate the functional impact of ezrin, an actin binding protein, that they identified by Y2H screens as an interactor with the SARS-CoV spike (S) protein.76
Recently, clustered regularly interspaced short palindromic repeats (CRISPR/Cas) technology has greatly expanded the ability to genetically probe virus–host functional dependencies by means of initial screening and loss/gain-of-function based confirmation of targets (Fig. 5B).77 Exploiting the characteristics of virus–host interactions and the basic rules of nucleic acid cleavage or, most recently, gene regulation via the CRISPR–Cas system, it can be used to target components associated with both the virus genome and host factors to define their roles in virus infection or replication. These include CRISPR interference (CRISPRi) technology that can inhibit expression of target genes, or CRISPR activation (CRISPRa) that can increase target gene expression to test contrasting impact on viral proliferation.
In a notable recent study, Abbot et al. reported a CRISPR–Cas13-based antiviral strategy, PAC-MAN (prophylactic antiviral CRISPR in human cells), for viral inhibition that can effectively degrade SARS-CoV-2 RNA sequences in human lung epithelial cells.78 They designed and screened multiple single guide RNAs (sgRNAs) targeting conserved viral genomic regions to identify functional sgRNAs potently targeting SARS-CoV-2 (Fig. 5C). Another study by Broughton et al. developed a CRISPR–Cas12-based lateral flow assay for the rapid detection of SARS-CoV-2 from respiratory swab RNA extracts (Fig. 5C).79 Daly et al. reported the interaction of host cell receptor Neuropilin-1 (NRP-1) with CendR motif of SARS-CoV-2 spike (S), a polybasic Arg-Arg-Ala-Arg C-terminal sequence on the S1 subunit upon cleavage by host protease furin. NRP-1 is a cell-surface receptor that plays an essential role in angiogenesis, regulation of vascular permeability, and the development of the nervous system. This group probed the functional relevance of this interaction through CRISPR/Cas-mediated knockout of NRP-1.80
The receptor recognition mechanism of SARS-CoV spike (S) protein towards hACE2, which defines host cell infectivity, pathogenesis and host range, was assessed using SPR by Wu et al. (2012).81 They measured the binding affinities between the receptor binding domain (RBD) of S and hACE2 variants by immobilizing serial dilutions of the proteins on a sensor chip through covalent coupling via amine groups. An alternate biophysical approach was likewise employed to explore the RBD–hACE2 interface by Shang et al. (2020).82 They generated a crystal structure of SARS-CoV-2 RBD complexed with hACE2 and compared it to the interface for SARS-CoV. This study found the ACE2-binding ridge in SARS-CoV-2 RBD assumes a more compact conformation compared to SARS-CoV RBD along with several residue changes that stabilize binding pockets at the RBD–hACE2 interface, thereby increasing the binding affinity for ACE2.
Fig. 6 Mass spectrometry-based techniques used to identify virus–host interaction networks. (A) Overview of proteomic tools utilized in the of study host–pathogen interactions and their integration with omics approaches. (B) Quantitative multi-omic analysis workflow to define the host response to an infection. The resulting datasets are mapped to known metabolic pathways to measure the up- or downregulation upon infection and integrated with phenotype data to construct correlation networks. (C) Schematic of immunoaffinity or epitope tag-based affinity purification coupled to mass spectrometry (IP-MS/AP-MS) workflow. Components include immunoaffinity purification of protein complexes, enzymatic digestion of proteins, nano-liquid chromatography coupled to mass spectrometry (nLC-MS/MS), and bioinformatic analysis to identify proteins. Label-free protein quantification may be performed by MS/MS spectral counting or precursor ion intensity (MS1) integration. (D) Global proteomic approaches can be used to study alterations throughout infection in protein abundance. These changes can be quantified at the MS level, by pulse labeling via Stable Isotope Labeling in Cell Culture (SILAC) and/or isobaric tagging (such as tandem mass tagging, TMT) of samples after proteolysis and comparing the ion intensities to define proteome alterations at multiple time points of infection. In this workflow, cells are harvested at different infection times after pulse labeling and digested peptides from each sample are labeled with isobaric tags consisting of unique reporter masses. The samples are mixed together for MS analysis, and peptide quantification is assessed at the MS/MS level using the reporter ion intensities. Peptide quantitative values derived from sequences assigned to the same protein are used to calculate the overall relative protein abundance. (E) Workflow showing differential analysis of global proteomic and metabolomic profiles of COVID-19 patient cohorts vs. healthy individuals via an untargeted LC-MS/MS platform. (A–D) adapted from ref. 95. |
The method that has seen the widest implementation in host–pathogen interaction studies is immunoaffinity or epitope tag-based affinity purification coupled to mass spectrometry (Fig. 6C). In immunoprecipitation mass spectrometry (IP-MS), a target of interest is isolated using an antibody raised against the endogenous protein whereas in the case of affinity purification coupled to mass spectrometry (AP-MS) the target protein of interest is co-expressed with an epitope tag which is isolated using antibodies raised against the epitope tag. This immunoprecipitated or affinity-purified protein of interest, along with co-isolated interacting proteins, is then identified by MS. IP-MS and AP-MS studies can be performed from both the pathogen and host perspective. For example, selective enrichment of a viral protein can facilitate understanding of the host factors it interacts with to promote replication or suppress host defense pathways. Alternatively, enrichment of host cellular protein can be performed to identify interactions with surrounding protein partners during viral infection to characterize possible changes in the hijacked host protein function(s). Both IP-MS and AP-MS studies that capture differential interactions are often performed in conjunction with fluorescent tagging and microscopy. These studies provide complementary spatial–temporal information about host–pathogen interactions to follow through the temporal cascade of cellular events that occur during a pathogen infection. One advantage of IP-MS over AP-MS is that experiments can be performed in a native cellular context to enable unbiased detection of PPIs whereas AP-MS requires overexpression of the protein of interest to capture interacting proteins and can result in background binding artifacts. Furthermore, in both methods, weakly bound protein interactors or localized, but non-interacting, by-stander proteins cannot be captured.
A recent example by Gordon et al. used AP-MS to investigate the SARS-CoV-2 virus–host protein interactions. The authors devised a viral–host SARS-CoV-2 protein interaction map by cloning, tagging, and expressing 26 of the 29 proteins encoded by the SARS-CoV-2 genome in HEK293T cell lines and identified the human proteins that physically associate with each viral protein bait.83 They identified 332 putative viral–host PPIs encompassing 66 druggable human proteins targeted by 69 small molecule compounds (including 29 drugs approved by the US Food and Drug Administration, 12 in clinical trials, and 28 in preclinical studies), and established several antiviral candidates potentially suitable for repurposing as antiviral therapeutics. This group also did a comparative coronavirus – human PPI study to understand the conservation of target proteins and cellular processes between SARS-CoV-2, SARS-CoV, and MERS-CoV.84
Global proteomics approaches that quantify changes in protein abundance and post-translational modifications (PTMs) (e.g. phosphorylation) are powerful tools to elucidate mechanisms of viral pathogenesis by providing a snapshot of how cellular pathways are rewired upon infection. Importantly, the functional outcomes of many protein and PTM event changes are well annotated, especially for kinases as drug targets where phosphorylation directly regulates their activity. These approaches employ mass-spectrometry analysis coupled with bioinformatics-based tools to quantitatively assess changes to protein and/or PTM levels. Bojkova et al. performed a global quantitative proteomic analysis of SARS-CoV-2 infected Caco-2 cell lines and revealed that the virus rewires host cellular pathways such as translation, splicing, carbon metabolism, protein homeostasis (proteostasis) and nucleic acid metabolism.85 The authors used a previously developed method known as multiplexed enhanced protein dynamics (mePROD) proteomics that combined both Stable Isotope Labeling in Cell Culture (SILAC) and Tandem Mass Tags (TMT) labeling methodologies to enhance quantification of protein level changes with high temporal resolution (Fig. 6D).86 A quantitative profiling of the global phosphorylation and protein abundance landscape of SARS-CoV-2 infection was also reported by Bouhaddou et al. where they mapped phosphorylation changes to disrupted kinases and pathways and used this information to identify potential anti-viral small molecules.87 Collectively, the combination of these quantitative methods with other chemogenomic screening efforts88 have the potential to rapidly prioritize proteins, PTMs, and associated drugs and compounds for treating SARS-CoV-2 infection.
Since SARS-CoV-2 infects pneumocytes, leading to acute lung injury and impaired gas exchange, the molecular mechanisms driving infection and pathology remain unclear. To address this gap, our team performed a quantitative phosphoproteomic survey of pluripotent stem cell-derived human alveolar epithelial type 2 cells cultured as an air–liquid interface during infection with SARS-CoV-2 (Hekman et al., in press). The resulting time-course profiles revealed rapid remodeling of diverse host cell systems, including signal transduction machinery, RNA processing, translation, metabolism, nuclear integrity, protein trafficking, cytoskeletal-microtubule organization, leading to cell cycle arrest, genotoxic stress, and innate immunity. Comparison with other proteomic studies of undifferentiated transformed cell lines highlighted convergent and divergent responses, reflected by differential sensitivity to antiviral compounds, providing a rich perspective for targeting respiratory processes hijacked by SARS-CoV-2 as potential therapeutic avenues.
Global proteomic and metabolomic profiling of COVID-19 patient sera is another approach adopted to address the differential expression of proteins and metabolites upon infection when compared to healthy cohorts (Fig. 6E). Shen et al. found 105 differentially expressed proteins in the sera of COVID-19 patients.89 They correlated expression profiles with clinical disease severity that showed 93 proteins to be specifically modulated in severe patients. Activation of the complement system, macrophage function, and platelet degranulation were the major pathways correlated to 50 out of the 93 differentially expressed proteins as shown by their network enrichment analyses. 373 metabolites were significantly changed in COVID-19 patients and 204 metabolites were correlated with disease severity. Activation of the complement system, macrophage function, and platelet degranulation were the three major pathways correlated to the key dysregulated proteins and metabolites from the pathway and network enrichment analyses in this study.
Our group has been building a ligand discovery pipeline employing the AP-MS approach for screening host endogenous metabolite ligands of diverse protein baits, including SARS-CoV-2 proteases (Fig. 7A). We believe that this study will contribute to basic understanding of novel aspects of the chemical biology of SARS-CoV-2–host interactions and possibly connect the dots as to how the cellular and immunological responses are so varied innately to this virus.
Another chemical biology approach to measure targeted small molecule–protein engagement is the cellular thermal shift assay (CETSA).90 This assay involves treatment of cells with one or more compounds of interest, followed by heating to denature and precipitate proteins, with the premise that ligand binding will stabilize proteins and hence preserve their solubility. After removing aggregated protein from the soluble fraction, a quantitative proteomics workflow is applied to measure ligand-induced changes in protein solubility to infer the target of a compound of interest such as a bioactive small molecule from a cell-based antiviral phenotypic screen or a drug lead in order to deduce or confirm their mechanism of action (Fig. 7B).
A recent addition to the affinity purification-based interactomics toolkit are proximity-based labeling techniques, wherein either an engineered enzyme or, in a recent advance, a photocatalyst, is used to chemically label cellular protein(s) in close proximity to a protein of interest. Established methods are centered on BioID, in which a promiscuous biotin ligase is fused to a target protein leading to the biotinylation of cellular factors bound to, or in close proximity, to the bait.91 The biotinylated proteins are then selectively recovered by binding to an affinity capture matrix (e.g. streptavidin). Related approaches based on a similar engineered enzyme principle, i.e. ascorbate peroxidase (APEX, APEX2) and horseradish peroxidase (HRP) have been described,92 as has another peroxidase-based assay known as selective proximity labeling assay using tyramide (SPPLAT)93 for mapping PPIs. When compared to AP-MS based approaches, proximity labeling methods can map cellular networks consisting of both stable and transient PPIs, whereas the former tend to be biased towards identifying stable protein complexes. Proximity labeling systems are finding increased utility for profiling entire cellular compartments and to monitor cell–cell contacts that mediate cell engagement.
Proximity labeling efforts to understand coronavirus–host protein interactions have been employed by V'kovski et al. using BioID to map the microenvironment of the replication and transcription complex of murine coronavirus (MHV).94 By fusing the biotin ligase enzyme to MHV-nsp2 followed by subsequent infection of L929 murine fibroblasts, this group identified the close association of replicase gene products nsp2–10, RNA-dependent RNA polymerase (nsp12), the NTPase/helicase (nsp13), the 5′-cap methyltransferases (nsp14, nsp16), the proof-reading exonuclease (nsp14), and the nucleocapsid protein. They proposed that nsp2–16 and the nucleocapsid protein collectively constitute a functional coronavirus replication and transcription complex in infected cells. In addition to these enriched viral gene products, >500 human host proteins were also enriched from this labeling experiment suggesting their proximity to the MHV replication/transcription complex.
Recently, we developed a microenvironment-mapping platform (μMap) to identify protein interacting partners through photocatalytic-mediated carbene generation to selectively tag and identify neighboring proteins on cell membranes.24 This technology was used to identify proximal protein environments of the immune cell surface proteins CD45, CD47, CD29, and PD-L1 in a visible light dependent manner. Furthermore, synaptic labeling between interacting cells within a co-culture environment was achieved with the μMap technology. Successful application of this technology in mammalian cell systems highlights the potential for utilizing photocatalyst-based strategies to map viral–host protein interaction environments relevant to coronaviral infection. This can involve attaching a photocatalyst to SARS-CoV-2 spike, protease, or replicase proteins to identify their respective interacting host proteins on the cell surface or within cytosolic environments (Fig. 8). A more detailed understanding of these dynamic interactions may provide alternative therapeutic targets towards inhibiting SARS-CoV-2 infection and disease spread.
HPRD – https://www.hprd.org/
IntAct – https://www.ebi.ac.uk/intact/
MINT – https://mint.bio.uniroma2.it
VirHost-Net – https://virhostnet.prabi.fr
BioGrid – https://thebiogrid.org
InnateDB – https://www.innatedb.com
PHISTO – https://phisto.org
OMIM – https://www.ncbi.nlm.nih.gov/omim
HGMD – http://www.hgmd.cf.ac.uk/ac/index.php
COSMIC – https://cancer.sanger.ac.uk/cosmic
This journal is © The Royal Society of Chemistry 2021 |