Network analysis of SARS-CoV-2

Network analysis of SARS-COV-2

The COVID-19 pandemic, caused by the novel SARS-CoV-2 virus of the Corona-virus family, poses an immediate threat to humanity. An urgent challenge is to understand the underlying biology of how the SARS-CoV-2 virus works when it infects the body.

To combat the virus with effective treatments, we must first learn the defining and driving features of SARS-CoV-2 infection. A recent study by Blanco-Melo et al. (Cell, 2020) found that SARS-CoV-2 infection is characterized by unusual low innate antiviral defenses and high pro-inflammatory response. Although the study bridges an important gap in our understanding of the virus’ pathogenicity, additional insight can be gained from a more in-depth identification and classification of molecular systems affected by the viral infections. Furthermore, investigating the impact at the systems level, allows for evaluation of drug target candidates in the context of the biological networks they are affecting.

Here, we sought to re-analyze the Blanco-Melo et al. data with the aim of gaining new therapeutic insights and targets for COVID-19, by identifying ‘secondary cellular responses’ outside of the primary immune response. We use a network approach based on physical protein-protein interactions (inBio Map™) to find biological systems regulated as a whole in response to SARS-CoV-2 infection. We show that by going beyond a limited list of individually significantly regulated genes, we can uncover new insights and drug target candidates for COVID-19.

Highlights from this work
  1. We use gene expression data from SARS-CoV-2 -infected human cells to perform network-level analysis of the host transcriptional response.
  2. We find biological networks that as a whole are more regulated than what should be expected from the transcriptome background, even if individual members of the network are not strongly regulated.
  3. We point to molecular mechanisms being affected by the virus infection, that are not immediately obvious from investigating individually significantly regulated genes.
  4. We identify drug target candidates in the networks (6 targets with approved drugs and 29 with drugs in clinical trials). Our network-based approach identifies more than two times the number of drug targets candidates compared to the Blanco-Melo et al. list of differentially expressed genes.

COVID-19 regulated networks

We identified 13 networks significantly regulated during SARS-CoV-2 infection. In general, most networks were up-regulated in response to SARS-CoV-2 infection. In the table below the networks are numbered by their ranking of significance. The networks are made freely available via our inBio Discover™ platform.

Network Size Main biology
View NW1 46 Cell cycle, protein degradation
View NW2 50 DNA repair, DNA replication
View NW3 41 DNA repair (DS break)
View NW4 24 Cell cycle (chromatid segregation)
View NW5 48 DNA repair
View NW6 18 Cell cycle (M phase)
View NW7 36 Transcription (RNA polymerase 1)
View NW8 35 Vesicle transport
View NW9 33 DNA repair (DS break)
View NW10 17 Cell cycle (M phase)
View NW11 39 Protein localization (less certain)
View NW12 37 tRNA, RNA processing
View NW13 36 Vesicle coating, vesicle localization
View NW1-13 merged 351 Network 1-13 merged

Each network was inspected in detail, and the dominant molecular biology in each network was assessed via Gene Ontology overrepresentation analysis using inBio Discover™. The biologies represented in the networks are very well in agreement with how SARS-CoV-2 and other coronaviruses are believed to affect the infected cells (disruption of cell cycle regulation (NW1, NW4, NW6, NW10), including mechanisms linked to DNA repair (NW2, NW3, NW5, NW8), the formation of new virus particles via vesicle formation in the Golgi apparatus (NW8, NW13) and effect on the host’s RNA processing systems (NW7, NW12). (We also repeated the analysis for the top 25 networks, and the same classes of biology were represented in this extended set as well, indicating further robustness of the findings.)

We further annotated the networks with information about direct virus-to-host protein-protein interactions as well as known drug targets. For each main category of biological function, we have highlighted an example below, with an interesting pattern of virus-host interactions and/or drug targets.

We merged networks 1 through 13 into a combined network of 351 genes. We compared our network to the list of 120 differentially expressed genes (DEGs, FDR < 5%) reported by Blanco-Melo et al. Only 3 genes overlap between the two gene sets, demonstrating that individually DEGs and biological networks regulated as a whole, are two complementary approaches to studying gene regulation. Our networks cover a more expansive set of biological systems and include more than two times the number of drug target candidates compared to the DEG gene set. In summary, network analysis is a powerful and complementary approach to derive new and deeper biological insights from transcriptomic data.

Highlighted Networks

Here, we provide biological characterization for four highlighted networks.

Network 13: Vesicle coating, vesicle localization

This network is closely linked to the process of formation of viral particles in the Golgi apparatus. Not only is the system significantly regulated as a whole at the transcriptional level (the individual genes/proteins are weakly regulated), but it also contains key virus-host interactions (see table below) plausibly involved in the process of highjacking the Golgi as a virus particle factory. Furthermore, among the drug targets present in the network, one of them (CTSC) currently has an associated drug in phase 2 clinical trials specifically for the treatment of COVID-19 (ref.), which further strengthens the link from this biological system to COVID-19.

Drug targets (pre-clinical or clinical trials) Virus-host interactions
CTSC
CSNK1D
CAPN1
GOLGA2 interacts with nsp13
GORASP1 interacts with nsp13
RAB1A interacts with nsp13

Network 7: Transcription

This network is linked to the virus process of interfering with host gene expression. It is already known that other coronaviruses, including SARS-CoV, interfere with the host’s transcriptional system to enhance virus transcription/replication, via interaction with the helicase DDX1 (ref.). DDX1 is part of a complex containing DDX1, DHX36, TICAM1 and DDX21, the latter of which is found in this network. Furthermore, DDX21 has been shown to be the target of a virus/host interaction (see table below).

Drug targets (pre-clinical or clinical trials) Virus-host interactions
CDK7
KAT2B
SIRT1
TP53
DDX21 interacts with nucleocapsid (Protein-N)

Network 1: Cell cycle / protein degradation

This network captures part of the response to cell cycle deregulation by the virus. This effect is well known from other viruses (including SARS-CoV, where it is mediated via an interaction with nsp3b) as well as SARS-CoV2 (where it is mediated via interaction from protein 7a). A number of points in the cell cycle are commonly targeted by viruses: G1/S checkpoint, G2/M checkpoint, mitotic exit, G0/G1 checkpoint. SARS-CoV-2 is currently thought to affect the host cell cycle by dysregulation of the G0/G1 transition potentially via interference with the cyclin D3/pRb pathway. Any dysregulation of the cell cycle in a differential gene expression study, will naturally have an effect on multiple cell cycle-related systems, which is what we observe (see network overview list).

Drug targets (pre-clinical or clinical trials) Virus-host interactions
EED
LRRK2
USP14
UCL5
TNK2
TP53

Network 3: DNA repair (DS break)

The final network we are showcasing in full detail is linked to DNA repair. It should be noted that since DNA repair is so closely linked to the S phase in cell cycle, it is difficult to completely separate regulation of these two biological systems in a differential gene expression study. However, DNA repair (including DS breaks) has previously been linked to SARS-CoV and it is plausible to also play a role in SARS-CoV-2 infection. From our analysis, multiple networks related to aspects of DNA repair (and the cell cycle S phase, by proxy) have been identified (see network overview list).

Drug targets (pre-clinical or clinical trials) Virus-host interactions
ATM
CHEK2
CRBN
IFNAR1
NLRP3
UBC
NSD2 interacts with nsp8

Network 1-13: Merged network

We merged networks 1-13 into a single network. This combined network contains several drug targets (6 targets with approved drugs and 29 with drugs in clinical trials) as well as 5 virus-host interactions. When overlaid with functional pathway annotations (Gene Ontology and Reactome), we see the four overall biological functions as sub-networks within the merged network. This is a strong example of how proteins that share biological functions cluster together in biological networks, and how network biology can thus strengthen interpretation and insights from transcriptional data.

Drug targets Virus-host interactions
6 targets with approved drugs
29 with drugs in clinical trials
DDX21 interacts with nucleocapsid (Protein-N)
NSD2 interacts with nsp8
GOLGA2 interacts with nsp13
GORASP1 interacts with nsp13
RAB1A interacts with nsp7

Limitations

We elected to focus this study on networks of limited size in order to get a more clear separation of overlapping biological systems. Focusing on smaller networks also allowed us to do a more in-depth characterization of the significantly regulated networks. The downside of restricting the network size is that we do not cover the full spectrum of biological networks implicated in the SARS-CoV-2 virus response.

Summary

We used protein-protein interaction networks (inBio Map™) to perform a system-level evaluation of genome-wide transcriptomics data to uncover new insights and targets for COVID-19. Our work implicates vesicle transport, cell cycle, DNA repair and transcriptional biology (see below figure) as part of the host’s response to SARS-CoV-2 infection. Our networks cover a more expansive set of biological systems and include more than two times the number of drug target candidates compared to the DEGs reported by Blanco-Melo et al. In summary, network analysis is a powerful approach to derive new, deeper and actionable biological insights from transcriptomic data.

SARS-CoV-2 life cycle. The COVID-19 regulated networks identified in this work represent biological processes that match the known life cycle of SARS-CoV-2 and other coronaviruses (boxes marked with red text). Created with BioRender.com.

Methods and network analysis

Differential expression

We obtained differential expression summary statistics of SARS-CoV-2 host infection from Blanco-Melo et al. (now published in Cell, 2020). Briefly, these data were generated as follows: human lung alveolar cells (A549) were infected with SARS-CoV-2 and RNA-seq libraries were prepared and sequenced, and differential expression analysis against control (mock infected A549 cells) was performed using DESeq2. The authors reported 120 differentially expressed genes (FDR < 5%) of which more than 80% were up-regulated.

Network analysis

We used the SystemSignificance tool (that is part of our inBio Command™) to identify significantly regulated COVID-19 networks. SystemSignificance uses omics data (in this case transcriptomics) and permutation analysis to assess the evidence of transcriptional regulation for neighborhoods of proteins. A key strength of this approach is, that by pooling information in neighborhoods of proteins, we can increase the robustness and amplify the biological signal from the differential expression analysis. The output of SystemSignificance is a p-value measuring the significance of (transcriptional) regulation for a given protein’s neighborhood.
If you would like to learn more about inBio Command™ or our software modules, please contact us.

SystemSignificance computes the significance of protein neighborhoods by permutation analysis.

Network filtering

The resulting networks where filtered based on size and significance:

  • We elected to focus this study on networks of limited size (<= 50 proteins) in order to get a more clear separation of overlapping biological systems.
  • We kept networks that remained significant after applying Bonferroni correction for multiple hypothesis testing. (We note that Bonferroni is a highly stringent way of correcting for multiple hypothesis testing. More lenient correction approaches are also applicable.)

In total we retained 13 networks.

Biological annotation

We annotated the networks with drug target data and virus-host protein-protein interactions (PPI). We obtained drug targets and drug data from the Therapeutic Target Database. For virus-host PPIs we downloaded the BioGRID database COVID-19 interaction release. The release contains 353 human-SARS-CoV-2 interactions from 14 studies, incl. the landmark study by Gordon et al. (2020, Nature). See the below section section for an overview of all data resources used.

References and datasets

Resource Source Data link Version
SARS-CoV-2 host infection RNA-seq data Blanco-Melo et al. (bioRxiv, 2020)
Now published in Cell (2020)
Click here to download Supplementary Table 2

The full dataset is available here

Pre-print posted March 24, 2020
Human PPI data InBio Map™ Click here Last updated March 4th, 2020
Drug data Therapeutic Target Database Click here Last updated July 14th, 2019
Virus PPI data BioGRID Click here Build 3.5.185 (last updated April 30th, 2020)

Contact and supporting information

Get in touch with us to learn more about these results and algorithms, and how Intomics can help you advance your biomedical or COVID-19 research.


We offer access to individual COVID-19 network files (.xgmml format) upon request.

Terms of use: these results and networks are made freely available to enable COVID-19 research in the biomedical and public health research communities. The networks are made available via inBio Discover™ , which is a free to use service for searching and exploring inBio Map™ PPI networks. Access to advanced features and download of the full underlying PPI database, requires a commercial license. Read more here.