A Network Flow Approach to Predict Protein Targets and Flavonoid Backbones to Treat Respiratory Syncytial Virus Infection

Background. Respiratory syncytial virus (RSV) infection is the major cause of respiratory disease in lower respiratory tract in infants and young children. Attempts to develop effective vaccines or pharmacological treatments to inhibit RSV infection without undesired effects on human health have been unsuccessful. However, RSV infection has been reported to be affected by flavonoids. The mechanisms underlying viral inhibition induced by these compounds are largely unknown, making the development of new drugs difficult. Methods. To understand the mechanisms induced by flavonoids to inhibit RSV infection, a systems pharmacology-based study was performed using microarray data from primary culture of human bronchial cells infected by RSV, together with compound-proteomic interaction data available for Homo sapiens. Results. After an initial evaluation of 26 flavonoids, 5 compounds (resveratrol, quercetin, myricetin, apigenin, and tricetin) were identified through topological analysis of a major chemical-protein (CP) and protein-protein interacting (PPI) network. In a nonclustered form, these flavonoids regulate directly the activity of two protein bottlenecks involved in inflammation and apoptosis. Conclusions. Our findings may potentially help uncovering mechanisms of action of early RSV infection and provide chemical backbones and their protein targets in the difficult quest to develop new effective drugs.


Introduction
Respiratory syncytial virus (RSV) is a major cause of lower respiratory tract infection with high level of mortality in children around the world [1][2][3]. It is estimated that all children by two years of age have been infected by RSV and more than half of them are reinfected [4]. Moreover, RSV pathogenesis is notably associated with an increased airway resistance characterized as wheezing, diagnosed as bronchiolitis [2].
In the 1960 decade, a vaccine trial was performed with unexpected and tragic results [5]. Hence, effective preventive treatment to RSV infection is unavailable, since there is no vaccine against the virus. However, several prototypes are under study [6][7][8][9]. The prophylactic therapy with palivizumab, a humanized monoclonal antibody, has been shown to reduce the number of RSV hospitalizations in preterm infants [10], but the treatment has a very high cost, and it is administered only to children with risk factors for RSV bronchiolitis [11]. Another optional treatment against RVS infection is ribavirin. It is a nucleoside analog that introduces mutations into the RNA viral genome during replication and was previously used routinely for infants hospitalized with RSV. However, it has been associated with undesired side-effects and was not considered an effective treatment [12,13].
The absence of a vaccine for RSV-induced bronchiolitis and the existence of few antiviral agents against RSV constitute very important problems in pediatric medicine. Thus, the development of novel anti-RSV drugs that can be administered orally or parenteral to children is extremely necessary.
A great variety of viruses have been reported to be inhibited by natural compounds, such as flavonoids [14][15][16]; however, the molecular mechanisms underlying such effects are largely unclear. In this sense, it is difficult to develop new drugs.
In a search to provide new insights for RSV treatments and to understand the multiples signaling pathways affected by RSV infection, an integrative model based on systems pharmacology predictions has been used. Moreover, this methodology will allow understanding the effect of flavonoid (FLA) compounds against RSV infection, integrating chemical-protein (CP) and protein-protein interaction (PPI) networks.

Gene Expression Data from Primary Human Bronchial
Epithelial (PHBE) Cells Infected by RSV. The microarray data GSE12144 were downloaded from the Gene Expression Omnibus (GEO) database [http://www.ncbi.nlm.nih .gov/geo/]. Subsequently, a linear model was applied to normalize this data, using Limma package from R/Bioconductor to guarantee maximal statistical stringency [17]. Additionally, a contrast analysis was applied and differentially expressed genes (PHBE mock versus PHBE RSV 24h) were identified by Rank Product with a cutoff value of ≤ 0.05 [18].
In order to obtain drug-like compounds, a databasedependent model was applied to calculate the drug-likeness of all compounds similar to resveratrol or quercetin through Tanimoto coefficient (Tc) [37]: where "a" is the molecular property of each compound and "b" represents the average molecular properties of the whole compounds in the Drugbank database  [38]. Nonconnected nodes were excluded from the networks.

Modular Analysis of CPI-PPI Network.
ClusterONE was the tool used to discover densely connected and possibly overlapping regions within the Cytoscape network [39]. Dense regions corresponded to protein or compound-protein complexes or parts of them. ClusterONE identifies subnetworks by the identification of "growing" dense regions out of small seeds guided by a quality function. The quality of a group was evaluated by the number of internal edges divided by the number of edges involving nodes of the group.

Gene Ontology Analysis.
Gene ontology (GO) analysis was determined by biological network gene ontology (BiNGO) software 2.44 [http://chianti.ucsd.edu/cyto web/ plugins/index.php] [40]. The degree of functional enrichment for a given category was assessed ( value ≤ 0.05) by hypergeometric distribution [41] and multiple test correction was applied using the false discovery rate (FDR) algorithm [42], from BiNGO software. Overrepresented biological process categories were obtained after FDR correction, with a significance level of 0.05.

Centralities Parameters and Topological Analysis.
Major network centralities (closeness, betweenness, and node degree) were analyzed with the CP-PPI networks using the Cytoscape plugin CentiScape 2.8.2 [43]. Closeness centrality was used to evaluate the shortest path among a random node (protein or chemical compound) and all other nodes [43]: where the closeness value (Clo(V)) was calculated by computing the shortest path between the node V and all other nodes found within a network. The average closeness (Clo) score was calculated by the sum of different closeness scores (Clo ) divided by the total number of nodes analyzed ( (V) ): Also, the betweenness parameter was taken into account in the analysis. This parameter is a measure equal to the number of shortest paths from a couple of nodes that pass through a different node [43,44]: where is the total number of the shortest paths from node to node and (V) is the number of those paths that pass through the node V.
The average betweenness score (Bet) of the network was calculated using (5), where the sum of different betweenness scores (Bet ) is divided by the total number of nodes (V) analyzed: The average betweenness score of CP-PPI network was used to obtain responsible nodes of the control of the flow of information in the network. These nodes are called bottlenecks (B) and show higher probability of connections of different modules or biological processes.
Finally, parameter degree was calculated. This parameter is a measure that indicates the number of adjacent nodes ( ) that are connected to a specific node (V), according to The average node degree of a network (Deg) is given by (7), where the sum of different node degree scores (Deg ) is divided by the total number of nodes (V) present in the network: Nodes with a high node degree score compared to the average are called hubs (H) and are responsible for a central regulatory role in the cell.
In this work, H-B (hub-bottleneck) may correspond to central proteins or FLA compounds that are highly connected to several complexes, while nodes that belong to the NH (non-hub-B) group correspond to proteins or FLA compounds that are important. In order to obtain H-B and NH-nodes, mathematical means (threshold) generated for betweenness and degree parameters were considered.

Molecular Parameters for the Development of a Potential
Drug. All compounds, which were chemically verified by Zinc database [45,46] were analyzed taking into account the Lipinsky's rule of five (xLogP, molecular weight, number of hydrogen bond acceptors, and donors). Toxicity risks (mutagenic, tumorigenic, irritant, and reproductive effect) were also examined by the Osiris Property Explorer [http://www.organic-chemistry.org/prog/peo].
A diagram of methodological steps used in this work is showed in Figure 1.

Results and Discussion
Studies of the FLA effects on viruses only have been performed in vitro and in vivo but not in silico using highthroughput (omic) approaches and network analysis based on interactome data. This may occur due to the structure of flavonoids, which generally consist of two aromatic rings, each containing at least one hydroxyl group that is connected through a three-carbon "bridge" becoming later part of a heterocyclic ring [47]. These chemical proprieties allow increased permeability across the cellular membrane to interact with multiple intracellular targets [48,49]. As such, these compounds possess a broad spectrum of biological activities [50,51], leading to the overrepresentation of many biological pathways, which may not be necessarily linked to antiviral potential. In this sense, systems pharmacology or chemobiology strategies could be employed to define specific targets of flavonoids.  Figure 1: Experimental approach employed to define potential treatments against RSV infection. The interactome data was obtained from microarrays data derived from human bronchial cells infected with RSV. Differential gene expression was considered as initial input for network prospection. Additionally, the natural compounds from flavonoids obtained according to Tanimoto similarity were added to the initial input in STITCH software. The CP-PPI network generated was viewed by Cytoscape and analyzed by ClusterONE in order to identify the major clusters associated. Biological processes found within clusters were retrieved by employing BiNGO plugin. Moreover, to find bottlenecks and hubs, proteins/compounds used CentiScape plugin. Finally, data interpretation was performed based on Zinc database and Osiris Property Explorer.

Topological Design and Analysis of a Major CP-PPI Network of PHBE Cells Infected by RSV.
To focus on RSV antiviral effects of flavonoids, we developed an interatomic network considering 285 genes differentially expressed during RSV infection of PHBE cells and 26 flavonoids compounds (Table 1) as an initial input on STITCH software. As a result of this approach, a major CP-PPI network composed of 57 nodes and 92 edges and integrated by five compound targets with putative antiviral activity was obtained ( Figure 2). It is important to note that minor networks without CPI were also detected but were not considered for posterior analysis (Supplementary sense, the global organization of clustering in the major network suitable for flavonoid modulation was analyzed. Clus-terONE identified four interconnected clusters (Figure 2). Subnetworks of these clusters were created, representing four discrete biological processes, as identified by gene ontology analysis (GO) (Supplementary Table 1): (1) cell cycle phase (corrected value: 2.33 × 10 −6 ); (2) ubiquitin-dependent protein catabolic process (corrected value: 1.61 × 10 −5 ); (3) nucleic acid metabolic process (corrected value: 4.68 × 10 −4 ); and (4) RNA splicing (corrected value: 1.65 × 10 −6 ). RSV-host studies have identified these processes that occur upon infection [52][53][54]. However, all flavonoids and their targets are unclustered in the major CP-PPI network. This shows a compound-target regulation independent of cluster network organization during early RSV infection. An alternative and possible strategy to understand RVS modulation by flavonoids is to predict the best ranking of compound target (high impact on the network) through network connectivity analysis. In this sense, centrality properties were evaluated; however, 11 H-B nodes were identified in the CP-PPI network, represented only by proteins (Figure 3 major network. All flavonoid compounds are H-NB and NH-NB nodes, but these modulate directly 2 H-B proteins (PIM1 and BCL2).

PIM1 and BCL2, as FLA Targets against RSV Infection.
PIM1 is a protooncogene which encodes a serine/threonine kinase [55]. This kinase controls cell survival, proliferation, differentiation, and apoptosis [56]. In the context of respiratory diseases, a recent study suggests that PIM1 has a role in the induction of allergic airway responses [57]. Therefore, PIM1 inhibition reduces the development of full spectrum allergen-induced lung inflammatory responses, at least partially through limiting the expansion and actions of CD4+ and CD8 + effector T cells [57]. A similar function for PIM1 has been described in acute RSV infections [58]. PIM1 inhibition attenuates induced RSV reinfection, enhancing airway hyperresponsiveness and activation of the inflammatory cascade. In our analyses, PIM1 showed to be upregulated in comparison with noninfected control (log FC = 0.026) and to interact with three flavonoids (tricetin, myricetin, and quercetin). These compounds are cell-permeable and directly inhibit PIM1 kinase activity [59]. In this sense, these flavonoids are potential inhibitors of RSV-caused inflammation in a target-specific manner, through yet unknown mechanisms. It is important to note that anti-RSV activity of myricetin and tricetin were not tested experimentally and should be further investigated.
On the other hand, our data suggest BLC2 regulation mediated by flavonoids. BCL2 is a regulator of programmed cell death (apoptosis), in part by modulating the release of proapoptotic molecules from mitochondria. For viruses in general (included RSV), apoptotic death of infected cells is a mechanism for reducing virus replication. After 24 h of infection by RSV, several proapoptotic factors of the BCL2 family and caspases 3, 6, 7, 8, 9, and 10 are induced in different epithelial cell lines (primary small airway cells, primary tracheal-bronchial cells, A549, and HEp-2 but not for PHBE) [60]. At the same time, RSV also mediates induction of antiapoptotic factors of the BCL2 family [60], which might account for the delayed induction of apoptosis of RSVinfected cells. This indicates the importance of a complex struggle between apoptotic (host) and antiapoptotic (virus) pathways [60].
In our study, BCL2 was shown to be downregulated in PHBE infected cells in comparison with noninfected controls (log FC = −0.008). We hypothesized that differential expression of this gene may be caused by overexpression of PIM1. In hematopoietic cells, PIM1 kinase acts as a survival factor in cooperation with a regulation of BCL2 [61]. This mechanism should be investigated in RSV infected PHBE.
Furthermore, resveratrol and apigenin control the activity of BCL2 in inducing apoptosis in cancer cells [62,63], but the effect of these flavonoids has not been explored in PHBE cells or in in vivo models for RSV infection. However, these compounds are described as inhibitors of RSV replication in vitro (see Table 1).

In Silico Analysis of FLA Effects on Human Health.
We have also predicted potential undesired effects on human health of each of the FLA compounds based on its chemical structures (for more details, see Section 2.7 of Materials and Methods). Our analysis suggests that tricetin may have low risk to human health considering the four main parameters of the analysis (mutagenic, tumorigenic, irritant, and reproductive effectiveness), as shown in Table 2. The other four flavonoids (resveratrol, quercetin, apigenin, and myricetin) may require chemical modification to reduce human health impact but provide versatile chemical backbones for drug development. Biotransformation of flavonoids into drugs is Low-risk Low-risk Low-risk Low-risk * All parameters related to Lipinsky's rule of five were obtained from Zinc database. * * All toxicity risks were predicted by Osiris Property Explorer. the usual approach in the development of anticancer targets [64,65] but could also be applied in the search of new therapies against RSV.

Conclusions
Our model network CPI-PPI identified five target flavonoid compounds: resveratrol, quercetin, tricetin, apigenin, and myricetin. These compounds are suggested as potential candidates in the process of development of novel drugs against early severe RSV infection. Despite these potentially interesting associations, these findings are mainly relying on statistical analysis. Thus, Further experimental testing of these predictions will be required to support the in silico data.