Computation-Based Discovery of Potential Targets for Rheumatoid Arthritis and Related Molecular Screening and Mechanism Analysis of Traditional Chinese Medicine

This study is aimed at screening potential therapeutic ingredients in traditional Chinese medicine (TCM) and identifying the key rheumatoid arthritis (RA) targets using computational simulations. Data for TCM-active ingredients with clear pharmacological effects were collected. Absorption, distribution, metabolism, excretion, and toxicity were evaluated. Potential RA targets were identified using the Gene Expression Omnibus (GEO) database, protein–protein interaction network, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses and potential TCM ingredients using AutoDock Vina. To examine the mechanisms underlying small molecules, target prediction, Gene Ontology, KEGG, and network modeling analyses were conducted; the effects were verified in rat synovial cells using cell proliferation assay. The activities of tumor necrosis factor TNF-α and IL-1β and alterations in cellular target protein levels were detected by ELISA and Western blotting, respectively. In total, data for 432 TCM active ingredients with clear pharmacological effects were obtained. Five critical RA-related genes were identified; CCL5 and CXCL10 were selected for molecular docking. Target prediction and network-based proximity analysis showed that dioscin could modulate 22 known RA clinical targets. Dioscin, asiaticoside, and ginsenoside Re could effectively inhibit in vitro cell proliferation and secretion of TNF-α and IL-1β in RA rat synovial cells. Using bioinformatics and computer-aided drug design, the potential small anti-RA molecules and their mechanisms of action were comprehensively identified. Dioscin could significantly inhibit proliferation and induce apoptosis in RA rat synovial cells by reducing TNF-α and IL-1β secretion and inhibiting abnormal CCL5, CXCL10, CXCR2, and IL2 expression.


Introduction
Rheumatoid arthritis (RA) is a chronic, systemic, inflammatory, and progressive autoimmune disease that affects synovial joints and other organ systems [1]. To date, the underlying disease mechanism of RA remains unclear, but it is generally believed to be initiated by infection and inflammatory mediators. Furthermore, recent studies have shown that the pathogenesis of RA is related to the changes and effects of genetic, bacterial, and viral factors; T and B lymphocytes; cytokines; and other immune cells [2,3]. The incidence of RA increases with age. Approximately 0.3% to 1.0% of the population worldwide are affected by RA each year. Furthermore, RA is more common in adults aged 35-50 years, and the incidence in women is approximately three times that in men [4,5].
Currently, the available treatment of RA includes nonsteroidal anti-inflammatory drugs, glucocorticoid drugs, and biological macromolecular therapy; however, they are costly and often exert toxicity and cause side effects [6,7]. Therefore, further improvement of the efficacy and safety of anti-RA drugs and reduction of the costs is necessary.
Traditional Chinese medicine (TCM) has a long history of RA treatment based on rich experiences in clinical applications. RA, a common type of arthritis, belongs to "Bi syndrome" category in Chinese medicine [8]. Currently, Chinese medicines and compound prescriptions against RA have shown anti-inflammatory, analgesic, immunomod-ulatory, multilevel, and multistep therapeutic effects and present advantages of high safety, few adverse reactions, and cost effectiveness. Therefore, they have potential application for the treatment of RA [9].
Although several TCM compounds have been reported to exert good curative effects and minor side effects, the discovery of additional potential anti-RA ingredients from a large number of common small TCM molecules with clear pharmacological effects has important practical significance. In this view, we integrated bioinformatics, computer-aided drug design, and in vitro cellular experiments, in combination with existing literature analysis, to explore important targets for the treatment of RA and further screen out the active ingredients of TCM with potential therapeutic effects.
In vitro experiments were performed to confirm these findings, with the aim of providing powerful methods and

Data Collection. The active ingredients of common
TCMs that have been proven to have clear pharmacological effects were collected from published literature in Chinese [10,11]. The PubChem database (https://pubchem.ncbi .nlm.nih.gov/) was used to confirm the chemoinformatics data for each ingredient, including molecular name CAS, PubChem CID, molecular formula, canonical SMILES, and SDF files. A more authoritative and reliable database of active ingredients in commonly used Chinese medicines has been constructed.

Gene Expression Omnibus (GEO) Differential Gene
Analysis for RA. In the GEO database (https://www.ncbi .nlm.nih.gov/geo/), "rheumatoid arthritis" was queried to download the gene expression profile chip data related to RA. Perl language (version 5.32.0) was used to perform preprocessing of the original data, such as standardization, correction, and gene name annotation. The R language-based limma package [13] (version 3.44.0) was used for differentially expressed gene (DEG) analysis, and the upregulated and downregulated DEGs in each set of chip data were screened out when jlog 2 FCj > 1:0 and adj:p:value < 0:05.

Construction of Disease-Associated Protein-Protein
Interaction (PPI) Network and Discovery of Key Targets. All selected genes were imported into the STRING database (version 11.0, https://string-db.org/) to obtain the PPIs of the DEGs. The parameters were set as follows: organism, Homo sapiens; combined score threshold, 0.7. In addition, Gephi software (Version: 0.9.2, https://gephi.org/) was used to visualize the PPI network, and cytoHubba in Cytoscape   Disease Markers (version 3.7.1, https://cytoscape.org/) [14], in which the algorithm selects the maximal clique centrality (MCC), was used to screen out the key targets in the PPI network.

Gene Ontology (GO) Function Annotation and Kyoto
Encyclopedia of Genes and Genomes (KEGG) Pathway Enrichment Analysis. The GO database constructed by the GO Consortium in 2000 contains information for the biological process (BP), molecular function (MF), and cellular component (CC) of genes [15]. A biological process or pathway involves a group of genes functioning together. Enrichment analysis is used to identify classes of overrepresented genes or proteins, and a few of these may have an association with disease phenotypes. GO analysis uses statistical approaches to identify significantly enriched or depleted groups of genes. In this study, clusterProfiler (version 3.14.3) [16] was used to perform GO function annotation and KEGG path-way enrichment analysis on all differential gene sets in RA, which were screened according to p value < 0.05 and q value < 0.05.

Screening of Small Molecules of TCM Based on Key
Targets and Molecular Docking. The key protein targets in the PPI network were identified, and the targets in the reported RA-related pathways were obtained from the KEGG pathway enrichment analysis as described in Sections 2.3 and 2.4, respectively. The common targets from the two above-mentioned gene sets were considered the potential key targets of RA.
The virtual screening process of small molecules of anti-RA TCM was based on the molecular docking method, which involved the following steps: (1) The PDB file for the potential key target proteins was downloaded from the PDB database (http://www

Disease Markers
.rcsb.org/). PyMOL (version 1.7.0, https://pymol.org/ ) was used to evaluate key target proteins to remove water molecules, impurities due to the presence of ions, and perform other changes. AutoDockTools (version 1.5.6, http://autodock.scripps.edu/wiki/ AutoDockTools) was used to hydrogenate and charge the processed protein-ligand; the processed PDB files were converted and saved in the PDBQT file format (2) The information of small molecules was downloaded from PubChem in two SDF format files, 2D and 3D. OpenBabel (version 2.4.0, http://openbabel.org/) was used to convert the 2D SDF file into a 3D structure file. The original 3D structure was directly converted into a PDB file, and all small molecules were processed with minimum energy; finally, the PDB file was converted into a PDBQT file  To examine the underlying mechanisms of small molecules of anti-RA TCM more comprehensively, we further used the target prediction method to discover other potential targets and combined the findings with those of the abovementioned bioinformatics analysis and molecular docking screening. Corresponding canonical SMILES of the obtained small molecules were imported into online prediction systems, such as HitPick (http://mips.helmholtz-muenchen.de/proj/ hitpick) [17], SEA (http://sea.bkslab.org/) [18], SwissTarget-Prediction (http://www.swisstargetprediction.ch/) [19], and STITCH (version 5.0, http://stitch.embl.de/), to predict their potential targets. In particular, the screening threshold precision for HitPick was set at 50%, the threshold MaxTc for SEA was set at 0.7, and the threshold probability for Swis-sTargetPrediction was set at 0.15. Molecular docking verification, GO function annotation, and KEGG pathway enrichment analyses were performed on the predicted potential targets.

Analysis of the Regulation of Known RA Treatment
Targets by Small Molecules of TCM Based on Network Proximity. Next, we determined whether the small molecules of TCM screened have direct or indirect regulatory effects on the clinically known RA treatment targets. Given that some  Disease Markers of these molecules may not directly act on certain RA targets but regulate the disease via the intervention of its neighboring targets, we further analyzed the small molecules of TCM using the calculation method of network proximity, based on the human PPI background network [20], using the following formula [21]: where A is the target set of small molecules of TCM, kAk is the number of the targets of A, B is the clinically known set of RA treatment targets, kBk is the number of the targets of B, dða, bÞ is the shortest path distance between two nodes in the human PPI background network, hd AA i represents the average distance between the target points of the component action, hd BB i represents the average distance between the key genes, and hd AB i represents the average distance between the small Chinese medicine molecular target set and the clini-cally known RA treatment target set on the background network.
Generally, when S AB < 0, the small molecule target A of TCM and the clinically known RA treatment target set B are close in network topology, indicating that the small molecule of TCM can interfere with the target set A and regulate the clinically known RA treatment target set B. When S AB ≥ 0, A and B are separated in the network topology, indicating that the small molecule of TCM has no significant regulatory intervention on B. Therefore, we can calculate the value of S AB to further judge whether the small molecules of TCM can interfere with the known clinical disease target set by regulating some other target proteins to further improve the treatment of the disease.
For identifying the clinically known RA treatment targets, we queried across two databases: Therapeutic Target Database (TTD, http://db.idrblab.net/ttd/) and DrugBank (https://www.drugbank.ca/). Thereafter, the intersection of the RA targets collected using the two databases was used to construct the final known RA clinical treatment target set.
In this study, we used R language (version 3.6.2) and igraph (version 1.2.5) to complete the aforementioned programming, calculations, and analyses.

Verification of the Targets through In Vitro
Cell Experiments 2.8.1. Materials, Reagents, and Instruments. Materials: rat synovial cells were induced using type II collagen and isolated in our laboratory.

Detection of the Effects of Small Molecules of TCM on
Target Proteins through Western Blotting. The sample proteins collected as described in Section 2.8.3 were boiled at 100°C for 5 min. Thereafter, equal amounts of protein from each sample were loaded on a 10% SDS-PAGE gel for electrophoresis. After transferring the proteins on the gel to a cellulose acetate membrane with transfer buffer, the blots were blocked with 5% skim milk powder for 1 h. The blots were then incubated with different concentrations of primary antibody overnight at 4°C, followed by TBS-T washes and incubation with horseradish peroxidase-labeled secondary antibody. ECL was used to develop chemiluminescence, and the images were captured.

Data Collection for Active
Ingredients of TCM and ADMET Analysis. After collection of published literature data and confirmation from the PubChem database, chemoinformatic data of 432 effective small molecules of TCM and their corresponding 3D structure SDF files were obtained (Supporting material 1). For small molecules without 3D structure files, first, the 2D structure files were downloaded, and then, OpenBabel was used to convert them into a 3D structure file. The MMFF94 force field was applied for energy optimization.
The canonical SMILES of 432 small molecules of TCMs were input into the ACD/Labs software and SwissADME for ADMET prediction evaluation. The results are shown in Figure 2. The evaluation results showed that among the 432 small molecules collected, 275 showed good Lipinski properties, 70 had moderate (moderate), and 87 exhibited low scores. The average, minimum, and maximum molecular weights of all small molecules were 395.05, 103.12, and 1468.52, respectively. In total, 160 small molecules were BBB permeant, 278 had good water solubility, and 164 were of poor solubility. A total of 24 small molecules showed high metabolic stability; however, a few showed temporary uncertainty, and 20 small molecules showed potential metabolic instability. Most small molecules had a good bioavailability score; only a few scored less than 0.55. Genotoxicity prediction (through Ames test) showed that 39 small molecules had potential mutagenicity, and 142 were clearly nontoxic; however, mutagenicity of 297 molecules could not be determined. The intestinal absorption of each small molecule was high. The lipophilicity (log P) evaluation showed that 330 small molecules exhibited optimal lipophilicity (optimal). The average value for the synthetic accessibility of small molecules was 4.69, suggesting that most compounds were easy to synthesize in the laboratory. The prediction of cardiac inhibitory toxicity (hERG) suggested that 19 small molecules had cardiotoxic properties (Supporting material 2).

RA Chip Data Collection and Differential Gene Analysis.
GSE55235, GSE55457, and GSE77298 were obtained from the GEO query to obtain the expression profile data related After differential gene analysis, 1071 DEGs were screened in GSE55235-599 were upregulated and 472 were downregulated; 312 DEGs were screened in GSE55457-189 were upregulated and 123 were downregulated; and 432 DEGs were screened in GSE77298-237 were upregulated and 195 were downregulated. The volcano diagram of GEO differential gene analysis of each group is shown in Figures 3(a)-3(c). In this study, we included the DEGs that concurrently appeared in any two chip expression datasets into the RA gene set. Therefore, a total of 267 RA-related DEGs were obtained, as shown in Figure 3(d).
We further examined the intersection shown in Figure 5(c) and found that there were five targets associated with both the key proteins in the PPI network and the genes in the RA-related pathways. More interestingly, the target proteins CCL5, CXCL10, CXCL11, CXCL6, and CXCL9 appeared unbiasedly in module 1 of the PPI network. However, three of them do not exist in the PDB database without an appropriate PDB structure file; thus, we only used two proteins, CCL5 (CC motif chemokine 5, PDB ID: 5L2U) and CXCL10 (CXC motif chemokine 10, PDB ID: 1O7Y), with complete PDB information files for follow-up research.
3.5. Molecular Docking Analysis. AutoDock Vina was used to perform bulk molecular docking on the two protein targets CCL5 and CXCL10, and the affinity values of 432 small molecules of TCM to act on the targets were obtained, except for the very few unsuccessful dockings. The distribution of the molecular docking scores of all small molecules of TCM on CCL5 and CXCL10 is shown in Figure 6.
In this study, we only listed the top 10 small molecules of TCM with 15 molecular docking scores, as shown in Table 2. Table 2 demonstrates that ginsenoside Re, asiaticoside, 14 Disease Markers ergotamine, neferine, polyphyllin II, dioscin, raddeanin A, berbamine, ginsenoside Rg1, and tubeimoside I, in turn, had better scoring results when docking with CCL5 than with CXCL10, while alpha-crocin, polyphyllin II, dioscin, digoxin, ergotamine, saikosaponin A, raddeanin A, polyphyllin VI, asiaticoside, and jujuboside A had higher scores for CXCL10 affinity than for CCL5. Among the small molecules, two saponins, polyphyllin II and dioscin, had affinities of less than -9.0. Based on molecular docking conformation analysis of these two small molecules, we hypothesized that a combination of dioscin with CCL5 and CXCL10 was most stable. Therefore, we selected dioscin for subsequent in-depth analysis, and the binding conformations of CCL5 and CXCL10 are shown in Figures 7(a) and 7(b).

Potential Target Prediction and Molecular Docking of Dioscin.
To comprehensively analyze the potential anti-RA mechanism of dioscin, we used HitPick, SEA, SwissTarget-Prediction, and STITCH to predict and screen the targets of dioscin. As a result, we found that dioscin had two potential targets, CXCR2 and IL2. In addition to the results of molecular docking screening presented in Section 3.5, four potential targets, CCL5, CXCL10, CXCR2, and IL2, were proposed for dioscin interaction.
Next, we verified the molecular docking of dioscin for CXCR2 (PDB ID: 6LFO) and IL2 (PDB ID: 4NEM). The results showed that the docking score of dioscin and CXCR2 was -9.1, which was identical to the docking score of dioscin and IL2 (-9.1). The molecular binding conformation also indicated that dioscin could form stable interactions with these two protein targets, as shown in Figures 7(c) and 7(d).

Mechanistic
Analysis of the Anti-RA Effects of Dioscin. Furthermore, GO function annotation and KEGG pathway enrichment analysis revealed that the four targets CXCR2, CCL5, CXCL10, and IL2 were mainly involved in GO functions, such as cellular calcium homeostasis (GO:0006874), calcium homeostasis (GO:0055074), cellular divalent inorganic cation homeostasis (GO:0072503), kinase regulator activity (GO:0019207), cytokine activity (GO:0005125), Gprotein-coupled receptor binding (GO:0001664), and external side of the plasma membrane (GO:0009897). The KEGG pathway was mainly enriched in 10 pathways. The interaction pathways between viral proteins and cytokines and cytokine receptors (hsa04061), the interaction pathway between cytokines and cytokine receptors (hsa04060), chemokine signaling pathway (hsa04062), cytoplasmic DNA sensing pathway (hsa04623), epithelial cell signaling pathway in Helicobacter pylori infection (hsa05120), Chagas disease pathway (hsa05142), Toll-like receptor signaling pathway (hsa04620), TNF signaling pathway (hsa04668), influenza A (hsa05164), and human cytomegalovirus infection pathway (hsa05163) are shown in Figure 8. Using the algorithm, we obtained the proximity values of dioscin to the 23 clinically known RA treatment targets on the human PPI network (S AB = −1:16 < 0), where hd AB i = 2:85, hd AA i = 4:62, hd BB i = 3:40. Therefore, we speculated that dioscin could indirectly regulate and intervene in other 22 known RA targets by directly acting on the four targets, i.e., CCL5, CXCL10, CXCR2, and IL2, in treating RA. Based on further analysis of the human PPI network, we found that CCL5, CXCL10, CXCR2, and IL2 interacted with 54 proteins in the human PPI network to regulate other therapeutic targets of RA, as shown in Figure 9.
Furthermore, we searched "rheumatoid arthritis" on the TCMSP website (https://tcmspw.com/) and obtained 42 small molecules of TCM with anti-RA effects and corresponding therapeutic targets (Supporting material 3). Accordingly, the proximity S AB of these 42 small Chinese medicine molecules on the human PPI network to 23 RA treatment targets was calculated. The results showed that the S AB values of these 42 small molecules were all less than 0, which confirmed that dioscin had potential anti-RA effects.  Cell proliferation experiments indicated that these eight small molecules of TCM had no significant effect on normal rat synovial cells, while the response of CIA rat synovial cells was significantly enhanced by dioscin, asiaticoside, and ginsenoside Re, with IC 50 values of 230:1 ± 2:1 μg/ml, 299:9 ± 2:8 μg/ml, and 270:2 ± 2:5 μg/ml, respectively.
Compared with the control group, the CIA rat synovial cells produced significantly increased levels of TNF-α and IL-1β. However, the TNF-α and IL-1β levels in the culture supernatant of the dioscin, asiaticoside, and ginsenoside Re groups were significantly reduced, which suggested that these TCMs could significantly inhibit the hypersecretion of TNF-α and IL-1β in CIA rat synovial cells (Figure 10(a)).
We further examined protein expression using Western blotting. Compared with the control and RA groups, the dioscin and asiaticoside groups showed significantly reduced expression of CCL5, while dioscin and ginsenoside Re groups showed significantly reduced expression of CXCL10. Similarly, CXCR2 expression was significantly inhibited in the dioscin, asiaticoside, and ginsenoside Re groups. Dioscin also had a certain inhibitory effect on IL2 expression. All the above-mentioned small molecules of TCM had similar effects to tripterygium glycosides, as shown in Figure 10

Discussion
In this study, we first collected 432 effective common small molecules of TCM and screened out 267 RA-relevant genes from 3 sets of RA gene expression profile data. The PPI network based on differential genes enclosed important nodes. Using biological function enrichment analysis, we identified five critical genes related to RA. Combined with literature analysis and existing protein databases, two chemokines (CCL5 and CXCL10) were selected for follow-up research. Target prediction and in vitro experiment results collectively suggested that dioscin had a direct regulatory effect on CXCR2, CCL5, CXCL10, and IL2.
Reportedly, chemokines play an important role in inflammatory cell infiltration in the joints of patients with RA [22], and chemokine blockers are considered potential drugs for the treatment of RA [23]. Previous bioinformatics analyses have shown that the expression of CCL5 is closely related to RA [24]. CCL5 is produced by circulating T cells and plays an active role in the chemotactic activity of T cells in RA [25]. The expression of CXCL10 is increased in the RA serum and synovium. It not only plays an important role in the homing of chemotactic leukocytes in RA inflammation but can also destroy the bone tissue by activating nuclear factor-κB ligands [26]. Studies have also found that the expression of CXCL10 in the peripheral blood and synovial fluid of RA patients is increased [27], and an increasing number of studies are now using CXCL10 as a novel target for RA treatment [28]. Similarly, the chemokine CXCR2 is known to be related to RA. CXCL1 promotes the expression of IL6 in synovial fibroblasts of osteoarthritis and RA patients via the CXCR2, c-Raf, MAPK, and AP-1 pathways [29]. Therapeutic blockade of CXCR2 can rapidly clear inflammation in arthritis and atopic dermatitis models [30]. Studies have shown that IL2 is a pleiotropic cytokine that can promote inflammation and maintain immune tolerance. Consequently, it is now an emerging therapeutic target for many anti-RA drugs [31].
Dioscin is found in medicinal plants such as Dioscoreanigra, D. zingiberensis, and D. fuzhouensis. Dioscin exerts many therapeutic effects on various diseases [32,33], such as regulating the inhibitory effect of miR-125a-5p on STAT3 signaling and reducing glucose and lipid metabolism disorders in Type 2 diabetes (T2DM) suggesting that it exhibits anti-T2DM activity [34]. In an animal model of hyperuricemia, the effects of dioscin on the reduction of serum uric acid levels and the enhancement of uric acid excretion have been reported [35]. Moreover, dioscin inhibits the M2 polarization of macrophages in the JNK and STAT3 pathways, thereby triggering antitumor immunity [36]. Other studies have further shown that dioscin has similar pharmacokinetic characteristics to dexamethasone [37] and exerts antiarthritis effects by inhibiting the immune response of Th17 cells [38]. In addition, dioscin inhibits osteoclast differentiation and bone resorption by downregulating the Akt signaling cascade [39]. In our study, dioscin-targeted enrichment analysis of the KEGG pathway also showed that the resultant Toll-like receptor signaling pathway (hsa04620) and TNF signaling pathway (hsa04668) were the main signaling pathways related to RA. Therefore, accumulating evidence supports that, as a natural small molecule, dioscin has multiple therapeutic effects on RA and is worthy of indepth study.
In this study, we used an algorithm based on network proximity to innovatively detect the regulation and intervention ability of dioscin on the known clinically therapeutic targets of RA. The results clearly showed that dioscin, by acting directly on four targets, indirectly intervenes and affects 54 related proteins and eventually regulates the main therapeutic targets of RA. We found that, in the network of dioscin-regulated RA targets, UBC (degree = 18), ARRB1 (degree = 14), APC (degree = 8), VCAN (degree = 7), GRB2 (degree = 6), IL2RB (degree = 6), FGR (degree = 6), and GNA15 (degree = 5) had higher degree values. The discovery of these target proteins in the network provides new insights for further related research and could serve as an important reference for the development of other RA drugs.

Conclusion
In summary, this study comprehensively used bioinformatics, molecular docking-based virtual screening, network modeling, and other computational methods; collected 432 commonly used TCM active ingredients; and analyzed the