Analyzing the Therapeutic Mechanism of Mongolian Medicine Zhonglun-5 in Rheumatoid Arthritis Using a Bagging Algorithm with Serum Metabonomics

Rheumatoid arthritis (RA) is a complex autoimmune disorder. Zhonglun-5 (ZL), a traditional Mongolian medicine, exhibits an excellent clinical effect on RA; however, its molecular mechanism remains unclear. In this study, rat serum metabolomic analysis was performed to identify potential biomarkers for RA and investigate its treatment mechanism. A Dionex Ultimate 3000 ultrahigh-performance liquid chromatography system coupled with a Q-Exactive Focus Orbitrap mass spectrometer was used for metabonomics analysis. Bootstrap aggregation (bagging) classification algorithm was applied to process data from control (CG), model (MG), and treatment administration groups. The classification accuracy was 100.00% (6/6) in the decision tree model and 83.33% (5/6) in the K-nearest neighbor (KNN) model, accompanied by 18 training samples and 6 testing samples. Using volcanic map analysis, 24 biomarkers were identified between CG and MG, including those related to glycosphingolipid biosynthesis, arachidonic acid, fatty acids, amino acids, bile acids, vitamins, and sphingolipids. A set diagram of the heatmap and drug-biomarker network of potential biomarkers was constructed. After ZL administration, the levels of these biomarkers returned to normal, indicating that ZL had a therapeutic effect in rats with RA. This study established a solid theoretical foundation to promote further research on the clinical applicability of ZL.


Introduction
Te etiology of rheumatoid arthritis (RA), an autoimmune disease, remains unknown. RA is mainly characterized by infammatory synovitis and extra-articular organ involvement. In the worst case, the joints become deformed, leading to the loss of function. RA pathogenesis is related to heredity, infection, and sex hormone levels [1,2]. Infammatory cell infltration, pannus formation, and cartilage destruction occur in the diseased joint, which can cause great pain in patients. At present, the main therapeutic drugs for RA are allopathic medicines; many of these drugs have been reported to induce toxicity and side efects that reduce patients' quality of life. Patient immunity may be signifcantly impaired after taking certain medicines, increasing the risk of bacterial infection and cancer. Terefore, its clinical application is limited.
Traditional Mongolian medicine includes a complete theoretical system for treating disease by aiming to comprehensively regulate body function. Zhonglun-5 (ZL), a traditional Mongolian medicine, is composed of Sophora favescens Ait (SFA), Gardenia jasminoides Ellis, Fructus toosendan, Terminalia chebularensis, and Lomatogonium rotatum. As a traditional treatment for hot yellow water disease, swimming pain syndrome, gout, and other diseases, it has been reported to induce the efects of clearing heat, cooling blood, relaxing tendons, and relieving pain. Clinically, ZL has exhibited signifcantly benefcial efects on RA [3]. As the primary active component of ZL, SFA plays a vital role in inhibiting RA by regulating the T1/T2 cytokine response in RA via attenuated NF-κB signaling, thereby inhibiting disease progression [4]. Oxymatrine is a monomeric alkaloid extracted from SFA that can regulate the imbalance between regulatory T cells and helper T17 cells in RA rats, yielding a good protective efect against certain diseases [5]. Although SFA itself has a strong therapeutic efect, ZL may exhibit a unique molecular mechanism against RA.
At present, metabonomics data are primarily classifed using principal component analysis approaches, and a large amount of original data are often lost during data processing. If the cumulative contribution rate of the frst several principal components is low, the model is unqualifed, and the obtained classifcation accuracy is very low. Te bootstrap aggregation (bagging) algorithm is a group learning algorithm used in the feld of machine learning. As a representative algorithm in the data-mining feld, bagging can be combined with other classifcation and regression algorithms to improve accuracy and stability and avoid overftting [6]. Compared to traditional omics data classifcation methods, this algorithm does not lose any original data. In addition, the bagging algorithm is advantageous for solving omics data classifcation problems associated with small sample sizes, high dimensions, and sparseness [7]. For example, owing to the diferences in functional brain tissues between individuals, improving the reproducibility of neuroimaging measurements is the main obstacle to the development of human neuroscience. Recent research results show that bagging can improve the reproducibility of functional zoning without long-term scanning. In calculating two large datasets, it was proven that bagging improves the repetition accuracy of cortical and subcortical functional zoning under a series of diferent parameter conditions compared to the standard clustering framework [8]. Furthermore, in research addressing novel Coronavirus-19 (COVID-19) diagnosis, Zhang et al. proposed a bagging dynamic deep learning network (B-DDLN) to diagnose COVID-19 using intelligent recognition of chest symptoms in X-ray images. Te calculation results showed that the accuracy of B-DDLN was 98.8889%, and B-DDLN had the best diagnostic performance among the existing open image set diagnostic methods [9]. If the bagging algorithm is applied to the classifcation of metabonomics data, it can not only signifcantly improve the accuracy of metabonomics data classifcation but also be of great signifcance for expanding existing classifcation methods. After data processing, disease-related biomarkers were screened, and the regulatory efects of ZL on markers related to metabolic pathways were investigated. Tis study revealed the molecular mechanisms underlying the role of ZL in RA treatment.

Adjuvant-Induced Arthritis Model Design and Treatment.
Tis study was approved by the Ethics Committee of the Afliated Hospital of Inner Mongolia Minzu University (NMMZDX2020[K]0034). Adjuvant arthritis is an infammatory reaction model mediated by cellular immunity. Te antigen enters the body to sensitize T cells. When it contacts the antigen again, the sensitized T cells diferentiate and proliferate, release various lymphokines, and directly kill target cells to cause infammation. Multiple arthritis is the feature of this model, which is closer to clinical human rheumatoid arthritis. To develop an animal model of RA treatment with ZL, 24 male Wistar rats (200 ± 10 g) were purchased from Shenyang Aikesaisi Biotechnology Co., Ltd. (Shenyang, China). All animals were acclimatized to the laboratory for one week before the experiment. Te rats were divided into three groups: control, model, and ZL administration groups (CG, MG, and ZL, respectively), with eight rats in each group. On day 1, MG and ZL rats were intradermally injected with 0.1 mL CFA in the right posterior toe, and CG rats were injected with 0.1 mL saline. After 7 days, MG and ZL rats were injected with 0.1 mL CFA again. On day 14, the rats in the ZL group were administered Zhonglun-5 at doses of 0.86 g/kg/day for 28 consecutive days, and on day 42, all rats were euthanized. Blood was collected from the hepatic portal vein and centrifuged at 3500 rpm for 10 min at 4°C. Te supernatants were immediately frozen and stored at −20°C. Te arthrodial cartilage was fxed in 10% formaldehyde for parafn embedding [10].

Biochemical and Histological Analysis.
Te treated articular cartilage was stained with hematoxylin and eosin. Te sections of articular cartilage were observed under a microscope. A Multiskan FC Microplate Reader (Fisher Scientifc, USA) was used to measure changes in SOD, MDA, TNF-α, and IL-1β levels in diferent groups.

Serum Sample Preparation.
Te serum samples were thawed before analysis, and 100 μL aliquots were added to 400 μL acetonitrile, followed by vortexing for 30 s and centrifugation at 12000 rpm for 10 min at 4°C. Te supernatant was subsequently fltered through a 0.22 μm flter membrane. Optimized mass spectrum conditions were as follows: the fow rates of the sheath and auxiliary gas were 45 and 10 bar, respectively. Te spray voltage was 3.5 kV, and the capillary and auxiliary gas heater temperatures were 320 and 350°C, respectively.

UPLC-MS
Te MS data were collected in switching mode between positive and negative spectra. Te mass inspection range was 100-1000 Da. Te resolution of the full MS was 70000. In the dd-MS 2 discovery mode, the resolution was 17500. Te MS 2 collision energy was 35 eV.

Data Analysis.
Every day, eight pooled quality control samples were used to test the stability of the instrument. Peak detection, alignment, and normalization were performed using Compound Discoverer software (CD, version 2.0). MATLAB 2012 software was used to process the data based on the bagging algorithm. CD was used to analyze the volcanic map, while the Statistical Package for the Social Sciences (SPSS, version 20.0) was applied for an independent sample t-test. Potential biomarkers were screened according to the multiple content change (>2 times the content change) and P value (P < 0.05). Heml software was used to show intergroup changes in biomarkers [11]. Te Human Metabolome Database was used to identify potential biomarkers. A set diagram of the heatmap and drug-biomarker network of potential biomarkers was constructed using Cytoscape 3.2.1.

Application of Bagging Algorithm for Classifcation.
In this study, metabonomic data on ZL for the treatment of RA were classifed. Tis dataset consists of three types of samples (CG, MG, and ZL administration groups, n � 8 for each group, 24 samples in total), and the dimension of each sample data was 2286.
Te bagging algorithm was used to complete the resampling of metabonomics data; that is, k new metabonomics datasets are selected from the original metabonomics dataset through bootstrap sampling to train the classifcation model. Tese new metabonomic datasets can then be repeated. Te trained multiple classifers are used to classify metabonomics samples, and the classifcation results of all classifers are counted by majority voting or the outputaveraging method. Te category with the highest result was the fnal label. Tis method can reduce the overftting problem associated with a single classifcation model, improve the learning efect, and generate accurate predictions.
To improve the diference in the model, when bagging trained the model for combination, data were randomly extracted from the training set. For the bagging algorithm, samples of the same number as the training set were randomly collected. Tus, the number of samples in the sampling set was the same as that in the training set, but the sample content was diferent. For example, if we randomly sample k times for the training set containing n samples, the k sample sets are diferent owing to randomness. Tis method usually considers homogeneous weak learners, learns them independently in parallel, and combines them according to a certain deterministic averaging process. Te bagging classifcation model is illustrated in Figure 1.
If an ensemble model consisting of k classifcation models is built and we assume that the error of each model on each sample is ϵ i , the error obeys a multidimensional normal distribution with zero mean, E[ϵ 2 i ] � ]variance, and E[ϵ i ϵ j ] � c covariance. Te average prediction error obtained from all ensemble classifcation models is 1/k i ϵ i , and the mathematical expectation of the square error is as follows: (1) When the error is completely correlated (c � v), the mean square error is reduced to V; thus, the ensemble model has no efect. If the error is completely unrelated (c � 0), the expected value of the ensemble square error is only 1/k × v, which means that the expected value of the ensemble square error will decrease linearly with an increase in the ensemble size. In other words, the integration approach should perform at least as well as any of the other approaches and signifcantly better than any of the other single approaches if the errors of each individual model are independent.
In this experiment, n training samples were randomly selected from the original sample set, and k rounds were extracted to obtain k training sets that were independent of each other and accompanied by repeatable elements. For N training sets, K decision trees and K-nearest neighbor (KNN) models were trained, and the fnal classifcation results were obtained by majority voting. KNN is a supervised classifcation algorithm. If a sample has K most similar samples in the feature space, and most of these samples belong to a certain category, then the sample also belongs to this category [12,13]. In each experiment, samples were assigned to the training and test sets. Te number of randomly selected training samples N was the same as the number of training set samples, and the rest of the test set samples were used to verify the classifcation accuracy. Te classifcation experiment was completed by selecting the proportion of data for diferent training and test sets.

Results and Discussion
3.1. Biochemistry and Histology. Te serum biochemical parameters are shown in Figure 2. As an antioxidant metalloenzyme in organisms, SOD promotes the conversion of superoxide anion radicals into oxygen and hydrogen peroxide, maintaining the balance between oxidation and antioxidation. Tus, it is crucial in many disease processes. MDA content is a parameter that refects the potential Evidence-Based Complementary and Alternative Medicine antioxidant capacity in vivo. Te increase in MDA content indicated that the degree of tissue peroxidation damage had increased. As shown in Figure 2(a), the SOD content decreased, and the MDA content increased in MG compared to CG, indicating that serious peroxidation occurred in RA rats. After ZL administration, the contents of SOD and MDA normalized, indicating that ZL exhibited antioxidant activity.
TNF-α and IL-1β are closely associated with RA development. Tey are produced by excessive secretion of matrix metalloproteinases and can cause severe damage to the joints. As shown in Figure 2(b), the levels of TNF-α and IL-1β increased signifcantly in the MG compared to those in the CG. After ZL administration, the levels of TNF-α and IL-1β decreased compared to those in MG, indicating that they all exhibit anti-infammatory activity.
Te histopathology of each group is shown in Figure 3. RA results in numerous panni (yellow stripes in Figure 3(b)) in the MG. Synovial hyperplasia, neovascularization, and leukocyte extravasation transform the normal acellular synovium into an invasive pannus. An imbalance in the microvascular structure leads to an insufcient synovial oxygen supply. With an increase in metabolic turnover of the dilated synovial pannus, a hypoxic microenvironment is formed in the synovium. Terefore, abnormal cell metabolism and mitochondrial dysfunction occur, which induces the production of RA [14]. After ZL administration (Figure 3(c)), the pannus was signifcantly reduced, indicating that ZL had a positive efect on RA.

Data Classifcation Results.
Te serum total ion fow chromatograms of CG, MG, and ZL are shown in Figure 4. Small diferences between the groups were observed using the bagging algorithm.
Te data classifcation results are presented in Table 1. In general, the more training samples there are, the more reliable the model will be, and the higher the grouping accuracy. Te results of the decision tree were better than those of KNN. Te classifcation accuracy was 100.00% (6/6) in the decision tree model and 83.33% (5/6) in the KNN model (5/6 samples were correctly grouped), accompanied by 18 training samples and 6 testing samples. With 15 training samples and 9 testing samples, the classifcation accuracy was 88.89% (8/9) in the decision tree model and 77.78% (7/9) in the KNN model. Te classifcation accuracy was 75.00% (9/12) in the decision tree model and 75.00% (9/12) in the KNN model, accompanied by 12 training samples and 12 testing samples. Finally, with 9 training samples and 15 testing samples, the classifcation accuracy was 73.33% (11/ 15) in the decision tree model and 66.67% (10/15) in the KNN model. Te high classifcation accuracy laid a good foundation for screening biomarkers and inferring metabolic pathways.

Identifcation of Potential Biomarkers.
A volcano model was used to detect potential biomarkers ( Figure 5). Between CG and MG, 24 metabolites were screened that were involved in arachidonic acid metabolism, glycosphingolipid biosynthesis, fatty acid metabolism, amino acid metabolism, bile acid metabolism, vitamin metabolism, and sphingolipid metabolism. After ZL treatment, all metabolites returned to normal levels, indicating that ZL may have a therapeutic efect on RA by afecting various metabolic pathways. Te set diagram of heatmap and drug-biomarker network of the potential biomarkers that can be regulated by ZL is shown in Figure 6. Te biomarker information is summarized in Table 2.

Biological
Relevance. Prostaglandins (PGs) comprise a type of lipid mediator produced by arachidonic acid metabolism and are abundant in bodily fuids. PGs combine with specifc receptors and mediate multiple cellular activities, such as cell proliferation, diferentiation, and apoptosis. In addition, PGs also participate in the pathological processes of infammation, cancer, and various cardiovascular diseases [15]. PGs are lipid-signaling factors released during the early stages of RA. PGs maintain immune system infammation by regulating the diferentiation and maturation of immune cells and cytokine production. PGs are  conducive to leukocyte infltration, synovial hyperplasia, and angiogenesis in synovitis and participate in cartilage degradation and bone resorption. PGs are important mediators of joint pain regulation and can protect joints during the late stage of RA infammation [16,17]. PGs are lipid mediators produced by the enzymatic metabolism of arachidonic acid, an eicosanounsaturated fatty acid. NF-κB is the main switch of proinfammatory genes, which can activate arachidonic acid pathway enzymes and lead to infammation [18]. After ZL administration, the levels of prostaglandin G1 and 10hydroperoxy-H4-neuroprostane returned to normal levels, indicating that ZL had a regulatory efect on arachidonic acid metabolism.
Gangliosides are important glycosphingolipids that are rich in nerve endings and assist in transmitting nerve impulses. In patients with RA, ganglioside levels in the synovium are signifcantly decreased compared to healthy patients. In the RA mouse model, ganglioside defciency exacerbated infammatory arthritis. In addition, the destruction of gangliosides can induce T cell activation in vivo and promote excessive production of RA-related cytokines.
Tese fndings suggest that gangliosides play key roles in RA pathogenesis and progression [19]. After ZL administration, ganglioside content normalized.
Fatty acid metabolism is closely associated with RA [20]. As a Ω-3 fatty acid (Ω-3 FA), eicosapentaenoic acid (EPA) is mainly found in fsh oil. EPA has been found to inhibit infammation via several mechanisms, including reducing the expression of adhesion molecules and T cell response activity, inhibiting the production of prostaglandins and leukotrienes by arachidonic acid, and inhibiting the production of infammatory cytokines. Te anti-infammatory mechanisms of Ω-3 FAs include changing the composition of phospholipid fatty acids in cell membranes, destroying lipid rafts, and inhibiting the activation of proinfammatory transcription factors, thereby reducing the expression of infammatory genes [21]. α-Linolenic acid inhibits arachidonic acid metabolism, resulting in the inhibition of the production of proinfammatory n-6 eicosanoids and a decrease in vascular permeability [22]. After ZL administration, the content of various fatty acids normalized.

Evidence-Based Complementary and Alternative Medicine
RA can lead to an imbalance between amino acid and bile acid metabolism. Amino acid metabolism is a key regulator of immune response and can provide new drug targets for safer and more efective RA treatment [23]. Glycine is a potential immunomodulator that prevents reactive arthritis by increasing the infux of chloride ions through glycine-gated chloride channels and slowing the release of cytokines by macrophages [24]. Bile acid is an important component of bile that plays an important role in fat metabolism. MCP-induced protein (MCPIP) is a new zinc fnger protein that participates in infammatory angiogenesis. Tauroursodeoxycholic acid can block endoplasmic reticulum (ER) stress and inhibit infammatory angiogenesis induced by MCPIP [25]. After ZL administration, the levels of various amino acids and bile acids returned to normal levels, indicating that ZL can regulate amino acid and bile acid metabolism.
RA afects vitamin E metabolism as a hydrolytic product of vitamin E, and tocopherol can be further metabolized to either 13′-carboxy-c-tocopherol or α-tocotrienol. As a fatsoluble antioxidant, tocopherol captures free radicals produced by lipid oxidation in cell membranes and exhibits antioxidant efects [26]. Oxygen free radicals are considered mediators of tissue damage in RA patients. When the α-tocopherol content in the blood is low, the oxygen free radical cannot be removed efciently, and the risk of RA   [27]. After ZL administration, 13′-carboxyc-tocopherol and α-tocotrienol levels normalized. RA can also lead to phospholipid and sphingolipid metabolism disorders. Phosphatidylcholine (PC) is an oily substance present in animal tissues and egg yolk. PCs mainly include phosphoric acid, choline, fatty acids, glycerol, glycolipids, triglycerides, and phospholipids. It is an important component of cell membranes, alveolar surfactants, lipoproteins, and bile as well as a source of lipid messengers such as lysophosphatidylcholine, phosphatidic acid, diglyceride, lysophosphatidic acid, and arachidonic acid [28]. PC and its derived metabolites have shown anti-infammatory activity under various stress conditions. Experimental studies reported that rats fed PC exhibited reduced arthritis-induced hypersensitivity, the frequency of leukocyte-endothelial interactions, and the range of functional capillary density. PC also improves tissue damage by reducing the expression of nitric oxide synthase [29]. Ceramides are sphingosine lipids composed of sphingosine long-chain bases and fatty acids. Ceramides can regulate cell diferentiation, proliferation, apoptosis, aging, and other life activities [30]. Te balance between cell proliferation and apoptosis is impaired, which leads to excessive growth of synovial cells in rheumatoid joints and aggravates the destruction of joints. Ceramides mediate multiple cellular functions as lipid messengers. After ceramide pretreatment, the cell cycle process of synovial cells was completely inhibited, and the symptoms of RA were relieved [31]. Phytosphingosine is a ceramide precursor. As a lipid component of the skin, it exerts a natural repairing efect on barrier function. Studies have shown that phytosphingosine derivatives can effectively inhibit infammatory response [32]. After ZL administration, PC (22 : 5 (7Z, 10Z, 13Z, 16Z, 19Z)/16 : 0), Cer (d18 : 0/ 16 : 0), and phytosphingosine contents normalized, indicating that ZL can regulate phospholipid and sphingolipid metabolism.

Conclusions
In summary, ZL alleviated RA in a rat model. Te bagging algorithm was applied to process omics data, and the classifcation result was found to be outstanding based on the decision tree and the KNN model. A total of 24 biomarkers related to RA were identifed involving multiple metabolic pathways, such as those related to glycosphingolipid biosynthesis, arachidonic acid, fatty acids, amino acids, bile acids, vitamins, and sphingolipid metabolism. After ZL administration, the levels of these biomarkers returned to normal. In future studies, we will examine the efects of ZL and its components on metabolic pathways in an RA rat model to clarify the compatibility mechanism of ZL.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon request.

Conflicts of Interest
Te authors declare that they have no conficts of interest.