Exploratory Compatibility Regularity of Traditional Chinese Medicine on Osteoarthritis Treatment: A Data Mining and Random Walk-Based Identification

Osteoarthritis (OA) is a degressive and complex disease which is a growing public health problem on a global scale. On basis of an in-house database consisting of clinical records of 13,083 OA patients, the Traditional Chinese Medicine (TCM) was divided into 4 categories of medicines on the basis of the curative properties of herbs. Due to the lack of depth and internal relationship in the calculation results of TCM compatibility law data mining methods such as statistics and frequency analysis, we use a variety of multidimensional complex network methods that can efficaciously find the compatibility law of TCM, including similarity measure, graphical visualization of network diagram, random walking, and propensity score methods. We summarize common couplet medicines utilized for the treatment of osteoarthritis. The similarity measure method was used to investigate the commonly used drugs for the treatment of osteoarthritis. The method of association rule analysis is used to recognize the compatibility between the components. On basis of the propensity score methods, the evaluation displayed that, compared with single drug, the drug group increased ESR, CRP, C3, C4, IgG, and IgA more efficiently. Concluding, a random walk model was constructed to assess drug efficacy. After applying a random walk model, while revealing the compatibility among different components of TCM, their therapeutic efficacy against OA is analyzed. We obtained four groups of drug combination clusters by similarity measure and 11 pairs of highly connected drugs by association rules, which are cardinal drug combinations in the prescription for the treatment of OA. We also found that different traditional drug pairs were associated with different laboratory indexes, and drug combinations could better optimize laboratory indexes. This study presented that the TCM constituents complement one another. Besides, the therapeutic effects resulting from a variety of combinations of these constituents are quite different.


Introduction
As a chronic progressive joint disease, osteoarthritis inflicts harm to the physical and mental health of the middle-aged and the elderly patients [1]. Pathologically, OA is characterized by simultaneous catabolic and anabolic processes which cause changes in all joint tissues. Chondrogenic degeneration, ectopic bone formation, subchondral osteosclerosis, ligament and meniscus injuries, and synovial injury in the articular cavity are the main disease markers [2]. Most of the methods remedying OA of Western medicine have their special shortcomings [3]. TCM can prevent and treat osteoarthritis through multiple levels, multiple targets, and multiple ways by the use of holistic view. What is more, traditional Chinese medicine has proved its advantages in certain aspects such as reducing the side effects of drugs, reversing drug resistance, and enhancing the quality of life and survival rate of patients [4].
Chinese medicine prescription is the treatment of a certain disease by the clinician under the guidance of the theory of Chinese medicine through a reasonable combination of medicines to maximize the strengths and avoid weaknesses, adjust its bias, reduce its toxicity, enhance or change its original effect, and eliminate or alleviate its disadvantages to the human body. e pharmacological and pharmacodynamic relationship between herbs is considered to be the compatibility of Chinese medicine treatment. e law of compatibility of TCM is a key issue in the study of Chinese medicine prescription. It is the core aim of Chinese medicine prescription research to excavate the law of compatibility and clarify the scientific connotation of compatibility. As an efficacious method and significant technique to grope for traditional Chinese medicine, data mining method has formed a set of standardized research models and systems in the mining of compatibility law of TCM, which can better reveal the compatibility relationship between drugs [5].
is research is designed to explore the compatibility regularity and clinical therapeutic role of TCM in the treatment of osteoarthritis. We use a variety of data mining tools, such as cluster analysis, association rules, graphical visualization of network graphs, propensity scoring methods, and random walk models to analyze clinical OA data. e data mining method can analyze the frequency of herbs, the rule of formulation, and the change of formulation pattern obtained from the knowledge graph.

Materials.
e inpatient data were collected for inpatients who were OA inpatients between July 2009 and March 2021 in the Department of Rheumatology and Immunology of the First Affiliated Hospital of Anhui University of Chinese Medicine. e dataset includes the use of Chinese herbal medicine, Huangqin Qingre Chubi capsules prescription preparation, Furong ointment, and disease-associated laboratory indices such as the inflammatory markers CRP and ESR and the immune indexes IgA, IgM, IgG, C3, and C4. e Ethics Committee of the First Affiliated Hospital of Anhui University of Chinese Medicine approved the study protocol. ere were 13,776 patients with OA examined, including 13,083 with the treatment of Chinese herbal medicine. All patients were divided into a control group (Chinese herbal medicine alone) and an experimental group (Chinese herbal medicine plus Huangqin Qingre Chubi capsule/Furong ointment prescription preparation). e control group had 7,119 cases, and the experimental group had 5,964 cases.

Similarity Measure.
We used Chinese herbal medicine as a variable, and we set it as 1 if used and 0 if not used. e systematic cluster analysis method is used to study the compatibility of Chinese herbal medicine by SPSS 22.0 (IBM Corp., Armonk, NY, USA). We treat each herb as a point and calculate the distance between the points. e ones that are close to each other fall into one category, and the ones that are far away fall into another category. e similarity of Chinese herbs was calculated using the mean Euclidean measure [6]. Setting x i and y i (i �1, 2, . . . n) as the continuous points in the two-dimensional metric space of signal x and signal y, the Euclidean distance of x and y can be defined as follows [7]: In the above equation, x i and y i are the ith sampling data points of signal x and signal y, respectively, and n is the total number of sampling points. An analysis of the equation demonstrates that the higher the similarity between signal x and signal y is, the smaller the value of the Euclidean distance d(x, y) is. On the contrary, the lower the similarity between signal x and signal y is, the larger the value of the Euclidean distance d(x, y) is.

Association Rules
(1) Apriori Algorithm. 1 was used to indicate the application of Chinese herbal medicine or indicators, while no use was indicated by 0. e interdependency of Chinese herbs medicine was identified by Apriori module in SPSS Clementine v.11.1 (IBM Corp., Armonk, NY, USA). We set the minimum support to 60%, confidence to 80%, and improvement to >1. Nowadays, the data mining technology of association rules is mainly on basis of Apriori algorithm, and its key optimization is to find all frequent itemsets in the trade database [8]. As a pattern of the form X ⟶ Y, an association rule means that the presence of the itemsets X is associated with Y in transactions [9]. e support degree, confidence degree, and Lift degree of an association rule between X and Y are, respectively, as follows [10]: , (2) Graphical Visualization of Network Diagram. We use the SPSS Clementine v.11.1 (IBM Corp., Armonk, NY, USA) "network" node to analyze Chinese medicine. Set the threshold to be absolute, strong links are thicker, and the maximum number of links that can be displayed is 80. e upper limit of weak links is 35, the lower limit of strong links is 100, and the link size shows continuous changes. e thick, thin, and dashed lines indicate the strength of the links between drugs to generate an overall network diagram.

2
Evidence-Based Complementary and Alternative Medicine 2.2.3. Propensity Score Methods. Propensity score approaches, extensively applied to regulate for confounding in observational researches with dichotomous treatment modalities, mimic the intended roles of randomization by the balance of measured baseline covariates across treatment groups [11]. Z means treatment allocation, Y the continuous outcome, and the baseline covariates are displayed as X � (X 1 , . . ., X p ); the propensity score is defined as [12] e(X) � P(Z � 1 | X).
As the treatment index Z is binary and it is supposed that the logistic regression is parametrized by α � (α 0 , α 1 , . . ., For every participant indexed by subscript i, it is possible to estimate a probability in the treatment or control arm, given the baseline features from the fitted propensity score model as e schematic diagram of propensity matching is shown in Figure 1.

Random Walking.
e evaluation of the laboratory indices random walking model is realized through the ORACLE 10 g tool.
Random walk was first put forward by Pearson in 1905 [13]. e walk upward movement is (u(i) � +1) and the downward movement is down (u(i) � − 1) by one-unit length (u) for each step i. Consequently, the random walk ultimately stimulates the quantification of this correlation by calculating the "net displacement" (y) of the walker after one step, which is the sum of the unit steps u(i) f of each step i [14]: e root mean square fluctuation F(l) about the average of the displacement is an important statistical quantity characteristic. F(l) is defined as the difference between the average of the square and the square of the average, of a quantity Δy(l) defined by We can comprehend that equation (1) walks a set of calipers with a fixed distance gauge, equation (2) sequentially moves the starting point from l 0 � 1 to l 0 � 2 and so on, equation (6) calculates the quantity Δy(l) and its square for each l 0 , and equations (7a) and (7b) average all calculated quantities to obtain the following equation:

Statistical
Processing. e analysis of all data was made by SPSS v. 22.0 (IBM Corp., Armonk, NY, USA). e operation of a nonparametric test on two associated samples was made for the control and experimental groups before and after treatment. Variations in characteristics between groups were analyzed using the Mann-Whitney rank-sum test or χ 2 tests. Variations were recognized to be statistically significant at P < 0.05.

Similarity Measure of Chinese Herbal Medicine in the Treatment of Osteoarthritis.
e systematic clustering method for cluster analysis of 20 traditional Chinese medicines for treating osteoarthritis is adopted. When the Euclidean distance is 15, the following four groups of drug combination clusters can be obtained (the other 2 drugs are invalid clusters) ( Figure 2

Analysis of Association Rules of Traditional Chinese
Medicine or Immune-Inflammatory Indices in the Treatment of Osteoarthritis.
e compatibility of the two TCM is to increase efficiency and reduce toxicity. e compatibility mode of the prescription is further tested. We analyze the correlation degree of core Chinese medicine, where the  minimum support is set to 70%, and the minimum confidence is set to 80%. e lift is > 1, P ≤ 0.001. According to the analysis results of association rules, the higher the promotion (gain), the stronger the correlation. We define each traditional Chinese medicine as an itemset. Finally, we secured 11 pairs of highly connected drugs. ey are cardinal drug combinations in the prescription for the treatment of OA. e improvement degree of all drug combinations is greater than 1.0, indicating that these drug combinations are statistically significant (Table 2). In addition, by constructing an association network diagram of high-frequency drugs, it clearly and intuitively reflects the degree of association between drugs. Herba Taraxaci Mongolici, Herba Hedyotis, Flos Carthami, Caulis Spatholobi, Semen Persicae, Semen Coicis, Rhizoma Dioscoreae Oppositae, Radix Salviae Miltiorrhizae, Radix Achyranthis Bidentatae, Radix Glycyrrhizae, and Poria have strong correlation in node degree. ey often appear in pairs and are the core drug combination ( Figure 3). Simultaneously, the minimum support is set to 60% and the minimum confidence to 80%; after the analysis of the Apriori module, each item is ranked with the highest confidence level. We conduct correlation analysis on drugs and immune-inflammatory indices to analyze the relationship between immune-inflammatory indices optimization and Chinese herbs compatibility. We found that Herba Taraxaci Mongolici, Flos Carthami, and Semen Coicis are related to inflammatory index, such as ESR and CRP. Semen Persicae, Poria, Radix Glycyrrhizae, and Radix Salviae Miltiorrhizae are related to immune index, for instance, IgM, IgA, and IgG. Radix Glycyrrhizae and Pericarpium Citri Reticulatae are bounded up with C4 and C3. e improvement degree of the above correlation results is > 1 ( Table 3).

Improvement of Immune-Inflammatory Indices.
First, we assembled one-to-one propensity score methods matching the clinical background between the two groups to minimize the imbalance from measured baseline covariates. e matched groups were balanced by age, sex, BMI, length of stay, and underlying diseases (coronary heart disease, hypertension, cerebral infarction, diabetes, chronic gastritis, anemia, osteoporosis, and fatty liver) ( Table 4). One-to-one nearest neighbour caliper matching was used to match Evidence-Based Complementary and Alternative Medicine 5 patients based on the propensity score using a caliper equal to 0.2 of the SD of the logit of the propensity score. e age, sex, BMI, LOS, and underlying diseases (coronary heart disease, hypertension, cerebral infarction, diabetes, and osteoporosis) were significantly different between the experimental group and the control group but were comparable between the matched experimental group and the matched control group after propensity score matching. e immune-inflammatory indices between the matched experimental group and the matched control group were also     (Table 5).

Evaluation of Immune-Inflammatory Indices by Random
Walking Model. e ESR of the control group had a total of 6253 comprehensive evaluation records.
e clinical significance is that, for every increase in the integrated index of the patient, 9.26 steps need to be walked, or, every step forward, the comprehensive improvement rate is 22.60%.
ere are a total of 6883 comprehensive evaluation records of ESR in the experimental group. e clinical significance is that, for every increase in the integrated index of the patient, 5.11 steps need to be walked, or, every step forward, the integrated improvement rate is 36.20%. ere were 6,933 and 7,603 integrated evaluation records for CRP in both groups. e patient improvement indexes in both groups were 0.276 and 0.426, respectively. With regard to the clinical significance, for every increase in the integrated index, the patients had to walk 7.28 and 4.18 steps. ere are 4676 integrated evaluation records of IgA in the control group. e clinical significance is that every time the patient's comprehensive index improves by one point, they need to walk 17.33 steps, or, every step forward, the  ere are 4670 integrated evaluation records for C3 in the control group. e clinical significance is that every time the comprehensive index of the patient improves by one point, they need to walk 15.17 steps, or, every step forward, the comprehensive improvement rate is 16.30%. ere are 4796 comprehensive evaluation records for patients in the treatment group's C3. e clinical significance is that every time the comprehensive index of the patient improves by one point, they need to walk 9.07 steps, or, every step forward, the comprehensive improvement rate is 24.40%. ere were 4,670 and 4,796 integrated evaluation records for C4 in both groups. e patient improvement indexes in both groups were 0.270 and 0.353. With regard to the clinical significance, for every increase in the integrated index, the patients had to walk 9.12 and 6.28 steps (Table 6 and Figure 4).

Discussion and Conclusion
Clustering is unsupervised learning and separated the data from groups (call as clusters) on account of their similar attributes. Apriori algorithm is a typical Boolean association rule frequent itemset algorithm [15]. e long-term correlation and mathematical probability theory of the random walking model are similar to the development of human diseases. Each symptom of the disease is affected by many factors such as the patient's self-perception, the body's ability to resist disease, and treatment measures. Whether the random walk model has long-term correlation, whose direct meaning is whether the index system is effective, is of great significance to the establishment of a comprehensive index system that is widely recognized and effective in Chinese medicine clinics.
OA is regarded as a kind of "arthralgia syndrome" in traditional Chinese medicine theory [16]. Traditional Chinese medicine treatment can effectively improve pain, dysfunction, and other symptoms and reduce the recurrence rate [17]. Herbal therapy is on the basis of the syndrome differentiation for the sake of satisfy personalized needs of different patients. Despite the use of single herb, physicians love herbal formulas, as well as complex mixtures of numerous herbs with abundant therapeutic compounds to maximize the curative effects and minimize the toxicity or adverse effects by interplay of different herbs [18]. As the painful obstruction, arthralgia syndrome signifies that either the limbs or the joints are subjected to pain and malfunction.
e evil influence such as wind, cold, damp, and heat invades the physical organism, blocking the meridians and consuming the Promordial Qi and blood, which will make the joints painful, swollen, stiff and, in severe cases, deformed. Optimal administration of OA appeals for individual methods on the basis of various constitutions of human body. e individually integrated TCM approach in this research has been generated by clinical experience from the First Affiliated Hospital [19]. e approach comprises a fundamental treatment of orally administered Chinese herbal drugs and herbal patch on the basis of the severity of the patient symptoms.
Here, cluster and association rule analyses, graphical visualization of network diagram, propensity score methods, and random walking model data mining were applied to discriminate Chinese herbal medicine for OA treatment, compatible combinations, and significant therapeutic efficacy against OA.
In the 13,083 Chinese herbal prescriptions, 426 types of herbs were applied to OA treatment. In the light of the nature, flavor, meridian tropism, and main efficacy of medicines, it was sensible to divide these into four categories tally with the principles of TCM, as shown in Table 1.
e frequency of use of invigorating spleen for eliminating dampness herbs was 47,424, which was the highest. e most commonly used single-flavor medicines are Poria, Flos Carthami, Radix Glycyrrhizae, Radix Salviae Miltiorrhizae, Pericarpium Citri Reticulatae, and so forth. Classified according to five flavors, sweetness and bitterness are the two most frequently used. Within the four classes of Chinese herbal medicine, the spleen meridian was used 44,470 times, sweet taste was applied 59,458 times, and bitter taste was applied 69,650 times. Sweet taste can replenish energy and tonify the spleen, while bitter taste is used to dispel dampness, on account of traditional Chinese medicine ideology. Both can verify that OA pathogenesis originates mainly from spleen deficiency and dampness. e development of muscle and its function are closely related to the function of spleen transportation. Based on modern pharmacology study, spleen-invigorating Chinese medicine can increase immune capacity, improve learning and memory function, and control the endocrine function of patients with impaired disease [20]. rough data mining technology, the high-frequency herbs and Chinese herbs compatibility for the treatment of osteoarthritis are summarized, and the drug characteristics and prescription rules of osteoarthritis are scientifically and objectively revealed. At length, we come to the conclusion that the regularity of traditional Chinese medicine in the treatment of osteoarthritis is invigorating spleen for removing dampness, promoting blood circulation and removing blood stasis, expelling wind and removing dampness, and clearing away heat and toxic materials, which can be used for reference in the treatment of clinical osteoarthritis.
rough cluster analysis, we found that high-frequency Chinese medicine combinations commonly used in this disease can be divided into four categories (Figure 2). e first group of Chinese medicines is to invigorate the spleen, dispel dampness, and promote blood circulation. e second kind of Chinese medicine expels wind and dredges collaterals. e third kind of Chinese medicine clears away heat 8 Evidence-Based Complementary and Alternative Medicine and detoxifies.
e fourth kind of traditional Chinese medicine is to clear away heat and detoxify and relieve collaterals.
According to the set of principles for drug association analysis, there were a total of 11 rules involving couplet medicines resulting. ere are some interesting association rules that are usually used by ancient and modern physicians for combinations of couplet medicines. e couplet medicine of invigorating the spleen medicinal is Pericarpium Citri Reticulata-Poria and Radix Glycyrrhizae-Poria, while bloodactivating medicinal is Radix Salviae Miltiorrhizae-Flos Carthami and Flos Carthami-Semen Persicae, for instance. We find that the obtained association rules invigorating the spleen, dispelling dampness, promoting blood circulation, and clearing away heat and detoxifying treatments have concomitant uses. Based on the relationship rule mining, synergistic herbal groups could be derived; nevertheless, it is necessary to further analyze the associated mechanism (Table 2 and Figure 3). At the same time, there is a strong correlation between the compatibility of traditional herbal medicine and the optimization of immune-inflammatory indices (Table 3). Herba Taraxaci Mongolici, Flos Carthami, and Semen Coicis are related to inflammatory index, such as ESR and CRP. e new research demonstrated that taraxasterol has the in vitro anti-inflammatory effect, which is one of the chief active components isolated from Herba Taraxaci Mongolici [21]. Flos Carthami [22] and Semen Coicis [23] are perhaps connected to its antioxidation and anti-inflammatory properties. Semen Persicae, Poria, Radix Glycyrrhizae, and Radix Salviae Miltiorrhizae are related to immune index, for instance, IgM, IgA, and IgG. Radix Glycyrrhizae and Pericarpium Citri Reticulatae are bounded up with C4 and C3. Poria [24] is commonly used as a tonic and antiaging traditional Chinese medicine, which is traditionally used in combination with other TCM to enhance immunity. Radix Glycyrrhizae Polysaccharide, one of the chief bioactive components in Radix Glycyrrhizae, has been reported to participate in regulation of immunity and phagocytosis, as well as anticomplement [25]. Volatile oils and flavonoids in Pericarpium Citri Reticulatae are considered to be major components, which act alone or jointly to combat inflammatory responses and lipid peroxidation, followed by attenuating immunological reaction [26]. To conclude, the therapeutic effects resulting from combinations of couplet medicines are quite different. e generation of Chinese formulas with various herbs is not random. In this research, data mining technology was used to determine the rules for the application of Chinese herbal medicine in OA treatment in our hospital, in order to test the efficacy of Chinese herbal medicine in OA treatment. As a convenient prescription hospital preparation of Anhui Traditional Chinese Medicine Hospital, Huangqin Qingre Chubi capsule (Anhui medicine No. ZL201110095718.X) consists of Radix Scutellariae Baicalensis, Radix et Rhizoma Clematidis Chinensis, Semen Coicis, and Semen Persicae. It has the function of invigorating spleen and removing dampness and clearing heat and removing arthralgia with long history in clinical OA treatment. As a prescription preparation of our hospital, Furong ointment has the effect of clearing heat, detoxifying swelling, cooling blood, and relieving pain. e variations in the values of immune-inflammatory indices were analyzed to evaluate the efficacy. Before that, we used propensity scoring method to match the baseline characteristics and immune-inflammatory indices of the two Evidence-Based Complementary and Alternative Medicine groups to reduce research bias. As you can see, the propensity score matching resulted in 4438 pairs of patients from the experimental group and the control group. Unlike the results before adopting propensity score matching, there were no significant differences between both groups in terms of age, sex, BMI, LOS, and underlying diseases (coronary heart disease, hypertension, cerebral infarction, diabetes, and osteoporosis) ( Table 4) (P > 0.05). However, inevitable imbalance persisted in chronic gastritis, anemia, and fatty liver. After applying propensity score matching, the immune-inflammatory indices were balanced between the two groups, and there was no significant difference (P > 0.05) ( Table 4). According to the distribution rules, there were 7,119 patients in the control group and 5,964 patients in the experimental group. e statistical results show that, compared with single drug, the drug combination improved ESR, C3, IgA, CRP, IgG, and C4 more effectively (Table 5). erefore, strong therapeutic efficacy of Chinese herbal decoctions integrated with prescription preparations was found. A random walking model was applied to assess the immune-inflammatory indices of both groups of OA patients. e patient's immune-inflammatory indices refer to the long-term relationship between the changes of ESR, IgA, IgG, C3, CRP, and C4 and the intervention measures the patient receives, which means that the treatment measures the patient receives affect the changes of the patient's immune inflammation index. e treatment group's ESR, IgA, CRP, C3, C4, and IgG are better than the control group's in terms of the maximum random fluctuation, the positive growth rate of walking, the increase rate of the comprehensive evaluation index, the comprehensive improvement rate, the number of records of the comprehensive evaluation index, and the expected improvement value. But the improvement of IgM is not obvious.
ere is a long-term  correlation between the comprehensive evaluation indexes and intervention measures of the two groups of patients, and the improvement effect of indexes of the experimental group is better than that of the control group (Table 6 and Figure 4). e Chinese medicine is through the classification of the causes of OA, symptomatic administration, and the use of appropriate methods of treatment [27]. As for the diagnosis and treatment model based on diagnostic personality and diagnostic integration, TCM medical records provide good evidence for TCM evidence-based practice, which reflects the comprehensive application of TCM principles, methods, formulas, and drugs rather than the comprehensive application of TCM. It is not only the true record of medical activities but also the reflection of the physician's clinical experience and thought process. Data mining, also known as knowledge discovery, is the task of discovering laws or patterns hidden in a large amount of data, which can bring improvement and further development of TCM academic technology [28]. However, Chinese medicine discusses and studies human body from the whole and function level, while data mining based on data induction and machine learning is limited to exploring the surface statistical law and lacks the discussion on the internal mechanism of the system. On the premise of guaranteeing the integrity of the data, therapeutic efficacy is difficult to be fully verified because of the complexity of the inpatient medical records, including the incomplete and unquantifiable information in the data. Beyond that, Chinese medicine studies human body from the whole and function level, but the data mining based on data induction and machine learning is limited to exploring the surface statistical law and lacks the discussion of the internal mechanism of the system.
In conclusion, we built a random walking model, which is based on laboratory indicators and combined with cluster analysis, association rules, graphical visualization of network diagram, and propensity score methods. We found that Chinese medicine compatibility can improve the immune-inflammatory indices of patients suffering from OA. Different groups of these drugs bring about various effects in OA treatment, which provide experience for clinical treatment of OA.

Data Availability
e datasets generated for this study are available upon request from the corresponding authors.

Conflicts of Interest
e authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.