The Identification of Immune-Related Biomarkers for Osteoarthritis Immunotherapy Based on Single-Cell RNA Sequencing Analysis

Osteoarthritis (OA) is a chronic musculoskeletal disease affecting approximately 500 million people worldwide. Globally, OA is one of the most common and leading causes of disability. Several genetic factors are involved in OA, including inherited genes, genetic susceptibility, and genetic predisposition. As the pathogenesis of OA is unknown, there are almost no effective treatments available to prevent the onset or progression of the disease. In recent years, many researchers focused on bioinformatics analysis to explore new biomarkers for the diagnosis, treatment, and prognosis of human diseases. In this work, we obtain the traditional RNA sequencing data of OA patients from the GEO database. By performing the differentially expressed analysis, we successfully obtain the genes that are closely associated with the OA. In addition, the Venn diagram was applied to evaluate the genes that are involved in OA and immune-related genes. The protein-protein interaction analysis was further conducted to explore the hub genes. The single-cell RNA sequencing analysis was used to evaluate the expression distribution of the MMP, VEGFA, SPI1, and IRF8 in synovial tissues of patients with osteoarthritis. Finally, the GSVA enrichment analysis discovered the potential pathways involved in OA patients. Our analysis provides a new direction for the exploration of the process of OA patients. In addition, VEGFA may be considered a promising biomarker in OA.


Introduction
Te cartilage in synovial joints is a transparent tissue that covers the bony surface, making it easier for the joint to slide and friction to be reduced [1]. Synovial fuid and subchondral bone nourish this tissue, and its physical properties include resistance to pressure and compressive forces [2]. Te only type of cell present in cartilage is the chondrocyte, which accounts for 1-5% of the total cartilage mass [3]. Te chondrocytes are grounded in collagen and proteoglycans, which are considered an amorphous extracellular matrix. During compression, cartilage is protected by proteoglycans, which provide tension to the tissue [4]. Among all joint diseases, osteoarthritis (OA) is one of the most common and leading causes of disability in the world. Approximately 18% of women and 9.6% of men over the age of 60 have symptoms of OA, and a quarter is unable to carry out daily activities because of it [5]. Te number of people sufering from OA is expected to rise to 130 million by 2050, representing a major social issue [6]. OA is primarily caused by genetic factors such as inherited genes, genetic susceptibility, and genetic predisposition [7]. Tere are almost no treatments available to prevent the onset or progression of OA because no clear pathogenesis has been identifed. As opposed to earlier paradigms, OA is now recognized as a lowgrade infammatory disease that afects the entire joint, such as progressive destruction of articular cartilage, synovitis, subchondral bone remodeling, osteophyte formation, and meniscal and ligament changes [8]. It is becoming increasingly evident that infammation (especially synovitis) plays an important role in OA as well as mechanical load.
Te search for biomarkers for OA prevention, diagnosis, and disease progression has been a major focus of many researchers in recent years. It has been shown that the combined detection of serum chondroitin sulfate epitope 846 and cartilage oligomeric matrix protein is an efective method for diagnosing and monitoring the progression of osteoarthritis [9]. Te collagen II C-terminal peptide concentration was higher in synovial fuid in patients with early OA compared to healthy controls [10]. In urine samples from patients with knee OA, metabolite levels may be capable of predicting the progression of the disease. In addition, C-reactive protein is associated with knee OA occurrence and progression [11]. A patient's susceptibility to OA progression can be distinguished by acids such as glycolic acid, hippuric acid, and fenugreek. Due to the small sample sizes of the above studies, these results are limited [12]. At the present time, there are very few biomarkers available for clinical use. Terefore, it is urgent to explore more efective biomarkers to better prevent, diagnose, and treat OA.
In recent years, with the quick development of bioinformatics analysis, in silico analysis has been widely taken into consideration for the analysis of the etiology, prevention, and prognostic prediction [13][14][15]. In this work, the aim is to explore the genes that play a key role in the occurrence of OA. In addition, the potential enrichment analysis was also used to explore the enriched pathways that are closely associated with OA. Further, the immune cell infltration analysis was performed to further explore the correlation between OA and immune-related cells.

Te Downloaded Dataset of OA Patients.
Based on the GEO database, we downloaded the gene expression profle of GSE98918. A total of 24 synovial tissue samples are included in GSE98918, including 12 samples from normal joints and 12 samples from joints with OA. Data analysis was performed using R and Bioconductor software packages. Te "sva" package is used to eliminate batch efects and normalize data.

Te Screening of the Diferentially Expressed Genes (DEGs)
between Normal Synovial Tissue and Synovial Tissue Samples with OA. Trough comparison of expression values in synovial tissues from normal joints and DEGs from OA joints, the LIMMA package in Bioconductor was used to identify DEGs and OA joints. P value <0.05 and |log 2 FC| > 1 were used as selection criteria. Te pheatmap package is used to draw heatmaps of DEGs in R software.

Te Exploration of the Potential Pathways Tat Are Closely
Associated with the Diferentially Expressed Genes. Functional enrichment was used to further confrm the potential functions of the potential targets. GO is widely used for annotating genes with functions, particularly molecular functions (MFs), biological pathways (BPs), and cellular components (CCs). In addition, analyzing gene function and related high-level genome function information using KEGG enrichment analysis is practical and useful. Analysis of the GO function of potential mRNAs and enrichment of KEGG pathways were performed using the ClusterProfler package in R to better understand the oncogenic functions of target genes.

Protein-Protein Interaction Network Based on the Key
Genes. In order to explore the potential correlation between the proteins encoded by key genes, we then constructed the PPI network. Using STRING, a gene PPI network was analyzed interactively. PPI networks were also analyzed and visualized using Cytoscape 3.8.2 when interactions with composite ratings exceeded 0.4.

Immune Cell
Infltration. CIBERSORT (https:// cibersort.stanford.edu/) is an online analysis tool for estimating the abundance of many immune cell subtypes in mixed cell populations using gene expression data.

Te Single-Cell RNA Sequencing Analysis in the Synovial
Tissues of Patients with Osteoarthritis. In the single-cell RNA sequencing analysis, all the data were obtained from the GEO database (https://www.ncbi.nlm.nih.gov/geo). Te Seurat package is used to generate objects and flter out cells with poor quality. Additionally, standard data preprocessing procedures are performed, resulting in the determination of gene number, cell number, and mitochondrial content. We flter the samples based on genes and cells containing fewer than 200 genes. Cells with at least three genes detected were retained, while cells with fewer than 500 or more than 7500 genes detected were discarded. Cells with high mitochondrial content (>20%) were also removed. Te UMI counts were scaled with scale factor � 10,000. After logtransforming the data, we used the ScaleData function in Seurat to normalize each cell. As a result, we frst identifed some hypervariable genes in all samples, also known as magnet genes (genes that exhibit obvious diferential expression between groups) which were used for grouping and cell identifcation. Te marker gene of each cluster identifes each cluster. Generally, the marker gene is expressed characteristically in this type of cell and requires at least two other characteristic genes to describe it as a whole. Renaming of cell populations is carried out after all populations have been identifed based on the annotation results.

Gene Set Variation Analysis (GSVA) of the Hub Genes.
In this study, gene set enrichment was evaluated using GSVA, an unsupervised, nonparametric method. As a result of scoring the genes of interest and determining the biological function of the sample, changes at the gene level were transformed into changes at the pathway level in this study. Based on the molecular signatures database (version 7.0), gene sets were retrieved. In order to evaluate potential biological function changes, various samples were evaluated using the GSVA algorithm [13].

A Total of 220 Genes Were Considered the DEGs and Show
Some Key Pathways. In the gene expression profle of GSE98918, a total of 220 genes were considered as the DEGs, which include 127 upregulated genes and 93 downregulated genes (Figures 1(a) and 1(b)). Subsequently, we also performed the GO enrichment analysis based on the DEGs. Te results revealed that some pathways are closely associated with the upregulated genes, including extracellular matrix organization, extracellular structure organization, collagen catabolic process, regulation of stem cell proliferation, and positive regulation of morphogenesis of an epithelium. However, the downregulated genes are highly correlated with the pathways, such as humoral immune response, complemental and coagulation cascades, regulation of infammatory response, and Staphylococcus aureus infection. In addition, the KEGG enrichment analysis demonstrated that the upregulated genes are closely associated with the synaptic vesicle cycle, relaxin signaling pathway, pyrimidine metabolism, protein digestion, and absorption. However, the downregulated genes involved in the DEGs are closely associated with the pathways, such as viral myocarditis, T1 and T2 cell diferentiation, transcription misregulation in cancer, and the IL-17 signaling pathway (Figures 1(c)-1(e)).

Exploration of the Diferent Immune-Related Cells Tat
Are Closely Associated with the DEGs. In order to explore the genes that are closely associated with the immune-related indexes, we downloaded the immune-related genes from the former studies. We fnally obtained a total of 3178 immunerelated genes. In addition, 50 of them are also involved in the diferentially expressed genes in the OA cohort (Figure 2(a)). Subsequently, in order to further explore the hub immunerelated genes that may play a key role in the OA cohort, we then constructed the PPI network. Te results demonstrated that many genes show more than 20 interactive counts with other genes, such as MMP9 (36 interactive counts), VEGFA (32 interactive counts), SPI1 (28 interactive counts), IRF8 (20 interactive counts), and CAMP (20 interactive counts). Terefore, these fve genes were considered as the hub immune-related genes involved in the OA cohort ( Figure 2(b)).

Te Single-Cell RNA Sequencing Analysis Revealed Different Cell Types Involved in the Synovial Tissues of Patients with Osteoarthritis.
Te single-cell RNA sequencing data of GSE176308 with the platform of GPL18573 were applied for further analysis, which consist of 3 samples of synovial tissues of the patients with osteoarthritis. Firstly, we performed the quality control of cells. Te standard of RNA counts is set between 500 and 7500. In addition, cells with high mitochondrial content (>20%) were also removed ( Figure 3(a)). Subsequently, a subset of features of hypervariable genes was used to perform batch removal analysis on the fltered data (Figures 3(b)-3(d)). Hypervariable genes are the genes that show the largest diference in expression from cell to cell (Figures 3(e) and 3(f )). We then transform highly variable genes into highly variable gene groups by using principal component analysis (Figures 4(a)-4(c)). After performing the cell annotation, we fnally obtained a total of 6 types of cells in the synovial tissues of patients with osteoarthritis, which include progenitor cell, stem cell, osteocyte, osteoblast, osteoclast, and mesenchymal stem cell (Figure 4(d)). In addition, the proportional histogram demonstrated that most enriched cell type is the stem cell. In addition, the second enriched cell type is the progenitor cell. Furthermore, osteocytes, osteoblasts, and osteoclasts also account for the majority of the cells in the synovial tissues of patients with osteoarthritis.

Te Expression Level of MMP9, VEGFA, SPI1, and IRF8 in
Single-Cell RNA Sequencing Dataset. Based on the previous study, we then evaluate the expression distribution of the MMP, VEGFA, SPI1, and IRF8 in synovial tissues of patients with osteoarthritis. Te results demonstrated that MMP9 is not especially expressed in the cells of synovial tissues ( Figure 5(a)). For IRF8, the results demonstrated that the osteoclast is the most enriched cell in the synovial tissues ( Figure 5(b)). In terms of SPI1, the most enriched cells in synovial tissues are also osteoclasts ( Figure 5(c)). Finally, we evaluate the expression of VEGFA in synovial tissues. Te single-cell RNA sequencing analysis demonstrated that progenitor cells, stem cells, osteocytes, osteoblasts, osteoclasts, and mesenchymal stem cells are closely associated with the VEGFA (Figure 5(d)).

Exploration of the Potential Pathways Tat Are Closely
Associated with MMP9, VEGFA, SPI1, and IRF8. Subsequently, in order to further explore the potential pathways that are closely associated with the four key immune-related genes that are also highly correlated with the OA, we then performed GSVA to explore the potential pathways. Te high expression level of IRF8 is associated with collagen trimer, cell cycle, smoothened signaling pathway, midbody, programmed cell death, and signaling receptor binding, while the downregulation of IRF8 is associated with the nuclear body, beta-catenin binding, neurogenesis, histone methyltransferase binding, central nervous system development, and protein dimerization activity (Figure 6(a)). In addition, the high expression level of MMP9 is associated with signaling receptor binding, lipid binding, collagen trimer, transition metal ion binding, calcium ion binding, and midbody. Te low expression level of MMP9 is associated with beta-catenin binding, structural molecule activity, nuclear body, protein dimerization activity, and cytoskeleton organization (Figure 6(b)). In terms of SPI1, the high expression is correlated with negative regulation of gene expression, cell cycle, BORC complex, and smoothened signaling pathway, while the low expression is correlated with the phosphatase binding, transcription regulator activity, chromatin binding, misfolded protein binding, synapse, and structural molecule activity (Figure 6(c)). Finally, the protein dimerization activity, transcription regulator activity, structural molecule activity, nuclear body, and toxic substance binding are associated LGMN ANTXR1  IGFBP7  TREM2   18000  20000  22000  24000  harmony_1  harmony_2   CCL3  ADIRF  MS4A7  HTRA3  CD9  SPI1  OTOA  AHNAK2  LST1  CLU  C1QC EMP3 HLA-DRA SERPINF1

Discussion
Over 27 million Americans are estimated to have osteoarthritis (OA), also known as degenerative joint disease. Degenerative diseases can afect any joint [16]. Articular cartilage and surrounding tissues are primarily afected by OA. OA can be classifed as either primary or secondary. In primary OA, the disease is idiopathic, usually afecting multiple joints at the same time and usually afecting those who are relatively elderly. In most cases, secondary osteoarthritis is a single-joint condition caused by an articular surface disorder, such as trauma [17]. In recent years, although many researchers have put a lot of efort to explore the potential mechanisms of OA, little progress has been achieved. Terefore, it is urgent to explore the potential biomarkers for better diagnosis, diagnosis, and prognosis of OA. In addition, with the development of bioinformatics analysis, many studies focused on the use of bioinformatics analysis in various diseases. In this work, by combining the traditional RNA sequencing analysis and the single-cell RNA sequencing analysis, we aim to explore the genes that play a key role in OA. Firstly, we performed the diferentially expressed analysis to explore the genes that play a key role in the OA patients. In addition, based on the DEGs, we also performed pathway enrichment analysis. Te results demonstrated that humoral immune response, complemental and coagulation cascades, and regulation of infammatory response are the most enriched pathways. Te presence of OA is associated with low-grade infammation. OA patients frequently experience synovial infammation due to macrophages, T cells, B cells, and other immune cell infltration, which plays an important role in its pathogenesis [18]. As a result, it is extremely important to coordinate the local infammation microenvironment with the regeneration microenvironment during OA treatment [19]. It has been shown that GRB10 and E2F3 can be used as diagnostic markers of osteoarthritis, and they are important in the occurrence and development of this condition [20].  Genetics Research Subsequently, to further explore the immune-related genes that play a key role in the OA, we performed the PPI network analysis. Te results revealed that some genes showed many interactive counts with other genes, which were regarded as the hub genes, such as MMP9, VEGFA, SPI1, IRF8, and CAMP. Te single-cell RNA sequencing analysis was also applied to evaluate the specifc expression of hub genes in specifc cells. Te results demonstrated that osteoclast is the most enriched cell in the synovial tissues of IRF8. It has been found that the degeneration of chondrocytes and loss of subchondral bone can be prevented and treated by inhibiting the degeneration of chondrocytes. Other studies demonstrated that osteoclasts play a protective role in the prevention of OA by attenuating the loss of preexisting cartilage through osteoclast-mediated bone loss. Terefore, the osteoclast plays a key role in the process of OA. In addition, many studies focused on the role of VEGFA in OA. Te process of angiogenesis, or the formation of new blood vessels, contributes signifcantly to the pathogenesis of a wide range of human diseases. As a heparin-binding growth factor, vascular endothelial growth factor A (VEGFA) is among the known angiogenic factors [21]. Angiogenesis, migration, proliferation, and the formation of oviducts are all mediated by VEGFA, a tyrosine kinase glycoprotein [22]. Furthermore, it is involved in skeletal development, osteoblasts, and osteoclasts, which are involved in endochondral bone formation by coupling angiogenesis with hypertrophic cartilage remodeling [23]. Infammation and angiogenesis are important pathological changes of KOA cartilage; degeneration of the cartilage is Genetics Research time-dependent. According to former study, the expression of the VEGFA/VEGFR2 pathway was mainly afected by KOA levels [24]. In this work, the single-cell RNA sequencing analysis demonstrated that progenitor cell, stem cell, osteocyte, osteoblast, osteoclast, and mesenchymal stem cell are closely associated with the VEGFA. Also, we also explore the potential pathways that are closely associated with VEGFA. Te protein dimerization activity, transcription regulator activity, structural molecule activity, nuclear body, and toxic substance binding are associated with the high expression level of VEGFA.
In spite of the fact that secondary analysis of online data can provide many exciting results [25,26], as far as our understanding of underlying disease mechanisms goes, this progress has been less dramatic than expected, especially when comparing genotypes and phenotypes [27]. In spite of the progressive increase in the number of key genes available through network-available online databases, signifcant limitations remain in the translation of these valuable data into disease-focused explanations [28]. It may be necessary to combine traditional analytical approaches with strategies aimed at interrogating complex biological systems in order to achieve this goal [29].
In conclusion, VEGFA is considered as the key immunerelated gene that may play a key role in patients with OA. In addition, the single-cell RNA sequencing analysis revealed that osteoclasts also play a key role in the process of OA.

Data Availability
Te data used to support the fndings of this study are included within the article.