The Underlying Molecular Basis and Mechanisms of Venous Thrombosis in Patients with Osteomyelitis: A Data-Driven Analysis

Objective Osteomyelitis (OM) is one of the most risky and challenging diseases. Emerging evidence indicates OM is a risk factor for increasing incidence of venous thromboembolism (VTE) development. However, the mechanisms have not been intensively investigated. Methods The OM-related dataset GSE30119 and VTE-related datasets GSE19151 and GSE48000 were downloaded from the Gene Expression Omnibus (GEO) database and analyzed to identify the differentially expressed genes (DEGs) (OMGs1 and VTEGs1, respectively). Functional enrichment analyses of Gene Ontology (GO) terms were performed. VTEGs2 and OMGs2 sharing the common GO biological process (GO-BP) ontology between OMGs1 and VTEGs1 were detected. The TRRUST database was used to identify the upstream transcription factors (TFs) that regulate VTEGs2 and OMGs2. The protein-protein interaction (PPI) network between VTEGs2 and OMGs2 was constructed using the Search Tool for the Retrieval of Interacting Genes (STRING) database and then visualized in Cytoscape. Topological properties of the PPI network were calculated by NetworkAnalyzer. The Molecular Complex Detection (MCODE) plugin was utilized to perform module analysis and choose the hub modules of the PPI network. Results A total of 587 OMGs1 and 382 VTEGs1 were identified from the related dataset, respectively. GO-BP terms of OMGs1 and shared DGEs1 were mainly enriched in the neutrophil-related immune response process, and the shared GO-BP terms of OMGs1 and VTEGs1 seemed to be focused on cell activation, immune, defense, and inflammatory response to stress or biotic stimulus. 230 VTEGs2, 333 OMGs2, and 13 shared DEGs2 were detected. 3 TF-target gene pairs (SP1-LSP1, SPI1-FCGR1A, and STAT1-FCGR1A) were identified. The PPI network contained 1611 interactions among 467 nodes. The top 10 hub proteins were TP53, IL4, MPO, ELANE, FOS, CD86, HP, SOCS3, ICAM1, and SNRPG. Several core nodes (such as MPO, ELANE, and CAMP) were essential components of the neutrophil extracellular traps (NETs) network. Conclusion This is the first data-mining study to explore shared signatures between OM and VTE by the integrated bioinformatic approach, which can help uncover potential biomarkers and therapeutic targets of OM-related VTE.


Introduction
Venous thromboembolism (VTE), including deep venous thrombosis (DVT) and pulmonary embolism (PE), is a common and severe complication of infectious disease.
is complication should always be considered in patients who present with a musculoskeletal infection, especially osteomyelitis (OM) in younger ones [1,2]. In recent years, the prevalence of DVT among acute haematogenous osteomyelitis (AHO) cases has been reported to be 6-10% [3,4]. In Taiwan, the risk of developing DVT was 2.49-fold in patients with chronic OM compared with the comparison cohort [5].
Previous studies have evaluated the clinical characteristics of OM among children with and without DVT [3,4]. Staphylococcus aureus (S. aureus) is the predominant causative agent for OM [6], and clinical outcomes were worse in patients caused by methicillin-resistant S. aureus (MRSA) than in those affected by methicillin-susceptible S. aureus (MSSA) [7]. In patients with AHO, Virchow's triad of hypercoagulability, venous stasis, and injury to the vessel wall also applies and is thought to trigger thrombosis. Animal studies have revealed the pathophysiological roles of various influencing factors, including leukocytes, platelets, and neutrophil extracellular traps (NETs), on thrombosis [8]. Besides, there is increasing evidence for an association of Panton-Valentine leukocidin (PVL)-expressing S. aureus strains with AHO severity [9]. e possible mechanistic links between DVT and OM may be platelets activated by PVLdamaged neutrophils via neutrophil secretion products [9]. However, the mechanisms by which the transcriptomic response of OM contributes to thrombosis development remain incompletely understood, although there is increasing evidence suggesting an extensive "cross-talk" between the inflammatory and coagulation cascades [10], and these signatures may be useful in the diagnosis of venous thrombosis.
Transcriptional profiles are used increasingly to investigate the severity, subtype, and pathogenesis of a disease, which have implications for diagnosis and therapeutic development [11][12][13]. erefore, blood transcriptional profiles and host response signatures could serve as biomarkers of clinical changes in subjects at risk for or diagnosed with venous thrombosis in OM. To further explore the underlying pathophysiology, we examined the gene expression alterations involved in OM and venous thrombosis. In this study, we downloaded the gene expression profiles for OM and VTE from the Gene Expression Omnibus (GEO) database. We performed a gene expression analysis using the GEO2R web tool to identify the differentially expressed genes (DEGs) of OM and VTE with their respective controls and subsequently developed Gene Ontology and pathway enrichment analysis for screening of DEGs with the g:Profiler toolset. Finally, an integration of the DEGs protein-protein interaction (PPI) network was constructed and the module analysis was performed. e identified hub genes might play important roles in the process of venous thrombosis which developed from OM. Overall, our findings will hopefully deepen the understanding of the relationships between OM and VTE. A flowchart summarizing this study is shown in Figure 1.

Data Acquisition and
Processing. Gene expression datasets related to OM and VTE were obtained from the GEO database of NCBI (https://www.ncbi.nlm.nih.gov/geo/) [14]. e keywords "osteomyelitis" and "venous thromboembolism" with "Homo sapiens" or "human" were employed to mine the dataset. Finally, 3 datasets were selected from published studies and downloaded from the GEO database, including GSE30119 for OM and GSE19151 and GSE48000 for VTE. e related information of these datasets is given in Table 1. ese blood samples were further divided into different groups according to the diseases source. DEGs were screened using transcription profile data of whole blood samples. All data were normalized and log (base 2) transformed.

Identification of DEGs.
According to the recurrence or degree of risk, VTE patients in the two datasets GSE19151 and GSE48000 were grouped and compared with the healthy controls, respectively. e patients with S. aureus OM in the dataset GSE30119 were compared with healthy controls. GEO2R web tool (https://www.ncbi.nlm.nih.gov/geo/geo2r/) [18] using the GEOquery and limma R packages from the Bioconductor project was utilized to compare the differences in gene expression between two or more groups of samples in a GEO dataset (Table 1) and identify DEGs associated with the diverse experimental conditions. e adjusted P values were used to decrease the false positive rate using the Benjamini and Hochberg false discovery rate method by default. e threshold value for identifying DEGs was set as adjusted P value < 0.05 and |log2(fold change)| ≥ 1.
e results of the pairwise comparisons were summarized for subsequent analysis.

Venn Diagram Analysis of DEGs.
We used an online integrative tool Venny (http://bioinfogp.cnb.csic.es/tools/ venny/index.html) to analyze the similarities and differences of DEGs in the three datasets, GSE30119, GSE19151, and GSE48000. e identified DEGs of the two datasets associated with VTE were merged, whereas DEGs which were observed in opposite expression directions were discarded. Afterward, two specific clusters of DEGs were defined as OM-related DEGs (OMGs1, obtained from GSE30119) and VTE-related DEGs (VTEGs1, obtained from GSE19151 and GSE48000), respectively. Additionally, the intersections of OMGs1 and VTEGs1 were defined as shared DGEs1.

Functional and Pathway Enrichment Analyses of DEGs.
To discern the implication of specific clusters and shared DEGs on OM and VTE, we used the g:Profiler (https://biit. cs.ut.ee/gprofiler/) [19] toolset to perform Gene Oncology (GO) terms and pathway enrichment analysis. e shared biological process ontology (GO-BP) was identified based on overlapping GO term IDs between functional enrichment analysis results of OMGs1 and VTEGs1. e DEGs enclosed by all GO-BP terms shared between OMGs1 and VTEGs1 were defined as DEGs2 (VTEGs2 and OMGs2, respectively) and used to predict the significant pathways and ensuing crosstalk between these pathways. Additionally, the intersections of OMGs2 and VTEGs2, suggested the potential mechanism of patients with OM in whom VTE developed, were defined as shared DGEs2. e Benjamini and Hochberg false discovery rate method was used to correct the P value. An adjusted P value < 0.05 was considered to have statistical significance and to achieve significant enrichment.

Identification of Transcription Factors (TFs) at Are
Significantly Associated with the DEGs. To link gene expression signatures to upstream cell signaling networks, we used Transcriptional Regulatory Relationships Unraveled by Sentence-based Text mining (TRRUST) (https://www. grnpedia.org/trrust/) [20] to identify the upstream TFs that regulate VTEGs2 and OMGs2. e shared TF-target gene pairs of VTEGs2 and OMGs2 may be involved in a common pathway.

PPI Network and Module Analysis.
e Search Tool for the Retrieval of Interacting Genes Database (STRING) (https://www.string-db.org/) [21] was used to expose the protein-protein interaction (PPI) information among VTEGs2 and OMGs2 at the protein level. Afterward, the network was recreated and visualized using Cytoscape (http://www.cytoscape.org/) [22], based on the coexpression graph of the PPI network. Topological properties (such as degree of distribution, betweenness centrality, and closeness centrality) of the constructed PPI network were calculated

Identification of DEGs.
Comparisons between differentially gene expression between two or more groups of samples from 1 OM dataset (GSE30119) and 2 VTE datasets (GSE19151 and GSE48000) were performed using the GEO2R web tool, respectively, and mRNAs with an adjusted P value < 0.05 and |log2(fold change)| ≥ 1 were identified as DEGs. Two specific clusters of DEGs are shown in Figure 2. A total of 587 OMGs1 were identified in GSE30119. A total of 382 VTEGs1 were identified in VTErelated datasets, including 186 in GSE19151 and 206 in GSE48000. Furthermore, 27 shared DGEs1 that overlap OMGs1 and VTEGs1 were screened out, with 2 genes (LSP1 and PADI4) found to be observed in opposite expression directions. After applying functional annotation with the Gene Ontology database to find terms associated with the development of VTE in OM patients, 230 VTEGs2 and 333 OMGs2 sharing the common biological process ontology were detected and used for further analysis (

Functional Annotation of DEGs.
To gain more biological insight, we performed GO enrichment analysis using the g:Profiler toolset. As shown in Figure 3, the activity of neutrophils has attracted widespread attention among biological processes. With respect to OMGs1, the following GO-BP terms were significantly enriched: neutrophil activation, neutrophil degranulation, neutrophil-mediated immune response, and defense response to the bacterium (Figure 3(a)). Highly similar results of enrichment analysis were also presented in shared DGEs1 (Figure 3(c)).
ere suggested a significant correlation with the biological process of neutrophils and the process of VTE which developed in OM patients. In addition, VTEGs1 were mainly enriched in the viral gene expression, nuclear-transcribed mRNA catabolic process, protein targeting to membrane, and protein localization to the endoplasmic reticulum ( Figure 3(b)). e shared GO-BP terms of OMGs1 and VTEGs1 seemed to be focused on cell activation, immune defense, and inflammatory response to stress or biotic stimuli (Figure 3(d)). ese results showed that DEGs were mainly enriched in host immune defense processes such as neutrophil activation and degranulation to resist bacterial invasion or other stress.

Discussion
As a fatal disease caused by serious musculoskeletal infection, VTE needs to arouse the attention of clinical staff. It has been reported that children with OM may have increased susceptibility to VTE in previous epidemiologic studies [3][4][5]. Unfortunately, the transcriptional signatures between VTE and OM have not been intensively investigated. e main objective of this study was to focus on revealing the underlying pathophysiology association between VTE and OM.
In the present study, we integrated 2 expression profiles from VTE patients and 1 expression profile from OM patients to identify genes that may play a crucial role in the onset and development of VTE in OM patients. First, 382 VTEGs1 and 587 OMGs1 were identified in VTE and OM patients, respectively. Following these, DEGs1 were evaluated by functional enrichment analysis to get insight into the biological significance in the pathogenesis. It is reported that blood coagulation was overexpressed in patients with osteoarticular infections [15,24]. Here, our research showed    showing an important part in the multitude of biological processes involved in diseases [20]. In this study, based on the Venn diagram analysis method, 13 shared upstream TFs of VTEGs2 and OMGs2 were screened out. It is suggested that 3 TF-target gene pairs, consisting of SP1-LSP1, SPI1-FCGR1A, and STAT1-FCGR1A, may reveal the potential pathogenesis link between VTE and OM. Finally, in the PPI network analysis, TP53, IL4, MPO, ELANE, FOS, CD86, HP, SOCS3, ICAM1, and SNRPG were the top 10 hub proteins with the highest connectivity within the network. Significant modules were further detected in the PPI network, and some specific nodes with the highest connectivity (i.e., ELANE, HP, SOCS3, and SNRPG) were also identified in the top 3 modules. Particularly, 4 shared DGEs2 (SLPI, TCN1, CREG1, and FCGR1A) were found in the 16 modules.
Additionally, other hub proteins may also play a crucial role in the pathogenesis of VTE in patients with OM. TP53, the top hub gene of our study, is a critical tumor suppressor and a key regulator in numerous cellular functions, which maintains critical functions in immunity, inflammation, and tissue repair [42]. Intriguingly, TP53 may participate in the host-virus interactions that could characterize shared biological mechanisms between acute respiratory distress syndrome (ARDS) and VTE in severe COVID-19 patients [43]. A similar host-bacterial interaction may also be reflected between OM and VTE. IL4 is directly involved in the bone desorption and osteoclast activity regulation that occur in OM.
e association between IL4 gene polymorphisms (-1098-G/T and -590-C/T) and chronic OM was previously observed [44]. However, IL4 was not significantly elevated in a mouse model of fracture fixation with S. aureus OM [45]. e possible links between inflammation-related genetic variants (including IL4) and VTE established the fundamental role of genetic background in predisposition to VTE and several inflammation-related conditions [46]. HP, one of the acutephase reactants, can stimulate inflammatory bone loss and its phenotype 2-2 has been reported to be a risk factor for spontaneous VTE [47]. SOCS3 was reported to be a key player in bone-associated inflammatory responses, which acted as the cytokine-inducible negative regulator of cytokine signaling [48]. e FOS proteins have been implicated as regulators of cell proliferation, differentiation, and transformation. e c-Fos gene has been put forward as a new factor in the progression and severity of atherosclerosis [49]. CD86 plays an important role in immune responses as a costimulatory molecule on antigen presenting cells. e gene polymorphisms of CD86 may be related to the risk of sepsis with contradictory results in different studies [50,51]. Unfortunately, there is no relevant research report on FOS or CD86 in the pathogenesis of OM or VTE.
is study has several limitations. Although followed the same analysis strategy and threshold for DEGs screening, the number of differential genes shared by the GSE19151 and GSE48000 datasets obtained in the end was small due to the inconsistency of the inclusion criteria and sequencing platform. e DEGs from these two datasets were included in the subsequent analysis, which can provide us with more possibilities, but they were not differentially expressed at the same time. More external datasets were needed to verify the results of the study. In addition, further experimental investigations are warranted to decipher the roles of these hub genes in the development of VTE in OM patients.

Conclusions
In this study, we reported for the first time the identification of the DEGs and activated signaling pathways between OM Genetics Research 9 and VTE by the integrated bioinformatic approach. Our results finally identified a new set of novel biomarkers and important molecular targets, including 3 TF-target gene pairs (SP1-LSP1, SPI1-FCGR1A, and STAT1-FCGR1A), 3 structural proteins of NETs (MPO, ELANE, and CAMP), and 6 other hub proteins (TP53, IL4, HP, SOCS3, FOS, and CD86), which might play essential biological roles during the progression of VTE in OM patients. In addition, the enriched neutrophils-related pathways (neutrophil activation, degranulation, and immune response to bacteria) could advance our understanding of the development of OM towards VTE and indicate new avenues to develop therapies for VTE. However, more research, especially experimental research, will be indispensable for the further clinical application of these biomarkers.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.