Selecting Hub Genes and Predicting Target Genes of microRNAs in Tuberculosis via the Bioinformatics Analysis

Tuberculosis (TB) is the world's most prevalently infectious disease. Molecular mechanisms behind tuberculosis remain unknown. microRNA (miRNA) is involved in a wide variety of diseases. To validate the significant genes and miRNAs in the current sample, two messenger RNA (mRNA) expression profile datasets and three miRNA expression profile datasets were downloaded from the Gene Expression Omnibus (GEO) database. The differentially expressed (DE) genes (DEGs) and miRNAs (DE miRNAs) between healthy and TB patients were filtered out. Enrichment analysis was executed, and a protein-protein interaction (PPI) network was developed to understand the enrich pathways and hub genes of TB. Additionally, the target genes of miRNA were predicted and overlapping target genes were identified. We studied a total of 181 DEGs (135 downregulated and 46 upregulated genes) and two DE miRNAs (2 downregulated miRNAs) from two gene profile datasets and three miRNA profile datasets, respectively. 10 hub genes were defined based on high degree of connectivity. A PPI network's top module was constructed. The 23 DEGs identified have a significant relationship with miRNAs. 25 critically significant Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were discovered. The detailed study revealed that, in tuberculosis, the DE miRNA and DEGs form an interaction network. The identification of novel target genes and main pathways would aid with our understanding of miRNA's function in tuberculosis progression.


Introduction
Tuberculosis (TB) is one of the most common infectious diseases, caused by the pathogen Mycobacterium tuberculosis (MTB). It was linked to a high rate of infection and a longterm disease course [1]. TB is one of the top 10 causes of death according to the World Health Organization (WHO)'s global TB report for 2020. Each year, about ten million people were infected with tuberculosis. TB was identified as a repetitive immune reaction, and a subset of patients with tuberculosis will grow into active tuberculosis. It has a close connection with poverty. Around 90% of tuberculosis patients stayed in underdeveloped countries [2].
ere are many diagnostic tools available today, including the tuberculin skin examination (TST), the interferon gamma release assay (IGRA), and imaging procedures. ey all lack precision and are more expensive and highly technical [3,4]. microRNAs (miRNAs) are noncoding RNA molecules that have 22 to 23 nucleotides [5]. ey regulate gene expression at the posttranscriptional stage by facilitating mRNA degradation and preventing mRNA translation [6]. When a patient is infected with tuberculosis, major miRNAs are released into the bloodstream. MiRNAs have shown to play an important role in a variety of pathological and physiological mechanisms in tuberculosis [7]. Microarray analysis, miRNA, and gene have been extensively utilized in the quest for biomarkers and therapeutic targets in recent years. e discovery of novel DE miRNAs and DEGs provides useful and accurate perspectives for future study [8].

Microarray Data Screening.
e bioinformatics analysis was carried out in accordance with the protocol depicted in Figure 1. e term 'Tuberculosis" or "microRNA and Tuberculosis" was searched in the Gene Expression Omnibus database (GEO). ere are some database criteria that are helpful with more analysis: I) clinical study pairs must have both healthy and tuberculosis patients, and II) mRNA and miRNA data from plasma (Affymetrix miRNA 4.0) were retrieved. As a result, two gene expression profile datasets, GSE62147 and GSE34608-GPL6480, were used in the primal mRNA datasets. GSE62147 had 14 TB-positive donors and 14 healthy donors. GSE34608-GPL6480 had eight TB donors and eighteen healthy donors. Six donors with tuberculosis and three healthy donors contained GSE29190. GSE49951 had 71 TB-positive donors and 71 healthy donors.

Selecting DEGs and DE miRNAs and Constructing a Volcanic Map.
To diagnose DEGs and DE miRNAs in TB patients, the GEO2R online research method (https://www. ncbi.nlm.nih.gov/geo/geo2r/) was used. Exact cutoff values (P < 0.05 and |log fold change (FC)|≥1) were established. e two gene and three miRNA datasets were submitted to VENN's online tool (http://bioinformatics.psb.ugent.be/ webtools/Venn/). e DEGs and DE miRNAs that overlap were filtered out. e volcanic chart of overlapping DEGs was constructed using the online method of bioinformatics (http://www.bioinformatics.com.cn/).

GO and KEGG Pathway analysis.
Gene Ontology (Go) was applied to perform the category biological process (BP), molecular function (MF), and cellular component (CC) enrichment analysis [15]. e KEGG was a set of databases that provides information regarding biological mechanisms, cellular processes, chemical substances, and diseases [16]. e R software package clusterProfiler was used in functional gene annotations and KEGG enrichment analysis [17]. A cutoff criterion (P < 0.05 and FDR<0.05) was set. Also, the top 10 pathways with the maximum number of genes for the corresponding term were chosen. Significant items of GO and KEGG were submitted to R software package ggplot2 to visualize and merge enriching analysis. e enrichment dot bubble method was used to build the bubble plot.

Construction of Network and Identification of Top
Modules and Hub Genes. Permanent DEGs were uploaded to the STRING online tool (https://string-db.org/) [18]. A cutoff point (interaction score >0.4) was established. e PPI network was constructed using the CytoScape program [19].
e MCODE plug-in was used to determine the PPI network's top module. Strict cutoff conditions (degree cutoff � 2, node score cutoff � 0.2, k-core � 2, and maxdepth � 100) were established. Meanwhile, the top 10 hub genes were identified using the CytoHubba plug-in.

Screening of DEGs and DE miRNAs.
Under the precise cutoff criteria employed (P < 0.05 and |log fold change (FC)|≥1), there were 614 and 4211 DEGs extracted from GSE6214 and GSE34608-GPL6480, respectively. Following that, DE miRNAs were extracted from GSE34608-GPL7731, GSE29190, and GSE49951, respectively, yielding 174, 36, and 23 DE miRNAs. Two gene expression profile datasets yielded a total of 181 overlapping DGEs (135 downregulated genes and 46 upregulated genes) (Figure 2(b) and Table 1). Two overlapping DE miRNAs (two downregulated and zero upregulated) were derived from three miRNA expression profile datasets (Figure 2(c) and Table 2). e volcano plot of three gene expression profile datasets revealed a substantial difference between regular and tuberculosis patients (Figure 2(a)).

Hub Gene Identification Based on the DEG PPI Network and Module Analysis.
e STRING online tool was updated with 181 common DEGs (135 downregulated and 46 upregulated). Significant DEGs were identified in 135 of 181 DEGs. Significant DEGs were visualized in detail using the      (Figure 4(b)). e top significant module was defined from the PPI network using the MCODE plug-in based on its degree value. e top module featured fifteen nodes and forty-five corners (Figure 4(c)).

Construction of the DEGs-DE miRNA Network.
Because of the shortcomings of each dataset, hsa-mir-7 and hsa-mir-451 were each submitted to four accurate datasets for target gene prediction. Using the VENN online research website, miRNA target genes were integrated with DEGs.

Discussion
In recent years, a number of studies have been carried out to reveal the potential mechanisms of tuberculosis. e prevalence of tuberculosis has been steadily rising. Traditional studies have two flaws: a single genetic case and a limited cohort [23]. Bioinformatics analysis of gene expression profile datasets of TB patients is now being used to screen more reliable data. A total of 181 common DEGs (135 downregulated genes and 46 upregulated genes) were extracted from two gene expression profile datasets in this study.
e hsa-mir-7 and hsa-mir-451, two DE miRNAs, were extracted from four gene expression profile datasets. Both hsa-mir-7 and hsa-mir-451 were reduced in expression. e 181 DEGs underwent GO and KEGG enrichment analysis, allowing them to be classified into BP, CC, MF, and KEGG groups. Finally, a DEGs-DE miRNA PPI network was built. e PPI network was used to screen the top module and ten hub genes.
To determinate the underlying molecular mechanisms in the TB process, the most enriched BP, CC, and MF pathways were combined with downregulated and upregulated DEGs for comprehensive analysis, separately. GO analysis presented that upregulated DEGs were mainly related with sarcoplasm and downregulated DEGs were mainly related with neutrophil, antibacterial activities, primary lysosome, and exogenous protein binding. e comprehensive analysis   of GO and hub genes demonstrated that hub genes mainly associated with neutrophil and primary lysosome signal pathways. KEGG analysis presented that upregulated DEGs were mostly involved in hepatocellular carcinoma, Kaposi sarcoma-associated herpesvirus infection, the phosphatidylinositol signaling system, circadian entrainment, and the apelin signaling pathway. Most of the abovementioned KEGG pathways play an important role in the immune response and apoptosis of Mycobacterium tuberculosis.
Kaposi sarcoma-associated herpesvirus is well known to be involved in antiapoptosis, enhancement of cytokine production, and cell proliferation [24]. e previous study reported that MTB enhances bacterial virulence by avoiding host cell death [25]. Phosphatidylinositol is a lipid anchor, which as virulence factor and modulate host immune response in MTB [26]. Also, the apelin decreases mitochondrial apoptosis, mitochondrial ROS-triggered oxidative damage, and NF-κB activation to inhibit acute lung injury (ALI) and acute respiratory distress syndrome (ARDS) [27]. Circadian entrainment was also involved in innate immune response and photoperiod significant impairment and enhancing immune function [28].
According to a recent study, hsa-mir-7 impaired NF-κB and AKT transcriptional activity. e NF-κB and AKT pathways are important inflammatory-associated pathways and inhibited host cell autophagy in tuberculosis [4,29,30]. e hsa-mir-7 was thought to be involved in innate immune responses in a previous study [31]. Meanwhile, hsa-mir-451 protects against cell death caused by ischemia/reperfusion injury, cancer, and myocardial I/R injury [32][33][34]. CAMP, CEACAM8, CD59, CLEC5A, STOM, MS4A3, GPR84, LAMTOR3, CEACAM1, and DEFA4 are among the ten hub   Genetics Research genes discovered. ey act as TB immune mediators and, additionally, as an active participant in the control of the inflammatory response [35][36][37][38][39][40]. Meanwhile, macrophages are innate immune cells that serve as a guardian in the immune system's protection and response to microbial infection [41]. GPR84 is a member of the family of metabolic G protein-coupled receptors [42]. GPR84 activation enhances bacterial adhesion and phagocytosis in macrophages. According to a recent study, elevated glucose levels promote GPR84 expression [43]. STOM is a member of the family of integral membrane proteins [44]. As previously mentioned, STOM's primary role is to regulate glucose transporter type 1 operation. Furthermore, elevated glucose levels augment macrophage anti-inflammatory function [45]. GPR84 and STOM are closely associated with the adaptive immune response, as determined by experimentation. CAMP is an antimicrobial peptide that is synthesized at the C-terminus of proteins [46]. It eradicated MTB and slowed the progression of tuberculosis. CAMP is a vitamin D-related gene that would be triggered in response to vitamin D. It is essential for macrophages infected with MTB to benefit from the antimicrobial reaction induced by vitamin D. CAMP activation enhances the effectiveness of immune responses in tuberculosis [47]. CEACAM1 and CEACAM8 are members of the class of heavily glycosylated carcinoembryonic antigens [48,49]. e primary role of CEACM1 is antiapoptotic. It is involved in granulocyte survival [50]. It is classified as an immunoregulatory checkpoint regulator. Previous research demonstrated that inadequate CEACAM1 resulted in inflammatory exacerbation [51]. CEACM8 is a responsive granulocyte biomarker. It performs two functions: it recognizes neutrophils and degrades extracellular matrix, thus stimulating the immune response [52]. MTB has been shown to control the immune system by controlling the macrophage cell cycle. It prevents macrophages from entering the interphase and gap phase 1 phases [53]. MS4A3 is a member of the family of membrane-spanning 4A genes [54]. MS4A3 is intimately linked to the cell cycle. As a consequence, the activation of the kinase-associated phosphatase (KAP) leads to cell cycle arrest. CLEC5A is a member of the superfamily C-type lectin/C-type lectin-like domain (CTL/CTLD) [54]. CLEC5A controls cell development by inducing apoptosis and arresting the cell cycle. Additionally, it plays a critical function in the production of inflammation and serves as a critical gene for the clinical management of pulmonary inflammation [55]. DEFA4 is a part of the lipocalin family, which is involved in the transportation of vitamins, lipids, and steroid hormones   Genetics Research [56]. It inhibits bacterial growth by attaching to pathogenic bacterial siderophores. It plays a significant role in immune response and protection. In recent years, LAMTOR3 and CD59 have been extensively used to inhibit the development of a variety of cancers [57,58]. CD59's primary role is to regulate immune cell activation throughout the tumor microenvironment [59]. Additionally, the top module was defined through the PPI network of DEGs. e top module included a large number of genes associated with macrophages and the innate immune response. Previously published studies established that macrophages act as a host cell for MTB and are active in the innate immune response. MTB destroyed the macrophage's main immune systems, antigen introduction and intracellular killing [41]. e important DEGs were established using integrated bioinformation analysis. GPR84, STOM, CAMP, CEACM8A, MS4A3, LAMTOR3, DEFA4, CLEC5A, and CD59 were the ten hub genes that were filtered out. Two DE miRNAs, hsa-mir-7 and hsa-mir-451, were discovered. Some important pathways, including neutrophil, antibacterial activities, primary lysosome, exogenous protein binding, the apelin signaling pathway, Kaposi sarcoma-associated herpesvirus infection, and the phosphatidylinositol signaling system, are involved in TB. DEGs and DE miRNAs may be used as sensitive biomarkers in the clinical diagnosis of tuberculosis. e network of DE miRNA-DEGs revealed the molecular mechanism causing tuberculosis. More accurate clinical samples and experiments are needed to validate the cause of tuberculosis. In the future, bioinformatics research could be able to identify novel genes and pathways for use in genomic therapy for tuberculosis.

Data Availability
ere are no files that need to be deposited in public repositories among the data in the manuscript. e manuscript contains all necessary information about the data collected.

Conflicts of Interest
e authors declare no conflicts of interest.