Integrated Bioinformatics Analysis to Screen Hub Gene Signatures for Fetal Growth Restriction

Background Fetal growth restriction (FGR) is the impairment of the biological growth potential of the fetus and often leads to adverse pregnancy outcomes. The molecular mechanisms for the development of FGR, however, are still unclear. The purpose of this study is to identify critical genes associated with FGR through an integrated bioinformatics approach and explore the potential pathogenesis of FGR. Methods We downloaded FGR-related gene microarray data, used weighted gene co-expression network analysis (WGCNA), differentially expressed genes (DEGs), and protein-protein interaction (PPI) networks to screen hub genes. The GSE24129 gene set was used for validation of critical gene expression levels and diagnostic capabilities. Results A weighted gene co-expression network was constructed, and 5000 genes were divided into 12 modules. Of these modules, the blue module showed the closest relationship with FGR. Taking the intersection of the DEGs and genes in the blue module as pivotal genes, 277 genes were identified, and 20 crucial genes were screened from the PPI network. The GSE24129 gene set verified the expression of 20 genes, and CXCL9, CXCR3, and ITGAX genes were identified as actual pivotal genes. The expression levels of CXCL9, CXCR3, and ITGAX were increased in both the training and validation sets, and ROC curve validation revealed that these three pivotal genes had a significant diagnostic ability for FGR. Single-gene GSEA results showed that all three core genes activated “hematopoietic cell lineage” and “cell adhesion molecules” and inhibited the “cGMP-PKG signaling pathway” in the development of FGR. CXCL9, CXCR3, and ITGAX may therefore be closely associated with the development of FGR and may serve as potential biomarkers for the diagnosis and treatment of FGR.


Introduction
Fetal growth restriction (FGR), also knowns as intrauterine growth restriction (IUGR), means that the fetus cannot reach its biological growth potential and is a common complication of pregnancy [1]. It is usually used to describe fetuses whose estimated fetal weight or abdominal circumference is less than the 10th percentile for gestational age [2]. It is well known that FGR is a major cause of fetal, perinatal, and neonatal morbidity and mortality. Infants with FGR are prone to long-term health problems such as poor physical growth, metabolic syndrome, cardiovascular disease, neurodevelopmental disorders, and endocrine abnormalities [3].
Te pathogenesis of FGR is related to maternal, fetal, placental, and genetic factors, among which placental insufciency is the leading cause [4]. Te placenta is a vital tissue that connects the mother to the fetus. If the placental blood perfusion is insufcient, the fetus sufers from chronic hypoxia and decreased growth rate [5]. Compared to normal controls, pregnancies with FGR (with or without preeclampsia) had smaller placental volumes and more excellent resistance to uterine blood fow [6]. Many types of research showed that insufcient chorionic trophoblast infltration, defective maternal uterine artery remodeling, and placental infammation are associated with inadequate placental perfusion [7][8][9][10].
Although there are many studies on the pathogenesis of FGR, its specifc pathological mechanisms are still not fully elucidated. At present, with the rapidly developing microarray technology and high-throughput sequencing technology, bioinformatics is used to study the pathogenesis of FGR. In this research, we used WGCNA to explore the characteristics of the placental gene network associated with FGR and to identify novel biomarkers of FGR pathogenesis.

GEO Dataset Download and Process.
Te workfow analysis is as follows (see Figure 1). Data were collected from the Gene Expression Omnibus (GEO) database (https:// www.NCBI.nlm.gov/GEO). We used the keywords "fetal growth restriction" or "intrauterine growth restriction" to search for FGR or IUGR gene expression profles from the database of GEO. Te screening standards for this study were as follows. (1) Te gene expression profles must include a case group of patients with FGR or IUGR and a control group of normal pregnant women. (2) Te tissue used for sequencing should be placenta. (3) For the WGCNA to be accurate, there should be at least 15 samples. (4) Datasets should contain either raw data or processed data, and these data should be microarray data. Finally, we selected GSE147776 and GSE24129 for further research analysis, GSE147776 as a discovery cohort and GSE24129 as a validated cohort. After downloading the normalized data, we flter the data to remove probes without corresponding annotations and take the maximum value for duplicate probes.

WGCNA.
We used RStudio 4.1.3 software to process all data, in which co-expression networks were constructed using the WGCNA package [11]. We selected the top 5000 genes with median absolute deviation values for the WGCNA based on GSE147776. To exclude the outlier samples, the samples were clustered by hierarchical clustering analysis. To ensure scale-free topology, when the correlation coefcient threshold was used at 0.85, the softthresholding power was chosen to be 12 and the minimum module size was chosen to be 50. We defned 0.25 as the threshold of cutting height to merge the potentially similar modules. Te expression of each module was calculated by module eigengenes (MEs), and the relationships between ME and clinical features were analyzed. Finally, we selected the module with a high coefcient of correlation with clinical features and selected the genes of this module for further analysis.

DEG Analysis.
DEGs in the FGR and control groups were screened with the "limma" package [12]. Te critical values for diferential genes were taken as |log2 (fold change)| > 1.5 and P value <0.05. Using the Venn diagram program, overlapping genes of the WGCNA blue module genes and the DEGs were screened and visualized. Tese overlapping genes were identifed as core genes.

Functional Enrichment Analysis of Hub Genes. Gene
Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed for overlapping genes using the "clusterProfler" R package [13]. Adjusted P value <0.05 was considered signifcantly.

PPI Network Construction and Hub Gene Identifcation.
To construct a gene action network, 277 hub genes were mapped to the STRING database (https://string-db.org/). Ten, we used the CytoHubba plugin for the base Cytoscape software (https://www.cytoscape.org/, version 3.9.1) to build protein interactions and visualize them, from which we selected the genes with the highest degree of connectivity as the central genes.

Hub Gene Expression Validation and Efcacy Evaluation.
Validation of hub genes in the dataset GSE24129 downloaded from the GEO database was performed. Te expression of core genes in FGR and normal control placental tissues was analyzed using the "ggplot2" package. Statistically signifcant diferential genes were used for further ROC curve analysis. ROC curves were plotted, and the area under the curves (AUCs) were calculated using the "pROC" software package to assess the ability of the selected genes to discriminate between FGR and control groups [14].

Gene Set Enrichment Analysis.
A gene set enrichment analysis (GSEA) was performed on individual hub genes separately in order to further explore the potential molecular functions of these genes in FGR. In the dataset GSE147776, we divided the samples into two groups in accordance with the median expression of the pivotal genes in the FGR and performed GSEA using the R package "clusterProfler" with a P value <0.05 for the cutof criterion.

Information of Datasets.
In accordance with the established search criteria, we found two datasets, GSE147776 and GSE24129. Te specifc information of the two datasets is shown in Table 1, and the clinical information of maternal and neonatal characteristics [15,16] is presented in Table 2.

Weighted Co-Expression Network Construction and Key
Module Identifcation. To fnd the most associated gene sets with the FGR trait, we used the WGCNA package to construct a gene co-expression network. We frst examined genes and samples, then performed cluster analysis on samples to exclude outliers, and fnally collected all 15 clinical samples from the GSE147776 dataset for analysis (see Figure 2(a)). In this dataset, when R2 of the spectrum structure of the scale-free network was used at 0.85, the soft threshold power is 12, ensuring that the network was approaching a scale-free topology (see Figure 2(b)). 12 coexpression modules were constructed by WGCNA (see Figure 2(c)). Tese modules were divided into 2 clusters (see Figure 2(d)). We drew a heat map of module-trait relationship to assess the correlation of all modules with FGR and found that the blue module had the highest positive correlation with FGR, so we selected this module for further analysis (see Figure 3).

DEGs and Hub Gene Identifcation.
In total, 437 DEGs have been identifed in GSE147776, including 325 upregulated genes and 112 downregulated genes. Te volcano plot of the DEGs is illustrated in Figure 4(a). We identifed 277 candidate genes from the intersection of the DEGs and the WGCNA blue module genes in the Venn diagram (see Figure 4(b)). Te heatmap of the extract hub genes is displayed in Figure 4(c).

GO and KEGG Analyses.
Te "clusterProfler" package was used for GO function enrichment analysis to investigate the biological characteristics of 277 hub genes. In biological process, the hub genes were mainly enriched in the regulation of T cell activation, T cell diferentiation, lymphocyte diferentiation, and positive regulation of cell-cell adhesion (see Figure 5(a)). In cell component (CC), they were mainly enriched in the external side of plasma membrane, collagencontaining extracellular matrix, immunological synapse, and specifc granule lumen (see Figure 5(c)). In molecular function, the hub genes were mainly enriched in the receptor ligand activity, signaling receptor activator activity, cytokine activity, and G protein-coupled receptor binding (see Figure 5(e)). In addition, KEGG enrichment analysis revealed the following pathways: cytokine-cytokine receptor interaction, hematopoietic cell lineage, graft-versus-host disease, and viral protein interaction with cytokine and cytokine receptor (see Figure 5(b)).

PPI Network Construction and Core Gene Analysis.
For further study, we constructed a PPI network among 277 candidate genes in the STRING database and visualized the PPI network using Cytoscape software. Potential key genes were identifed by the CytoHubba plugin (see Figure 5(d)). Te top 20 genes in Hubba nodes were collected as pivotal genes. Te heatmap of 20 hub genes is shown in Figure 5(f ).

Core Gene Validation and Validity Assessment.
Te extracted core genes were verifed in the GSE24129 database, which revealed that CXCL9, CXCR3, and ITGAX were signifcantly increased in the expression of placental tissue from FGR patients (see Figure 6). Tese genes' expression levels consistently matched their expression in GSE147776. Additionally, ROC curve was plotted and AUC was measured to distinguish FGR from the control group; in dataset GSE147776, the AUC of CXCL9 was greater than 0.78, and the AUCs of CXCR3 and ITGAX were both greater than 0.85, while in GSE24129, the AUCs of all true pivotal genes were above 0.8 (see Figure 7).

Gene Set Enrichment Analysis.
To analyze the potential molecular mechanisms of the core genes CXCL9, CXCR3, and ITGAX in FGR, we used single-gene GSEA to analyze the KEGG pathway. We found that "hematopoietic cell lineage" and "cell adhesion molecules" were activated in the high-expression groups of each of CXCL9, CXCR3, and ITGAX, while "cGMP-PKG signaling pathway" was inhibited (see Figure 8), suggesting that these pathways may be closely related to the development of FGR.

Discussion
FGR is a signifcant cause of stillbirth, neonatal mortality, and short-and long-term morbidity [1]. To date, there are no good treatment options for FGR except for iatrogenic preterm birth [17]. Te most common factor for FGR is placental dysfunction; accordingly, the samples selected for this study were all placental tissues, excluded samples with combined preeclampsia.
WGCNA can be used to efciently integrate data on gene expression and trait, explore the characteristics of gene networks, and identify regulatory pathways and potential biomarkers associated with complex diseases [11]. In the present study, based on WGCNA, the blue module (780 genes) was identifed to be associated with FGR, and an additional 437 genes were identifed by diferential gene analysis. Interestingly, 277 of these intersecting genes were enriched in immune cell activation, diferentiation, and   Genetics Research regulation of cell adhesion, suggesting that the placenta exhibits infammatory and immune abnormalities. Studies have found that placental infammation is associated with intrauterine growth restriction [10,18,19], which is in agreement with our results. Ten, we identifed three key genes (CXCL9, CXCR3, and ITGAX) as critical for FGR by multiple bioinformatics analyses and validated in an additional independent dataset that all three genes were highly    Genetics Research expressed in the FGR group and had a diagnostic ability for FGR. CXCL9 and CXCR3 are members of the chemokine family. CXCL9 is positioned on chromosome 4 in humans, which is induced by IFN-c [20]. CXCR3 is a transmembrane G protein-coupled receptor, whose gene is located on chromosome Xq13 [21]. CXCR3 is the ligand for CXCL9 and also for CXCL10 and CXCL11 [22]. CXCR3 interacts with its ligands to disrupt fetal-maternal immune tolerance, triggering a range of chronic infammatory lesions in the placenta that lead to intrauterine growth restriction, fetal death, spontaneous abortion, premature rupture of membranes, and preterm delivery [23][24][25]. Malaria infection during pregnancy leads to severe maternal anemia and low infant birth weight, and multivariate analysis of known predictors of birth weight suggests that elevated placental CXCL9 levels are considered an important cause of fetal growth restriction [26]. Tis is similar to the results of our study, where we found that the expression of CXCR3 and CXCL9 was elevated in the FGR group, and they are one of the important factors in the development of FGR.
Integrin alpha X (ITGAX) is one of the members of the integrin family, which usually acts as a receptor for the extracellular matrix. ITGAX is closely associated with tumor development, and ITGAX promotes c-Myc-mediated VEGF-A transcription by activating the PI3K/Akt pathway and binding to VEGFR2 on the cell membrane, enhancing angiogenesis during ovarian cancer growth [27]. Study to explore key genes in unexplained recurrent spontaneous abortion by targeted RNA sequencing and clinical analysis identifed ITGAX as one of the immunerelated genes involved in T cell activation and proliferation and cytokine receptor interactions [28]. However, there are no studies on the relationship between ITGAX and FGR. Our results suggest that ITGAX expression is elevated in FGR placental tissue and ITGAX is involved in the development of FGR, adding a new perspective to the study of the mechanisms of FGR.
Finally, we also investigated the biological functions of CXCL9, CXCR3, and ITGAX. GSEA revealed that CXCL9, CXCR3, and ITGAX could activate "hematopoietic cell lineage" and "cell adhesion molecules." Studies have shown that cell adhesion molecules are involved in the proliferation, fusion, migration, and invasion of trophoblast during placenta formation [29], and the dysregulation of the expression of these molecules can easily lead to pathological placenta, which can cause various obstetric complications such as intrauterine growth restriction [30,31], but the exact mechanism needs further research. CXCL9, CXCR3, and ITGAX also inhibit the "cGMP-PKG signaling pathway," which regulates the umbilical cord circulation, and the NOinduced umbilical vein relaxation observed in growthrestricted female neonates is associated with an imbalance in the NO/cGMP pathway [32].
Te current study has some limitations. We explored the pivotal genes associated with FGR and their biological functions in the GSE147776 dataset and validated the pivotal genes in the GSE24129 dataset, but we still need to validate  the placental tissue by the qRT-PCR analysis method, and the regulatory mechanism of hub genes in fetal intrauterine growth restriction needs to be further investigated.

Conclusions
In this study, we used WGCNA to screen the core module and identify key genes to provide new ideas for the pathogenesis of FGR and provide potential diagnostic and therapeutic targets. We will subsequently validate the fndings of this study in vivo and in vitro and elucidate the specifc mechanisms of the core genes in FGR.

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon request.

Conflicts of Interest
Te authors have no conficts of interest to declare.