Analysis of Long Noncoding RNAs-Related Regulatory Mechanisms in Duchenne Muscular Dystrophy Using a Disease-Related lncRNA-mRNA Pathway Network

Objective This study aimed to investigate the molecular regulatory mechanisms underpinning Duchenne muscular dystrophy (DMD). Methods Using microarray data, differentially expressed long noncoding RNAs (DELs) and DMD-related differentially expressed mRNAs (DEMs) were screened based on the comparative toxicogenomics database, using a cutoff of |log2 fold change| > 1 and false discovery rate (FDR) < 0.05. Then, protein-protein interaction (PPI), coexpression network of lncRNA-mRNA, and DMD-related lncRNA-mRNA pathway networks were constructed, and functional analyses of the genes in the network were performed. Finally, the proportions of immune cells infiltrating the muscle tissues in DMD were analyzed, and the correlation between the immune cells and expression of the DELs/DEMs was studied. Results A total of 46 DELs and 313 DMD-related DEMs were identified. The PPI network revealed STAT1, VEGFA, and CCL2 to be the top three hub genes. The DMD-related lncRNA-mRNA pathway network comprising two pathways, nine DELs, and nine DMD-related DEMs showed that PYCARD, RIPK2, and CASP1 were significantly enriched in the NOD-like receptor signaling pathway, whereas MAP2K2, LUM, RPS6, PDCD4, TWIST1, and HIF1A were significantly enriched with proteoglycans in cancers. The nine DELs in this network were DBET, MBNL1-AS1, MIR29B2CHG, CCDC18-AS1, FAM111A-DT, GAS5, LINC01290, ATP2B1-AS1, and PSMB8-AS1. Conclusion The nine DMD-related DEMs and DELs identified in this study may play important roles in the occurrence and progression of DMD through the two pathways of the NOD-like receptor signaling pathway and proteoglycans in cancers.


Introduction
Duchenne muscular dystrophy (DMD), a severe and progressive X-linked disease that is characterized by muscular dystrophy, is caused by mutations in the dystrophin gene and has a global prevalence of approximately 0.005% of male births [1]. Defciency of the dystrophin protein in muscles results in the progressive loss of muscle tissue and impaired muscle function, rendering the muscles more vulnerable to mechanical damage [2]. Complications, including skeletal muscle wasting, cardiomyopathy, and respiratory insufciency, are often observed, with cardiopulmonary failure being the most common cause of DMD-related mortality [3]. Multidisciplinary symptomatic treatment is used in the clinical management of DMD [4]. Gene-based therapies and myogenic cell transplantation are the current therapeutic approaches for DMD [5].
Long noncoding RNAs (lncRNAs) are noncoding transcripts longer than 200 nucleotides that are not translated into proteins [6]. LncRNAs are important modulators of messenger RNA (mRNA) and modulate mRNA expression at the posttranscriptional level by functioning as competing endogenous RNAs (ceRNAs) that competitively bind to microRNAs (miRNAs) [7]. LncRNAs regulate dystrophin expression and thus represent a potential therapeutic strategy to alleviate dystrophin protein defciency [8]. LncRNA H19 reportedly ameliorates muscular dystrophy by suppressing the degradation of the dystrophin protein [9]. In a recent study, bioinformatics analyses were used to comprehensively examine DMD-related diferentially expressed lncRNAs (DELs) by constructing a ceRNA network of lncRNA-miRNA-mRNA using gene expression omnibus (GEO) datasets [10]. However, DMD-related lncRNAs and their regulatory molecular mechanisms involving lncRNAs, mRNAs, and pathways have not been fully investigated.
In the present study, gene expression profles of DMD samples were analyzed using a series of bioinformatics tools to identify diferentially expressed mRNAs (DEMs) and DELs in DMD. Next, protein-protein interaction (PPI) networks of the DMD-related DEMs and DMD-related lncRNA-mRNA pathway were constructed. Additionally, alterations in diferent types of immune cells and the correlation between immune cells and crucial factors in DMD were also analyzed. Finally, the expression of the important factors (lncRNAs and mRNAs) identifed in the study was verifed in the training and validation sets. Tus, the study identifed several potential biomarkers and therapeutic targets for DMD and has signifcant implications for the development of novel treatment strategies based on them.

Data Source and Preprocessing.
Te gene expression profles from the GSE38417 and GSE6011 datasets were downloaded from the NCBI Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/). Te GSE38417 dataset, which was used as a training set, included muscle tissue samples from 16 patients with DMD and six normal control individuals (Afymetrix Human Genome U133 Plus 2.0 Array platform). However, the GSE6011 dataset, which served as a validation set, contained 36 muscle tissue samples (22 patients with DMD and 14 normal control individuals) and was analyzed using the GPL96 detection platform. Ten, the mRNA and lncRNA expression was reannotated according to the annotation fles from the Ensembl genome browser database (https://asia.ensembl.org/index.html).

Screening DEMs and DELs in DMD.
Using the limma package of the R software (version 3.6.1, https:// bioconductor.org/packages/release/bioc/html/limma.html), the raw expression datasets were subjected to log 2 transformation and then normalized using the "between array normalization" function. Subsequently, DEMs and DELs were screened with a strict cutof of false discovery rate (FDR) < 0.05 and |log 2 fold change (FC)| > 1. Ten, bidirectional hierarchical clustering analysis based on the centered Pearson correlation was performed on the identifed DEMs and DELs using the pheatmap package (https:// cran.r-project.org/web/packages/pheatmap/index.html) in R 3.6.1.
Using the keyword "Duchenne muscular dystrophy," DMD-related genes were searched in the comparative toxicogenomics database (CTD) [11] (https://ctdbase.org/), with the inference score threshold set at >3. Following comparison with the identifed DEMs, the overlapping DMD-related DEMs were selected for gene ontology (GO) terms [12] and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway [13] enrichment analyses using the online tool Database for Annotation, Visualization, and Integrated Discovery (DAVID, version 6.8, https://david.ncifcrf.gov/). An FDR of <0.05 was set as the threshold for a signifcant diference.

Constructing a PPI Network.
Te overlapping DMDrelated DEMs identifed using the screening were subjected to searches for interactions of the proteins they encoded using the search tool for the retrieval of interacting genes [14] (STRING, version 11.0, https://string-db.org/), and the PPI pairs with a combined score >0.6 were retained to construct a PPI network [15]. Cytoscape (version 3.9.0) was used to visualize the PPI network. Subsequently, the topological features of the network, including the average shortest path length, degree centrality (DC), closeness centrality (CC), and betweenness centrality (BC) [15], were calculated to determine the hub genes in the PPI network using the CentiScaPe plugin (version 2.2) of Cytoscape software (https://apps.cytoscape.org/apps/centiscape). Degree refers to the number of interactions (edges) at a node (protein), and node genes with the highest degrees were defned as hub genes.
Te molecular complex detection (MCODE) version 1.4.2; https://apps.cytoscape.org/apps/mcode) plugin in the Cytoscape software was used to mine the functionally related and highly interconnected modules using the following parameters: degree cutof � 2, node score cutof � 0.2, and Kcore � 2. Ten, the functional analysis of the genes in these modules was performed using the BINGO plugin (version 2.44; https://apps.cytoscape.org/apps/bingo) in the Cytoscape software with FDR <0.05.

Constructing a Coexpression Network of lncRNA-mRNA.
Correlations between the DELs and overlapping DMDrelated DEMs were analyzed by calculating Pearson's correlation coefcient (PCC) using the cor.test (https://stat. ethz.ch/R-manual/R-devel/library/stats/html/cor.test.html) using the R software. Ten, the lncRNA-mRNA pairs with P < 0.05 and PCC > 0.9 were chosen to construct a lncRNA-mRNA coexpression network, and the network was visualized using Cytoscape software. Next, the genes in the coexpression network were subjected to GO term and KEGG pathway enrichment analyses using the DAVID online tool (version 6.8) with FDR < 0.05.

Constructing a DMD-Related lncRNA-mRNA Pathway
Network. Te DMD-related KEGG pathways in the CTD database were searched using the keyword "duchenne muscular dystrophy." Te DMD-related KEGG pathways thus identifed were compared with the KEGG pathways that were signifcantly enriched by the genes in the coexpression network, and the overlapping KEGG pathways and overlapping KEGG pathway-related DEMs and DELs were retained to construct a DMD-related lncRNA mRNA pathway network. Genetics Research

Correlation between the Crucial Factors and DMD-
Related Immune Cell Types. Te infltration of immune cells in muscle tissues is an important feature of DMD [16]. Te proportion of diferent types of immune cells infltrating the muscle tissue of patients with DMD and normal control individuals was calculated using CIBERSORT [17] (https:// cibersort.stanford.edu/index.php), and the diferences between the two sets of samples were compared using t-tests. Finally, the correlation between the signifcantly diferent immune cell types and the expression of the factors (mRNAs and lncRNAs) in the DMD-related lncRNA mRNA pathway network was analyzed by calculating the PCC.

Analysis of the PPI Network Based on the DMD-Related
DEMs. Te 313 DMD-related DEMs were subjected to a search for interactions between the proteins encoded by these DMD-related DEMs using the STRING database, and 695 interaction pairs with a combined score >0.6 were identifed and used to build a PPI network, which contained 216 nodes and 695 edges (Figure 3(a)). After calculating the centrality parameters of each node in the network, the top 20 genes were identifed (listed in Table 1), and the top three hub genes were STAT1 (degree � 41), VEGFA (degree � 32), and CCL2 (degree � 28).

Analysis of the Coexpression Network of lncRNA mRNA.
To investigate the relationship between the identifed DELs and overlapping DMD-related DEMs in the pathogenesis of DMD, we calculated the PPCs of the DELs and overlapping DMD-related DEMs. Based on the thresholds of P < 0.05 and PPC > 0.9, 308 lncRNA-mRNA pairs were obtained, and a coexpression network of lncRNA-mRNA was constructed (Figure 4(a)). Next, the genes in this coexpression network were subjected to functional analysis. With the FDR set at <0.05, 21 GO terms of biological process and nine KEGG signaling pathways were found to be signifcantly enriched and included "positive regulation of ERK1 and ERK2," "T cell receptor signaling pathway," "infammatory response," "response to hypoxia," "NOD-like receptor signaling pathway," "PI3K-Akt signaling pathway," "HIF-1 signaling pathway," and "NF-kappa B signaling pathway" (Figure 4(b)).

Analysis of the DMD-Related lncRNA-mRNA Pathway Network Composed of Nine mRNAs, Nine lncRNAs, and Two
Pathways. We searched the CTD database using the keyword "Duchenne muscular dystrophy" and identifed 45 KEGG signaling pathways to be closely associated with DMD. Following comparison with KEGG pathways signifcantly enriched by the genes in the coexpression network, two overlapping KEGG pathways were obtained, including the NOD-like receptor signaling pathway and proteoglycans in cancers. Subsequently, the interaction between DEMs in the overlapping KEGG signaling pathways and the coexpression relationship with lncRNAs were integrated, and a DMD-related lncRNA-mRNA pathway network was proposed with two KEGG pathways, nine DMD-related DEMs (LUM, HIF1A, PYCARD, RIPK2, CASP1, PDCD4, RPS6, MAP2K2, and TWIST1), and nine DELs (ATP2B1-AS1, GAS5, MBNL1-AS1, MIR29B2CHG, PSMB8-AS1, FAM111A-DT, CCDC18-AS1, DBET, and LINC01290) ( Figure 5). From the lncRNA-mRNA pathway network, PYCARD, RIPK2, and CASP1 were found to be signifcantly enriched in the NOD-like receptor signaling pathway, whereas MAP2K2, LUM, RPS6, PDCD4, TWIST1, and HIF1A were signifcantly enriched in proteoglycans in cancers.     (P � 0.0007) were signifcantly increased (Figure 6(a)) in the DMD samples compared to those in the normal control samples. Furthermore, analysis of the correlation between the fve signifcantly diferent immune cell types, nine DMD-related DEMs, and nine DELs in the DMDrelated lncRNA-mRNA pathway network revealed that activated NK cells, neutrophils, and Tregs were positively correlated with the expression of four DELs (CCDC18-AS1, DBET, MBNL1-AS1, and MIR29B2CHG) and one DMD-related DEM (MAP2K2), whereas the cells were negatively correlated with the expression of the other fve DELs and eight DMD-related DEMs. However, resting myeloid dendritic cells and M2 macrophages showed an opposite trend (Figure 6(b)).

Validation of the Identifed Crucial Factors in the GSE38417 and GSE6011 Datasets.
Te expression of the crucial factors in the lncRNA-mRNA pathway network was analyzed in the training (GSE38417) and validation (GSE6011) sets. In GSE38417, the expression of lncRNA ATP2B1-AS1, FAM111A-DT, GAS5, LINC01290, and PSMB8-AS1 signifcantly increased (P < 0.05), whereas that of CCDC18-AS1, DBET, MBNL1-AS1, and MIR29B2CHG was signifcantly decreased (P < 0.05, Figure 7 Figure 7(b)). Because GSE6011 is based on the GPL96 annotation platform, few lncRNAs could be annotated on this platform. Terefore, we only verifed the expression of nine DMD-related DEMs in the validation set (GSE6011). We found that the trend of expression of these DMD-related DEMs (except for CASP1, RIPK2, and RPS6) in GSE6011 was consistent with that in GSE38417 (Figure 7(c)).

Discussion
DMD is a fatal X-linked genetic disorder characterized by progressive muscular wasting resulting from dystrophin protein defciency [18]. Te involvement of lncRNAs in the pathophysiology of DMD is being increasingly investigated because of their pivotal roles in the regulation of dystrophin protein expression. In the present study, we identifed 46 DELs and 313 DMD-related DEMs in DMD. A PPI network was constructed based on these DMD-related DEMs, and STAT1, VEGFA, and CCL2 were identifed as the top three hub genes. Based on the DMD-related DEMs and DELs, a coexpression network of lncRNAs and mRNAs was generated, and the factors in this coexpression network were found to be signifcantly enriched in 21 GO terms of biological processes and nine KEGG signaling pathways. According to the CTD database, NOD-like receptor signaling and proteoglycans in cancer pathways were found to be closely related to DMD. Te DMD-related lncRNA-mRNA pathway network was found to consist of two pathways, nine DMD-related DEMs and nine DELs. Furthermore, fve immune cell types were found to be significantly diferent between the DMD and normal control samples. Tus, this study provides important insights into the underlying mechanisms of lncRNAs in the pathogenesis of DMD and provides potential targets that could aid in the rational design of lncRNA-based therapies for DMD. Lack of dystrophin results in chronic infammation, which is closely associated with muscle degeneration in the pathogenesis of DMD [19,20]. In our proposed DMDrelated lncRNA-mRNA pathway networks, PYCARD, RIPK2, and CASP1 were signifcantly enriched in the NODlike receptor signaling pathway, whereas MAP2K2, LUM, RPS6, PDCD4, TWIST1, and HIF1A were signifcantly enriched in proteoglycans in cancers. Te NOD-like receptor (NLR) assembles a protein complex called the NLR family pyrin domain-containing 3 (NLRP3) infammasome in response to certain infectious and sterile stimuli [21]. NLPR3 infammasomes are upregulated as a result of dystrophin defciency and play a key pathogenic role in DMD [22]. Te NLR signaling pathway has a critical role in infammation [23]. PYCARD encodes the infammasome adaptor apoptosis-associated speck-like protein containing a C-terminal caspase recruitment domain (ASC), a component of the NLRP3 infammasomes, which is involved in downstream signaling of the infammasome pathway [24].   Receptor-interacting serine/threonine protein kinase 2 (RIPK2) regulates NLRP3 and infammation [25], and the NOD/RIPK2 infammatory signaling pathway has been found to confer susceptibility to osteoarthritis [26]. CASP1, encoding caspase 1, is a key component of NLRP3 infammasomes [27] and is positively related to active ulcerative colitis [28]. Our results suggest that PYCARD, RIPK2, and CASP1 may be involved in regulating DMDrelated infammation via the NLR signaling pathway.
Te dystrophin protein is a component of the dystrophin-associated glycoprotein complex that connects actin flaments, intermediate flaments, and microtubules to transmembrane protein complexes to stabilize muscle cells [29]. Proteoglycans, a class of highly glycosylated proteins that are expressed in almost all tissues, are involved in tissue homeostasis and remodeling of the stromal microenvironment during physiological and pathological processes, such as tissue regeneration, angiogenesis, and cancer [30]. MAP2K2 encodes mitogen-activated protein kinase 2, an important component of the MAPK pathway. Te MAPK pathway is an important regulator of myofber death [31]. MAPK/ERK signaling plays a critical role in various biological activities by phosphorylating various substrates in the cytoplasm and nucleus and has been reported to play a central role in organ regeneration [32]. Ribosomal protein S6 (RPS6) controls mRNA translation, and phospho-RPS6 has been identifed as a surrogate marker of the activated PI3K/AKT/mTORC1 pathway found in many cancer types [33]. Programmed cell death 4 (PDCD4) mediates infammation and regulates the MAPK pathway [34]. Wang et al. [35] demonstrated that the PDCD4/HO-1 pathway is involved in oxidative stress and infammation in atherosclerosis. Lumican (LUM), a member of the small leucinerich proteoglycan family, is present in muscle tissues and plays dual roles as an oncogene and tumor suppressor [36]. TWIST1 has been reported to regulate infammatory genes in skeletal muscle [37] and mediate epithelial-tomesenchymal transition and cell migration [38]. Te activation of HIF1α (encoded by HIF1A) is an important cause of vascular dysfunction and impaired angiogenesis in DMD [39]. Additionally, interaction between HIF1A and TWIST was observed in the DMD-related lncRNA-mRNA pathway network. HIF1α silencing reportedly leads to downregulation of TWIST1 [40]. Taken together, these results suggest that MAP2K2, LUM, RPS6, PDCD4, TWIST1, and HIF1A may play important roles in DMD pathophysiology by regulating the MAPK pathway and proteoglycans in cancer.
In DMD, the immune system is activated and several types of immune cells, such as CD4+ and CD8+ T cells, Tregs, and NK cells, invade the skeletal muscles [16]. A recent study showed that lncRNA CCDC18-AS1 is positively correlated with activated myeloid dendritic cells and neutrophils [52], in concordance with our results that showed that the expression of lncRNA CCDC18-AS1 was positively correlated with the proportion of activated NK cells, neutrophils, and Tregs but negatively correlated with resting myeloid dendritic cells and M2 macrophages. Our study also showed that the proportions of Tregs, activated NK cells, neutrophils, M2 macrophages, and resting myeloid dendritic cells were signifcantly diferent between patients with DMD and normal control individuals.
However, this study has some limitations. First, the number of samples in the GSE38417 dataset was small, and additional validation cohorts need to be analyzed in future studies. Second, as this study involved secondary analysis of microarray data, further experimental studies are needed to verify the results of our study. Additionally, the specifc mechanisms of the identifed DMD-related DEMs and DELs need to be further explored in vitro and in vivo.

Conclusions
By performing a series of bioinformatics analyses on microarray data, a DMD-related lncRNA-mRNA pathway network was constructed. Nine DELs (DBET, MBNL1-AS1, MIR29B2CHG, CCDC18-AS1, FAM111A-DT, GAS5, LINC01290, ATP2B1-AS1, and PSMB8-AS1), nine DMDrelated DEMs (PYCARD, RIPK2, CASP1, MAP2K2, LUM, RPS6, PDCD4, TWIST1, and HIF1A), and two KEGG pathways (NLR signaling pathway and proteoglycans in cancer) were identifed to play a crucial role in the pathogenesis of DMD. Our fndings provide novel insights into the regulatory relationships between lncRNAs, mRNAs, and pathways in the pathogenesis of DMD. Te DMD-related DEMs and DELs identifed in this study need to be further investigated for their clinical use as potential biomarkers and therapeutic targets for DMD.

Data Availability
Te datasets used and analyzed during the current study are available from the corresponding author upon request.