A Novel Method for Predicting Disease-Associated LncRNA-MiRNA Pairs Based on the Higher-Order Orthogonal Iteration

A lot of research studies have shown that many complex human diseases are associated not only with microRNAs (miRNAs) but also with long noncoding RNAs (lncRNAs). However, most of the current existing studies focus on the prediction of disease-related miRNAs or lncRNAs, and to our knowledge, until now, there are few literature studies reported to pay attention to the study of impact of miRNA-lncRNA pairs on diseases, although more and more studies have shown that both lncRNAs and miRNAs play important roles in cell proliferation and differentiation during the recent years. The identification of disease-related genes provides great insight into the underlying pathogenesis of diseases at a system level. In this study, a novel model called PADLMHOOI was proposed to predict potential associations between diseases and lncRNA-miRNA pairs based on the higher-order orthogonal iteration, and in order to evaluate its prediction performance, the global and local LOOCV were implemented, respectively, and simulation results demonstrated that PADLMHOOI could achieve reliable AUCs of 0.9545 and 0.8874 in global and local LOOCV separately. Moreover, case studies further demonstrated the effectiveness of PADLMHOOI to infer unknown disease-related lncRNA-miRNA pairs.


Introduction
Noncoding RNA, according to its size, can be divided into small and long noncoding RNAs approximately. Generally, small RNAs include tRNAs, miRNAs, piRNAs, and snoR-NAs [1][2][3][4], and miRNAs are widely present in the cytoplasm of eukaryotic cells and are approximately 18-22 nucleotides in length, which can bind to 3′-untranslated region of mRNA (3′-UTR) to inhibit the translation process of mRNA or to degrade mRNA, thereby affecting the expression of related genes [5][6][7]. miRNAs play important roles in a series of life activities such as cell differentiation of living body [8], growth and development [9], and apoptosis [10]. Compared to small-molecule ncRNA, lncRNA has a longer nucleotide chain with more than 200 nucleotides and has a specific and complex secondary space structure inside the molecule and can provide multiple sites for protein binding [11]. In addition, both lncRNAs and miRNAs are key members of noncoding RNAs and play important roles in coding and regulation of many complex human diseases [12][13][14][15][16].
Up to now, there have been many studies on relationships between diseases and miRNAs. For example, some important methods proposed by Xing Chen et al. [17][18][19][20] and Zou et al. [21][22][23][24]. In terms of prediction of potential associations between lncRNAs and diseases, Yu et al. [25] and Xing et al. [26] proposed two kinds of computational models called NBCLAD and LRLSLDA, respectively. Moreover, studies have also shown that there exist relationships between lncRNAs and miRNAs. For example, Gernapudi et al. demonstrated that miRNA 140 can induce the expression of lncRNA NEAT1 [27]. Dey et al. showed that the silencing of lncRNA H19 and knockout of H19 gene in myoblasts significantly decreased skeletal muscle differentiation [28]. Yilong et al. discovered that, after low XIST expression in gliomas, XIST could regulate miR-152 glioma stem cells to inhibit cell proliferation, migration, and invasion [29]. Xinyu et al. demonstrated that lncRNA MALAT1 could achieve posttranscriptional regulation of esophageal squamous cell carcinoma cells through miR-101 and miR-217 [30]. Er-bao et al. proposed that lncRNA ANRIL interacted with miR-99a/miR449a to regulate cell proliferation during gastric cancer formation [31]. You et al. found that the expression of miR-449a and the expression of lncRNA NEAT1 in lung cancer cell L9981 inhibited each other. When miR-449a was overexpressed, NEAT1 expression was decreased, cell proliferation was inhibited, and apoptosis was increased, and vice versa [32]. Emmrich et al. found that the expression of lncRNA MONC and MIR100HG was closely related to the miRNA groups of miR-99a∼125b-2 and miR-100∼125b-1. After silencing of lncRNA MONC and MIR100HG, acute megakaryocytes in the early stage of the disease, the tumor cells of leukemia patients, were severely inhibited [33]. Amy et al. found that lncRNA Ang362 was the host transcriptor of miR-211 and miR-222, and their interactions regulated Ang II and induced proliferation of vascular smooth muscle cells [34]. Miaojun et al. found that the interactions between lncRNA H19 and miRNA-675 play an important role in the metastasis of prostate cancer [35]. Obviously, the exploration of these relationships was conducive to the construction of gene regulatory networks and the identification of the mechanisms of complex human diseases [36][37][38].
From the above description, it is easy to see that more and more studies have shown that lncRNA-miRNA interactions are involved in the development of complex diseases. However, to the best of our knowledge, so far, in addition to the model of PADLMP proposed by Zhou et al. [39], few models have been proposed for large-scale prediction of potential associations between diseases and lncRNA-miRNA interactions. Hence, inspired by state-ofthe-art methods [40][41][42][43][44], which show that the miRNA-miRNA pairs can work cooperatively to regulate a single gene or gene clusters being involved in similar processes [45], and simultaneously, based on the reasonable assumption that functionally similar lncRNA-miRNA pairs tend to be associated with similar diseases, in this paper, a new prediction model called PADLMHOOI was proposed to infer potential associations between diseases and the lncRNA-miRNA pairs. And, as illustrated in Figure 1, our newly proposed prediction model PADLMHOOI consists of the following four major steps: Step 1 (Data Integration and Network Construction). In this step, first of all, we downloaded known disease-lncRNA associations from three different disease-lncRNA databases such as disease-lncRNA [46], MNDR [47,48], and lnc2cancer [49], respectively, and then, based on these datasets, we constructed a bipartite network of disease-lncRNA. Next, we downloaded known disease-miRNA associations from three different databases such as miR2Disease [50], HMDD [51], and miRCancer [52] separately, and then, based on these datasets, we constructed a bipartite network of disease-miRNA. Moreover, we downloaded the 2015 and 2017 versions of known lncRNA-miRNA associations from the starBasev2.0 database [53] (http:// starbase.sysu.edu.cn/) on Feb 2, 2017, and based on these datasets, we constructed a bipartite network of lncRNA-miRNA. Finally, based on these three kinds of bipartite networks, we constructed an integrated tripartite network of disease-lncRNA-miRNA, which could be denoted as a tensor T.
Step 2 (Similarity Calculation). In this step, we would integrate the disease semantic similarity and Gaussian Interaction Profile Kernel similarity firstly to measure the similarity of diseases. Next, we would integrate the lncRNA functional similarity and miRNA functional similarity in three different ways to measure the functional similarity of lncRNA-miRNA pairs.
Step 3 (Weighted K-Nearest Neighbor Profile). Considering that there may be diseases that are unrelated to all lncRNA-miRNA pairs, which may lead to unsatisfactory prediction results while implementing PADLMHOOI to infer potential associations between diseases and lncRNA-miRNA pairs. Hence, in this step, we would introduce the weighted K-nearest neighbor profile (WKNNP) to add more interaction information between diseases, lncRNAs, and miRNAs to improve the prediction performance of PADLMHOOI.
Step 4 (Tensor Decomposition). In this step, we would perform tensor decomposition on the newly constructed disease-lncRNA-miRNA tensor T. Since the results of tensor decomposition include a core tensor and three matrices, we can define the final predicted association tensor as the modal product between the core tensor and these three matrices. ereafter, we would sort scores of the lncRNA-miRNA pairs associated with each disease in the descending order in the final predicted association tensor, and it is obvious that the higher the ranking of the score, the bigger the possibility that there may exist potential association between the disease and the lncRNA-miRNA pair would be.

Construction of the Bipartite Network of Disease-lncRNA.
In order to construct the bipartite network of disease-lncRNA, firstly, known associations between diseases and lncRNAs were downloaded from three different databases such as the LncRNADisease, MNDR, and Lnc2Cancer, respectively, and then, after feature processing (including feature cleaning and data imbalance processing etc.), 2048 different disease-lncRNA associations were finally obtained (Supplementary Table 1). ereafter, based on these newly obtained 2048 known disease-lncRNA associations, we can construct a disease-lncRNA bipartite network G 1 � (V 1 , E 1 ) according to the following steps: Step 1. Let V l 1 � l i |i ∈ [1, n l 1 ] be the set of all different lncRNAs in these 2048 known disease-lncRNA associations and V d 1 � d i |i ∈ [1, n d 1 ] be the set of all different diseases in these 2048 known disease-lncRNA associations, then we define V 1 � V l 1 ∪ V d 1 as the vertex set in G 1 .
Step 2. ∀l i ∈ V l 1 , d j ∈ V d 1 , if (l i , d j ) belongs to these 2048 downloaded known disease-lncRNA associations, then we define that there is an edge between l i and d j in G 1 ; thereafter, we can obtain the edge set E 1 in G 1 . New tensor T

Tensor decomposition
Step 1. Input T , R 1 , R 2 and R 3 and initialize Z 1 , Z 2 , Z 3 , G, and convergence threshold ε Step 2. Repeat for n = 1 to 3: Update Z n according to (31) end for Update G according to (35) Until Computational and Mathematical Methods in Medicine 3

Construction of the Bipartite Network of Disease-miRNA.
In order to construct the bipartite network of disease-miRNA, at first, known disease-miRNA associations were downloaded from three different databases such as the miR2Disease, HMDD, and miRCancer separately, and then, after these newly acquired miRNAs and diseases being mapped to the database miRBase v21 [54] and disease ontology (DO) [55], respectively, 4041 different disease-miRNA associations were finally obtained (Supplementary Table 2). Hence, based on these newly obtained 4041 known disease-miRNA associations, we can construct a disease-miRNA bipartite network G 2 � (V 2 , E 2 ) according to the following steps: Step 1. Let V m 1 � m i |i ∈ [1, n m 1 ] be the set of all different miRNAs in these 4041 known disease-miRNA associations and V d 2 � d i |i ∈ [1, n d 2 ] be the set of all different diseases in these 4041 known disease-miRNA associations, then we define belongs to these 4041 known disease-miRNA associations, then we define that there is an edge between m i and d j in G 2 ; thereafter, we can obtain the edge set E 2 in G 2 .

Construction of the Bipartite Network of lncRNA-miRNA.
In order to construct the bipartite network of lncRNA-miRNA, at first, two different versions (2015 and 2017) of lncRNA-miRNA dataset were downloaded from the starBa-sev2.0 database separately, and then, after feature processing (including feature cleaning and data imbalance processing), 20324 different lncRNA-miRNA interactions were finally obtained (Supplementary Table 3). ereafter, based on these newly obtained 20324 known lncRNA-miRNA associations, we can construct a lncRNA-miRNA bipartite network G 3 � (V 3 , E 3 ) according to the following steps: Step 1. Let V l 2 � l i |i ∈ [1, n l 2 ] denote the set of all different lncRNAs in these 20324 known lncRNA-miRNA associations and V m 2 � m i |i ∈ [1, n m 2 ] denote the set of all different miRNAs in these 20324 known lncRNA-miRNA associations, then we define belongs to these 20324 known lncRNA-miRNA associations, then we define that there is an edge between l i and m j in G 3 ; thereafter, we can obtain the edge set E 3 in G 3 .

Construction of the Tripartite
Network of Disease-lncRNA-miRNA. Based on the above newly obtained networks such as G 1 , G 2 , and G 3 , we can construct a tripartite network G 4 � (V 4 , E 4 ) according to the following steps: Step Step 2. While V d is not null, Repeat: If ∃l j ∈ V l and m k ∈ V m satisfyies the following three kinds of conditions simultaneously: , and (l j , m k ) will be added into E 4 firstly, and then, d i will be added into V d ′ and removed from V d . Finally, l j and m k will be added into V 4 if l j and m k are not inV 4 . Else, d i will be removed from V d . Step According to above steps, a tripartite disease-lncRNA-miRNA association network can be obtained finally. And, it is obvious that, in the tripartite network, there are three kinds of different nodes such as disease nodes, lncRNA nodes, and miRNA nodes; moreover, the number of disease nodes, lncRNA nodes, and miRNA nodes is 68, 44, and 211, respectively, and the number of associations between diseases and lncRNA-miRNA pairs is 3,047.

Construction of the Disease-lncRNA-miRNA Tensor.
Based on the newly constructed tripartite network, for any given disease node d i , lncRNA node l j , and miRNA node m k in G 4 , we can define a tensor T as follows: denotes the set of nodes containing the node d itself and its ancestors and E(d) denotes the set of edges of the respective direct links from parent to child nodes [56]. ereafter, based on the newly constructed directed acyclic graph DAG(d), the semantic contribution of an ancestor node d s to the disease d can be calculated as follows: where Δ is the semantic contribution decay factor with value between 0 and 1. And, in addition, according to the experimental results of some previous state-of-the-art methods [57,58], the most appropriate value for Δ will be 0.5. Hence, based on the assumption that two diseases with more common ancestor nodes in their DAGs shall have higher semantic similarity, the semantic similarity between two diseases d i and d j can be defined as follows: 4 Computational and Mathematical Methods in Medicine (3)

Calculation of the Gaussian Interaction Profile Kernel
Similarity for Diseases (GIPSim). Based on the hypothesis that functionally similar genes are often associated with similar diseases, in this section, we will adopt the Gaussian Interaction Profile Kernel to calculate the similarity of diseases according to the following steps: Firstly, based on the networks G 1 and G 2 constructed above, for any given lncRNA l i and disease d j , we define that Next, for any given miRNA m i and disease d j , we define that Hence, let IP l (d i ) denote the ith column of the matrix Y 1 , then we can calculate the Gaussian Kernel Similarity between diseases d i and d j based on their interaction profiles as follows: where the parameter n d1 denotes the number of different diseases in G 1 .
In a similar way, let IP m (d i ) denote the ith column of matrix Y 2 , then we can calculate the Gaussian Kernel Similarity between diseases d i and d j based on their interaction profiles as follows: Here, the parameter n d2 denotes the number of different diseases in G 2 . ereafter, based on these above formulas, we can calculate the Gaussian Interaction Profile Kernel Similarity between diseases d i and d j as follows:

Calculation of the lncRNA Functional Similarity
(lncfunSim). For any two given lncRNAs such as l i and l j , let DT 1 � dt 11 , dt 12 , . . . , dt 1m be all the diseases related to l i in G 1 and DT 2 � dt 21 , dt 22 , . . . , dt 2n be all the diseases related to l j in G 1 , then we can define the functional similarity between l i and l j as follows: where SemSims dt 1k , DT 2 � max 1≤l≤n DisSemSim dt 1k , dt 2l ,

Calculation of the Gaussian Interaction Profile Kernel
Similarity for lncRNAs (GIP lncSim ). For any two given lncRNAs such as l i and l j , similar to the definition of formula (6), let IP(l i ) and IP(l j ) denote the ith and the jth row of the matrix Y 1 , respectively, then we can calculate the Gaussian Kernel Similarity between diseases l i and l j based on their interaction profiles as follows: where n l1 denotes the number of different lncRNAs in G 1 .
Hence, based on these formulas given above, we can finally define the similarity measurement between lncRNAs l i and l j as follows:

Calculation of the Gaussian Interaction Profile Kernel
Similarity for miRNAs (GIP miRSim ). For any two given miRNAs, such as m i and m j , in a similar way, let IP(m i ) and IP(m j ) represent the ith and jth row in matrix Y 2 , respectively, then we can calculate the Gaussian Kernel Similarity between diseases m i and m j based on their interaction profiles as follows: where n m2 denotes the number of miRNAs in G 2.
Hence, based on these formulas presented above, we can finally define the similarity measurement between miRNAs m i and m j as follows: en, it is obvious that the values in these three kinds of interaction profiles of any novel diseases, lncRNAs, or miRNAs are all zeros, which may lead to unsatisfactory prediction performance during inferring potential associations between diseases and lncRNA-miRNA pairs. Hence, in this section, we will perform a procedure for the construction of new interaction profiles to address the problem mentioned above. And, in this procedure, for each disease d i , its association with other K nearest known diseases (including at least one experimentally verified association) and corresponding K interaction profiles will be utilized to obtain the following interaction profile: where, d 1 , d 2 , . . . , d K are the diseases sorted in descending order based on their similarity to d i , w t is the weight coefficient, and w t � α t−1 * disSim(d t , d i ), which means that a higher weight will be assigned if d t is more similar to d i . e parameter α is a decay term with values between 0 and 1. e parameter Q d is a normalization term, and there is In the same manner, the new interaction profile for each l k can be determined as follows: where l 1 , l 2 , . . . , l K are the lncRNAs sorted in the descending order based on their similarity to l k , w t is the weight coefficient, and w t � α t−1 * lncSim(l t , l k ), which means that a higher weight will be assigned if l t is more similar to l k . e parameter Q l is a normalization term, and there is Q l � K t�1 lncSim(l t , l k ). Similarly, the new interaction profile for each m p can be determined as follows: where m 1 , m 2 , . . . , m K are the miRNAs sorted in the descending order based on their similarity to m p , w t is the weight coefficient, and w t � α t−1 * miRSim(m t , m p ), which means that a higher weight is assigned if m t is more similar to m p . e parameter Q m is a normalization term, and there is Q m � K t�1 miRSim(m t , m p ). ereafter, after combining the above three kinds of tensors T D , T L , and T M obtained from different data spaces and replacing T(i, j, k) � 0 with an associated likelihood score, we can update the original adjacency matrix T as follows: where T DLM � (a 1 T D + a 2 T L + a 3 T M / a i ), (i � 1, 2, 3).

PADLMHOOI.
Inspired by the successful application of tensor decomposition in the field of link prediction and the application of nonnegative matrix decomposition methods in inferring disease-miRNA associations, in this section, we proposed a novel model called PADLMHOOI to predict new associations between diseases and miRNA-lncRNA pairs. From above descriptions, it is easy to know that a tensor is a multidimensional array. Currently, the most commonly used tensor decomposition techniques include Tucker decomposition [59], HOSVD [60], and HOOI [61]. In this section, we will perform Tucker decomposition on above constructed tensor T. Assuming T ∈ R n 1 ×n 2 ×n 3 , the tucker decomposition aims at finding Z α (α ∈ (1, 2, 3)) and core tensor G ∈ R R 1 ×R 2 ×R 3 that can solve the following optimization problem:

Computational and Mathematical Methods in Medicine
Hence, based on formula (21), we can further transform this equation to following simple form: Z 1 ∈ R n 1 ×R 1 , Z 2 ∈ R n 2 ×R 2 , and Z 3 ∈ R n 3 ×R 3 are the factor matrices, which are usually orthogonal and can be considered as the main component of each mode. R 1 , R 2 , and R 3 are the number of columns (max(R 1 , R 2 , R 3 ) ≪ min(n 1 , n 2 , n 3 )) in the factor matrices Z 1 , Z 2 , and Z 3 respectively. e notation X n denotes n-mode product; ⟦G; Z 1 , Z 2 , Z 3 ⟧ is the shorthand introduced by Kolda and Gibson [62] (Supplementary File A).
Based on equation (22), the above optimization problem can be solved according to the following steps: Considering that the derivation forms of Z 1 , Z 2 , and Z 3 are similar, we will only derive the iterative formula of Z 1 as an example. Firstly, as illustrated in formula (23), the objective function given in formula (22) can be rewritten as a matrix form of T along the first dimension: where T (1) ∈ R n 1 ×(n 2 * n 3 ) is the unfolding of T along the first dimension (Supplementary File A). Assuming that the optimal solution Z 1 satisfies all the constraints in equation (22), we have where ⊗ denotes the Kronecker product, and moreover, we have Hence, formula (24) can be regarded as a nonnegative matrix factorization (NMF) form [63]. en, we can finally obtain the solution of Z 1 by updating NMF as follows: .
Hence, we can finally obtain the factor matrices Z 2 and Z 3 in a similar way. ereafter, while fixing the factor matrices Z 1 , Z 2 , and Z 3 , the objective function in formula (22) can be converted to the following form: where vec(·) denotes the vectorization of the tensor. And moreover, based on formula (27), the following linear equation can be obtained: Let Q � Z 3 ⊗ Z 2 ⊗ Z 1 , then obviously, formula (28) can also be regarded as a NMF, and thereafter, the core tensor in formula (28) can be obtained as follows [63]: Based on above formulas, the pseudocode of our prediction model PADLMHOOI based on tensor decomposition can be described as follows: Step 1. Input: T, R 1 , R 2 , R 3 , Z 1 , Z 2 , Z 3 , G, and the convergence threshold ε.
Step 2. Repeat For i � 1 to 3: According to above steps, we can obtain the final predicted disease-lncRNA-miRNA association tensor T * � GX 1 Z 1 X 2 Z 2 X 3 Z 3 , and after prioritizing the disease-related lncRNA-miRNA pairs based on the entities in the tensor T * , obviously, the top-ranked lncRNA-miRNA pairs can be regarded as more likely to be related to the corresponding disease.

Leave-One-Out Cross-Validation (LOOCV).
In order to estimate the prediction performance of our newly proposed prediction model, the global leave-one-out cross-validation (LOOCV), 2-fold cross-validation (2-fold CV), and 10-fold cross-validation (10-fold CV) were implemented on PAD-LMHOOI, respectively. In the K-fold cross-validation, the initial sample will be divided into K subsample sets, and a single subsample set is retained as the data for the validation model, while the other K − 1 samples are used to train the model. During simulation, the cross-validation will be performed K times, and each subsampling set will be verified once, and the average results of K times will be utilized to obtain a single estimation. Moreover, in order to reduce the performance deviation caused by the random sample partitioning, we divide the partition 100 times and then obtain the ROC curve and the AUC value in the same way as the LOOCV. And, as a result, from the following Table 1, it is easy to see that PADLMHOOI can achieve reliable AUCs of 0.9545, 0.9730 ± 0.0119, and 0.9626 ± 0.0150 in the frameworks of global LOOCV, 2-fold CV, and 10-fold CV, respectively. Additionally, in order to further estimate the prediction performance of PADLMHOOI, we implemented it under the framework of local LOOCV, and the simulation results of 50 predicted related diseases were illustrated in Supplementary Table 4.

Performance Comparison with Other Methods.
To the best of our knowledge, up to now, PADLMP [39] is the unique model having been proposed for predicting potential associations between disease and lncRNA-miRNA pairs, in which, these three kinds of nodes such as disease nodes, lncRNA nodes, and miRNA nodes are considered simultaneously to construct a triple network. And, the major difference between PADLMP and our model PADLMHOOI is that PADLMP is based on the method of link prediction. erefore, in order to compare PADLMP with our model PADLMHOOI, we implemented LOOCV to verify the prediction performance of these two models based on the 3047 known disease-lncRNA-miRNA associations downloaded above. In the first experiment, we set the parameters in PADLMP to their best values; specifically, the step size K is set to 2 and the attenuation coefficient c is set to 0.01. Meanwhile, for convenience, we set the parameters in PADLMHOOI as follows: the parameters a 1 , a 2 , and a 3 in formula (20) are all set to 1, the parameters r 1 , r 2 , and r 3 in formula (21) are all set to 5, and the parameters K and α in formulas (17)- (19) are all set to 3 and 0.1 separately. And, as illustrated in Figure 2, it is easy to see that PADLMHOOI and PADLMP can achieve the AUCs of 0.9545 and 0.9318 separately, which demonstrate that the prediction performance of PADLMHOOI is superior to that of PADLMP.
As time went by, we found that some databases have been updated. Hence, in order to further demonstrate the advancement of PADLMHOOI, we once again collected the latest disease-lncRNA correlations from the databases lnc2cancer v2.0, lncRNADisease 2.0 [64], and MNDR v2.0 [48], collected the latest disease-miRNA associations from the database HMDD v3.0, and collected the latest lncRNA-miRNA associations from the database RAID v2.0 [65] separately. And thereafter, we reconstructed the triple network based on these newly collected latest datasets. In the newly constructed triple network, the numbers of disease nodes, lncRNA nodes, and miRNA nodes are 42, 234, and 251 respectively; the number of known associations between diseases and lncRNA-miRNA pairs is 3,768; the number of known associations between diseases and lncRNAs is 733; and the number of known associations between diseases and miRNAs is 674. en, based on the new triple network, we compared our model PAD-LMHOOI with PADLMP once more. And, in this second experiment, we set the parameters K and α to 10 and 0.5, respectively, in PADLMHOOI and kept other parameters unchanged as in the first experiment. And, as illustrated in Figure 3, simulation results show that PADLMHOOI and PADLMP can achieve AUCs of 0.9026 and 0.9013, respectively, which demonstrate that the prediction performance of PADLMHOOI outperforms that of PADLMP markedly.
Additionally, the interesting point is that our model can infer potential disease-lncRNA associations and disease-miRNA associations incidentally, while predicting potential associations between diseases and lncRNA-miRNA pairs. Hence, it is reasonable as well to compare our model PADLMHOOI with prediction models for inferring potential disease-lnRNA or disease-miRNA associations. erefore, in this section, we would compare PADLMHOOI with some state-of-the-art computational prediction models such as the LRLSLDA [26], NBCLAD [25], WBSMDA [66], and RLSMDA [67]. Among them, LRLSLDA is a semisupervised learning-based prediction model for inferring potential lncRNA-disease associations; NBCLAD is     Figure 2, the reason that the AUCs of our model decline in Figure 3 is that the values of parameters K and α are different. In Figure 2, K � 3 and α � 0.1, while in Figure 3, K � 10 and α � 0.5. a probabilistic model for predicting potential associations between diseases and lncRNAs; WBSMDA is a prediction model for predicting potential associations between diseases and miRNAs; and RLSMDA is a prediction model for predicting disease-related miRNAs based on the framework of regularized least squares. In addition, while comparing with LRSLDA, known disease-lncRNA associations were obtained from the triple disease-lncRNA-miRNA network; however, the parameters in LRSLDA are set to the same values given in the literature. Moreover, while comparing with NBCLDA, considering that there are four kinds of nodes such as diseases, lncRNAs, miRNAs, and genes included in NBCLDA, there are three kinds of nodes such as diseases, lncRNAs, and miRNAs in our model PAD-LMHOOI. Hence, for the sake of fairness, we only compared PADLMHOOI with the submethod NBCLDA-GN1-SD. And, as illustrated in Figure 4, simulation results show that PADLMHOOI, NBCLDA-G1-SD, and LRSLDA can achieve AUCs of 0.9568, 0.7928, and 0.5924 separately, which demonstrate that PADLMHOOI thoroughly defeats both NBCLDA-G1-SD and LRSLDA. In addition, while comparing with WBSMDA and RLSMDA, 674 known disease-miRNA associations were obtained from the triple disease-lncRNA-miRNA network; however, the parameters in both WBSMDA and RLSMDA are set to the same values given in the literatures. And, as illustrated in Figure 5, simulation results show that PADLMHOOI, WBSMDA, and RLSMDA can achieve AUCs of 0.9157, 0.8544, and 0.8991, respectively, which demonstrate that PADLMHOOI outperforms both WBSMDA and RLSMDA thoroughly as well.

Recall Ratio Analysis.
In this section, in order to further evaluate the prediction performance of PADLMHOOI, we compared the recall value of PADLMHOOI and other stateof-the-art models. It is well known that the higher recall ratio of all selected diseases in a top k ranking list means that the more positive testing samples (real disease-related lncRNA-miRNA pairs) have been identified successfully. And, as a result, Figure 6 illustrates the recall rate of all selected diseases in different top k ranking lists. Moreover, we further listed the recall rate of some given diseases associated with at least 80 verified lncRNA-miRNA associations in Supplementary Table 5.

Case Studies.
In this section, case studies of breast neoplasms, colon neoplasms, and prostate neoplasms were conducted to further verify the capability of PADLMHOOI to detect novel associations between diseases and lncRNA-miRNA pairs separately. And, among these three kinds of case studies, breast cancer is the second leading cause of female cancer death and comprises 22% of all cancers in women [68,69]. e related literature has suggested that lncRNAs and miRNAs play an important role in the formation of many diseases, and the formation of breast cancer may be more relevant to them [70,71]. Predicting breast cancer-associated lncRNA-miRNA pairs and identifying lncRNAs and miRNAs as biomarkers may make a significant contribution to better diagnosis and treatment of breast cancer [71]. In Supplementary  Table 6, the column of lncRID and miRID denotes lncRNA ID and miRNA ID, respectively. Evi1 and Evi2 denote some authority database or published literature containing verified disease-lncRNA or disease-miRNA associations separately. "#" and " * " stand for databases of lncRNADisease and MNDR v2.0, respectively, which consist of known disease-lncRNA associations or contain published literatures to support the association between predicted lncRNAs and breast cancer. "!," "&," and "+" stand for databases of HMDD, miR2Disease, and miRCancer, respectively, which consist of known disease- Computational and Mathematical Methods in Medicine miRNA associations or contain published literature to support the association between predicted miRNAs and breast cancer. Particularly, "Nan" indicates that there is no database or no published literature to support the predicted results. From Supplementary Table 6, it is easy to see that all candidate disease-lncRNA associations have been verified in databases of the lncRNADisease and MNDR v2.0 or published papers containing these databases. And, in addition, there are 42 out of 50 candidate disease-miRNA associations having been reported by HMDD, miR2Disease, and miR-Cancer or published paper containing these databases. Moreover, we discovered that those novel miRNAs with miRID 35, 51, 73, 164, and 186 are related to some important factors affecting the development of breast neoplasms. Hence, it is obvious that we infer that these lncRNA-miRNA pairs may be associated with breast cancer.
In addition, colonic tumors are a type of malignancy that is common in the rectum and sigmoid borders [72]. Early colon cancer is difficult to detect because of its insignificant symptoms [73]. Unfortunately, the related literature reports that its incidence has been on the rise in recent years [74]. erefore, predicting potential miRNAs and lncRNAs associated with colon tumors is of great significance for the diagnosis of early colon cancer. In Supplementary Table 7, we have listed the top 30 candidate lncRNA-miRNA pairs predicted to be associated with colon tumors. Moreover, all of these candidate lncRNAs and most of these candidate miRNAs have been verified by lncRNADisease database and MNDR v2.0, respectively.
Moreover, prostate neoplasm is one of the most common cancers in white and African-American men, and it is reported that there are about one in six white men and one in five African-American men having prostate cancer in their lifetime. Recent researches have shown that prostate neoplasm is caused by the malignancy of prostate epithelial cells [75], its formation includes many factors such as age, family history, and race [76], and particularly, some miRNAs such as has-let-7a-5p and lncRNAs such as XIST have been found to be involved in the formation of prostate neoplasms successively. Hence, it is interesting to infer potential miRNAs and lncRNAs associated with prostate neoplasms. In Supplementary Table 8, we have listed the top 30 prostate neoplasm-related candidate lncRNA-miRNA pairs. Moreover, all of these candidate lncRNAs and most of these candidate miRNAs have been verified by lncRNADisease and MNDR v2.0, respectively.

Parameter Sensitivity Analysis.
Considering that there are some key parameters such as K and α, which may be significant to the performance of our prediction model PADLMHOOI, in this section, we will further estimate the effects of these key parameters to the prediction performance of PADLMHOOI. Firstly, we varied K from 1 to 10 during simulation. And, as a result, Table 2 illustrates the impacts of parameter K on the performance of PADLMHOOI. By observing Table 2, it is obvious that PADLMHOOI can achieve the maximum AUC value of 0.9708 while K � 8. And additionally, as for the impacts of the parameter α, considering the time costs, we set K � 3 and varied α from 0.1 to 0.9 during simulation. And as a result, Table 3 illustrates the impacts of parameter α on the performance of PAD-LMHOOI. By observing Table 3, it is obvious that PAD-LMHOOI can achieve the maximum AUC value of 0.9591 while α � 0.7.

Discussion and Conclusion
Researches on prediction of potential associations between lncRNA-miRNA pairs and diseases not only are helpful in understanding the disease mechanisms on lncRNA and miRNA levels but also play an important role in the detection of disease biomarkers, diagnosis, prognosis, and prevention. However, to our knowledge, although there are many researches having demonstrated that lncRNA-miRNA interactions are associated with the development of complex diseases, up to now, there are few models having been proposed for large-scale forecasting potential associations between diseases and lncRNA-miRNA pairs. Since traditional biological experiments are quite expensive and timeconsuming, in this paper, based on the existing disease-miRNA associations, disease-lncRNA associations, lncRNA-miRNA interactions, and the assumption that genes with similar functions are often associated with similar diseases; we firstly constructed a three-order tensor T by adopting the method of WKNNP, and then, based on the method of tensor factorization, we further proposed a prediction model called PADLMHOOI to infer potential relations between diseases and lncRNA-miRNA pairs. And thereafter, simulation results under the frameworks of global and local LOOCV, 2-fold CV, and 10-fold CV, all confirmed the superiority of PADLMHOOI. Moreover, case studies of breast neoplasms, colon neoplasms, and prostate neoplasms further demonstrate that our model PADLMHOOI is an effective method for predicting potential disease-associated lncRNA-miRNA pairs. Certainly, there are still some limitations in PADLMHOOI. For example, although a large number of datasets have been integrated in PADLMHOOI, the amount of data available is still not enough; it is obvious that the prediction performance of PADLMHOOI will be better if more datasets can be collected. And in addition, in this paper, we only predicted the association between disease and a single lncRNA-miRNA pair. In the future, we will further modify PADLMHOOI to predict potential associations between diseases and multiple lncRNA-miRNA pairs.

Abbreviations
PADLMHOOI: Prediction of potential associations between diseases and lncRNA-miRNA pairs based on the higher-order orthogonal iteration.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare no conflicts of interest.
Supplementary 7. Table 6: the candidate lncRNA-miRNA pairs associated with breast cancer. In addition, the LncRNADisease and MNDR v2.0 databases have confirmed that these lncRNAs or miRNAs are associated with breast cancer.
Supplementary 8. Table 7: the candidate lncRNA-miRNA pairs associated with colon cancer. In addition, the LncRNADisease and MNDR v2.0 databases have confirmed that these lncRNAs or miRNAs are associated with colon cancer.
Supplementary 9. Table 8: the candidate lncRNA-miRNA pairs associated with pprostate cancer. In addition, the LncRNADisease and MNDR v2.0 databases have confirmed that these lncRNAs or miRNAs are associated with colon cancer.