A Cluster and Process Collaboration-Aware Method to Achieve Service Substitution in Cloud Service Processes

. Some cloud services may be invalid since they are located in a dynamically changing network environment. Service substitution is necessary when a cloud service cannot be used. Existing work mainly concerned on service function and quality in service substitution. To select a more suitable substitutive service, process collaboration similarity needs to be considered. This paper proposes a cluster and process collaboration-aware method to achieve service substitution. To compute the process collaboration similarity, we use logic Petri nets to model service processes. All the service processes are transformed into path strings. Service vectors for cloud services are generated by Word2Vec from these path strings. Process collaboration similarity of two cloud services is obtained by computing the cosine value of their service vectors. Meanwhile, similar cloud services are classiﬁed as a service cluster. By calculating function similarity and quality matching, a candidate set for services substitution is generated. The service with the highest process collaboration similarity to invalid one in the candidate set is chosen as the substitutive one. Simulation experiments show the proposed method is less time-consuming than traditional methods in ﬁnding substitutive service. Meanwhile, the substitutive one has a high cooccurrence rate with neighboring services of the invalid cloud service. Thus, the proposed method is eﬃcient and integrates process collaboration well in service substitution.


Introduction
With the promotion of cloud computing applications, a variety of cloud services with different functions are quickly registered in various cloud computing platforms [1]. Users can easily search and lease their expected cloud services in these cloud computing platforms. For example, "casicloud.com" is a cloud manufacturing platform providing manufacturing services. We can find that there are nearly 950,000 cloud services in the website and more than 8800T industrial data have been handled by these services in the end of August 2019 [2]. A new business application can be easily built by invoking these cloud services. To effectively address complicated service requests, we can assemble a group of cloud services as a composed service process with a specific business flow [3].
Service invocation is the most popular way to integrate existing cloud services in the network-based software systems [4]. It can greatly reduce the time cost in constructing new business system. To integrate a cloud service, an appropriate service which can properly respond service request needs to be selected. Since there are a large number of cloud services in cloud platforms, service discovery is a time-consuming process. Current service discovery methods face a large searching space, and their seeking processes are tedious and inefficient [5][6][7].
In the complicated and precarious network environment, some cloud services may be invalid during their invoking processes [8]. A substitutive service need to be searched once any of the component services is unavailable in service-oriented business systems [9]. Many service replacement methods are inefficient. e main reason of low efficiency is service substitution faces a large searching space. e existing work mainly concerns on service function and quality in service substitution. ese methods can find a service to replace the invalid one. e substitutive service is equivalent in service function and quality. However, it may not be able to cooperate with other services as well as the invalid one. e main reason is that process collaboration is not considered in service substitution [10].
Aiming at finding a quick and more reasonable substitutive service, we propose a cluster and process collaboration-aware method to achieve service substitution. To improve the efficiency of service discovery and substitution, we cluster cloud services with the same or similar functions as a group, named as a service cluster. e clustering mechanism can reduce service searching space. It can improve the efficiency of service discovery and substitution. We also take process collaboration of the component services into consideration. e candidate service with the highest process collaboration similarity to invalid one is recommended to apply service substitution. e main contributions of this paper are as follows: (1) We introduce clustering mechanism to reduce service searching space. e efficiency of service discovery and substitution is prominently increased.
(2) A method to evaluate process collaboration similarity is proposed. Service processes are transformed into path strings. We train service vectors for cloud service by Word2Vec based on these path strings. en, process collaboration similarity is obtained by computing the cosine value of service vectors.
(3) Service function, quality, and process collaboration are comprehensively considered to achieve service substitution. e proposed cluster and process collaboration-aware method is obviously superior to the existing methods in service substitution. e rest of this paper is organized as follows: Section 2 introduces the related work about service substitution; the concept of service cluster and service response schema based on service clusters is presented in Section 3; how to substitute cloud service based on the cluster and process collaboration-aware method is proposed in Section 4; Section 5 presents simulation experiments; and Section 6 concludes this work.

Related Work
Finding an appropriate service for the invalid one is a key work in service substitution. us, the existing service discovery methods can offer an important reference for the research of service substitution. Cheng presents a diversified keyword search approach on service connection graphs.
is method can satisfy the various possible requirements underlying a given keyword query [11]. Zhang defines a service composition context model based on three types of parameter correlations between service input and output parameters. e similarity between any two services is measured using the Person-alRank and SimRank++ algorithms by the composition context model [12].
Chen proposes a new measure of semantic similarity integrating multiple conceptual relationships for web service discovery. e new measure enables more accurate service-request comparison by treating different conceptual relationships in ontologies such as is-a, has-a, and antonomy differently [13]. A comprehensive ontology has been developed to provide a standardized semantic specification of cloud services based on their functional features and nonfunctional features in [7]. e authors present an intelligent cloud service discovery framework based on these ontology concepts to identify cloud service. e average amount of error expected to identify a service by using the proposed framework is 11% compared to 31% by using the cloud service discovery solution. Hierarchical Dirichlet processes (HDP) model and personalized Pag-eRank algorithm are used to achieve a two-stage model for cloud service recommendation by integrating the information of service descriptive texts and service tags [14]. Nabli proposes a self-adaptive semantic focused crawler based on latent Dirichlet allocation (LDA) for efficient cloud service discovery [15]. A method to learn features from service descriptions by using variational autoencoders is proposed by Lizarralde. It achieves significant gains compared to both word embeddings and classic latent features modelling techniques [16]. e above methods are the latest service discovery methods in recent three years. We can see that service context, comprehensive ontology about cloud services, and vector-based service similarity calculation are more concerned in service discovery.
In the domain of service substitution, researchers have also presented some effective methods. For example, Gong employs a cloud model to compute the QoS uncertainty to determine dynamic substitute targets. By targeting substitutions, the reconfigured web service will better satisfy users' requirements [17]. ree rules are provided to establish the compatibility and substitution of service operation interfaces [18]. e experiments to show service substitute identification based on the proposed framework achieve a best precision of 85%. By recording execution context data and mining the execution context conditions, an execution context-aware approach for web service substitution is proposed in [19].
Santhanam uses preference networks to represent and reason about preferences over nonfunctional properties in service substitution [20]. e proposed method is independent of the specific formalism used to represent functional requirements of a composite service as well as the specific algorithm used to assemble the composite service. By computing similarity degree between interface data and analyzing critical paths, Gao presents a method to check the data consistency for the dynamic replacement of service process [21].
is method provides fundamental theory guidance to enhance the credibility of the service process in the modern service industry. In recent research work, Sara et al. presented a similarity network for web services operations substitution [22]. e nodes represent the operations of the web services. A link joins two similar operations according to some relationships defined between them. e constituted network responds to the substitution best and much easier than existing works.

2
Scientific Programming e aforementioned methods must go through a large number of cloud services to find the substituted one in service substitution. Most of previously mentioned methods are time-consumed. In [23], Wu proposes deploying a web service cluster to perform service substitution. Service cluster contains a logic service and a set of concrete services, and these concrete services have functional equivalence or compatibility. Du converts a service cluster into a service cluster net unit. And it is used to analyze whether the services in the cluster can satisfy some service requests [24]. However, service clusters in their methods are restricted with the same interfaces. ey can reduce the searching space, but the flexibility of substitution still needs improvement.
e existing work mainly concerned on service function and quality in service substitution. Response time and substitution recall rate are the main evaluation indexes. We can also find some researchers give theoretical analysis to prove the feasibility and effectiveness of their approach. Few studies have taken process collaboration in consideration. e substitutive service will show better cooperation effect with other services once we add process collaboration relations in service substitution. In this paper, we introduce process collaboration similarity into service substitution and investigate service cooccurrence rate to show the benefits of our method.

Similarity Computation of Process Collaboration
A service process is composed of several cloud services. ese cloud services cooperate to accomplish service request from tenants. We can obtain the collaboration similarity from the existing service processes.
ere are two factors to determinate collaboration similarity for cloud services: one is cooccurrence rating, and another is the process distance. Two cloud services will have a higher collaboration similarity if they simultaneously appear more times than other services. Moreover, two services will also have a higher collaboration similarity once they are with a smaller process distance.

Path Strings of Service Processes.
In this study, we first convert service processes into path strings. en, we train service vectors for all the cloud services in these path strings by Word2Vec. Finally, we compute the collaboration similarity based on these service vectors. To obtain path strings, we use service nets to formally model cloud service processes. In service nets, logic Petri net is employed to describe the business flow. Now, we give the definition of logic Petri net.
1} is a marking function, ∀p ∈ P, and M (p) denotes the token count in p (6) Transition firing rules (a) ∀t ∈ T D , and the firing rules of t are the same as in (1) LPN is the process model of a service process, where T D denotes the component services. P � Pc ∪ Pd, Pd is a set of data places interacting with the external services, and Pc is a set of control places representing the states of the service process (2) i is the initial place, and o is the terminal place of a service process, with To get the preset and postset of x, we introduce two operations, π and τ, to compute • x and x • . In this study, π . ., f n } is a group of service nets, and t o and t i are the logic transitions to link F, i.e., ∀ f j : e paradigm of logic expressions is defined as follows: A service net for online shopping is provided in Figure 1. e service process described by this service net is initialized by inquiring some merchandises. If the query fails, a service is presented to show failure information. If the online seller can provide these merchandises, another service process which can purchase merchandises will be triggered. As we all know, either payment before receipt or receipt before payment can be both supported in the online trade. So, two subprocesses are concurrently presented. One is "reservedefray-delivery," and the other is "reserve-delivery-defray." However, the logic expression labelled on transition t o ′ is p 5 ∧¬p 6∨ ¬p 5 ∧p 6 , and it means that only one of the places in p 5 and p 6 can be assigned one token. us, only one service process can be performed. Similarly, t i ′ is labelled with a logic expression p 11∨ p 12 , and it means that p 13 can get a token once p 11 or p 12 has obtained the token. How to construct service nets for service processes can be found in our previous work [26].
To compute service vectors for the component services in the service processes, we convert the paths of service nets into strings, named path strings. A lot of path strings can be acquired from the existing service nets. e symbols in these path strings can form a corpus. en, the tool Word2Vec is employed to train service vectors for the component services by utilizing these path strings mapping from the service nets. Finally, we can obtain collaboration similarity for two cloud services by computing the cosine similarity based on their service vectors. Relevant introduction about Word2Vec can be referred from [27]. e following section presents how to generate path strings and give an algorithm to compute service vector for each cloud service.
ere are four types of basic process structures in the service nets: sequence, choice, parallel, and loop. e concept "one-fold service process" is proposed in our previous work [25]. In the one-fold service process, logic expressions labelled on logic transitions in the structures of choice and parallel must strictly follow Definition 4. Meanwhile, nesting structures must be not found in the one-fold structure. Four types of basic one-fold structures are illustrated in Figure 2. A merge-reduced method is introduced to generate a path string in this study. In the merging phase, all basic one-fold structures are mapped into a path string. e one-fold structures (a), (b), (c), and (d) can be merged as the path strings " ere are very few one-fold structures in the service nets in practice. If a transition t i is replaced by a service process sp i , we should first merge all the transitions in sp i and then replace t i by the path string generated from sp i . In the reducing phase, we link all the path strings obtained from the four types of process structures to generate a path string for a service net. e path string of the service net in Figure 1 is generated as "t 1 t o t 2 ||t o ′t 3 t 4 t 5 ||t 3 t 5 t 4 ||t i ′||t i " by this method. Here, service names have been mapped into symbols as {t 1 : query, t 2 : query fail, t 3 : reserve, t 4 : defray, t 5 : delivery}. Details about how to obtain path string for a service net are presented in Algorithm 1.

Computation of Process Collaboration Similarity.
Given a group of cloud services, process collaboration similarity is used to evaluate what extent two cloud services can cooperate with other ones. Normally, two cloud services with high-process collaboration similarity means that they may have more partner cloud services in service processes. Since service vectors for cloud services can be trained from the path strings, the process collaboration similarity of two cloud services can be obtained by computing the cosine similarity of their service vectors.
Assume there are two resource pools: the cloud service clusters pool (CSCP) and service net pool (SNP). All the service clusters and cloud services are organized in CSCP. Meanwhile, the existing cloud service processes have been transformed into service nets and stored in SNP. Algorithm 2 provides a method to generate word vectors and service vectors for cloud services and service clusters in CSCP.
In Algorithm 2, we first construct two corpus CP 1 and CP 2 . CP 1 consisted of the description sentences of all the cloud services. All the path strings of service nets are gathered in CP 2 (lines 1 to 10). For the cloud service and service cluster, CP 1  service and service cluster in finding candidate service set (see lines 11 to 13). In line 14, CP 2 is used to train service vector for cloud services. Since CP 2 consists of the path strings, the service vector trained by CP 2 can be adopted to calculate the collaboration similarity.
Definition 5 (process collaboration similarity) Assume S is a set of cloud services. Let PS be the set of path strings of all the services in S. For two service S i and S j in S, P i and P j are service vectors of S i and S j which are trained by the corpus PS. e collaboration similarity of S i and S j is defined as Notice that we omit the semantic of symbols in path strings, and only the positional adjacency of different symbols is considered to train the vectors. us, we use the serial numbers of cloud services to generate path strings in practice.

Service Substitution Based on Clustering and Process Collaboration-Aware Method
In this section, we first introduce the concept of service cluster, present the service response schema based on service clusters, and then propose the cluster and process collaboration-aware method to achieve service substitution.

Service Response Schema Based on Service Clusters.
Some similar definitions to describe a group of web services are put forward in the existing research, such as service pool Input: service net SN; Output: path string of SN; obtain a place p in τ (t n ) and build service net SubSN j with SubSN j .i � p; (11) sp j � PathString_Generate (SubSN j ); (12) if (I (t n ) � O ∨ ) PS � PS + sp j + ||; (13) if (I (t n ) � O ∧ ) PS � PS + sp j + ⊗; End for (15) if (t c ∈ SN.Ts∧t n ∈ SN.T O ∧j � � m) PS � PS + t n ; (16) t c � t n ; t n � t (τ (t c )); } (17) return (PS);  Scientific Programming [28], service class [29], and service cluster [26,30]. Cloud services in above concepts are required with the same input and output parameters. us, they have little flexibility in service substitution because they can only achieve service migration with same interfaces.
In this paper, we do not require all the cloud services in a service cluster with the same interfaces. e definitions of cloud service and service cluster are formally defined as follows.

Definition 6 (cloud service)
A cloud service is a 6-tuple C ls � (N, D, I, O, Q, L), where (1) N is the serial number of cloud service in cloud service platform (2) D is a function description of the cloud service (3) I and O are the sets of input and output parameters, respectively (4) Q is a set of quality parameters (5) L is the URI of the cloud service Function description of a cloud service is defined as Here, O p , T h , and F t are the operation, theme, and function text of a cloud service, respectively. For example, a weather forecast service is set as D � <query, weather, "the service can provide the weather forecast, users present the city and date, and then, the service can return temperature, humidity, ultraviolet intensity, and wind speed." >.
As we known, service quality is an important factor to evaluate a cloud service. ere are many common attributes in cloud services, such as response time, cost, and reliability. Besides, there may be some other quality attributes related to the practical application domain of cloud services. For example, manufacturing cycle and the level of after-sales service are more concerned by the tenants in cloud manufacturing.
Here, all these attributes are defined as quality parameters. We formally define it as Q � {q i }, q i � (n, c, v, u), where n is the name of quality parameter, c is a comparison operator, v is the value of the parameter, and u is the unit of quality parameter. If a cloud manufacturing service is assigned as q � (manufacturing cycle, <�5 day), it means that the manufacturing cycle is no longer than five days.  Figure 3 shows the architecture of service response schema based on service clusters. Cloud services published by service providers are stored in the physical resource layer. Service cluster is a mapping collection of these services, and all the service clusters constitute the virtual resource layer [23]. e tenant request is modelled and submitted in the business model layer. It can be responded as two ways: single service or service composition. To respond to tenant request, we can find another service to substitute the invalid one in its responding service cluster. In majority of cases, the searching space is the volume of the corresponding service cluster; thus, the efficiency of service substitution can be greatly improved.

Service Substitution.
A cluster and process collaboration-aware method is proposed to achieve service substitution in this paper. e method can be divided into two steps: (1) we find a candidate service set for substitution based on service clusters. All these candidate services can replace the invalid one in view of service function and quality.
(2) We compute the vector similarity between candidate services and invalid one so as to obtain collaboration intensity. By comprehensive consideration of function, quality, and collaboration, a cloud service with the highest similarity for the invalid one is selected to perform service substitution. e functional similarity of S 1 and S 2 is defined as FuncSim (S 1 , S 2 ). FuncSim (S 1 , S 2 ) � (1/2) * ((W Op1 Two cloud service S 1 and S 2 are called functional equivalence if FuncSim (S 1 , S 2 )≥δ. Here, δ is a threshold value. Meanwhile, the functional equivalence for two cloud services is denoted by S 1 ↔ S 2 .

Definition 9 (parameter compatibility)
Px and Py are two parameters. Parameter compatibility of Px and Py is the replaceable degree of Px and Py, denoted as PC (Px, Py).
Parameter compatibility is used to evaluate whether two groups of parameters can replace each other. It is divided into three levels in this study. To differentiate each level, we introduce three functions Num (P), type (P), and value (P) to represent the amount, type, and value of parameter P, respectively. e partition rules of parameter compatibility are formally described as follows: Definition 10 (quality score) S � {cls 1 , cls 2 ,. . .cls m } is the component cloud services in a service cluster. Assume that each service in S has n quality parameters, i.e., <q i1 , q i2 , . . .; q in > is the quality parameters of cls i . e quality score of cls i is defined as Qscore (cls i ): Quality parameters of cloud service can be divided into two types: positive parameters and negative parameters. Positive parameters will be attached with a higher quality when they are assigned a bigger value. On the contrary, negative parameters are attached with a lower quality when they are assigned a bigger value. Formula (2) is adopted to scale positive parameters, while formula (3) is utilized to scale negative parameters. Quality score can be computed by formula (1) after all quality parameters have been normalized. In Algorithm 3, CSCP is a cloud service cluster pool. All the cloud services and service clusters are stored in CSCP. In line 1, we initialize two empty sets, i.e., CS R and cs_r. All the possible cloud service clusters which can provide the similar function are enrolled into CS R . e candidate service set for substitution is represented as cs_r. By traversing CSCP, we can obtain every cloud service cluster in line 2. Functional similarity between each service cluster and the invalid service Se is computed, and the service cluster will be added to CS R if the function similarity is larger than a threshold δ.
For each component cloud service in CS R .S, we apply interfaces and quality matching in line 6 and line 7. In the level of interface, we know that cs 1 can replace cs 2 if the input parameters of cs 1 are the subset parameters of cs 2 ′s input parameters, while the output parameters of cs 2 are the subset parameters of cs 1 ′s output parameters. Meanwhile, the quality parameters of cs 1 should also provide a wide range value than cs 2 . For cs ∈ CS R .S and Se, it can be formally described as cs.I ∝ Se.I∧Se.O ∝ cs.O ∧cs.Q ≥ Se.Q. Finally, the candidate service set for substitution of Se is obtained in line 8 as the set cs_r.
From line 10 to line 12, we give a comprehensive scoring method to rank the service quality and collaboration similarity. Here, the weights α and β can be set according the tenants. Both α and β are assigned as 0.5 in this paper. e top rating cloud service will be returned to substitute the invalid service Se in lines 13 and 14.
Compared with traditional service discovery or substitution, our method needs to add service clusters. e number of service clusters will directly affect resource consumption. To verify how the granularity of service cluster affects service lookup time, we have grouped the 5000 cloud services into 50, 100, 200, 400, 600, 800, and 1000 service clusters, respectively. We find that when the number of service clusters is about 20%-40% of the total number of services, the service discovery is with a high efficiency.
In previous work, we have discussed the impact of service cluster granularity on service discovery from three aspects: quantity, structure, and quality [31]. However, we cannot give a specific granularity value on which the service discovery is in a highest efficiency. It is because we are unable to determine the size of the number of services in each service cluster. Normally, we can conclude from experiments that the scenario where the number of service clusters is about 20%-40% of the total number of services is the best granularity. us, we think the resource consumption will increase by 20%-40% by introducing service clusters in our method. In addition to these, we need to add a 200-dimensional vector for each service and its functional description to calculate the functional similarity. Of course, similar resource consumption also exists in other vectorbased service similarity calculation work [14][15][16].

Simulation Experiments
Simulation experiments are conducted to show the efficiency of the proposed method. Hardware for the computer is as follows: CPU is i5-8500 with 3.0 GHz, six cores. Memory is 16 G. Graphics card is GTX1060 with 6 G. Simulation program is designed by Java.
"Casicloud.com" is a famous industrial Internet platform of China. A large number of cloud manufacturing services were registered in this platform. We crawl 3780 cloud services from "casicloud.com." ese cloud services are about the same manufacturing domain. Four hundred cloud services are randomly selected, and we manually build two to five similar services for each selected cloud service. e total number of cloud services in simulation experiments is 5000.
We first present an experiment to obtain a reasonable threshold value in Definition 8. To obtain the threshold value δ, the function texts of all the cloud services are collected to form corpus. en, Word2Vec is used to train the word vectors for the terms in the operation and theme. e value of δ is set as 0.7, 0.75, 0.8, 0.85, 0.9, and 0.95, respectively. For each value, we randomly select a cloud service as an invalid service. By computing function similarity and interface matching, we find substitutive one for it from 5000 cloud services. e accuracy and recall rating of substitution for different threshold values can be evaluated from Figure 4. By analyzing the trend of the two curves, we select 0.85 as the value of δ. e following experiments are conducted with δ set as 0.85.
Our method has two advantages. One is that the efficiency of service discovery in service substitution is improved by introducing service clusters and vector-based similarity calculation, and the other is that the recommended substitutive service is with a high collaboration similarity to the invalid one.
To verify performance of proposed method, we compare it with Santhanam et al. method [20], Sara et al. method [22], Wu et al. method [23], and Du et al. method [24]. Five rounds of experiments are performed in this study. ese experiments in each round are performed for ten tests, and the average value of these results is taken as the final simulation result. According to different application areas, 5000 cloud services are manually divided into five parts. Different parts are selected to conduct experiments in turn. e number of cloud services, service clusters, and service nets of each round is shown in Table 1.
We make a rule that the component services in a service net can only be selected from different service clusters. at is, we cannot choose another cloud service from the same service cluster to compose a service net once we have chosen one from a service cluster. e number of cloud services in a service net is restricted within an interval of 8 to 20.
Algorithm execution time and service cooccurrence rate are compared between the above methods. As shown in Figure 5, our proposed algorithm has the least execution time in all rounds of experiments. Especially with increase in the number of cloud services, the advantage of our 8 Scientific Programming algorithm's execution efficiency is more obvious. e result also shows that the algorithm execution time of clusteringbased methods (our method, Du's method, and Wu's method) is lower than that of nonclustering method (Santhanam'method and Sara's method). us, we can get a conclusion that the clustering-based method can improve the efficiency of service replacement.
To prove that the substitutive services found by our method are more reasonable than other methods in process collaboration, we design another experiment to investigate service cooccurrence in service substitution. Let OccuNum (S i ) be the number of service S i appearing in all the service nets. Service cooccurrence of S i and S j is defined as Serv-iceCo_Occu (S i , S j ) � OccuNum (S i ∩ S j )/(OccuNum (S i ) + OccuNum (S j )).
Service cooccurrence can be used to judge whether two cloud services are with a close collaborative relationship. If a substitutive one has a high service cooccurrence with precursor and successor of the invalid service, we think it is a Input: the cloud service cluster pool CSCP; the invalid cloud service Se; Output: the substitutive cloud service St for Se.
(1) CSR � Ø cs_e � Ø; (2) for each Sec ∈ CSCP (3) compute FuncSim (Sec, Se) (4) if (Sec↔ Se) CSR � CSR ∪{Sec}; (5) end for (6) for each cs ∈ CSR.S (7) if (cs.I ∝ Se.I∧ Se.O ∝ cs.O∧cs.Q ≥ Se.Q) (8) cs_r � cs_r ∪{cs}; (9) End for (10) For each cloud service S in cs_r    Scientific Programming good collaboration-aware service substitution. Assume S i , Se, and S j are three cloud services. Let S i and S j be precursor and successor of Se, respectively. If Se is not working and St is the substitutive service of Se. Service cooccurrence in service substitution is defined as SubSerCo_Occu (St, Se) � (ServiceCo_Occu (S i , St) + ServiceCo_Occu (S j , St))/2. From Figure 6, we can see that our method shows remarkably good performance in service cooccurrence. By numerical comparison in service cooccurrence rate, it is about 2 to 4 times higher than other methods. us, the proposed method integrates service collaboration well in the service substitution.
Experiments to show the efficiency in service discovery are also conducted. We compare our method with three recently proposed methods (Cheng et al. method [11], Zhang et al. method [12], and Nabli et al. method [15]). We have investigated two factors: service discovery time and top-k accuracy. Service discovery time reflects the search efficiency, while top-k accuracy is an illustration of discovery accuracy. Figure 7 shows the service discovery time in different round experiments. We can see that Nabli's method is the most efficient in all the methods. Our method got the second place, and its discovery time is nearly close with Nabli's method. Nabli's method is vector-based service discovery. All service vectors must be trained in advance. e existing service vectors are directly used to compute similarity; thus, it is with a high efficiency.
Compared with Nabli's method, service search speed of our method is slightly slow although we introduce service clusters and vector-based similarity calculation. e main reason is that we present interface matching in the service discovery.
In the experiment of top-k accuracy, we have revised data set and guaranteed that there are several groups of services which can be used to evaluate the discovery accuracy. Each group of services can respond to the same discovery requirement. e number of each service group is not less than k. In top-k experiment, we test the proportion of appropriate services in the first k services found by different methods. As shown in Figure 8, our method is with the highest accuracy in top-k service discovery experiments. However, Nabli's method is with the worst accuracy in all the methods. It is because Nabli's method is computed similarity based on the LDA topic model. e accuracy of the LDA topic model is greatly fluctuated by the service descriptive information.

Conclusions
To efficiently and reasonably find a substitutive cloud service for the invalid one, this work proposes a method to achieve service substitution. e searching space of finding the substitutive service is greatly reduced by introducing service clusters. To get the substitutive service, we first obtain a service candidate set by applying function similarity computing and parameters matching of service quality. Service collaboration is mined from the existing service processes. By comprehensive consideration of function, quality, and process collaboration, we propose an algorithm to achieve service substitution.
We innovatively obtain the similarity of service function and process collaboration by computing the cosine value of their word/service vectors. How to construct the vectors which can represent the feather of function similarity and collaboration intensity is discussed in detail in Section 4. Results of simulation experiments have shown that the proposed method significantly outperforms the state-of-the art methods, especially for substitution in a mass of cloud services. In future work, we will focus on how to divide the process collaboration into different dimensions. A more    reasonable way to measure process collaboration will be presented so as to better realize service substitution.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.