Quantum-Based Feature Selection for Multi-classification Problem in Complex Systems with Edge Computing

The complex systems with edge computing require a huge amount of multi-feature data to extract appropriate insights for their decision making, so it is important to find a feasible feature selection method to improve the computational efficiency and save the resource consumption. In this paper, a quantum-based feature selection algorithm for the multi-classification problem, namely, QReliefF, is proposed, which can effectively reduce the complexity of algorithm and improve its computational efficiency. First, all features of each sample are encoded into a quantum state by performing operations CMP and R_y, and then the amplitude estimation is applied to calculate the similarity between any two quantum states (i.e., two samples). According to the similarities, the Grover-Long method is utilized to find the nearest k neighbor samples, and then the weight vector is updated. After a certain number of iterations through the above process, the desired features can be selected with regards to the final weight vector and the threshold {\tau}. Compared with the classical ReliefF algorithm, our algorithm reduces the complexity of similarity calculation from O(MN) to O(M), the complexity of finding the nearest neighbor from O(M) to O(sqrt(M)), and resource consumption from O(MN) to O(MlogN). Meanwhile, compared with the quantum Relief algorithm, our algorithm is superior in finding the nearest neighbor, reducing the complexity from O(M) to O(sqrt(M)). Finally, in order to verify the feasibility of our algorithm, a simulation experiment based on Rigetti with a simple example is performed.


Introduction
Complex systems [1] are nonlinear systems composed of agents that can act with local environmental information, which require big data to extract appropriate insights for their decision making.In the cloud computing [2][3][4][5], the data transmission delay between the data sources and the cloud centers is problematic for many complex systems where responses are usually required to be time critical or real time.Instead, a recently emerging computation paradigm, edge computing [6][7][8][9], is promising to cater for these requirements, as edge computing resources are deployed data sources which support time critical or real-time data processing and analysis.As we all know, the computing resources and storage resources of most intelligent terminals are very limited, which places higher requirements on the computing performance of algorithms, especially machine learning algorithms, in complex systems with edge computing.
Machine learning [10,11] continuously improves its performance through "experience," where experience generally originates from massive data.At present, many machine learning algorithms based on massive data [12][13][14] have been proposed.In the practical scenario, the amount of data available for training is getting larger and larger, while the characteristics of data are becoming more and more abundant.ose data with redundant or unrelated features will cause the problem of "curse of dimensionality" [15], which greatly increases the computational complexity of the algorithm.One of the possible solutions is the dimension reduction [16], and the other is the feature selection [17].
Relief algorithm [18] is a well-known feature selection algorithm for the two-classification problem.It is widely used because of its excellent classification effect.However, the limitation of this algorithm is that it can only perform the binary classification, and the efficiency of the algorithm will be greatly affected when the data size and feature size increase.To extend the application of the algorithm, Kononenko [19] proposed a new feature selection algorithm for the multiclassification problem, namely, ReliefF algorithm.It has the advantages of simple principle, convenient implementation, and good results and has been widely applied in various fields [20][21][22].
On the other hand, since Benioff [23] and Feynman [24] explored the theoretical possibilities of quantum computing, some excellent results have been proposed one after another.For instance, Shor's algorithm [25] solves the problem of integer factorization in polynomial time.Grover's algorithm [26] has a quadratic speedup to the problem of conducting a search through some unstructured database.ese excellent results have prompted people to think about how to apply this computing power into machine learning algorithms.us, a new research hotspot, quantum machine learning [27][28][29][30][31][32], has gradually formed.Although quantum technology provides a certain improvement in storage and computing power, the "curse of dimensionality" problem still exists in quantum machine learning.
erefore, the quantum-based dimensionality reduction method still has important research value.In 2018, Liu et al. [33] proposed a quantum Relief algorithm (namely, QRelief algorithm) for the two-classification problem, which reduces the complexity of similarity calculation from O(MN) to O(M).
As we know, in the application scenario of edge computing, there are various multiclassification problems based on distributed, massive, and large-feature data.e objective of this study is to design a feasible feature selection method which can effectively get rid of redundant or unrelated features in machine learning, reducing the computation load of intelligent terminals, and thus meet the requirement of real-time data processing and analysis in edge computing.In this paper, we introduce some quantum technologies (such as CMP operation, amplitude estimation, and Grover-Long method) and propose a quantum-based feature selection algorithm, namely QReliefF algorithm, for the multiclassification problem.
e main contributions of our work are as follows: (1) A quantum method is proposed to solve the problem of feature selection for the multiclassification problem in complex systems with edge computing.e proposed method fully demonstrates the quantum parallel processing capabilities that classical computing cannot match and significantly reduces the computational complexity of the algorithm.
(2) e problem of finding nearest neighbor samples is firstly transformed into the similarity calculation of two quantum states (i.e., calculating their inner product) in quantum computing, and the Grover-Long method is utilized to speed up the search of the targets.(3) A simulation experiment based on Rigetti is performed to verify the feasibility of our algorithm.
e outline of this paper is as follows.e classic ReliefF algorithm is briefly reviewed in Section 2, and the proposed quantum ReliefF algorithm is proposed in detail in Section 3.
en, we illustrate the process of the algorithm with a simple example in Section 4 and perform the simulation experiment on Rigetti in Section 5. Subsequently, the efficiency of the algorithm is analyzed in Section 6, and the brief conclusion and discussion are summarized in the last section.

Review of ReliefF Algorithm
ReliefF algorithm [19] is a feature selection algorithm which is used to handle the multiclassification problem.Before introducing our proposed quantum algorithm, let us review the detailed process of the algorithm.
Without loss of generality, suppose there are M samples with N features, and they can be divided into P classes: where v q is the q-th N-feature sample that belongs to Class C p , v q � (v q1 , v q2 , . . ., v qN ) T .And the weight vector of N features WT � (wt 1 , wt 2 , . . ., wt N ) T is initialized to all zeros, the upper limit of iteration is T, and the relevance threshold (that differentiates the relevant and irrelevant features) is τ (0 ≤ τ ≤ 1).e main steps of ReliefF algorithm are as follows (its pseudocode can be seen in Algorithm 1).At each iteration, ReliefF randomly selects a sample u and then searches for k nearest neighbor samples by cosine distance from each class.e closest same-class sample is called H j , and the closest different-class sample is called M j (C p ), where j � {1, 2, . .., k}. e updating weight vector formula is shown as follows: where p(C p ) represents the probability of randomly extracting samples of Class C p , and the definition of diff(i, u, v) function is as follows: After iterating T times, the final weight vector is obtained.
rough the relevance threshold τ, we can retain relevant features and discard irrelevant features.
ReliefF algorithm is an extension of Relief algorithm that extends the two-classification problem to multiclassification scenario.However, with the increase of category size, sample size, and sample features, ReliefF algorithm will also face with the problem of "dimension disaster," and the speed of the algorithm will also drop sharply.So, how to improve the efficiency of ReliefF algorithm becomes an urgent problem to be solved.

The Proposed QReliefF Algorithm
In order to implement the feature selection for the multiclassification problem in complex systems with edge computing, a feasible quantum ReliefF algorithm is introduced in this section.Suppose the sample sets C p � v q � (v q1 ,  v q2 , . . ., v qN ) T | v q ∈ R N , q � 1, 2, . . ., M p } (p represents the category of classification, p ∈ 1, 2, . . ., P { }), the weight vector WT, the upper limit T, and the relevance threshold τ are the same as classical ReliefF algorithm defined in Section 2. Different from the classical one, all the features of each sample are represented as a quantum superposition state, and thus the problem of finding nearest neighbor samples is transformed into the similarity calculation of two quantum states (i.e., calculating their inner product).And the similarity between any two samples can be calculated in parallel in the way of quantum mechanics.Algorithm 2 describes the process of our algorithm in detail, and the specific steps are as follows.

State Preparation.
In order to store classical information in quantum states, we need to normalize the sample sets: where Obviously, v qi is a real number, and v qi ∈ (0, 1).en, we prepare the initial quantum states as below: where |ϕ p 〉 q corresponds to the quantum state of the q-th sample that belongs to Class C p and v qi represents the i-th eigenvalue of the q-th sample.Assume our initial state is |q〉|0〉 ⊗n |1〉|0〉(n � log 2 (N); the construction scheme of the quantum state |ϕ p 〉 q includes the following steps.First, we perform Hadamard and CMP operations for |0〉 ⊗n and get a new state: and its circuit diagram is shown in Figure 1, where the definition of CMP operation is e function of CMP operation is to cut the quantum state larger than N, and its circuit diagram is shown in Figure 2. |i〉 and |N〉 represent a single qubit.e implementation of CMP operation needs to repeatedly implement such a circuit n times.After measurement, if the lowest register is |1〉, it means that i > N.
Next, we perform the unitary rotation operation R y : on the last qubit to obtain our target quantum state |ϕ p 〉 q :

Similarity Calculation.
After the state preparation, the information of the samples is encoded into the quantum superposition state |ϕ p 〉 q  .In this paper, we use the cosine distance to define the similarity between the random sample � u and other sample (e.g., v q ): (1) Init WT � (0, . .., 0) T (2) for t � 1 to T do (3) Pick a sample u randomly (4) Find k nearest neighbor samples H j from the same class of sample u (5) for C p ≠ class(u) do (6) Find k nearest neighbor samples M j (C p ) from the different Class C p (7) end (8) ) end (11) end (12) Select the most relevant features according to WT and τ Referring to equations ( 4) and ( 5), |u| 2 and |v q | 2 are 1, and equation ( 11) can be simplified as follows: First, |ϕ〉 (i.e., the sample � u) is randomly selected from |ϕ p 〉 q   which is the l-th sample in Class C p , as shown in the following equation: en, a swap operation is performed on |ϕ〉 to get Next, a swap test (its circuit is given in Figure 3) is performed on (|φ〉, |ϕ p 〉 q ), and we obtain From equation (15), we know the probability of measurement result being |1〉 is (

􏽱 􏽐
M p q�1 |q〉|v q − u〉 through swap test, the inner product and amplitude estimation operations (8) e nearest k samples in each class are obtained by Grover-Long method (9) for i � 1 to N do (10) WT (16) e i-th feature is relevant (17) else (18) e i-th feature is not relevant

Complexity
In addition, the inner product between |φ〉 and |ϕ p 〉 q (i.e., the prepared state) can be calculated as follows: Combining equation ( 16) with equation ( 17), we can get the similarity between samples � u and v q : Since N is a constant value and 〈u | v q 〉 is the angle cosine between the random sample � u and other sample v q (e.g., in Class C p ), then the smaller s(u, v q ) is, the smaller cosine distance is, which indicates that these two samples are more similar.
en, we can rewrite equation ( 15) as follows: 3.3.Finding the Nearest Neighbor Samples.First, the quantum amplitude estimation method [34] is applied to store the similarity of the sample � u and v q in the last qubit: where p ∈ 1, 2, . . ., P { }, and its quantum circuit diagram is given in Figure 4.
In the Grover-Long method [35], one iteration can be divided into four operations, i.e., G � − WI 0 W − 1 O, and its quantum circuit is shown in Figure 5. O is an oracle operation which performs a phase inversion on the targets: where v is the position of e iϕ in the diagonal matrix.e position v of e iϕ is divided into two cases.If v is odd, the u 1 (ϕ) operation will be applied to the lowest qubit: If v is even, X, u 1 (ϕ), X operations will be applied to the lowest qubit.
Besides, I 0 is a conditional phase shift operation which performs a phase inversion on |0〉: where sin η � ������ � (M/N)  and J represents the number of iteration.
Having obtained the state |β〉 p (see equation ( 20)) through the amplitude estimation, we introduce a quantum minimum search algorithm [37]

􏽱
), and its quantum circuit is shown in Figure 6.Suppose the set K � K 1 , K 2 , . . ., K k   represents the k nearest neighbor samples, we should prepare  � k √  auxiliary qubits.As shown in Figure 6, the operator W s represents and u 1 (ϕ) is the operator defined in equation (22).Let d 0 � s(u, v 1 ); we can mark We repeat the above steps several times until all samples in Class C p are compared.Finally, all index of k nearest neighbor samples in Class C p can be obtained according to the similarity.

Updating Weight
Vector.After the above steps, we obtain the nearest neighbor samples (i.e., H j and M j (C p )) of the random sample � u. en, we update the weight vector according to the updating weight vector formula as follows: where i ∈ [1, N].
3.5.Feature Selection.After iterating the above steps, i.e., similarity calculation, we find the nearest neighbor samples and update weight vector T times, and we jump out of the algorithm's loop.en, we get a final weight vector WT.And the average weight vector is en, we make feature selection based on the final WT and threshold τ.Here, τ can be chosen to retain relevant features and discard irrelevant features [38], that is to say, those features whose weight is greater than τ will be selected, and those less than τ will be discarded.Here, the value of τ is determined with regards to the user's requirements and the
First, the four initial quantum states are prepared as follows: Next, we take |ψ〉 S 0 as an example, and then the H ⊗3 operation is applied on the third and fourth qubits:

|0〉
Figure 4: Quantum circuit of amplitude estimation operation; G J represents J iterations of Grover-Long method [35] and F − 1 M represents the inverse Fourier transform [36].
Figure 5: Quantum circuit with one iteration in Grover-Long method [35]; q[0] denotes the highest qubit and q[n − 1] denotes the lowest qubit.6 Complexity en, we perform R y rotation (see equation ( 9)) on the last qubit, and we can get e other quantum states are prepared in the same way and they are listed as follows: Second, we randomly select a sample (assume |ϕ〉 S 0 is � u) and perform similarity calculation with other samples (i.e., |ϕ〉 S 1 , |ϕ〉 S 2 , |ϕ〉 S 3 , |ϕ〉 S 4 , |ϕ〉 S 5 ).Next, we take |ϕ〉 S 0 and |ϕ〉 S 1 as an example and perform a swap operation between the last two qubits of |ϕ〉 S 0 : After that, the swap test operation is applied on (|φ〉 , |ϕ〉 S 1 ): We perform a swap test operation to obtain a quantum state that encodes similarity in amplitude: en, through the amplitude estimation, we can obtain the quantum states: Next, we perform an oracle operation on the quantum states obtained in the above steps to obtain the k nearest neighbor samples.
Each row represents all the feature values of a certain sample, while each column denotes a certain feature value of all the samples.

Simulation Experiment
Quantum Cloud Services (QCS TM ) is Rigetti's quantum-first cloud computing platform.At the end of 2017, a 19-qubit processor named "Acorn" was launched, which can be used in QCS through a quantum programming toolkit named Forest [39].e chip of "Acorn" is made of 20 superconducting qubits but for some technical reasons, qubit 3 is offline and cannot interact with its neighbors, so it is treated as a 19-qubit device whose coupling map is shown in Figure 8.
In order to obtain the result and also verify our algorithm, we choose Rigetti to perform the quantum processing.However, since the Rigetti platform limits the length of the entire circuit and noise has a great influence on the preparation of quantum states [40], we only show one of the ideal experiment circuits of similarity calculation in QReliefF algorithm running on Rigetti platform.We successfully stored the characteristic information in the sample into the amplitude of the quantum state and then extracted the amplitude information into the quantum state through the phase estimation algorithm.Figure 9 gives the schematic diagram of our experimental circuit.e corresponding code of the circuit in Rigetti is shown in Algorithm 3.After running Algorithm 3 8 times, the result can be seen in Figure 10.We can get |1〉 with the average probability of 0.435125.en, we successfully stored the characteristic information in the sample into the amplitude of the quantum state.According to equation (32), we can get ������ s(u, v q )  ≈ ����� 0.435 √ , i.e., s(u, v q ) ≈ 0.435, and then we extracted the amplitude information into the quantum state through the phase estimation algorithm.
After all the steps have been performed, we obtain the quantum states S 1 (H), S 2 (M(B)), and S 5 (M(C)) of the nearest neighbor samples of the quantum state S 0 (� u) in each class of the random sample which can be seen in Figure 11.en, the weight vectors are updated according to equation (24) and the result of WT is listed in the second row of Table 2 after the first iteration.e algorithm iterates T times (in our example, T�4) as above steps and obtains all the WT results as shown in Table 2.After Tth iterations, WT � [4, 4, 4, − 2, 0, − 2], and then WT � [1, 1, 1, − 1/2, 0, − 1/2].In this paper, the value of τ in the example is assumed to be 0.5 according to the updated result of WT in Table 2. Since the threshold τ � 0.5, the selected features are F 0 , F 1 , and F 2 , i.e., the first, second, and third features.e result of quantum feature selection is consistent with the classical ReliefF algorithm after being verified by Python.
In the final weight value comparison, considering the large amount of data in the complex system and the corresponding eigenvalues, the calculation amount required for the comparison after the final result is obtained is also large.In order to meet the requirements of big data and result accuracy, we adopted an optimized quantum maximum and minimum value search algorithm [37] when comparing weights in the last step to help us quickly and accurately select the features we want, so as to better solve the multiclassification problem in complex systems.
In circumstances when we can exactly estimate the ratio of the number of solutions M and the searched space N, this algorithm can improve the successful probability close to 100%.Furthermore, it shows an advantage in complexity with large databases and in the operation complexity of constructing oracles.

Efficiency Analysis.
In order to evaluate the efficiency of QReliefF algorithm, three algorithms (i.e., classical Relief, classical ReliefF, and quantum Relief algorithms) are selected to compare with our algorithm from three indicators: complexity of similarity calculation (CSC), complexity of finding the nearest neighbor (CFNN), and resource consumption (RC). In q [10] q [12] q [17] q [16] q [15] q [14] q [13] q [11] q represents the randomly selected quantum state |ϕ〉 S 0 , q [9] − q [16] represents |ϕ〉 S 1 , and q [17] is the resultant qubit.X is the Not operation, and Ry is R y operation which can be expressed as a matrix in equation ( 9). (

Conclusion and Discussion
With the rapid development of edge computing technology and quantum machine learning algorithms, researchers began to pay attention to the combination and application of these two fields.In this paper, we use quantum technology to solve the multiclassification problem of feature selection in the complex systems with edge computing and propose a quantum ReliefF algorithm.Compared to the classic ReliefF algorithm, our algorithm reduces the complexity of similarity calculation from O(MN) to O(M) and the complexity of finding the nearest neighbor from O(M) to O( �� M √ ).In addition, from the perspective of resource consumption, our algorithm consumes O(MlogN) qubit, while the classic ReliefF algorithm consumes O(MN) bit.Obviously, our algorithm is superior in terms of computational complexity and resource consumption.
It should be noted that our work aims to improve the algorithm efficiency, while the privacy protection of sensitive data is not taken into account.At present, data security has become a focus of attention in the field of artificial intelligence, and some solutions for data privacy protection in complex systems with edge computing have been proposed [41][42][43][44].In our future work, how to improve the efficiency of quantum machine learning algorithms while ensuring the privacy protection of sensitive data, such as [45][46][47][48], will become our direction.

Figure 1 :Figure 2 :
Figure 1: Quantum circuit of getting (1/ �� N √ ) N− 1 i�0 |i〉; H is the Hadamard operation and ○ represents the control qubit conditional being set to zero.

Figure 3 :
Figure 3: Quantum circuit of swap test operation; the symbol of two crosses connected by a line represents the swap operation.

Figure 6 :
Figure 6: Quantum circuit of finding k nearest neighbor samples.

Figure 7 :
Figure 7: e simple example with six samples in classes A, B, and C.

Figure 9 :
Figure9: One of the ideal experiment circuits of similarity calculation in QReliefF algorithm running on Rigetti platform.q[0] − q[7] represents the randomly selected quantum state |ϕ〉 S 0 , q [9] − q[16]  represents |ϕ〉 S 1 , and q[17]  is the resultant qubit.X is the Not operation, and Ry is R y operation which can be expressed as a matrix in equation(9).

S 2 Figure 11 :
Figure 11: Finding the nearest neighbor samples (S 1 , S 2 , and S 5 ) of the sample S 0 .
1) Init WT � (0, . .., 0) T (2) Normalized the sample sets: C p ⟶ C p Select a state |ϕ〉 from |ϕ p 〉 q to find k nearest neighbor samples from Class C p with the time complexity of O(

Table 1 :
e feature values of four samples.
the classic Relief algorithm, it takes O(N) time to calculate the distance between randomly selected samples and any other samples, and then it finds the nearest neighbors related to M. is process needs to iterate T times, so CSC is O(TMN).Since T is a constant, CSC in the classic Relief algorithm is O(MN).As we know there are totally M samples, each with N features, CFNN is O(M) and RC in the classic Relief algorithm is O(MN) bits.e classical ReliefF algorithm is similar to the classical Relief algorithm.Since it finds k nearest neighbors at once time, the time complexity is O(kTMN).en,wecansimplifyCSC to O(MN) because k and T are constants.Besides, CFNN for finding k nearest neighbors is O(M).In terms of resource consumption, there are M samples, and each sample has N features, so the resource consumption of the classic ReliefF algorithm is O(MN) bits.In QRelief and QReliefF algorithms, the quantum property is used to calculate the distance from O(N) to O(1), so their CSCs are all O(TM).Since T is constant, their CSCs can be simplified to O(M).CFNN of QRelief is O(kM), and then it can be simplified to O(M) as k is constant, while CFNN of QReliefF is For multifeature big data in complex systems with edge computing, there is M ≫ N, so M ≫ (N/N − log N), and then RC of QRelief and QReliefF can be simplified into O(MlogN).For convenience, we list the efficiency comparison of classic Relief algorithm, ReliefF algorithm, quantum Relief algorithm, and our algorithm in terms of CSC, CFNN, and RC in Table3.Obviously, our algorithm is superior to classical algorithms (i.e., Relief and ReliefF) in terms of CSC, CFNN, and RC and better than quantum algorithm (i.e., QRelief ) in terms of CFNN. Figure 8: e coupling map picture: Rigetti's 19-qubit processor "Acorn."Lines indicate the two-qubit connection ruled by a controlled Z operation.

Table 2 :
e updated result of WT.