A Novel Efficient Feature Dimensionality Reduction Method and Its Application in Engineering

. In the engineering ﬁ eld, excessive data dimensions a ﬀ ect the e ﬃ ciency of machine learning and analysis of the relationships between data or features. To render feature dimensionality reduction more e ﬀ ective and faster, this paper proposes a new feature dimensionality reduction approach combining a sampling survey method with a heuristic intelligent optimization algorithm. Drawing on feature selection, this method builds a feature-scoring system and a reduced-dimension length-scoring system based on the sampling survey method. According to feature scores and reduced-dimension lengths, the method selects a number of features and reduced-dimension lengths that are ranked in the front with high scores. This feature dimensionality reduction method allows for in-depth optimal selection of features and reduced-dimension lengths with high scores using an improved heuristic intelligent optimization algorithm. To verify the e ﬀ ectiveness of the dimensionality reduction method, this paper applies it to road roughness time-domain estimation based on vehicle dynamic response and gene-selection research in bioengineering. Results in the ﬁ rst case show that the proposed method can improve the accuracy of road roughness time-domain estimation to above 0.99 and reduce measured data of the vehicle dynamic response, reducing the experimental workload signi ﬁ cantly. Results in the second case show that the method can select a set of genes quickly and e ﬀ ectively with high disease recognition accuracy.


Introduction
The curse of dimensionality is a common problem in engineering.Ineffective and unreasonable feature dimensionality reduction will compromise machine-learning efficiency, pattern recognition accuracy, and data-mining efficiency while increasing the workload of measured data experiments to some extent.A set of high-dimension features possesses the following problems: too many features but few samples, too many features with few or no relations to the mining task, and excessive redundancy among features [1][2][3][4].
Feature dimensionality reduction methods can be classified into two types: feature selection and feature projection.Feature projection is also called feature reparameterization and transforms the original feature space into one with a smaller dimensionality and more independent dimensions [5][6][7].Feature projection generally uses principal component analysis (PCA), kernel principal component analysis, or other improved principal component analysis methods for linear or nonlinear transformations.Although these methods can realize feature dimensionality reduction to some extent, they require extensive calculations and high computational complexity.These dimensionality reduction methods also cannot reduce necessary experimental data and are inconvenient when identifying features heavily correlated to mining tasks.By contrast, feature selection obtains low-dimension features more effectively compared to machine learning or data mining and carries low computational complexity [8][9][10].However, existing feature selection methods are still time-intensive with respect to calculations, and some data dimensionality reduction problems in engineering remain difficult to solve using these methods.
The road roughness time-domain estimation of vehicle dynamic response is taken as an example in this paper, capturing feature dimensionality reduction in multivariate time series.Acquiring road roughness information based on vehicle dynamic response is an economical and practical method, but most researchers have only improved the precision of road roughness estimation by using different neural networks or enhanced estimation methods [11][12][13][14]; they have thus far failed to evaluate the types and quantities of dynamic responses to be assessed and the positions of points to be measured in the vehicle.Measurement points can change positions freely in the vehicle, and each position can offer three dynamic responses (vertical vibration acceleration, speed, and displacement); as such, the example has theoretically infinite feature dimensions.In this case, to improve the accuracy and speed of road roughness estimation based on neural networks, key features must be selected.Additionally, the subsequent experiment needs to test and measure the selected features, which guides the experiment and reduces the workload substantially.
For instance, in gene-selection research on cancer, the mining of gene expression profile data can identify cancer types.The gene expression profile has many data dimensions but small samples.For instance, leukemia has 7129 gene dimensions but only 72 samples for analysis, which increases the challenges of feature dimensionality reduction.Various methods exist for gene selection, but they are computationally costly and complex [15][16][17][18].Many researchers have introduced information to facilitate gene selection, but these methods cannot properly reflect the length of a set of gene features, namely, the number of dimensions after dimensionality reduction.Overall, engineering projects in road roughness time-domain estimation based on vehicle dynamic response and biological gene selection suffer from a curse of dimensionality with an extensive number of features.In addition to their computational cost and complexity, common dimensionality reduction methods (e.g., PCA and cluster analysis) fail to effectively reduce dimensionality (i.e., experimental data) and thus cannot improve estimation accuracy.If researchers use feature selection, uncertainty persists regarding the number of features selected such that the length of the final feature set is unclear.Several variables must be measured to determine which are uncertain.
To solve the problem of excessive dimensions, this paper proposes a new feature selection method based on a sampling survey method and heuristic intelligent optimization algorithm.The general process is as follows: first, we build a feature-scoring system and a reduced-dimension lengthscoring system based on the sampling survey method; second, we sort features and reduced-dimension lengths according to their scores and select the features and reduced-dimension lengths that are ranked in the front; third, to apply selection optimization to features and reduced-dimension lengths simultaneously, we redefine the meaning of the information carried by individuals in the heuristic intelligent optimization algorithm population and improve the algorithm accordingly.To better explain and verify the effect of the dimensionality reduction method, this paper conducts feature dimensionality reduction using road roughness time-domain estimation of vehicle dynamic response and gene-selection research in bioengineering.Results show that the new feature dimensionality reduction method proposed can select features quickly and effectively: the accuracy of road roughness time-domain estimation was generally higher than 0.99 and effectively reduced vehicle dynamic response measurement data, and a reasonable number of genes was selected.

New Feature Dimensionality Reduction
Method Based on Sampling Survey Method and Improved Heuristic Intelligent Optimization Algorithm 2.1.Scoring System Based on Sampling Survey Method.The sampling survey method takes part of the population as samples for analysis [19].In a typical dimensionality problem, given a vast number of features, the final number remains unknown after dimensionality reduction; therefore, using enumeration to analyze all samples requires an intensive workload and may fail.By employing the sampling survey method and a certain sample scale, this paper analyzes the features and number of them after dimensionality reduction.All samples must have the same chance of being selected.To obtain useful information for feature dimensionality reduction, we propose a feature-scoring system and a scoring system for reduced-dimension length.
The feature-scoring system operates as follows: Step 1 Draw n 1 sets of features with any number of dimensions taken randomly from features with equal probability.Calculate the fitness of each sample based on classification recognition accuracy, data-mining efficiency, or neural network estimation accuracy.
Step 2 Set a threshold value p and select all samples with fitness larger than p; these samples comprise the set of valid feature scores.
Step 3 Score each feature based on the set of valid feature scores.The score of each feature is composed of the following two parts: where F f eature score 1 is the first part of the score (Score 1) of a feature, N 1 is the number of sets of valid feature scores, R i is the fitness ranking by the i th set of valid feature scores, F f eature score 1 min is the minimum of all Score 1s, and F f eature score 1 max is the maximum of all Score 1s.
where F f eature score 2 is the second part of the score (Score 2) of a feature; f score2 i is a piecewise function such that if the i th set of valid feature scores contains the feature, then f score2 i = 1 and f score2 i = 0 otherwise; F feature score 2 min is the minimum of all Score 2s; and F f eature score 2 max is the maximum of all Score 2s.

Complexity
The total score of a feature is represented by the following formula: where F f eature score is the total feature score.
According to Formulas (1) and ( 2), F f eature score 1 and F feature score 2 , strictly speaking, each belong to the range of [0, 1].F feature score 1 reflects the degree of contribution of a feature to fitness.F f eature score 2 reflects the frequency of a feature in a set of valid feature scores.A feature with a higher F f eature score 1 or F f eature score 2 can realize a better fitness value.
Step 4 Rank features according to their total scores and choose the first n 2 features for the set of valid features.
The scoring system for reduced-dimension length is as follows: Steps 1 and 2 These steps are the same as in the featurescoring system Step 3 Score the length of each reduced dimension according to the set of valid feature scores.The score of each dimension length is as follows: where F dimension score is the total reduced-dimension length score, N 2 is the number of sets with the same dimension length in a set of valid feature scores, and f f itness is the fitness of each sample.
Step 4 Rank each dimension length according to F dimension score , select those with a proportion of cumulative gross scores greater than 80% out of the total score of all dimension lengths, and identify reasonable dimension lengths after dimensionality reduction.
To verify that a scoring system based on the sampling survey method can be used to effectively select features with the greatest influence on results, this paper uses a calculation example for simple mathematical analysis: the classical problem of Hald cement.Hald cement contains four chemical components: 3CaO•Al2O3 (x 1 ), 3CaO•SiO2 (x 2 ), 4CaO•Al2O3•Fe2O3 (x 3 ), and 2CaO•SiO2 (x 4 ).Table 1 lists data on relations between heat y (calorie) released by each gram of cement and the content of these four components.
In statistics, variance reflects the degree to which two or more data points are different.Then, using the idea of calculating the degree of contribution in PCA, this paper takes the proportion of variance of one variable from the total variance of all variables to represent the degree of influence of that variable: where d i is the degree of influence of the i th feature, Var x i is the variance of the i th feature, and ∑Var x i is the total variance of all features.Formula (5) can be used to calculate the degrees of influence of four components of Hald cement (x 1 , x 2 , x 3 , and x 4 ), which are 0.0579, 0.4050, 0.0686, and 0.4686, respectively.x 2 and x 4 have high degrees of influence, x 3 has less influence, and x 1 has the least influence.To further reflect the degree of influence of these variables on heat y, we employed the radial basis function neural network (RBFNN) and considered several parts of the four components of Hald cement as input features to estimate y, the heat released by each gram of each component.If a feature does not have typical representativeness, then the estimation accuracy of RBFNN will be low.Therefore, we can rank the four features in terms of their influence degree according to the estimation accuracy after RBFNN machine learning.Considering a single feature as input into the neural network and the coefficient of determination as the evaluation index for the degree of closeness of heat y's predicted value to the actual value, we obtained coefficients of determination for x 1 , x 2 , x 3 , and x 4 as 0.7765, 0.9580, 0.8454, and 0.9495, respectively.The results are similar to the variance analysis; therefore, we can conclude that x 2 and x 4 exert great influences on heat y, x 3 has less influence, and x 1 has the least influence.
According to Step 1 of the scoring system based on the sampling survey method proposed in this paper, we must also calculate the fitness of each sample by considering the accuracy of classification recognition, data-mining efficiency, or neural network estimation accuracy as the fitness to analyze the four features.To better compare their degrees of 3 Complexity influence, we considered the output approximation degree of RBFNN when using the scoring system; Table 2 lists the results.
When ranking the four features by degree of influence in descending order, we obtained x 2 , x 4 , x 3 , and x 1 .The results of dimensionality length show that when the dimensionality is 2, we can estimate heat y using the RBFNN with zero error.In this case, two features input into the neural network must contain either x 2 or x 4 .The above analysis confirms that the proposed scoring system based on the sampling survey method can be used to evaluate the degree of feature importance to select an ideal feature-set length.

Improving the Heuristic Intelligent Optimization
Algorithm to Enhance Algorithm Execution Efficiency.This paper uses an artificial fish swarm algorithm (AFSA) and particle swarm optimization (PSO) to explain and verify the new feature dimensionality reduction method [20][21][22][23][24].To improve the rate of convergence and precision of the algorithm, we refine AFSA and PSO.
AFSA is improved as follows: (1) Add the memory function to all AFSA behaviors and store historical optimal positions and corresponding optimal values of individuals and the entire fish swarm (2) Introduce the speed and position calculation formulas of PSO into swarm foraging behavior and update and calculate the movement speed and position of the next-generation fish swarm (3) According to the idea of variation in a genetic algorithm, if the global optimum remains the same in multiple records, we shall disturb multiple fish in the swarm with a certain probability The PSO is improved as follows: (1) To avoid the prematurity phenomenon in PSO, introduce the probability judgment Metropolis rule of a simulated annealing (SA) algorithm where P is the probability of accepting a poor solution, ΔE is the difference in fitness between two generations, and T is the temperature variable in SA, which declines as the number of iterations increases.
(2) Change the update criterion of temperature variable T to where T i is the value of T of the i th iteration, T 0 is the initial value of T, T end is the final value of T, and N is the total number of iterations.
(3) Taking the nonuniform mutation idea of a genetic algorithm as reference, disturb learning parameter C of flight speed in PSO to intensify the local search ability.The following formula calculates learning parameter C: where C i+1 is the value of C of the i + 1 th iteration, C i is the value of C of the i th iteration, and a 1 is a constant, such that a 1 = 1 in this paper.
To theoretically confirm the excellent optimum searching performance of the improved heuristic intelligent optimization algorithm, this paper analyzes convergence of the improved PSO and improved AFSA.Solis and Wets [25] studied the stochastic optimization algorithm and provided convergence criteria for stochastic optimization.For a type of stochastic optimization algorithm to converge to the global optimal solution, the following two conditions must be satisfied: (1) Condition H1: In this condition, A is the feasible solution space, D is the stochastic optimization algorithm, x k is the result of the k th iteration, x k+1 = D x k , ξ , and ξ is the solution searched by the algorithm in this iteration.
Condition H1 ensures that the fitness f x k in each iteration of the stochastic optimization algorithm is nonincremental.The stochastic optimization algorithm with convergence ability can get sequence f x k ∞ k=1 with a probability of 1, and the sequence converges to a certain value.A stochastic optimization algorithm satisfying condition H1 has convergence ability; although the final convergence value is not always a globally optimal solution, it may be locally optimal.
(2) Condition H2: For A's arbitrary Borel subset B, if its measure >0, then ∞ k=0 1 − μ k B = 0. μ k B is the probability measure of the algorithm's k th iteration result in set B.
Condition H2 indicates that if a stochastic optimization algorithm can search and find the globally optimal solution, the algorithm goes through the globally optimal solution at In the PSO, we define each particle by updating its speed and position based on the following formulas: where v k+1 is the speed of a particle in the k + 1 th iteration, c 0 is the inertia weight factor, c 1 and c 2 are learning factors, p best is the historical best position of an individual, g best is the historical best position of the swarm, and x k+1 is the position of a particle in the k + 1 th iteration.
The PSO stores the historical optimal position of a swarm, g best , thus ensuring that the optimization result obtained in each step of the algorithm is nonincremental; hence, the PSO satisfies condition H1.
Then, we judge the convergence of particle speed and position according to convergence in probability.With an inertia weight c 0 and the proposed learning factor calculation formula, when the number of iterations approaches infinity, we have lim k→∞ P v k − 0 < ε = 1, where ε is an arbitrarily small positive value.Suppose x * is a point of convergence in the PSO; then, we have lim If the number of iterations is sufficiently large, then the particle's position and update speed will converge to x * and 0, respectively, with a probability of 1.
Additionally, we analyze and confirm the convergence of PSO using the difference equation [26].Although v k and x k are multidimensional variables, the dimensions are mutually independent; therefore, we simplify algorithm analysis to be a 1D case.To simplify the calculation, we suppose that p best , the position of optimal solution of a particle, and g best , the position of the optimal solution of the whole swarm, are constant.
Then, according to Formulas ( 9) and ( 10), we get Substituting Formulas ( 9)- (11) into Formula (12), we get It is a second-order constant-coefficient nonhomogeneous difference equation, and this paper uses the characteristic equation method to solve it.Equation ( 13)'s characteristic equation is A quadratic equation with one unknown quantity can be divided into the following three cases: Δ = 0, Δ > 0, and Δ < 0 In this case, we obtain the following equation: where x 0 is the initial value of the algorithm iteration and v 0 is the initial speed of the iteration.
In this case, we get the following equations: is the same as the solution in the case of Δ > 0.
We propose an improved PSO to improve learning factors c 1 and c 2 .With an increase in the number of iterations, c 1 and c 2 approach 0. Therefore, the improved PSO surely satisfies the above convergence conditions only if the inertia factor c 0 has a proper value.
We prove the convergence of the improved PSO proposed using the three methods above.However, if the algorithm is required to converge to the global optimal solution, it must satisfy condition H2.In this case, the algorithm goes through the global optimal solution at least once when its number of iterations approaches infinity; that is, the particle in the PSO should demonstrate ergodicity.Apparently, standard PSO cannot guarantee identification of the global optimal value, resulting in the aforementioned prematurity phenomenon.We introduce SA into the evolutionary criteria of PSO because SA has been theoretically proven to converge to the global optimal solution with a probability of 1 in certain conditions [27].
Next, this paper proposes modifying the learning factor to provide the algorithm with a fast flying speed in early iterations and then obtaining a large search scope in the search space to search and find the global optimal solution when the search scope declines as the search proceeds.According to the derivation of reference [28], the standard AFSA is convergent.Proposed improvements introduce the speed and position calculation formulas of PSO into the foraging behaviors of the fish swarm.This way, in addition to clustering and Complexity tailgating behaviors, the entire fish swarm can share information about the historical optimal position of the swarm and the optimal positions of individuals during foraging behavior.Moreover, with the first improvement to AFSA (i.e., storing the historical optimal positions of individuals and the whole fish swarm and corresponding optimal values) and the convergence analysis of PSO, we can ensure that the improved AFSA satisfies the convergence judging condition H1 in the stochastic optimization algorithm.
Because we use the speed and position update formulas of PSO in the fish swarm's foraging behaviors where the fish swarm may also exhibit clustering and tailgating behaviors, all of which move in the direction with higher fitness, we suppose that x * is the convergence point of AFSA.Then, we have lim k→∞ P x k − x * < ε = 1 where ε is an arbitrarily small positive value; therefore, the AFSA possesses convergence.
Similarly, to satisfy condition H2, the position of the fish swarm should have ergodicity in AFSA.Like PSO, the standard AFSA also suffers from the prematurity phenomenon.If a super individual appears in the early iterations of the algorithm, then the whole fish swarm will converge quickly to the value.Therefore, we propose a third AFSA improvement: if the historical optimal solution does not change in multiple iterations, then the algorithm randomly changes some individuals in the fish swarm.When the iteration approaches infinity, the fish swarm position will take all possible values in the definition domain with a theoretic probability of 1. Essentially, the improved AFSA is globally convergent with a certain probability.
To verify the computational complexity of the improved PSO and improved AFSA, we choose the following test function [29]: where x is an independent variable within the domain of [−1, 2].
For analysis, we used a Lenovo computer with an Intel(R) Core(TM) i5-4590 processor, CPU at 3.30 GHz, and 8 GB memory.The PSO and AFSA used 20 particles over 30 iterations. Figure 1 presents the algorithm test results.
The standard AFSA and standard PSO were trapped in the area of the local optimal value in early iterations and slowly identified the global optimal value as the number of iterations increased, whereas the improved AFSA and improved PSO quickly identified the area of global optimal value in early iterations and finally converged to the global optimal value.The improved PSO and improved AFSA spent only 0.0904 s and 0.0719 s, respectively, converging to the global optimal value, whereas the standard PSO and standard AFSA took 0.0990 s and 0.5321 s, respectively.

Improving the Heuristic Intelligent Optimization
Algorithm Based on the Need for Feature Dimensionality Reduction.To apply the heuristic intelligent optimization algorithm to optimal feature selection, we improved the algorithm based on classification recognition accuracy, datamining efficiency, or neural network estimation precision as the target value of algorithm optimization.We redefined the individuals of each population in the intelligent optimization algorithm (i.e., redefined information carried by either the fish swarm in AFSA or each particle in PSO).By imitating the transcription and translation process of genetic information in biology, information carried by individuals is defined as shown in Figure 2.
Here, S 1 is the first information of an individual within a value range of [0, 1].M is the total number of elements in the valid feature set.Table 3

lists reasonable dimension lengths after dimensionality reduction.
In this case, L 1 , L 2 , and L n represent reasonable dimension lengths after the first, second, and n th dimensionality reduction (i.e., the number of reasonable features after dimensionality reduction), respectively.Parameters are Table 3: Selected intervals of reasonable dimension length after dimensionality reduction.

Reasonable dimension length
equal to the proportion of the total score of L 1 out of the total score of all reasonable dimension lengths.Similarly, c n − b n is equal to the proportion of the total score of L n out of the total score of all reasonable dimension lengths.The following is the calculation formula of c i − b i : where c i and b i are the upper and lower limits L i , respectively, of the i th reasonable dimension length; F dimension score L i is the total score of L i , and n is the total number of reasonable dimension lengths.
If the value of s 1 is within the interval of the i th line in Table 1, then the corresponding reasonable dimension length L i is selected.Moreover, the heuristic intelligent algorithm will extract information from s 2 to s L i +1 according to the value of L i , and the information will be used to calculate the target value of the algorithm.
According to continuous iterative calculation of the heuristic intelligent optimization algorithm, the information carried by each individual will evolve in the direction of optimal target value to obtain the most reasonable dimension length after dimensionality reduction and feature selection.

Road Roughness Time-Domain Estimation Based on Proposed Dimensionality Reduction Method
Researchers generally measure contacting or noncontacting road roughness to determine roughness; however, this method returns measurement results slowly and requires expensive equipment.A method combining a dynamic response and neural network can quickly estimate road roughness.Yet little research exists about the types and quantity of dynamic response and the points to be measured in the vehicle.Because measurement points can change positions freely in the vehicle, and each position can offer three dynamic responses (i.e., vertical vibration acceleration, speed, and displacement), this example theoretically has infinite feature dimensions.

Building a Vehicle Vibration Model with Freely Changing
Measurement Points in the Vehicle.Assume the vehicle is symmetric along the longitudinal axis, and the left and right wheels are under the same road conditions while driving.Given these assumptions, we simplify the vehicle model into a half-vehicle model and the vehicle body into a rigid rod.The vibration model in this paper only considers degrees of freedom: the vehicle body's vertical vibration, pitch-angle vibration, vertical vibration of the seat system, vertical vibration of front wheel, and vibration of the rear wheel (Figure 3).We consider the points in the vehicle, of which the distance to the center of mass is e, as the dynamic response measurement points and build the vibration equation based on Lagrange's second equation.
In Figure 3, Z 1 , Z 2 , Z 3 , Z c , and α denote the vertical vibration displacements (m) of the front wheel, rear wheel, seat system, center of mass of the vehicle body, and pitch-angle displacement of the vehicle body (rad), respectively; m 1 , m 2 , m 3 , m c , and J are the mass of the front wheel, rear wheel, seat system, vehicle body (kg), and rotational inertia of the vehicle body's mass center around the horizontal axis (kg•m 2 ), respectively; a, b, l, and e represent the distance from the front axle, back axle, seat system, and measurement point to the center of mass (m), respectively; C 1 , C 2 , C 3 , C f , and C r are the damping of the front suspension, rear suspension, seat system, front wheel, and rear wheel (N•s/m), respectively; K 1 , K 2 , K 3 , K f , and K r are the rigidity of the front suspension, rear suspension, seat system, front wheel, and rear wheel (kN/m), respectively; and Q f and Q r are the vertical displacement of road roughness to the front wheel and rear wheel (m), respectively.
The following is Lagrange's second equation: where q denotes the generalized coordinates; L = T − V represents the difference between the expression of kinetic energy T and potential energy V; U is the dissipated energy, referring to energy consumed by the damping element; and Q i is the generalized force.
Because the pitch angle has a smaller vibration, the paper takes sin α ≈ α.We assume the vertical vibration displacement of the measurement point to be Z D and the distances from the measurement point to the front axle and the rear axle to be l 1 and l 2 , respectively.

Complexity
The kinetic energy of the system is 19 The potential energy of the system is The dissipated energy of the system is Substituting Equations ( 19)-( 21) into Equation ( 18), we obtain the vehicle vibration differential equation in the example.Figure 4 illustrates the simulation model built in Matlab/Simulink.
According to the vibration simulation model, we can obtain dynamic response data conveniently.The vibration model uses the filtered white noise method to produce the road roughness signal, meeting experiment requirements, and considers a level-B road as the road to be estimated.

Research on Feature Dimensionality Reduction in Road
Roughness Estimation Using the Proposed Method.This example uses the improved AFSA for optimal feature selection.We number the vertical accelerated speed, vertical speed, and vertical displacement of the positions of measurement point, center of the front wheel, center of the rear wheel, and center of the driver's seat as shown in Table 4.
Numbers 2, 7, and 12 are the pitch-angle accelerated speed, pitch-angle speed, and pitch angle of the measurement point, respectively.
The example uses the improved AFSA and assumes information carried by the i th fish to be X i .
where x i1 is the value of the first information, within the range of [0, 1]; x i2 , … , x i16 are positive integers within [1,15], the values of which represent the corresponding dynamic response numbers; and x i17 is the distance from the measurement point to the center of mass, e, and e ∈ −1 25, 1 51 in the example.
In the iteration process, the improved AFSA finds the corresponding reasonable dimension length L i from Table 3 according to the value of x i1 and selects the first L i elements from x i2 , … , x i16 accordingly.
The example uses the RBFNN for time-domain estimation of road roughness.We used the improved AFSA and conducted 12 total optimization experiments; results appear in Table 5.
In Table 5, R 2 is the coefficient of determination.The feature dimensionality reduction method proposed can choose features with great influence on the precision of road roughness estimation.The sets of features selected in Experiments 3 and 4 were applied in Experiments 7 and 9; thus, we used the features selected in Experiments 3 and 4 for timedomain estimation of road roughness.Figures 5 and 6 display the estimation results.
The actual value was close to the estimated value.The coefficient of determination exceeded 0.99 in both tests, indicating that the proposed feature dimensionality reduction method can select a feature set with high contributions to road roughness using only four sets of features; thus, we can accurately estimate road roughness using only four dynamic response values.To further confirm this result, we used the random forest method to estimate road roughness based on the results of Experiments 3 and 4. Estimation results are presented in Figures 7 and 8.
The road roughness estimation results obtained with feature sets from Experiments 3 and 4 based on the random forest method were 0.9489 and 0.9708, respectively.Although the estimation accuracy of the random forest method was slightly lower than that of RBFNN, it was still quite high, suggesting that the set of dynamic response features selected with the proposed method can match other estimation data methods perfectly.When the proposed feature selection method was combined with a prediction method, only a few dynamic response parameters were needed to estimate road roughness accurately.

Gene-Selection Research in Bioengineering Using the Proposed Dimensionality Reduction Method
The experimental subjects in the example consisted of four open microarray datasets on leukemia, the colon, SRBCT, and brain cancer.Leukemia [30] contained 72 samples, each with 7129 genes.Acute myeloid leukemia (AML) contained 25 samples, and acute lymphoblastic leukemia contained 47.The colon [31] contained 62 samples (40 of tumor tissue and 22 of normal tissue), each with 2000 genes.SRBCT [32] included 83 samples, each with 2308 genes: 29 samples of the Ewing family of tumors, 18 samples of neuroblastoma, 25 samples of rhabdomyosarcoma, and 11 samples of Burkitt lymphomas.
Brain cancer [33] contained 60 samples, each with 7219 genes: 46 samples of patients with classic brain cancer and 14 samples with desmoplastic brain cancer.Although brain cancer only had two classes, the dataset 8 Complexity has proven difficult to classify in current research with classification accuracy lower than 85% in much of the literature.Lung cancer [34] included 203 samples, each with 3312 genes.Samples were classified into five subtypes: 139 adenocarcinoma, 20 pulmonary carcinoids, 21 squamouscell lung carcinomas, six small-cell lung carcinomas, and 17 Because each type of tumor disease is related to many genes, we roughly extracted genes using the information index to classification (IIC) [34].This example uses the improved PSO and assumes information carried by the i th particle to be Y i .To demonstrate the advantage of the proposed method in sample classification accuracy, we chose five methods with high gene-selection accuracy, including binary particle swarm optimization BPSO-ELM, K-means-PSO-ELM, K-means-BPSO-ELM, BPSO-GCSI-ELM, and K-means-GCSI-MBPSO-ELM, to compare with the proposed method.Results of 100 independent repeated experiments were taken from [16]; Table 6 shows the comparison results.
BPSO is a PSO algorithm that solves discrete optimization problems using particles formed by the binary system [35].BPSO-ELM optimally selects disease-related genes using BPSO and evaluates the classification results of genes selected each time using ELM [36].K-Means-PSO-ELM and K-means-BPSO-ELM each complete clustering analysis on all disease-related genes using K-means first followed by optimal selection with standard PSO and BPSO combined with the ELM classifier [37].GCSI refers to gene-to-class sensitivity information; the BPSO-GCSI-ELM method generates an initial particle swarm using the GCSI of each gene and influences the updating and evolutionary process of particles [38].Combining the above methods, K-means-GCSI-MBPSO-ELM first completes K-means clustering analysis on all disease-related genes and then introduces the GCSI into the modified BPSO to optimally select genes in combination with the ELM classifier [39].
Because other gene extraction methods have demonstrated good effects on gene extraction of leukemia and SRBCT, this paper does not compare the results of these two types of diseases.Tables 6 and 7 indicate that we can select genes with greater influences on tumor subtype  11 Complexity classification using the proposed feature dimensionality reduction method.Gene sets selected with the proposed method exhibited high classification accuracy, substantially higher than other methods.The standard deviation of classification accuracy of the proposed method was smaller than that of the other methods except brain cancer.For the four diseases, the average accuracy of the proposed method was higher than the highest average accuracy of the other methods by 1.25%, 3.84%, 1.57%, and 6.63%, respectively (from left to right in Table 6).For brain cancer and lymphoma, the average classification accuracy improved to approximately 93%; for colon and lung cancer, the average classification accuracy reached approximately 99%.The classification accuracy was fairly high, greatly improving the practicability of gene-based disease detection.
To compare the ELM classifier with other classifiers based on disease classification results, we selected the random forest classifier and support vector machine (SVM) classifier to identify tumor diseases based on the gene sets in Table 7.We used the average and standard deviation of 5fold CV accuracy from 100 tests to evaluate the classification results of different classifiers.Table 8 shows the results of the random forest classifier and SVM classifier.
In terms of tumor disease recognition accuracy based on the selected gene sets using the proposed method, the random forest classifier and SVM classifier were generally poorer than the ELM classifier.For lung cancer, however, the SVM classifier's average accuracy was slightly higher than that of the ELM classifier and its standard deviation was smaller.

Conclusions
(1) This paper proposes a new feature dimensionality reduction method based on a sampling survey method and heuristic intelligent optimization algorithm.First, we build a feature-scoring system and a reduced-dimension length-scoring system based on the sampling survey method and select features ranking in the front with reasonable dimension lengths.Then, we select features according to the improved heuristic intelligent optimization algorithm.Our method has only two steps with simple operation and low computational complexity.
(2) We successfully apply the proposed feature dimensionality reduction method to road roughness timedomain estimation research.Results show that we only need as few as four dynamic responses to estimate road roughness accurately.The experimental results of 12 feature dimensionality reductions reveal the mean accuracy of road roughness time-domain estimation to be 0.9897.
(3) We successfully apply the proposed feature dimensionality reduction method to gene-selection research in bioengineering.Results show that the gene sets selected with the proposed method have high classification accuracy with mean classification accuracy substantially higher than that of other gene-selection methods.

Figure 3 :
Figure 3: Vehicle vibration model with five degrees of freedom.

Figure 5 :Figure 6 :
Figure 5: Road roughness estimation results with features in Experiment 3.

Figure 7 :
Figure 7: Road roughness estimation results based on the random forest method and features in Experiment 3.

Figure 8 :
Figure 8: Road roughness estimation results based on the random forest method and features in Experiment 4.

Table 1 :
Data on relations between content x and heat y of four components of Hald cement.

Table 2 :
Scores of four components of Hald cement.

Table 4 :
Numbers of dynamic responses.

Table 5 :
Optimization selection results of dynamic response features.

Table 6 :
Comparisons between proposed feature dimensionality reduction method and six other methods.

Table 7 :
Gene sets selected using the proposed feature dimensionality reduction method with classification accuracy.