A New Method for Spectral Wavelength Selection Based on Multiple Linear Regression Combined with Ant Colony Optimization and Genetic Algorithm

Wavelength selection is one of the key steps in quantitative spectral analysis, which reduces the computation time while also improving the prediction accuracy of the model. In this paper, we propose a wavelength selection algorithm based on the ant colony optimization (ACO), in which the absolute value of the regression coefcient of the multiple linear regression (MLR) model is used as the basis for evaluating the importance of wavelengths, and the absolute value of the regression coefcient after full wavelength MLR modeling is used as the initial pheromone value of the ant colony optimization (MLR-ACO). In each iteration, the absolute value of the regression coefcient corresponding to each wavelength of the individual with the highest ftness value is used as the basis for a pheromone update. Te crossover operator is introduced in MLR-ACO (MLR-ACO-GA), and the individuals with the top 100 ftness values in MLR-ACO are used as the initial population of the genetic algorithm (GA). A selected frequency of wavelengths greater than the threshold among MLR-ACO individuals is calculated. A number of coarse interval points are generated according to the selected frequency, and a coarse crossover operation is performed at the coarse interval points. Fine crossover points are randomly generated within the coarse interval, and fne crossover operations are performed within the coarse interval to exploit the potential of combining excellent individuals in MLR-ACO with each other as much as possible. MLR-ACO can well solve the problem of traditional ACO initial pheromone scarcity, and MLR-ACO-GA can avoid MLR-ACO falling into a local optimum to a certain extent and be more fexible in the selection of the number of wavelengths, which can give full play to the advantages of MLR-ACO.


Introduction
Spectroscopy is widely used in the felds of agriculture [1,2], medicine [3,4], environment [5,6], and food detection [7,8] due to its speed, low cost, and nonpollution characteristics. With the advancement of modern spectroscopic instruments, the obtained spectral data contain tens to thousands of wavelengths and can refect the subtle spectral diferences of diferent constituents in the measured substances. However, the obtained data contain a large number of uncorrelated or redundant features with high collinearity, and these data features usually reduce the prediction accuracy of the model and worsen the experimental results [9,10]. To solve this problem, many wavelength selection methods have been proposed, and many papers have demonstrated experimentally or theoretically that performing wavelength variable selection can lead to better prediction performance and signifcant computational time savings. Wavelength selection is a very important and essential key step before performing quantitative analysis [11][12][13][14].
In recent years, more and more swarm intelligence optimization methods have been proposed and widely used in wavelength selection. In addition to those mentioned earlier, there are gray wolf optimization (GWO) [27], monarch butterfy optimization (MBO) [28], slime mould algorithm (SMA) [29], hunger games search (HGS) [30], Harris Hawks optimization (HHO) [31], artifcial algae algorithm (AAA) [32], etc. All of these methods have achieved good results in the selection of feature wavelengths. Among these optimization algorithms, ACO has been widely studied because of its positive feedback mechanism, fast convergence speed, and high accuracy [18]. Te ACO has high efciency in solving complicated problems, but the traditional ACO also has many defects, such as a lack of initial pheromone and an easy tendency to fall into local optimal solutions.
To solve the problems such as the lack of an initial pheromone, Tong proposed using the importance projection coefcients of variables (VIP) under the full wavelength partial least squares regression (PLSR) model as the initial pheromone of the ACO algorithm and proposed the PLS-VIP-ACO wavelength selection method [33]. Based on Tong's research, Xiaoming et al. proposed an elite ACO based on the validity of variables, while combining forward selection methods to prefer feature wavelengths and using elite ant colony search [34]. Liu et al. proposed an improved adaptive update ACO to improve the convergence and global search capability of the traditional ACO [19]. However, it is worth noting that in their studies, the predictive performance of the PLSR model is used as the criterion for evaluating the selected subset of variables. In the iteration of the ACO, it is necessary to artifcially set the values of latent variables in PLSR or set the values of latent variables to a certain range, which on the one hand cannot make the PLSR model optimal, and on the other hand, it takes a lot of time in the process of fnding latent variables and the additional calculation of VIP coefcients increases the complexity of the algorithm. In order to improve these problems, this paper combines the multiple linear regression (MLR) method with the ACO and establishes the MLR model for the data at full wavelength, uses the absolute value of the regression coefcient of the MLR model as the criterion for evaluating the importance of wavelength and uses it as the value of the initial pheromone of the ACO to solve the problem of the lack of the initial pheromone of the ACO. To improve the problem that the traditional ACO easily falls into local optimum, the crossover operator in the genetic algorithm is introduced into MLR-ACO. Te ten-fold root mean square error of cross-validation (RMSECV) of the MLR model with full wavelength data is calculated and this is used as the threshold value. We count the individuals in the ACO that are larger than the threshold value and calculate their frequency of being selected for each wavelength. Several coarse interval points are generated according to the characteristics of the selected frequencies of wavelengths, coarse crossover operation is performed at the coarse interval points, fne crossover points are randomly generated within the coarse interval, and a fne crossover operation is performed at the fne crossover points. Among them, the coarse crossover is to discover the advantages of coarse intervals combining with each other among diferent excellent wavelength combinations, and the fne crossover is to explore whether the same intervals combining with each other can produce better subsets, to further exploit the advantages of MLR-ACO.

Multiple Linear Regression
Model. MLR is a common calibration method in quantitative spectroscopy, which focuses on the correlation between an attribute of interest and each wavelength [35][36][37]. Te basic form is It is generally written in vector form as where y denotes the attribute value of interest, x m denotes the refectance of the corresponding wavelength, w denotes the corresponding regression coefcient, and e is the error, which follows a normal distribution with the mean value of zero. n denotes the number of wavelengths. Te regression coefcient W (W�w 1 ,w 2 ,...w n ) is estimated using the least squares method, and the estimated amount of W is denoted as W * (W * � w * 1 , w * 2 . . . , w * n ) which can be obtained from the least squares method: Using the regression vector W * to predict Y, the predicted value of Y can be calculated by the following equation: From the appeal equation, we know that X is a fxed value and the value of Y * is determined by W * . When the absolute value of w * is larger, the greater the infuence on y. Te absolute value of w * i refects the contribution of wavelength i to y. It can be said that the larger the |w * i | is, the more important the i-th wavelength is. Terefore, the absolute value of the regression coefcients obtained by building MLR models for full wavelength data is used as the initial pheromone value of the ACO algorithm.

MLR-ACO.
Inspired by the traditional ACO combining the MLR algorithm regression coefcients with the ACO, the main steps are as follows:  (10) C: the contribution matrix, the combinations of feature wavelengths selected by ants whose ftness value is greater than the threshold value which is stored in the contribution matrix.

Ant Chooses the Path.
Each ant randomly selects a wavelength as the path start point and stores it in the HAS matrix. Te HAVE matrix removes the wavelength, and the IP matrix removes the pheromone value of the wavelength. Te roulette algorithm is used to select the next feature wavelength until the number of selected feature lengths reaches V-MAX. Te probability of each wavelength being selected is as follows: where P i is the probability that wavelength i in the HAVE matrix is selected and τ i have is the concentration value of the pheromone at wavelength i in the HAVE matrix.

Calculation of the Fitness Value.
Te combined data of feature wavelengths selected by each ant are built into the MLR model, and the RMSECV of the MLR model is used as the basis for the calculation of the ftness value of the ACO. Te ftness value (F) is calculated as shown in equation (6). Te ants whose ftness value is greater than the threshold value are selected into the contribution matrix.

Pheromone Update.
Te pheromone is updated according to the foraging behavior of ants in the biological world. When all ants fnish the iteration, the ant with the highest contemporary adaptation value is selected, and the absolute value of the regression coefcient of the corresponding wavelength after its MLR modeling is used as the basis for the pheromone update, and the pheromone of the corresponding wavelength is strengthened according to the pheromone update formula, and the wavelengths that are not selected will slowly become smaller because of the pheromone volatile concentration. Te specifc pheromone update equation is as follows: others.

(7)
We repeat steps 2-4 until the set maximum number of ant colony iterations is reached.
Te optimal wavelength combination is selected, and the feature wavelength combination selected by the ant with the highest ftness value among all individuals is the fnal selection.

Introduction of Crossover Operator in MLR-ACO.
In GA, a crossover operator operation can produce an even better ofspring that incorporates the characteristics of both parents. Te most common crossover method is the singlepoint crossover operation, which generates a random crossover point and swaps the feature wavelengths before and after the point between the two parents to generate two new combinations of feature wavelengths. Te combination of feature wavelengths with higher ftness values is saved for comparison with the parent. Since its intersection points are randomly generated, the stability of the generated children is not very good. To improve this problem, an improved intersection algorithm is proposed in this paper for spectral features. In the study of the optical parameters of milk by Jun [38], it is known that the absorption coefcients of diferent components of the same measured sample vary at diferent wavelengths and are strongly infuenced by the content of that component. Te absorption coefcients of diferent components also interact with each other, and it is difcult for us to tell the wavelength interval of the absorption coefcient corresponding to the component of interest directly from the raw spectral image. In calculating the selected frequencies of each wavelength in the MLR-ACO contribution matrix, it is found that the selected frequencies of wavelengths also show corresponding peaks and valleys within a certain wavelength interval. Te selected frequency of a wavelength indicates the importance of that wavelength for the property of interest that we need to measure. For this reason, we divide the full wavelength into coarse intervals according to the troughs of the selected frequencies of wavelengths based on the selected frequency map of wavelengths. Here, the valley points of the selected frequencies of wavelengths are used as the coarse crossover points of the crossover operator in the genetic algorithm. Te best 100 individuals generated in the MLR-ACO iteration are used as the initial population of the genetic algorithm, and the coarse crossover operation is performed by the coarse crossover point. All individuals are combined as much as possible to discover the best individuals generated by combining diferent wavelength intervals, and in order to discover the advantages of combining within the same wavelength interval, a fne crossover is randomly generated within the coarse interval, and the two parents are combined with each other in the same interval wavelength. Finally, four ofspring were generated cumulatively, and the two wavelength combinations with the highest ftness values were left by comparing them with their two parents. After 30 iterations, the wavelength combination selected by the individual with the highest ftness value is the fnal selection. Te fow chart of MLR-ACO-GA is shown in Figure 1.

Wheat Protein Dataset.
Te dataset is from the International Difuse Refectance Conference (IDRC) and can be downloaded at https://www.cnirs.org/content.aspx? page_id�22&club_id�409746&module_id�239453. It contains spectral data of 248 kinds of wheats measured by 3 spectrometers. Te wavelength range is 850-1050 nm, the interval is 2 nm, and there are 100 bands in total. Te data also measured protein content values for each wheat sample, which varied from 7.97% to 18.69% with an average of 13.64%. Spectral data and protein content values measured by Instrument C were used in this study.

Cereal Cheese Protein Dataset.
Te dataset is downloaded from https://eigenvector.com/resources/data-sets/ #grain-sec. In this dataset, the U.S. Department of Energy uses a mixture of three substances to predict the content of casein, glucose, lactic acid, and water in the mixture. Among them, the content of casein varies from 0% to 88.83%, with an average of 29.61%. Te value of casein and the measured spectral data are used in this study.

Corn Protein Dataset.
Te dataset can be downloaded from the website http://software.eigenvector.com/Data/ Corn/index.html. It consists of 80 corn samples measured by three diferent near-infrared spectrometers. Te instruments used are m5, mp5, and mp6. Te wavelength range is 1100-2498 nm, and the interval is 2 nm. Te data also measures the moisture, oil, protein, and corn content of each corn sample. Te spectral data and protein values measured by the M5 near-infrared spectrometer were used in this study. Te protein content varied from 7.65% to 9.71%, with an average of 8.66%.

Equipment and Software.
We use a general-purpose computer; the CPU is Intel (R) Core (TM) i5-6500 CPU @ 3.20 GHz 3.19 GHz, the memory is 8 GB, the operating system is Windows 10, and all calculations are implemented on the Python 3.7 platforms.

Results and Discussion
Tree publicly available datasets of wheat protein, grain casein, and corn protein were used to evaluate MLR-ACO and MLR-ACO-GA, which were eventually compared with fve established feature selection algorithms, CARS, SPA, ACO, GA, and DE. SPA is based on vector projection analysis, which compares the magnitude of projection vectors between diferent wavelengths to fnd the combination of feature variables with the lowest information redundancy in the spectral data and selects the optimal combination of feature variables by correcting the model. SPA can minimize the collinearity between variables and largely reduce the number of wavelengths needed for modeling. CARS imitates the principle of survival of the fttest in Darwinian evolutionary theory combined with PLS model regression coefcients, and in each iteration, samples are drawn by monte Carlo sampling, and variables with small absolute values of regression coefcients are forced to be removed by the exponential decay function (EDF). Te adaptive weighted sampling method is used to further flter the feature wavelengths, and the set with a large value of the regression coefcient weights is retained to create the PLSR model and calculate the RMSECV of this feature wavelength combination. After several iterations, the feature wavelength combination with the lowest RMSECV value is selected as the optimal subset. ACO screens the feature wavelengths by simulating the foraging behavior of ants. It uses the RMSECV of the calibration model to judge the goodness of this combination of feature wavelengths and updates the pheromones of the corresponding wavelengths according to the RMSECV after each iteration. In this paper, only the ants with the highest ftness value in each generation are selected to update the pheromone matrix. From the experiments, it is found that updating the pheromone matrix with the ant with the highest ftness value is much more desirable than updating the pheromone matrix with all ants. By imitating the mechanism of superiority and inferiority in nature, GA iterates repeatedly through the selection operator, crossover operator, variation operator, and three operators to fnally select the combination of feature wavelengths with the highest ftness value as the optimal feature wavelength combination. Te DE algorithm is very similar to the genetic algorithm in that it also includes the operations of mutation, crossover, and selection, but the specifc defnitions of these operations are diferent from those of the genetic algorithm. In this paper, the DE algorithm uses foating-point vector coding to generate the initial population, while the GA uses binary coding. In this paper, SPA, ACO, GA, and DE all use MLR regression models as calibration models, and RMSECV as the evaluation criterion for a subset of variables. However, CARS is used to improve the prediction accuracy for the PLSR model, so the MLR and PLSR models are built for the combination of feature wavelengths screened by CARS and compared with the RMSECV of the MLR and PLSR models for the combination of feature wavelengths screened by both MLR-ACO and MLR-ACO-GA algorithms. Te parameter settings of each algorithm were set according to their respective recommendations, 30 tests were performed on each data set, and the RMSECV was recorded.

Parameter Confguration.
In MLR-ACO, there are fve parameters that afect the performance of the algorithm. Before the method is used for diferent data sets, the parameters should frst be optimized. Te number of iterations N was set to 50, 100, 150, and 200 in order, with N set too small for the algorithm to achieve a ft and too large for N to increase the time complexity of the computation. It was found through experiments that the MLR-ACO algorithm reached its ft when N was set to 50 in the wheat protein and grain casein datasets. In the corn protein dataset, the MLR-ACO algorithm reached the ft only when N was set at 100. Tis is because the corn dataset has a higher number of wavelengths compared to the other two datasets and requires a longer iteration time. Te larger the number of ants M, the higher the accuracy of the algorithm, and also the higher the time complexity. Te number of ants in all three datasets was fnally set at 80. Te pheromone volatility factor ρ was set to 0.3, 0.5, and 0.7, respectively. ρ was too small, the ants might lose the global search ability, and ρ was too large, which would afect the convergence speed. After experiments, it was found that satisfactory results could be achieved when ρ was taken as 0.3 and 0.5. In the wheat protein and grain casein datasets, the results were slightly better when ρ was taken as 0.3, and in the corn protein dataset, the results were better when ρ was taken as 0.5. Q is the pheromone signifcance factor, and Q is set to 1 for all three data sets. V_MAX is one of the most important parameters in the MLR-ACO algorithm. If V-MAX is set too large, some irrelevant information variables cannot be eliminated, which will reduce the computational efciency. If V-VAX is set too small, some important variables may be excluded, and the accuracy of the prediction model will be reduced. In the wheat protein and grain casein datasets, V-MAX was frst set to 10,20,30,40,50, and 60 in that order, and after determining the optimal value in this interval, the fnal V-MAX value was determined in intervals of 5 within the value range. In the corn protein data set, the V-MAX values were frst set to 20, 40, 60, 80, and 100 in that order, and after determining the optimal value in this interval, the fnal V-MAX value was determined in intervals of 5 within this value range. Figure 2 shows the box line plots of the RMSECV for 30 experiments with diferent V-MAX values for the three data sets, respectively. We can know that the optimal values of V-MAX for the three data sets are 15, 40, and 65, respectively. When V-MAX is set too small, some important wavelengths cannot be selected, which reduces the prediction performance. When V-MAX is set too large, the model accuracy is reduced instead, which can be seen as proof of Occam's razor theory that better prediction performance can be achieved by using fewer wavelengths [39].
Te population size is set to 100 from the ants with the top 100 ftness values after the MLR-ACO iterations are completed. Te number of iterations is set to 30 because the initial population is already excellent and the number of iterations does not need to be set very widely. Te coarse crossover was calculated based on the individuals with ftness values greater than T produced by the 30 MLR-ACO iterations, with a threshold T equal to the reciprocal of the full wavelength MLR modeling RMSECV. In addition, the coarse crossover probability and fne crossover probability were set to 0.5 and 1, respectively. Te frequency of  Journal of Spectroscopy individuals being selected in the contribution matrix of the three datasets and the coarse crossover point settings are shown in Figure 3. Te wheat protein dataset generated 20 coarse crossover points, the cereal casein dataset generated 17 coarse crossover points, and the corn protein dataset generated 42 coarse crossover points. Te spectrograms of wheat protein data and the frequencies of the seven diferent methods selected for the wavelengths on the wheat protein data set for the test experiments are shown in Figure 4. As can be seen from the observation of Figure 4, it is not directly evident from the spectral images that the spectral absorption bands are related to the frequencies of the variables selected, which once again should confrm the conclusion in 2.3. From the fgure, the absorption spectra selected by MLR-ACO and MLR-ACO-GA are roughly the same, mainly including the complex regions of protein molecular characteristic absorption such as the stretching vibration or bending vibration of C-H, N-H, and O-H bonds, their interaction, and the infuence of the external environment. Te selected absorption bands of MLR-ACO and MLR-ACO-GA are mainly concentrated near 900 nm, 925 nm, and 950 nm. A few wavelengths are also selected near 860 nm, 1000 nm, and 1025 nm. Among them, 900 nm corresponds to the quadruple frequency absorption band of C-H and 950 nm corresponds to the triple frequency absorption band of the O-H bond. And the other selected wavelengths are difcult to match accurately with a certain chemical bond. However, the experimental results show that these wavelengths play an important role in the modeling. It is worth noting that the high-frequency wavelengths selected by the two algorithms, MLR-ACO and MLR-ACO-GA, basically match those selected by the other fve algorithms, but MLR and MLR-ACO-GA discard more irrelevant information variables.

Grain Protein Dataset.
For the grain protein data, SPA, CARS, ACO, GA, and DE were used for comparison with MLR-ACO and MLR-ACO-GA, respectively. Each algorithm except SPA was run 30 times and its RMSECV was recorded, and the results of the full wavelength model are listed together as shown in Table 2. From Table 2, it can be seen that the fnal modeling results of all seven wavelength selection algorithms outperformed the full wavelength model, and the mean values of SPA, CARS, ACO, GA, DE, MLR-ACO, and MLR-ACO-GA were reduced by 37.28%, 48.61%, 54.04%, 49.75%, 52.51%, 57.22%, and 57.46%. Tis shows that feature wavelength selection is very important before performing quantitative correction models. Among these seven algorithms, MLR-ACO-GA has the best prediction performance, and MLR-ACO is the second, but the number of feature wavelengths required is not the least, which is because the MLR-ACO algorithm believes that when the number of wavelengths is taken to be about 30, not all efective information variables can be selected to make the model efect optimal. From Figure 2(b), we can see that when the V-MAX parameter is set to 20, the prediction efect of CARS, SPA, and ACO can already be achieved, but the MLR-ACO algorithm believes that a V-MAX of 20 is not the optimal parameter value. Of course, V-MAX can be set to 20 if considered from the perspective of time complexity.
Te spectrograms of the cereal casein data and the frequencies of the variables selected by the seven diferent methods for the experimental tests on the cereal casein dataset are shown in Figure 5. It can be seen from the fgure that the high-frequency wavelengths selected by the six algorithms are the same, mainly around 1152 nm, 1248 nm, around 1500 nm, 1752 nm, and 2028 nm. Among the three algorithms with better prediction performance, CARS, MLR-ACO, and MLR-ACO-GA, the selected frequencies of the eight bands of 1152 nm, 1164 nm, 1200 nm, 1224 nm, 1248 nm, 1752 nm, 1776 nm, and 2028 nm are more than 70%. Compared with the CARS algorithm, MLR-ACO, MLR-ACO-GA, ACO, and GA additionally select wavelengths in the range of 1344-1392 nm and some other wavelengths. Among them, the spectral regions near 1152 nm, 1500 nm, 1752 nm, and 2028 nm correspond to the C-H bond triple frequency absorption band, N-H bond double frequency absorption band, C-H double frequency absorption band, and O-H combined frequency absorption band, respectively. Te other chosen wavelengths are difcult to match exactly to a particular chemical bond, but experimental results show that these wavelengths play an important role in modeling. Te selected wavelengths of the swarm intelligence class algorithm are more dispersed because the total number of wavelengths is only 117. Te swarm intelligence algorithm has a good global search capability and can exploit the advantages of the combination of diferent wavelengths as much as possible.

Corn Protein Dataset.
For the corn protein data, SPA, CARS, ACO, GA, and DE were compared with MLR-ACO, MLR-ACO-GA, and each algorithm was run 30 times, except SPA and its RMSECV was recorded, and the results of the full wavelength model are presented together as shown in Table 3. All the algorithms except SPA outperformed the full wavelength prediction. Te mean values of CARS, ACO, GA, DE, MLR-ACO, and MLR-ACO-GA were reduced by 40.15%, 62.45%, 41.80%, 55.08%, 91.91%, and 92.52%, respectively, compared to the full wavelength RMSECV. Tere are 700 wavelengths in the corn protein dataset, which is about 7 times the number of wavelengths of cereal casein and wheat protein, and the advantages of MLR-ACO and MLR-ACO-GA over other algorithms are more prominent as the number of wavelengths increases. SPA has a high RMSECV although only two bands were collected, and it is clear that SPA is not applicable to the corn protein dataset. ACO, GA, and DE have good prediction results, but the number of       selected wavelengths is much higher compared to the other methods. Te MLR-ACO and MLR-ACO-GA perform best in terms of prediction accuracy. Te number of selected wavelengths is slightly more than CARS, but the model accuracy is higher and the stability is also the best. It is worth noting that, observing Figure 2(c), it can be seen that when the V-MAX parameter in MLR-ACO is set to 20, the average value of RMSECV is 0.0181, which is lower than SPA, CARS, ACO, GA, and DE. In practical applications, the value of the V-MAX parameter can be set according to both time demand and accuracy demand. Te spectrograms of the corn protein data and the frequencies of the wavelengths selected by the seven diferent methods for the experiments on the corn protein data set are shown in Figure 6. Te wavelengths selected by the six algorithms are mainly around 1750 nm, 1776 nm, 2168 nm, and 2374 nm. Te wavelengths selected by the six algorithms are mainly around 1750 nm, 1776 nm, 2168 nm, and 2374 nm. Among them, 1750 nm corresponds to the C-H bond double frequency absorption band, 2168 nm corresponds to the N-H bond frequency absorption band, and 2374 nm corresponds to the C-H bond frequency absorption band. As can be seen from the fgure, the wavelengths selected by the CARS algorithm are relatively concentrated, while the ACO, GA, and DE algorithms are relatively divergent, which for ACO is due to the lack of an initial pheromone. For GA and DE, this is because the random population is randomly generated and the fnal result is directly related to the initial population. Te improved MLR-ACO and MLR-ACO-GA algorithms just make up for the defects of GA and ACO. On the other hand, the bionic algorithm has better global search capability and can exploit the advantages of combination between diferent bands as much as possible. It can be observed in the frequency diagram that each algorithm produces certain peaks and valleys in frequency for a certain wavelength interval, and the positions of the peaks and valleys of these six algorithms are the same.

Comparison of the Results of the PLS Correction Model Established by CARS, MLR-ACO, and MLR-ACO-GA.
Te results of the PLS correction models for the combinations of the selected feature variables for 30 tests of the MLR-ACO and MLR-ACO-GA and CARS algorithms are shown in Table 4. In the wheat protein dataset, the accuracy of PLS correction models for the selected combinations of variables in CARS, MLR-ACO, and MLR-ACO-GA was better compared to the full wavelength, and the mean value of RMSECV was reduced by 3% in CARS, 15% in MLR-ACO, and 10% in MLR-ACO-GA, respectively, compared to the full wavelength. MLR-ACO predicted the best results and was more stable. In the cereal casein dataset, the accuracy of the RLSR correction models for the selected combinations of variables in CARS, MLR-ACO and MLR-ACO-GA were better compared to the full wavelength, with a 25.56% reduction in the mean value of RMSECV for CARS compared to the full wavelength, and a 37.32% and 35.17% reduction in RMSECV for MLR-ACO and MLR-ACO-GA compared to the full wavelength. Both MLR-ACO and MLR-ACO-GA performed better due to CARS. in the maize protein dataset, the accuracy of RLS correction modeling was also better for the combination of variables selected for CARS, MLR-ACO and MLR-ACO-GA compared to the full wavelength, with a 79.20% reduction in CARS compared to the full wavelength RMSECV mean, and a 79.20% reduction in MLR-ACO and MLR-ACO-GA compared to the full wavelength RMSECV mean. ACO and MLR-ACO-GA reduce the full wavelength RMSECV values by 78.27% and 78.42% compared to the full wavelength, respectively. CARS performs the best and is the most stable among the three algorithms, but the minimum RMSECV values of MLR-ACO and MLR-ACO-GA are better than CARS. Apparently, the combination of feature wavelengths selected by MLR-ACO and MLR-ACO-GA can also achieve good results in the PLSR model.

Conclusion
In this paper, we propose an improved algorithm based on the ant colony algorithm, combining ACO with MLR and adding the crossover operator of the genetic algorithm to MLR-ACO to combine them into MLR-ACO-GA. Te MLR-ACO makes up for the defects of the original ant colony algorithm well, and the MLR-ACO-GA further exploits the advantages of the MLR-ACO-GA algorithm. Compared with other methods, these two algorithms are highly accurate and require fewer wavelengths, but require more time to complete the iterations. Our future work will try to solve this problem and apply the algorithm to practical applications. It is worth noting that these two algorithms can be applied to the selection of feature wavelengths for NIR spectral data and can also be extended to other data requiring quantitative analysis for the selection of feature wavelengths.

Data Availability
Te data used in this study are a public dataset; the source is detailed in the text.