Application of Artificial Neural Networks for the Automatic Spectral Classification

Classification in astrophysics is a fundamental process, especially when it is necessary to understand several aspects of the evolution and distribution of the objects. Over an astronomical image, we need to discern between stars and galaxies and to determine the morphological type for each galaxy. ,e spectral classification of stars provides important information about stellar physical parameters like temperature and allows us to determine their distance; with this information, it is possible to evaluate other parameters like their physical size and the real 3D distribution of each type of objects. In this work, we present the application of two Artificial Intelligence (AI) techniques for the automatic spectral classification of stellar spectra obtained from the first data release of LAMOSTand also to the more recent release (DR5). Two types of Artificial Neural Networks were selected: a feedforward neural network trained according to the Levenberg–Marquardt Optimization Algorithm (LMA) and a Generalized Regression Neural Network (GRNN). During the study, we used four datasets: the first was obtained from the LAMOST first data release and consisted of 50731 spectra with signal-to-noise ratio above 20, the second dataset was obtained from the Indo-US spectral database (1273 spectra), the third one (the STELIB spectral database) was used as an independent test dataset, and the fourth dataset was obtained from LAMOST DR5 and consisted of 17990 stellar spectra with signal-to-noise ratio above 20 also.,e results in the first part of the work, when the autoconsistency of the DR1 data was probed, showed some problems in the spectral classification available in LAMOST DR1. In order to accomplish a better classification, we made a twostep process: first the LAMOST and STELIB datasets were classified by the two IA techniques trained with the entire Indo-US dataset. ,e resulted classification allows us to discriminate at least three groups: the first group contained O and B type stars, whereas the second contained A, F, and G type stars, and finally, the third group contained K and M type stars. ,e second step consisted of a refinement of the classification, but this time for every group, the most relevant indices were selected. We compared the accuracy reached by the two techniques when they are trained and tested using LAMOST spectra and their published classification and the resultant classifications obtained with the ANNs trained with the Indo-US dataset and applied over the STELIB and LAMOST spectra. Finally, in the first part, we compared the LAMOST DR1 classification with the classification obtained by the application of the NNs GRNNs and LMA trained with the Indo-US dataset. In the second part of the paper, we analyze a set of 17990 stellar spectra from LAMOST DR5 and the very significant improvement in the spectral classification available in DR5 database was verified. For this, we trained ANNs using the k-fold cross-validation technique with k = 5.


Introduction
Nowadays, the huge quantity of astronomical data coming from different survey projects makes the traditional stellar classification process unsuitable to handle; furthermore, the major part of the spectra collected by these surveys presents a signal-to-noise ratio (S : N) very low. e Large Sky Area Multiobject Fibre Spectroscopic Telescope (LAMOST) is a project planned to conduct a 5-year spectroscopic survey [1,2]; actually, LAMOST will be the telescope with the highest rate of spectral data acquisition. After the two first years, an amount of 2,204,696 spectra was obtained, from which 1,944,329 come from stars, and was published online in its first data release catalog (DR1) [3]. Up to now, LAMOST have released with its fifth data release (DR5) (for data release notes on LAMOST DR5, see link in "Data Availability" section) a total of 9,026,365 spectra. In addition to LAMOST, there are many other projects that are collecting a vast amount of stellar spectra; this highlights the importance of the stellar classification process, which allows the better understanding of the stellar physics and stellar evolution processes. e automatic stellar spectral classification of big astronomical databases is a challenge that has been addressed by computational intelligence techniques; among them the Artificial Neural Networks (ANNs) have proven their effectiveness and accuracy [4][5][6][7][8][9][10]. Results found in the studies mentioned previously showed that, with the Artificial Neural Network approach, it is possible to classify spectra with S : N as low as 20 with errors in the classification process lower than 2 spectral subtypes; therefore, with the advent of massive astronomical databases, the automatic methods are clearly more necessary to extract and analyze the spectral information in a fast and accurate way, even when noise is a strong characteristic in the data.
ANNs also have a wider number of applications in astrophysics; some examples are the automatic determination of star's physical parameters [11,12], distinguishing stars from galaxies [13,14], and galaxy morphologic classification [15]. Some of the most powerful advantages of ANNs are their capacity to handle large volumes of information, resistance to noise in the data as well as to the lack of data, and the capacity to reach an accuracy on the classification process similar to that of an expert [9]. Combined with the use of Artificial Neural Networks and other classifiers, dimensionality reduction methods as Principal Component Analysis (PCA), Isomap, and index measurement have been applied, in order to reduce the size of the input vector and extract the main features for classification [8,10,16]; in these works, the authors reported that dimensionality reduction methods are efficient to extract and retain the most useful information in the spectra and thus can be used in the classification process.
In addition, powerful methods have also been developed that allow the determination of atmospheric physical parameters from stellar spectra. Two examples of such algorithms are the matrix inversion for spectral synthesis algorithm (MATISSE) ( [17]) and the University of Lyon Spectroscopic analysis Software (ULySS) [18].
MATISSE is an automated procedure to estimate atmospheric parameters: Teff, Log g, metallicity, and chemical abundances from the analysis of stellar spectra. is algorithm was developed for the estimation of stellar parameters from the Gaia RVS spectra and was applied to the determination of stellar parameters in the Fiberfed Extended Range Optical Spectrograph (FEROS) archived spectra [19]. e synthetic spectral grid used for the analysis by MATISSE includes only F, G, K, and M type stars. e algorithm shows great performance when applied to spectra with high SNR; however when applied to spectra with low SNR, the errors increase rapidly, as shown in Figures 2 and 3 of Recio-Blanco et al.'s paper [17]. ULySS is a "full spectrum fitting package" [18] that was developed with the purpose of analyzing stellar populations in galaxies and can be used also to determine stellar atmospheric parameters. It is based in a library of synthetic or observed spectra but the effect of noise on the analyzed spectra is not evaluated. e detailed analysis carried out by ULySS demands a greater dedication of time, making its application to a large data collection difficult.
In this paper we used ANNs with the index measurement as the dimensionality reduction method, since the selected lines are directly related to the physical properties of the stars and were proven to be highly sensitive to spectral type [10] and references therein. We used and compared the behavior of two types of neural networks: a feedforward neural network trained according to the Levenberg-Marquardt Algorithm (LMA) [20] and a Generalized Regression Neural Network (GRNN) [21].
In every normalized spectrum, 36 spectral indices were measured; hence the first dataset (LAMOST) represents a 50731 × 36 matrix, the second one (Indo-US) represents a 1130 × 36 matrix, the third dataset (Stelib) was a 188 × 36 matrix, and finally the LAMOST DR5 dataset represents a 17990 × 36 matrix.
As a first test we trained, validated, and tested the ANNs with LAMOST spectra, then the ANNs were trained with the Indo-US dataset, and we compared the performance in their classification; as a second test, we used the ANNs trained with the Indo-US dataset to predict over the LAMOST dataset, and the resultant classifications were compared with the LAMOST assigned spectral type. ese "new" classifications were assigned to the LAMOST spectra and we trained new ANNs with these LAMOST spectra (with the new labels) and compared the classifications. e LAMOST DR5 spectra were used to test the new classification available in the fifth data release. As with DR1 data, we trained, validated, and tested the ANNs with LAMOST DR5 spectra; we used also the ANNs trained with the Indo-US database to classify the DR5 spectra and compared such classification with DR5 "subclass." e compilation details of each spectral database are explained in Section 2, and the ANNs used for the spectral classification are described in Section 3. In Section 4, we present the results of the classification procedures and the conclusions of this work are presented in Section 5.

Data Acquisition and Index Measurement
e first dataset was obtained from the DR1 general catalog of LAMOST; these spectra cover the range 3690-9100Å with resolution of R∼1800. From the 1,944,329 stellar spectra, we selected 50,731 spectra with a signal-to-noise ratio above 20. e spectral type distribution for this dataset is shown in Figure 1; for a better data representation, a base-10 log scale is used for the vertical axis; we observed many gaps in the distribution of spectral types which is quite peculiar. e second dataset was obtained from the Indo-US Library of Coudé Feed Stellar Spectra [22], a spectral library with 1273 stellar spectra with spectral resolution of R∼2500 from 3460 to 9464Å, and a signal-to-noise ratio around 100. e 1273 stars cover all the spectral types, but near 30% of these stellar spectra do not have the full spectral coverage. From this library, we used only 1130; 143 spectra were leaved aside because the gaps in the spectrum affected many indices. A third dataset was used to test and compare the ANNs independently from the first two databases; this dataset contained 188 spectra from the STELIB dataset [23], a homogeneous Spectroscopic Stellar library which contains spectra in the 3200-9500Å spectral range and a resolution of R∼2000. From LAMOST DR5, we download 18000 spectra with signal-to-noise ratio above 20; we did not use 10 of them due to problems in the fits file that does not allow displaying the wavelength axis correctly.
Although spectral types were originally designated as classes, it is more appropriate to treat them as continuous values because it is actually an underlying continuous temperature scale; therefore the spectral types for these objects were coded in numerical form: a B0 star corresponded to 2000, an A0 star corresponded to a 3000 code, and every subtype corresponded to a hundred step; for instance, a B5 star was coded as a 2500, whereas a K6 star was coded as a 6600.
We used the normalized spectra from LAMOST DR1 and normalized each spectrum of the Indo-US, STELIB, and LAMOST DR5 libraries in order to reduce and make the range value of the input data uniform. To construct the final databases, we measured the 36 indices defined in Table 2 of [10], on each spectrum of the four datasets. e normalization and index measurement were performed over the normalized spectra with the IRAF program; the task used was BPLOT from the NOAO suit, which allows us to measure the pseudoequivalent width for the 36 indices in a batch mode for the 50,731 spectra. e index measurement was carried out by direct integration, marking two continuum points around the line to be measured. e same spectral intervals were measured in all databases. Although the LAMOST and Indo-US spectra cover a wider spectral region, the set of indices defined and measured in both datasets are within the spectral region between 3900 and 6800Å, where the indices were defined (in a future work, we will analyze the behavior of other indices defined in the near IR region).

Artificial Neural Networks
Nowadays Artificial Neural Networks have a wide field of applications in pattern recognition, prediction, and classification problems.
ey are one of the most robust and powerful techniques; they have the ability to detect and replicate the complex nonlinear relationship in the data and the capability to model any dataset. Artificial Neural Networks are mathematical tools originally inspired by the neural structure of the brain. ey consist of a series of nodes organized in layers: an input layer, one or more internal (hidden) layers, and an output layer and are interconnected with each other (in analogy with real neurons); each connection is weighted and the weights are randomly initialized between 0 and 1.
Supervised ANNs create their models based on a supervised learning process, by using each sample that is presented in the training dataset. During the training process, the neurons weights are adjusted to reach a specific target from a determined input, until the predicted output from the ANN matches the real target within the predefined tolerance or until the maximum number of iterations is reached. e neurons compute a weighted sum of its inputs and are activated depending on a certain threshold. Once the ANN has been trained, it can be used to make predictions over a new set of data yielding a highly accurate results.

Levenberg-Marquardt Algorithm.
e Levenberg-Marquardt Algorithm (LMA) is used to solve nonlinear least squares problems; as other algorithms, it is an iterative procedure that locates the minimum of a function that is expressed as the sum of squares of nonlinear functions. When it is implemented in ANNs, the LMA is one of the most efficient algorithms, but only when the size of the feedforward neural network is moderated. In the present experiment, we used a Matlab function called "trainlm" that trains a supervised feedforward neural network according to the Levenberg-Marquardt Algorithm [20]. We created and tested different LMA feedforward neural network architectures: two LMA neural networks with one hidden layer, the first one with 30 and the second one with 50 neurons, and one more with two hidden layers with 30 neurons. e arguments passed to the ANNs functions were a learning rate of 0.05 and an iterations maximum limit of 500. Although iteration parameter was set to a limit of 500, the neural network converged in the most cases under 50 iterations.

Generalized Regression Neural Networks.
e Generalized Regression Neural Networks (GRNN) can be used to solve any function approximation problem; they are similar in form to the probabilistic neural networks. GRNN structure is very simple; one of its characteristics is that it does not require an iterative training process as other algorithms like LMA or conjugated gradient. erefore, one of its advantages is that it could be trained very fast and is easily tuned. e main function of the GRNN is to estimate the values for continuous dependent variables through the use of nonparametric estimators of probability density functions [21]. e probability function that the GRNN uses is the Normal Distribution, where each X i training sample is used as the mean of a Normal Distribution. e GRNN has four layers: an input layer, a radial basis layer called the pattern layer, a special linear layer called the summation layer, and finally an output layer. e input layer distributes the data of the variables to the pattern layer; the pattern layer has a fixed number of neurons, which is equal to the number of observations in the training set; therefore, in the pattern layer, each neuron is a training pattern, its outputs represent a measure of the distance between the training sample and the point of prediction X, and all the neurons in the pattern layer are linked to the summation layer. In the summation layer, there are two types of summation neurons: an S-summation neuron and a D-summation neuron. e first S-summation neuron calculates a sum of the weighted outputs from the pattern layer, and the second one determines the sum of the unweighted outputs from the pattern neurons. e output layer divides the output of each S-summation neuron by the output of each D-summation neuron to yield the desired prediction. e only free parameter of the GRNN function to be optimized is the sigma factor; this is a very important parameter of the GRNN neural network; if sigma is too large, then the estimated density is forced to be smooth; hence the possible representation of the point of evaluation by the training sample is possible for a wider range of X; otherwise for a small sigma value, the representation is limited to a narrow range of X, respectively. In the present work, we tested the performance of the GRNN with different sigma values in the classification process.

Classification Procedure and Results
In order to assess the ANNs performance on the spectral classification process, we used a cross-validation technique, which allow us to evaluate or measure in a better way the algorithm performance on its resultant prediction over a new dataset. One of the most common validation techniques is the k-fold cross-validation, which randomly partitions the original data into k equal size subsets. During the process k − 1 subsets are put together to form the training and validation sets, and the remaining subset is retained to test the model accuracy on unseen data. e cross-validation process is then repeated k-times such that each subset is used once as a validation data. Finally, the performance measured by the k-fold cross-validation can be averaged or combined to produce a single estimation.
As a first test, we applied the GRNN and the LMA neural networks techniques to the dataset obtained from LAMOST DR1 database in order to analyze the selfconsistency of the classification presented in the LAMOST first data release. e LAMOST DR1 dataset was randomly divided into 5 equal subsets, 4 with size 10,146 × 36 and one more with size 10,147 × 36. According to the k-fold cross-validation technique, the ANNs were trained and tested during 4 times, and the resulting RMS errors were averaged for the training, validation, and test sets. Table 1 shows these values for the GRNN neural networks created with different sigma values. Table 2 presents the classification RMS results for different LMA architectures; as we can see, the RMS values are around 2.5 spectral subtypes for both GRNN and LMA techniques in the training, validation, and test datasets. Figure 2 presents the specific predictions for one round of targets relative to outputs of the GRNN neural network on the training and validation sets, whereas Figure 3 presents the predictions of the LMA ANN on the training and validation sets. As shown from these figures, the dispersion of the results is very large, presenting errors as high as 10 spectral subtypes or even more in some cases. e considerable dispersion observed in Figures 2 and 3 shows that there must be some inconsistencies in the spectral classification in DR1. e atypical distribution of the spectral types observed in Figure 1 confirms this observation. e pipeline used for the spectral analysis in LAMOST DR1 ( [24]) is based in a set of templates. e pipeline fits each spectrum with a linear combination of eigenspectra and loworder polynomials. Due to this, the resulting spectral classification hardly depends on the quality of such templates. e original templates collect spectra from different sources (SDSS ( [25]) and MILES ( [26])), so their spectral characteristic (resolution, spectral coverage, and instrumentation) differs from the LAMOST ones. Reference [27] constructed a new template library using a set of selected spectra from the same LAMOST DR1; these templates have many advantages over the first one: they have the same dispersion and same spectral coverage and were obtained with the same instrumentation.
In order to test the improvement in spectral classification achieved in latest data releases of LAMOST, we analyzed a set of 17990 spectra with SNR >20 from DR5. ese spectra cover the range 3690-9100Å with resolution of R∼1800, the spectral coverage goes from B4 to M7, and again for the purposes of this work, we coded the classes' types in numerical form. We made the normalization of the spectra using the BPLOT task from IRAF. We measure the same 36 indices as the pseudoequivalent width, and fitting a local continuum using the blue and red intervals is defined in [10].
In the classification process, we applied the GRNN and LMA neural networks to this dataset which was obtained from DR5 of LAMOST. As in our previous experiments, we selected a sigma value of 1 for the GRNN, and the LMA architecture was 36 × 30 × 1. First the dataset was divided randomly in 5 equal subsets of 3598 × 36 size. e neural networks were trained and tested during 4 times; then the results were averaged for the training, validation, and test sets. Results are shown in Table 3. Figure 4 presents the specific predictions for one round of targets relative to outputs of the GRNN neural network on the training and test sets, whereas Figure 5 presents the predictions of the LMA ANN on the training and test sets.
From these results, the great improvement in the spectral classification obtained for the LAMOST DR5 and their selfconsistency is clear.
On the other hand and in order to analyze the unexpected behavior detected in the LAMOST DR1 classification and trying to improve the classification, we carry out a new classification of the LAMOST DR1 database using GRNN and LMA neural networks trained with the Indo-US dataset. With this aim, the spectral resolution of the Indo-US library was degraded to the LAMOST DR1 resolution. e indices measured on such dataset were used to train the GRNN and LMA neural networks in order to compare the classification accuracy obtained with them.
is dataset also was divided randomly into 5 subsets, each with a size of 226 × 36. From Indo-US database, we separated the spectra with gaps over wavelength intervals that include more than two indices. In this way, we selected 1130 spectra.
As shown in Table 4, the GRNN was capable of reaching an averaged test RMS errors around 1.5 spectral   subtypes; in the case of LMA, the test RMS error was around 2 spectral subtypes. Figure 6 displays how well is the training and validation performance of the GRNN neural network, whereas Figure 7 shows likewise the training and validation LMA performance when the Indo-US dataset is used. Table 4 also shows in the fourth column the averaged RMS errors obtained on the prediction over LAMOST DR1 dataset, and finally in the fifth column, the standard deviation over the 5 predictions is presented. It is noticeable that when we used the ANNs (trained with the INDO-US spectra) to predict the LAMOST DR1 spectra, the RMS error increased almost to 6 spectral subtypes for the GRNN and LMA. Figure 8 shows the predicted classes for LAMOST DR1 spectra given by LMA technique trained with Indo-US dataset.
e STELIB dataset was used to assess the accuracy of the neural networks trained with the Indo-US library. Again, the more accurate results were obtained with the GRNN networks when the classification was carried out in two steps as showed in the third column of Table 5.
For a more detailed analysis of LAMOST DR1 spectra classification, we separated the prediction over the different spectral types and we detected that early types (O and early Bs) and M types had the biggest errors in the classification. e errors in the 4 M type stars included in the dataset are highly notorious, so we decided to review such spectra. We detected that the normalized spectra in these cases had some problems. An example is shown in Figure 9; Figure 9(a) shows the M spectrum without normalization; Figure 9(b) shows the LAMOST normalization found in the fits file, while Figure 9(c) shows our proposed normalization for the same spectrum. In those cases, the molecular bands, characteristic of this type of stellar spectra, affected the fit to the continuum and then the normalization. We proposed a different normalization and measured again the indices over these new normalized spectra. e classification of the two ANNs for the M type spectra was highly improved with these new indices.
From previous tests, we selected the best parameters for the GRNN and LMA neural networks; then with those characteristics, they were trained and tested under the k-fold cross-validation method; in each round of the process, we measured the RMS error in the training and validation sets; at the same time, the GRNN and LMA were used to yield a new LAMOST stellar classification; as mentioned before, this   classification was made in two steps. During the rounds of the first classification, each ANN generated 5 different predictions for the LAMOST spectra. We averaged the 5 predictions given by the GRNN during the process and the resultant classes were assigned to a new fourth dataset. e same process was repeated with the classification made by the LMA and it was assigned to a fifth dataset. After the first step, the resultant labels for each dataset were divided into three groups; the first one contained the O and B spectral types, whereas the second one contained the A, F, and G spectral types, and finally, the third group contained the M and K types; each group was classified again, but this time with a preselected number of spectral indices. Table 6 shows the cases for the spectral types, where the accuracy on the classification was improved by using a twostep classification.
Finally, the two new datasets of LAMOST DR1 spectra which include the spectral classification proposed by the ANNs trained with Indo-US database were used to accomplish the final tests. e distribution of the new LAMOST DR1 spectral types yielded by the GRNN and LMA neural networks is shown in Figures 10 and 11, respectively; a base-10 log scale is used for the vertical axis.
First we selected the dataset which contained the LAMOST DR1 spectra and their new classes given by the GRNN. e dataset was divided into 5 equal subsets and were used to train and test the GRNN and LMA neural networks; again under the cross-validation technique, the training, validation, and test sets errors were calculated and averaged. Table 6 presents the averaged RMS errors on training, validation, and test sets using the LAMOST DR1 spectra and the new classification produced by the GRNN model. e same process was done with the second dataset, which contained the LAMOST spectra and their new classification yielded by the LMA neural network. e RMS error for the training, validation, and test sets in each round was calculated and averaged. Table 7 shows the averaged RMS errors obtained by the GRNN and LMA, when the LAMOST DR1 spectra and the new classification given by the LMA model are used. It is noticeable from these tables that the RMS error on test set is reduced almost as low as 1 spectral subtype for both techniques (Table 8). Figure 12 presents the training and test sets predictions yielded by the GRNN using the LAMOST DR1 spectra with their new labels proposed by GRNN technique trained with Indo-US dataset, whereas Figure 13 exhibits the LMA     predictions for training and test sets using the same LAMOST GRNN classification. Figure 14 shows the training and test predictions of the GRNN using the LAMOST DR1 spectra with their new labels given by LMA technique which was trained with the Indo-US dataset. Figure 15

Conclusions
In this paper, we have presented the application of two ANN techniques: a Generalized Regression Neural Network and a feedforward neural network; these two ANNs techniques were used to analyze the LAMOST stellar classification. From the first results, it is evident that both techniques were unable to predict accurately the LAMOST stellar spectra when they are trained and tested with the actual LAMOST classes. is makes some inconsistencies in the classification that already appears in the LAMOST DR1 catalog evident. e averaged RMS error in training and test sets during the 5 rounds was around 2.5 spectral subtypes for both neural networks.
In the subsequent data releases of LAMOST, the templates and pipelines used for automated spectral classification were improved. To test the new spectral classification, we apply the same procedure (index measurement and ANNs training, validation, and subsequent classification) to analyze a set of 17990 spectra from LAMOST DR5. e results confirm the improvement in the spectral classification.
Parallel to this evaluation of LAMOST DR5 and in order to test a new LAMOST spectral classification for DR1, the GRNN and LMA neural networks were trained with a consistent set of spectra which were created using the Indo-US spectral library and tested with the STELIB dataset. e most accurate ANN classifications of these databases were obtained, when we classify the databases in two steps as made in [10]. We choose such trained GRNN networks to predict the classification of LAMOST dataset. When we compared its predictions to the actual LAMOST classes, we found a discrepancy; for both ANNs, there was an RMS error around 6 spectral subtypes.
e new LAMOST predictions were analyzed divided by spectral type. We found the biggest differences with the original LAMOST classification in the early types (O and early Bs) and in the K and M types. e problem for M type stars was detected in the normalization of the spectra. We correct the normalized spectra and the indices measured on them produce a spectral type prediction near to the reported by LAMOST.
We conclude that since the index determination is highly sensitive to the normalization, this process must be made with extreme attention, so we propose to design an automatic reviewer that permits us to detect such type of problems in the normalization and index measurement. We are working on this problem and the results will be published in the near future. e problem for the classification of early and late type stellar spectra supports the convenience of dividing the spectra in at least three groups, as proposed by [10]. e spectral classification presented in this work allows to us separate the LAMOST spectral database in such three groups and then classify each group with the more adequate indices. We found that this process reduces in at least 30% the RMS errors obtained with GRNN and LMA neural networks, especially for the K and M type spectra.

Data Availability
All databases used in this work are public and can be accessed through the following links: LAMOST Data Release