^{1}

^{2}

^{1}

^{2}

In order to model the output laser power of a copper bromide laser with wavelengths of 510.6 and 578.2 nm we have applied two regression techniques—multiple linear regression and multivariate adaptive regression splines. The models have been constructed on the basis of PCA factors for historical data. The influence of first- and second-order interactions between predictors has been taken into account. The models are easily interpreted and have good prediction power, which is established from the results of their validation. The comparison of the derived models shows that these based on multivariate adaptive regression splines have an advantage over the others. The obtained results allow for the clarification of relationships between laser generation and the observed laser input variables, for better determining their influence on laser generation, in order to improve the experimental setup and laser production technology. They can be useful for evaluation of known experiments as well as for prediction of future experiments. The developed modeling methodology is also applicable for a wide range of similar laser devices—metal vapor lasers and gas lasers.

The object of research of this paper is a low-temperature copper bromide vapor (CuBr) laser, with a wavelength of 510.6 nm and 578.2 nm. This type of laser is one of the most promising of the group of metal vapor lasers. It is characterized as the most efficient laser in the visible spectrum which allows for many practical applications [

Another fundamentally different approach to metal vapor laser research is the use of accumulated experiment data to construct statistic models and to design experiments. This is a fairly new approach in the field of lasers. In principle, it is considered that the physics processes and phenomena connected with lasers are deterministic. In practice, actual experimental measurements do not provide readings for all related physics and technical parameters and phenomena; they do not take into account specific internal and external conditions; furthermore, there is an error factor in the accuracy of the measurement itself. Consequently, experimental data contains a number of random components which makes it suitable for analysis (processing) using statistical methods [

In [

The object of this paper is the construction and comparison of several types of parametric and nonparametric regression models for estimation and prediction of output laser power (laser generation) of CuBr laser devices. This is achieved using the PCA factors of the data and the following statistical methods: Multiple Linear Regression (MLR) and multivariate adaptive regression splines (MARS).

Modeling was based on experimental data, obtained at the Laboratory of Metal Vapor Lasers with the Georgi Nadjakov Institute of Solid State Physics, Bulgarian Academy of Sciences. For the purpose we used the statistical package SPSS, Mathematica and MARS predictive software [

This study includes experimental data for various CuBr lasers, published in [

The independent input laser variables included in the analysis are as follows.

The response variable is laser generation,

It has to be added that laser generation is also affected by other quantities such as pulse repetition frequency, neon gas pressure, capacity of the capacitor bank, and temperature of the CuBr reservoirs. Their values for the lasers being studied have been experimentally optimized and exhibit statistical nonsignificance (see also [

Both for all the data and for the sample, obtaining regression models based on input variables is impeded by their multicolinearity. For this reason the first process used is multiple factor analysis in order to obtain orthogonal to each other factor variables describing the data cloud. Using the SPSS software for our data sample we obtained the Kaiser-Meyer-Olkin measure of sampling adequacy KMO = 0.660 and Bartlett’s test of sphericity with significance level equal to 0.000. The respective measures of sampling adequacy (MSA) are also of significance for each variable. This indicates that the factor analysis of the sample is adequate and can be carried out. The factors have been extracted using PCA. Usually the number of factors chosen is equal to the number of eigenvalues of the correlation matrix greater than 1. However, as shown in [

Table

Rotated component matrix. Factor loadings below 0.5 have been omitte

Component | |||

Variable | 1 | 2 | 3 |

0.913 | |||

0.887 | |||

0.807 | |||

0.769 | |||

0.929 |

The factor scores which are used in all methods of this study have also been calculated at this stage of the statistical calculations.

Resulting factors

(a)–(c) Relationship between

(a)–(c) Relationship between

Based on these graphical relationships, in an exploratory manner, we will later on construct regression models using three groups of variables as predictors: first group

The corresponding models will be noted as 0, 1, and 2 order models, respectively.

The results from the modeling have been presented in this and the following sections. For parametric methods it is assumed that data and population distribution are nearly normal. All calculations and analyses have been carried out at level of significance 0.05. The comparison between models has been conducted via the commonly used indices, such as multiple correlation coefficient

With the help of the three orthogonal PCA factors

The conducted ANOVA produced the statistics given in Table

Results from constructed parametric and nonparametric regression models for estimation of output laser power

Model | MARS GCV | Std. Err. of the Estimate | Number of predictors | ||
---|---|---|---|---|---|

MLR 0th order, no interactions | 0.946 | 0.944 | — | 7.92540 | 3 |

MLR 1st order interactions | 0.950 | 0.948 | — | 7.63382 | 4 |

MLR 2nd order interactions | 0.967 | 0.965 | — | 6.27075 | 7 |

MARS 0th order, no interactions | 0.965 | 0.963 | 0.952 | 6.50470 | 3 |

The basic statistics of the constructed models are presented in Table

Figure

Values of the experimental

The histogram of the model residuals showed that the residuals of the MLR model (

Quantile versus quantile scaterplot of the regression standardized residuals of the model (

This way, the diagnostics shows that the model (

Using the nine predictors (

For the second-order model the obtained equations are, respectively,

The basic statistics of these models are given in Table

The MARS method is a relatively new but adaptable instrument for the construction of nonparametric regression models. It was developed by Friedman in [

An important advantage of MARS over the parametric approach is that it describes local changes in the data behavior. What is more, nonlinear relationships fit local interactions between generated basis functions in the respective subregions. In principle, we have to note the possibility of a problematic sudden increase in the number of possible interactions when dealing with a large number of (several thousand) BF and a large number of subregions. However, this is not the case with our data.

Another important advantage of MARS is that being a nonparametric technique it overcomes the requirement for normal distribution of data, which makes it applicable to a much broader range of problems. Furthermore, MARS can be applied to both big and small size data samples and its basis functions making resulting models easy to interpret and subsequently utilize [

Within this study only the best MARS models are calculated, respectively, to the same cases as for MLR models, obtained in Section

The first model (MARS-0th order) with three predicators

Their graphs are shown in Figures

(a)–(c) Graphs of basis functions of MARS model (

The estimated values of laser generation are calculated using the formula:

The model (

The relative factor variable importance for the model (

Relative Variable Importance for the MARS model with 3 PCA factors, no interactions.

Variable | Importance | -GSV |
---|---|---|

100.00000 | 1099.82605 | |

32.78836 | 167.41476 | |

16.64488 | 84.04294 |

With the help of MARS model (

The second MARS model which we will describe in more detail is the one which accounts for possible first order interactions. The resulting best model includes the following ten basis functions and a constant:

We can see that basis functions in the best model include only five predictors:

The respective equation that can be used to calculate the estimates of

The model (

Graphs of contribution distribution of factor variables in MARS model (

In order to have a reliable estimate of the prediction power of each model the following cross validation technique is used. The initial data sample was splitted randomly into one raining and one evaluation data set, containing approximately 70% and 30% of the total cases, respectively. The training data sets were used to generate the models which were then tested with the independent evaluation data sets.

The following model is obtained using the MLR method with three predicators for 70% of the training data set:

Using this model we have calculated the

Results of the cross validation of the models utilizing independent data sets (70% training set and 30% evaluation set).

Model | Training set 70% | Evaluation set 30% | Number of predictors | |||

MARS | ||||||

GCV | ||||||

MLR 0th-order, no interactions | 0.945 | 0.942 | — | 0.948 | 0.947 | 3 |

MLR 1st-order interactions | 0.948 | 0.945 | — | 0.955 | 0.954 | 6 |

MLR 2nd-order interactions | 0.968 | 0.964 | — | 0.967 | 0.965 | 7 |

MARS 0th-order, no interactions | 0.963 | 0.960 | 0.952 | 0.966 | 0.965 | 3 |

For the MARS models the same validation technique is applied, utilizing the same two independent estimation data sets which were used for MLR, respectively, with 70% and 30% of the data sample. The results from the cross-validation of all MARS models are given in Table

The initial data set includes six independent input laser variables, some of which indicate high multicolinearity. The problem of predictor multicolinearity is commonly encountered not only in engineering data but also in ecology, medicine, and many other types of data. In order to solve this problem we utilize preliminary the method of multiple factor analysis, based on PCA. We obtained three orthogonal to each other factor variables. These factors are then used to construct regression models. In essence this is the well-known projection method which is also called Principal Component Regression. We have to note that usually the application of this technique limits the accuracy of models because factors do not carry all the information contained in the sample. For this reason, all constructed models are to an extent—“rough” [

If we compare the obtained results from Table

As a whole, results obtained by modeling laser output power

Let us now consider the physics interpretation of the models. All models include the three basic factor variables (

The comparison between parametric and nonparametric methods for modeling output laser power of a CuBr vapor laser shows that generally nonparametric models have slightly better characteristics. The constructed MARS models allowed for a more adequate description of the data in question at the same time overcoming the problems with multicolinearity, local nonlinearities, and interactions between first- and second-order interactions between predictors. Although the regression does not give causation, the models can be of great use for evaluation of known experiments as well as for prediction and direction regarding future experiments. The presented methods are also applicable to experimental data of similar laser devices in the group of metal vapor and gas lasers.

This study was conducted with the financial support of the Scientific National Fund of Bulgarian Ministry of Education, Youth and Science, project no. VU-MI-205/2006, and the Scientific Fund of Plovdiv University “Paisii Hilendarski”-NPD, projects IS-M4 and RS2009-M-13.