A Method for Solving LiDAR Waveform Decomposition Parameters Based on a Variable Projection Algorithm

Light detection and ranging (LiDAR) is commonly used to create high-resolution maps; however, the efficiency and convergence of parameter estimation are difficult. To address this issue, we evaluated the structural characteristics of received LiDAR signals by decomposing them into Gaussian functions and applied the variable projection algorithm of the separable nonlinear least-squares problem to the process of waveform fitting. First, using a variable projection algorithm, we separated the linear (amplitude) and nonlinear (center position and width) parameters in the Gaussian function model; the linear parameters are expressed with nonlinear parameters by the function.-ereafter, the optimal estimation of the characteristic parameters of the Gaussian function components was transformed into a least-squares problem only comprising nonlinear parameters. Finally, the Levenberg–Marquardt algorithm was used to solve these nonlinear parameters, whereas the linear parameters were calculated simultaneously in each iteration, and the estimation results satisfying the nonlinear least-square criterion were obtained. Five groups of waveform decomposition simulation data and ICESat/GLAS satellite LiDARwaveform data were used for the parameter estimation experiments. During the experiments, for the same accuracy, the separable nonlinear least-squares optimization method required fewer iterations and lesser calculation time than the traditional method of not separating parameters; the maximum number of iterations was reached before the traditional method converged to the optimal estimate. -e method of separating variables only required 14 iterations to obtain the optimal estimate, reducing the computational time from 1128 s to 130 s. -erefore, the application of the separable nonlinear least-squares problem can improve the calculation efficiency and convergence speed of the parameter solution process. It can also provide a new method for parameter estimation in the Gaussian model for LiDAR waveform decomposition.


Introduction
Light detection and ranging (LiDAR) is an active remote sensing measurement system that integrates a laser altimeter, a global navigation satellite system (GNSS), and an inertial measurement unit (IMU) [1]. Currently, LiDAR is widely employed for distance measurements [2], vegetation inversion [3,4], urban modelling [5], marine mapping [6], and other such applications. In such applications, the received waveform is mathematically described as the convolution of the emitted laser pulse and ground response function [7]. e geometry, physical characteristics, and vertical distribution of the objects irradiated by the laser pulses can be obtained by analyzing the properties of the received waveform, such as its amplitude and pulse width.
Waveform decomposition is an important step when processing full-waveform LiDAR data. e reflected pulse corresponding to each object is extracted from the received signal, using waveform modelling. Using LiDAR fullwaveform data is a prerequisite for terrain and building reconstruction [8]. As LiDAR data are dependent on several factors, such as the background light and electrical noise from the avalanche photodiode, the selection of an accurate decomposition model and an efficient and fast algorithm is important to ensure accurate and precise results. For a majority of the received signals, amplitude modulation approximates a Gaussian distribution; therefore, waveform signal decomposition based on Gaussian fitting is the most popular method used for LiDAR full-waveform decomposition [9,10]. Hofton et al. [11] screened the waveform data according to the importance of the Gaussian component and then solved the optimal value of each waveform component parameter using the nonlinear optimization Levenberg-Marquardt (LM) algorithm. Wagner et al. [12] analyzed the theoretical basis of simulating a LiDAR waveform using a Gaussian mixture model and obtained the pulse amplitude, half-width, and distance via Gaussian decomposition.
ese factors were included in the calibration equation, and the data were calibrated using an external reference to estimate the target backscattering cross section. Lai et al. [13] decomposed the waveform components from the original waveform data, layer by layer, until the maximum peak value of the remaining waveform was less than a certain threshold. ereafter, the initial parameters were optimized using the limited-memory Broyden-Fletcher-Goldfarb-Shanno algorithm, and the optimal solution of the waveform component was finally obtained. Lu et al. [14] improved the second-order reciprocal zero-point method to detect the initial position of waveform components and then iteratively optimized the initial value using the LM algorithm to achieve least-squares fitting of the waveforms.
In general, the Gaussian decomposition of waveforms consists of two steps: (1) determining the number of Gaussian components and their initial parameters and (2) optimizing and postprocessing the characteristic parameters. A majority of parameter estimation methods regard all Gaussian parameters as nonlinear variables [15] and then employ the nonlinear optimization method to solve the parameters. Liu et al. [16] combined this method with sequential quadratic programming (SQP), developing a gradient-based optimization algorithm; it determined the optimal time delays and system parameters in a novel dynamic optimization problem for nonlinear multistage systems. To address the randomness of setting the final iteration condition artificially in the process of obtaining the optimal valuation, Zhai et al. [17] proposed a novel characteristic value correction iteration method, and L curve was drawn to express the correlation of solutions and residuals and the iteration time corresponding to the maximum curvature point is chosen as the iteration termination condition to obtain the "optimal" intermediate estimates. To enhance the estimation accuracy, Ding et al. [18,19] presented a filtering-based gradient iterative algorithm and a filteringbased least-squares iterative algorithm, which yielded an improved convergence speed. However, a majority of the classic LiDAR waveform decomposition methods focus on determining the initial parameters and the iterative algorithm of parameter optimization. ese methods neglect the feature that the full-waveform model is a linear combination of multiple Gaussian components. Moreover, the traditional nonlinear optimization method relies on the initial value of all parameters involved in the calculation, which requires a high demand on the parameter optimization algorithm. If some of these initial parameters are not selected appropriately, the waveform parameters may not converge to the optimal estimation value, thereby worsening the waveform processing effect.
Golub and Pereyra proposed an explicit variable projection (VP) method for solving the separable nonlinear least-squares problem in 1973 [20]. Compared with the corresponding algorithm, which directly solves the nonlinear minimum resultant problem, it featured three advantages: fewer iteration steps, fewer initial point guesses, and potentially reduced ill-conditioned degree when the original problem is ill conditioned. Based on this method, the improvement and application [21][22][23] of VP were conducted. Kaufman [24] proposed a modified VP algorithm based on trapezoidal decomposition to simplify the calculation, which reduced its computational complexity and improved computational efficiency. Subsequently, Ruano et al. [25] proposed an improved VP algorithm based on orthogonal triangle (QR) decomposition for the sparse nonlinear function matrix, which also improved operation efficiency. According to different calculation methods of the Jacobian matrix of variable projection algorithm, Gan et al. [26] summed up four separation algorithms to solve the problem of separable nonlinear least squares and proved that it is usually very beneficial to separate the variables into a linear and a nonlinear part. Furthermore, for cases where the number of observations was significantly higher than the number of linear parameters, Gan and Li [27] proposed a VP algorithm based on Gram-Schmidt matrix decomposition, which reduces the amount of calculation required. In addition, Chen et al. [28] proposed a modified Gram-Schmidt method-based variable projection algorithm for separable nonlinear models and verified its effectiveness experimentally. Wang et al. [29] proposed a variable projection algorithm based on singular value decomposition (SVD) to solve the separable nonlinear least-squares problem. SVD was adopted to simplify the VP algorithm and improve the stability of the calculation. Subsequently, after the elimination of the linear parameters by the improved VP algorithm, the separable least-squares problem was transformed into an optimization problem only containing nonlinear parameters.
In this study, we applied the variable projection algorithm based on the SVD of the separable nonlinear leastsquares problem for the waveform fitting of received LiDAR signals. Linear parameters were expressed with nonlinear parameters by the function, and the problem was then transformed into a least-squares problem only containing nonlinear parameters. Finally, the improved LM algorithm [30,31] was employed to solve the nonlinear parameters, whereas the linear parameters were solved using the variable projection algorithm for nonlinear functions. e remainder of this paper is organized as follows. In Section 2, the method for determining the number of components and the initial values of the Gaussian decomposition parameters for LiDAR waveforms is presented, along with a brief description of the traditional method for estimating waveform parameters. Section 3 presents the proposed method for separating linear and nonlinear parameters in the waveform model based on a variable projection algorithm; the optimization algorithm and the processing of waveform parameters after separation are also discussed in this section. Subsequently, the methods with and without separation of the linear and nonlinear parameters, i.e., the proposed and existing methods, 2 Complexity respectively, are compared using numerical experiments, as described in Section 4. Finally, the conclusions and implications of the study are discussed in Section 5.

Gaussian Decomposition of LiDAR Signals
A Gaussian mixture model with N Gaussian functions was used to describe the received LiDAR waveform according to the radar range formula. After fitting and decomposing, the received pulses related to single objects are separated and extracted. Finally, the attribute information of each signal received from the ground is obtained.
Assuming that a received waveform has m sampling points and is composed of N waveform components, the waveform based on the Gaussian function model is expressed as follows: where a s , μ s , and σ s are the pulse amplitude, center position, and half-width of the s th waveform component, respectively.
Here, noise denotes the noise in the original waveform. e aim of full-waveform Gaussian decomposition is to optimize 3N parameters of Gaussian components, where the number of linear parameters (a s ) is N and the number of nonlinear parameters (μ s and σ s ) is 2N.

Determining the Number of Gaussian Waveform Components and Parameter Initialization.
To determine the components of the Gaussian waveform, the amplitude threshold is set and the first derivative of the function is calculated. When the maximum value is greater than the amplitude threshold, it is determined as the peak value. ereafter, the second derivative of the function is calculated, and the inflection point is obtained. e time difference between two adjacent inflection points should be greater than the half-width of the transmitted pulse. Finally, the peak and inflection points are matched, where the position of the peak should lie between the two inflection points.
After determining the number of Gaussian function components, the feature parameters are initialized, where t 2s− 1 and t 2s are set as the sampling times of adjacent inflection points. Subsequently, the center position μ s and the half-width σ s of the s th Gaussian component are calculated using equations (2) and (3), respectively. e pulse amplitude a s refers to the amplitude corresponding to the center position:

LM Optimization without Waveform Parameter
Separation. After determining the number of components and initial parameters of the Gaussian function, the most commonly used method regards both linear and nonlinear parameters as nonlinear parameters and subsequently solves them using the LM algorithm, for the nonlinear leastsquares optimization [20]. e objective function is constructed by minimizing the sum of the squares of errors between the fitted and measured waveforms after establishing the Gaussian function model. According to the principle of nonlinear least squares, the solutions of linear parameters a � (a 1 , a 2 , · · · , a N ) Τ and nonlinear parameters b � (μ 1 , μ 2 , · · · , μ N , σ 1 , σ 2 , · · · , σ N ) Τ minimize the following functions: is the s th nonlinear function component, r is the residual vector, and ‖·‖ is the Euclidean norm. e parameters to be determined are e function model is expressed as e objective function is established under the leastsquares principle: e Jacobian matrix J(c) consisting of the first-order partial derivatives of the residual vector r is us, the Hessian matrix of r is approximated as erefore, the LM optimization algorithm is used to iterate the parameters: is the iteration direction, and α (k) is the iteration length. To ensure that the objective function, i.e., Equation (6), is always in a descending state, α (k) is calculated using the Armijo criterion [32] of imprecise search. e term λ is adjusted using a strategy similar to the radius of the trust region [33]. First, a quadratic function is defined at the current iteration point to represent the quadratic approximation of the objective function: ereafter, the ratio of the actual decrease in v(d) to the expected decrease in the objective function F is calculated based on the iteration direction: For an initial value of λ, d (k) is calculated at each step in the LM algorithm. Subsequently, λ (k) is adjusted according to the value of η (k) . Finally, a line search is performed to complete the iteration. When η k is close to one, the fitting between the quadratic function v(d) and the objective function is good at point x k . e parameter λ should be smaller when the LM algorithm is used to solve the nonlinear least-squares problem. When η k is close to zero, the fitting between the quadratic function v(d) and the objective function is poor at point x k . e parameter λ should be larger when LM is used to solve the nonlinear least-squares problem. When η k is neither close to zero nor one, λ is suitable and does not require adjustment. e critical values are typically 0.25 and 0.75, and the rules of adjusting λ are as follows:

Gaussian Model Based on a Variable Projection Algorithm.
Considering the structural characteristics of the functional model for the received LiDAR signal, we derived a new equation as a linear combination of exp(− (x − μ s ) 2 /2σ 2 s ) with respect to a s . Equation (4) is written in the matrix form as where y is the vector of the original signal. e column of Φ(b) corresponding to the nonlinear function is ϕ k (b; x) for parameter b. For the given nonlinear parameters b, the linear parameter a can be estimated by solving the following linear least-squares problem: where Φ + (b) is the pseudoinverse of Φ(b). Substituting (20) in (19) yields where P Φ(b) is the orthogonal projector on the linear space spanned by the columns of Φ(b) and P ⊥ Φ(b) is the projector on the orthogonal complement of vector y in the column space. To simplify the calculation, the matrix composed of nonlinear functions is decomposed using SVD: where U is an m × m orthogonal matrix, S is an m × p diagonal matrix, and V is a p × p orthogonal matrix. We obtained Φ(b) as us, If the rank of the matrix Φ(b) is r, the first r elements on the diagonal of S are nonzero, i.e., r ≤ p. erefore, U can be divided into an m × r matrix U 1 and an m × (m − r) matrix U 2 and V can be divided into a p × r matrix V 1 and a p × (p − r) matrix V 2 : 4 Complexity erefore, the minimization problem of the objective function (15) is simplified as

Optimal Estimation Method for Waveform Gaussian
Parameters. To fit the separated LiDAR signal with the Gaussian decomposition model, we combined the differential equation of the pseudoinverse matrix and the variable projection algorithm. e Jacobian matrix [20] of the separated objective function is e objective function only containing nonlinear parameters is optimized using the LM algorithm. To ensure global optimization, once the nonlinear parameters are calculated for a given iteration, the linear parameters are calculated using equation (15) until the overall gradient is less than a threshold value. For the convenience of expression, in the following text, the traditional LM method without the separation of parameters is referred to as LM unSep . Moreover, the separable nonlinear least-squares solution method (i.e., LM method with the SVD variable projection algorithm for parameter separation) is referred to as LM VP . e optimization process for the waveform decomposition parameters based on the LM VP method is summarized in Algorithm 1.   Table 1.

Numerical Experiment
Due to the influence of noise, the actual waveform distribution was not completely Gaussian. Based on the simulation data, random noise obeying a standard deviation of 0.5 was added to obtain the final five groups of discrete received signals. First, according to the parameter initialization method, the initial values were selected. ereafter, the waveform parameters were iteratively calculated using both methods for a comparison.

Comparisons of Solution Results in Simulation
Experiment. For the five groups of data, the results of the optimization methods with and without parameter separation are listed in Table 2.
From Table 2, it is evident that the results for both methods were identical, except in the case of group 4. For a clearer visualization of the decomposition and fitting effect, the original observation value, decomposition waveform, and fitting waveform obtained using both methods for the five groups of signals are depicted in Figures 1 and 2. As the fitting waveform was identical to the decomposition waveform obtained using the LM unsep method and the LM VP method, except in the case of group 4, groups 1, 2, 3, and 5 were represented as a graph, as depicted in Figure 1. Figures 2(a) and 2(b) present the results for group 4 obtained using the LM unsep method and the LM VP method, respectively.
As shown in Figure 1, the decomposition waveforms calculated via the two methods are identical. In Figure 2, the decomposition waveforms and the fitting waveforms obtained for group 4 are inconsistent. After comparing Figures 2(a) and 2(b) and the parameter results in Table 2, we observe that the optimization method and the decomposition waveform based on variable projection separation fit the received signal more closely than those of the traditional optimization method. Moreover, when the iterations reach the maximum limit, the traditional parameter optimization method did not converge on the optimal value.

Analysis of Simulation Example Results.
To analyze the effect of the parameter optimization method based on variable projection separation, we compared it with the
Step 3: compute the iteration length α (k) using the Armijo criterion and the iteration direction d (k) .
Step 5: compute a (k) according to the variable projection algorithm.
where y i is the original received signal and y i ′ is the fitted received signal.
where y i is the average of y i .
where y i ′ is the mean of y i ′ . RMSE values closer to 0 indicate a better fit, whereas R 2 and C values closer to 1 indicate a better correlation between the fitting result and the original signal.

Complexity
where M_diff is the maximum absolute value of the local difference between the received signal and the fitted signal.
Subsequently, the evaluation coefficient results were compared, as shown in Table 3. Table 3 indicates that the calculation for group 4 data using the LM unSep method required the highest number of iterations (100), and its results differed significantly from the optimal estimate (as shown in Figure 2). However, the proposed improved LM VP method yielded the optimal estimate after just 13 iterations. Similarly, for the other groups of data, the improved method required fewer iterations than the LM unSep method, while yielding a significantly improved computational efficiency. e RMSE, R 2 , C, and M_diff values were identical for both methods and all data, except in the case of group 4, wherein both methods obtained the optimal estimation.

Convergence Analysis in Simulation Experiment.
To further compare the results of the LM unSep and LM VP methods, the parameter iteration process for the five groups of data is depicted in Figure 3. Figure 3(a) presents a comparison between the iteration processes of the linear parameter a s for the five groups using the two different methods. Figures 3(b) and 3(c) present the same comparison for the nonlinear parameters μ s and σ s , respectively. Both the methods converged to the optimal value; however, the LM VP method required fewer iterations, while achieving faster convergence.

Example 2: Decomposition of Real Full-Waveform LiDAR
Data. In the real experiment, we used the ICESat/GLAS satellite LiDAR waveform data for the terrain of Germany in 2003 (Sources: https://nsidc.org/data/), with a sampling interval of 1 ns and a sampling number of 544.

Comparisons of Solution Results in a Real Experiment.
For the LM unSep method, we conducted two experiments that were restricted by different maximum number of iterations, i.e., 100 and 200 iterations. e results of the parameters were calculated using the LM unSep and LM VP methods (Table 4).
In the case of the LM unSep method, both experiments reached the maximum number of iterations; the results for 200 iterations was always closer to the LM VP method than those for 100 iterations. When the LM unSep method was allowed to run 200 iterations, it approached the results from the LM VP method; however, it never met them. erefore, to obtain an optimal estimation value from the LM unSep method, more than 200 iterations and additional time is required. Figure 4 presents a clearer comparison between the decomposition and fitting effect for the two models, original observation value, decomposition waveform, and fitted waveform of the received LiDAR signals. Figure 4(a) presents the LM unSep graph for the 200-iteration run, and Figure 4(b) depicts the LM VP graph.
As in the comparison of the parameters, the fitting and decomposition waveforms for the LM unSep and LM VP methods yield similar results; however, the LM VP method always yields the best fit.

Analysis of the Real Experiment Example Results.
We analyzed the parameters in the LM unSep and LM VP method calculations to determine the evaluation coefficients (Table 5).
e two experiments using the LM unSep method failed to achieve the optimal result with the maximum number of iterations; however, the LM VP method achieved the optimal result within 14 iterations, i.e., a reduction of 14 times considering the 200-iteration case. e RMSE from the LM VP method was the lowest, whereas the RMSE from the LM unSep method using 200 instead of 100 iterations was closer to that of the LM VP method. Considering the goodness-of-fit and correlation coefficient, the LM VP method yielded values closer to 1 than the traditional method, indicating better fitting. Similarly, M_diff was lower for the improved method. In conclusion, in the real numerical experiment, the LM VP method achieved higher accuracy and better fitting than the LM unSep method.

Convergence Analysis in Real Experiment.
e parameter iteration process depicted in Figure 5 provides a more clear comparison between the results of both models in the real experiments.

Complexity
Although the LM unSep and LM VP methods converge to a similar value, the LM VP method requires fewer iterations and achieves faster convergence. For the LM unSep method, there are only minor changes in the optimal value during the later iteration steps; however, for the LM VP method, the optimal value converges in 14 iterations, achieving a higher convergence speed. erefore, in summary, the simulation and real experiments prove that the LM VP method yields better results than the LM unSep method and more efficient calculations.

Conclusions
is study focused on the special case of LiDAR signals that can be decomposed into a linear combination of multiple Gaussian function components. e linear and nonlinear parameters are separated using the SVD variable projection algorithm, and the original problem is transformed into a least-squares problem comprising only nonlinear parameters. Compared with the traditional method of parameter estimation without separation, the proposed method increased the probability of global convergence by reducing the dimension of the parameters to be estimated, especially in terms of reducing the number of iterations and computational time. In addition, the new method yielded better waveform fitting, which increased the accuracy of extracting the ground feature information from the received signal. e findings of this study are expected to expand the application of separable nonlinear least-squares methods, thereby providing a reliable solution for parameter estimation when processing LiDAR signal data.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.  the Graduates Innovation Fund of Shandong University of Science and Technology (grant no. SDKDYC190207).