Estimation of Oceanic Eddy Viscosity Profile and Wind Stress Drag Coefficient Using Adjoint Method

Adjointmethod is used to assimilate pseudoobservations to simultaneously estimate theOEVP and theWSDC in an oceanic Ekman layer model. Five groups of experiments are designed to investigate the influences that the optimization algorithms, step-length, inverse integral time of the adjoint model, prescribed vertical distribution of eddy viscosity, and regularization parameter exert on the inversion results. Experimental results show that the best estimation results are obtained with the GD algorithm; the best estimation results are obtained when the step-length is equal to 1 in Group 2; in Group 3, 8 days of inverse integral time yields the best estimation results, and good assimilation efficiency is achieved by increasing iteration steps when the inverse integral time is reduced; in Group 4, the OEVP can be estimated for some specific distributions; however, when the VEVCs increase along with the depth at the bottom of water, the estimation results are relatively poor. For this problem, we use extrapolation method to deal with the VEVCs in layers in which the estimation results are poor; the regularization method with appropriate regularization parameter can indeed improve the experiment result to some extent. In all experiments in Groups 2-3, the WSDCs are inverted successfully within 100 iterations.


Introduction
The development of ocean observation technology provides a wealth of fruits, especially the satellite data as well as acoustic doppler current profilers (ADCP) data.It is a mass of observations that make the application of data assimilation techniques to oceanography possible.In recent years, researchers have developed some data assimilation methods, such as successive corrections [1], optimal interpolation [2,3], Kalman filtering [4,5], and variational assimilation method (most notably adjoint method) [6].
During the past few decades, many researches have showed that the adjoint method occupies an increasingly important position in data assimilation [7].The adjoint method is an advanced method with a combination of variation principle and optimal control theory [8].The method has an advantage of directly assimilating various observations distributed in time and space into numerical models while maintaining dynamical and physical consistency with the model.The adjoint method can be used to solve numerous different types of practical problems, such as estimating model parameters, inverting initial conditions, and researching steady flows and circulation problems [9,10] and surface heat flux problems [11].So far, the adjoint method has been widely used for parameter estimation and initial conditions inversion.Yu and O'Brien [12] estimated the wind stress drag profile and the oceanic eddy viscosity profile (OEVP) based on a modified Ekman layer model.Lardner et al. [13] estimated the bottom drag coefficients for a two-dimensional numerical tidal model of the Arabian Gulf.Chen et al. [14] constructed an adjoint model for simulation of internal tides and simply tested with a series of ideal experiments in which several prescribed spatial distributions of OBCs were successfully inverted by assimilating the model-generated pseudoobservations.Later on, studied the estimation of open boundary conditions and the bottom friction coefficients based on the adjoint model.Cao et al. [15] optimized the open boundary conditions in a 3D internal tidal around Hawaii.Seiler [16] estimated open boundary conditions with the adjoint method based on a quasi-geostrophic ocean model for a midlatitude jet.Yu and O'Brien [17] researched the initial conditions adjustment and parameter estimation simultaneously with the adjoint method.Peng et al. [18,19] corrected the errors in the initial conditions in storm surge simulation using the adjoint method.Predecessors have done a vast amount of work to research the OEVP.Lentz [20] studied the sensitivity of the inner-shelf circulation to the form of the eddy viscosity profile.Pohlmann [21] calculated the annual cycle of the vertical eddy viscosity in the North Sea with a three-dimensional baroclinic shelf sea circulation model.Tian et al. [22] studied the vertical distribution of diapycnal diffusivity in the South China Sea, the Luzon Strait, and the North Pacific based on the observations.In this paper, the adjoint method in conjunction with an oceanic Ekman layer model is used to obtain the optimal estimates of the vertical distribution of eddy viscosity and the surface wind drag coefficients from pseudoobservations of wind velocities and current velocity profiles within the water columns.At present, the error of ADCP survey technique could be controlled within 5 meters.Therefore, the Ekman layer (the thickness is 100 meters in this paper) could be divided into 20 layers evenly.
To solve the problem above, good many feasible largescale optimization algorithms can be found in [23].However, the studies discussing the performances of various optimization algorithms in the meteorological and oceanographic application are still relatively rare.The gradient descent (GD) algorithm is a fundamental and probably the earliest algorithm for nonlinear optimization so that it attracts many researchers' attention since it was proposed.It has the negative gradient direction as the search direction and is one of the simplest minimization algorithms that require calculation of derivatives.The GD algorithm is particularly useful when the dimension of the problem is very large.The limited-memory BFGS (L-BFGS) algorithm is an adaptation of the standard BFGS algorithm to large-scale optimization problems.This algorithm has been proved to be globally convergent for convex objective functions and provides a fast rate of linear convergence as well as requiring minimal storage.The conjugate gradient (CG) algorithm is an algorithm that establishes a link between conjugate direction algorithm and steepest descent algorithm, in order to obtain a good algorithm which is effective and has good convergence.
The organization of this paper is as follows.The numerical model is given briefly in Section 2. In Section 3, five groups of numerical experiments and results analysis are performed.To be specific, in the first group, three optimization algorithms are used to investigate which is more effective; in the rest of the groups, the influences that step-length, inverse integral time of the adjoint model, prescribed vertical distribution of eddy viscosity, and regularization parameter exert on the inversion results are investigated, respectively.A summary of the results and conclusions are given in Section 4.

Numerical Model
A horizontally unbounded ocean surface layer with depth  0 is considered.The -axis points vertically upwards, and the sea surface is located at  = 0.The Coriolis parameter, , is taken to be a constant.The oceanic Ekman layer model is governed by the following equations: The corresponding boundary conditions are The initial conditions are where () (positive to the east) and V() (positive to the north) are horizontal velocity components in ocean, while   and V  are the wind velocity components.Assume that the wind force is only along the -direction, and so V  is 0.   is the density of water, and   is the density of air.() is the vertical eddy viscosity coefficients (VEVCs) while   is the wind stress drag coefficient (WSDC), which are the parameters to be estimated.The VEVC is a function of depth.The numerical model is formulated using a difference discretization on a grid with spatial increment Δ.Model parameters are set as follows:  = 10 −4 s −1 ,   = 1.2 kgm −3 ,   = 1.025 × 10 3 kgm −3 , Δ = 5 m, Δ = 0.5 h,  0 = 100 m,   = 10 sin(2 * / 0 ), and  0 = 10 h.A cost function which quantifies the discrepancy between the model results and the observations can be constructed as where the caret denotes the observation values.  is the weighting matrix and theoretically should be the inverse of observation error covariance matrix [24].However, determining the "exact" form is far from easy [25].The general practice is to assume that each observation is not relevant and is equally important.So the covariance matrices can be set to unit matrices.Therefore,   is simplified to be unit matrix [26].Equations (1a), (1b), (1c), and (1d) can be enforced by introducing a set of undetermined Lagrange multipliers, which are essentially the adjoint variables.This leads to the formation of the Lagrange function, which is given by where  and  are Lagrange multipliers for the -component and V-component, respectively.
The aim is that the cost function is minimized subject to the initial boundary value problem (1a), (1b), (1c), and (1d).For this purpose, the adjoint equations derived from the Lagrange multiplier method are given as follows: The corresponding boundary conditions are The corresponding initial conditions are where  ( = 10 days) is the total integral time of this oceanic Ekman layer model.The currents are simulated by integrating positive (i.e., oceanic Ekman layer) model, while the control parameters are optimized by integrating inverse (i.e., adjoint) model.In general, the inverse integral time of inverse model is equal to the integral time of positive model.The gradient of the Lagrange function  with respect to  and   yields As shown in Section 1, there are many feasible largescale optimization algorithms.Take the GD algorithm [27] for instance, the control parameters can be optimized using the following formulas: where  is the cost function which quantifies the discrepancy between the model results and the observation values and (/) 0 ((/) 0 = (/)/‖/‖) denotes the normalized gradient with respect to , and where  1 and  2 are step-lengths which are used to adjust the parameters preferably.

Numerical Experiments and Results Analysis
In the first place, the performances of GD, CG, and L-BFGS are compared via a group of ideal experiments in Section 3.1.
In the CG algorithm, the coefficient of search direction employs FR formula, which is showed by Fletcher and Reeves [28].The FR formula is given as follows: where   is the coefficient of search direction of th-iteration and   is the gradient of the cost functions with respect to control parameters of th-iteration.
As we all know, step-length affects the result involved with numerical optimization.If VEVC is estimated successfully when the inverse integral time of the adjoint model is reduced, the workload of calculation will be saved greatly.The actual oceanic eddy viscosity profile cannot be observed and it is unknown.If different prescribed distributions of VEVCs are estimated successfully, it can be inferred that the model based on the adjoint method has a good ability to estimate the VEVCs.Based on the above considerations, the following four groups of experiments are carried out in order to investigate the influences that the three optimization algorithms, step-length, inverse integral time of the adjoint model, and prescribed distribution of the VEVCs exert on the inversion results, respectively.In all four groups, the given value of the WSDC is set to be 0.0012.Therefore, the effect of  2 is not investigated in this paper, and the related steplength,  2 , is set to be a constant.The initial values of VEVC and WSDC are 0.008 and 0.0007, respectively.In the first two groups, the prescribed OEVP refers to Yu and O'Brien [12].

Group 1.
The performances of GD, CG, and L-BFGS algorithms are compared via a series of ideal experiments in this group.Because of simplicities of the WSDCs in this model, the experiments are performed with only VEVCs briefly.The inverse integral time of the adjoint model is 10 days.In the process of parameter optimization, we tried different iteration numbers.And the best result in experiments with different iteration numbers is adopted as the result of the algorithm.Clearly, the step-length is set to 1 as a matter of priority [23].
The prescribed and inverted VEVCs and variation curves of cost functions are shown in Figure 1.The root mean square errors (RMSEs) between VEVCs before and after assimilation are listed in Table 1.  Figure 1(a) shows a good agreement between model results and given values of the VEVCs with both GD and CG algorithms, while the inversion result is going too far with L-BFGS algorithm.The cost functions obtained with three algorithms decline differently, and the cost function with CG algorithm declines fastest while the one with L-BFGS algorithm is poor.Table 1 shows that the RMSEs in VEVCs after assimilation with GD and CG algorithms are reduced, and the RMSEs with GD algorithm are the smallest while the one with L-BFGS algorithm is increased rather than decreased compared with the value before assimilation.If the scaling of prescribed values of VEVCs is enlarged by 10 times (in fact, these values are not reasonable physically), the result is moderately good with L-BFGS algorithm; see Figure 2.
Based on the results above, we choose the GD algorithm to estimate the VEVCs and WSDC in the following groups of experiments.

Group 2.
The purpose of Group 2 is to investigate the influence that the step-length exerts on the inversion results.The inverse integral time of the adjoint model is 10 days.In the process of parameter optimization, we tried 1000 iterations.Clearly, the step-length is set to 1 as a matter of priority [23].In consideration of the gradient of the cost function with respect to the VEVCs reduced by its norm in (5),  1 is determined with the following four strategies:  where  represents the control parameter (i.e., VEVCs).The subscript 1 means the first iteration, and  means the th iteration.
The RMSEs between VEVCs before and after assimilation are listed in Table 2.The prescribed and inverted VEVCs are shown in Figure 3.The variation curves of cost functions and their norms are shown in Figure 4, and only the cost functions and the norms of the first 200 iterations are plotted.The time series of the modeled and "observed" current fields with strategy 1 are shown in Figure 5.The prescribed and inverted values of WSDC with all 4 strategies are shown in Figure 6.
Figure 3 shows a good agreement between model results and given values of the VEVCs with four strategies.The cost functions obtained with all four strategies decline remarkably after 200 iterations in Figure 4, and they are all reduced by 4 orders of magnitude after 1000 iterations.The  2 norms of the gradients of the cost functions with respect to VEVCs decline remarkably with four strategies.Table 2 shows that the RMSEs in VEVCs after assimilation are reduced by 2 orders of magnitude compared with the values before assimilation.The results obtained with strategy 2 are not very satisfactory in spite of its steady decline in the cost function.The value obtained with strategy 1 is 1.68 × 10 −4 and is the minimal among those for all strategies.The simulated velocities are highly consistent with the pseudoobservations at the depths of 25 m and 75 m. Figure 5 shows the modeled (black dotted) and observation values (black solid) current velocities and their differences (blue solid) at the depths of 25 m and 75 m after 1000 iterations in Group 2 with strategy 1.The difference is very small and its maximum value is 10 −4 in magnitude.At other depths of water, the simulated velocities are also highly consistent with the observation values.The WSDCs are able to be successfully inverted after 95, 51, 37, and 47 iterations with four strategies, respectively.
The results show that the oceanic Ekman layer model with adjoint method could estimate OEVP and WSDC successfully; the step-length has an effect on inversion results: the best assimilation efficiency can be obtained with strategy 1; that is, the step-length which is equal to 1 obtains the best results in Group 2.

Group 3.
In this section, the influence that the inverse integral time of the adjoint model exerts on the inversion is investigated.Strategy 1 described in Group 2 is used for the step-length.The iteration number is set to be 500 mainly based on the following considerations: (1) the misfit between observation values and the simulated results is very small and approximately constant when it declines to a certain value after 500 iterations; (2) the cost functions are almost no longer falling after 500 iterations.A series of inverse integral times are tested, that is, 10 days, 9 days, 8 days, 7 days, 6 days, 5 days, 4 days, 3 days, 2 days, and 1 day.
Figure 7 shows the prescribed and inverted VEVCs, which indicates that the OEVP are inverted successfully with 10 different inverse integral times.
The RMSE in VEVCs before assimilation is 1.57 × 10 −2 , and the RMSEs after assimilation are shown in Figure 8.The RMSEs in VEVCs are reduced by 2 orders after 500 iterations, and the minimal value is 2.42 × 10 −4 with 8 days of inverse integral time.Even if the iterations are increased to 1000, the minimal RMSE is also obtained with 8 days of inverse integral time.When the inverse integral time is reduced to 1 day, the inversion results after 500 iterations are not satisfying.However, if the iteration steps are increased to 2000, we still get a sound result.For the other experiments, we can get preferable results by increasing iteration steps appropriately.That is, a good assimilation efficiency is achieved by increasing iteration steps if the inverse integral time is short.We can also see that the result obtained with 10 days of inverse integral time is not the best one, since the first few days is an unstable process.In fact, the cost functions for all experiments can decline remarkably by more than 3 orders of magnitude after 500 iterations (because of the similarity with the cost functions in Figure 4 and it is not shown here).The WSDCs are also able to be successfully inverted within 100 iterations (because of the similarity with the results in Figure 6 and it is not shown here).
The result indicates that the inversion effects can be associated with the inverse integral time of the adjoint model, and the best assimilation efficiency is obtained with 8 days of inverse integral time.

Group 4.
In order to demonstrate the capability of the oceanic Ekman layer model with adjoint method in  estimating the OEVP, different distributions of the VEVCs are designed in this section.Strategy 1 described in Group 2 is used to determine the step-length, and the inverse integral time of the adjoint model is 8 days.One conclusion drawn from experiments of Group 3 is that preferable results can be obtained by increasing iteration steps appropriately.Therefore, in this group, different iteration steps are adopted according to the distributions of the OEVCs.Six different distributions are designed as follows:     We obtained reasonable inversion results after 1000 iterations for distribution 1.For some complicated distributions, increasing iteration steps appropriately can also yield good inversion result.Taking distributions 3, 5, and 6, for examples, the iteration steps need to reach 20000.For other complicated distributions, such as distributions 2 and 4, however, the inversion results are poor at the bottom of water even though the iteration number is very large.That is, when the VEVCs increase along with the depth at the bottom of water, the estimation results are relatively poor.The RMSEs between VEVCs before and after assimilation are listed in Table 3, which shows that the RMSEs in VEVCs after assimilation are reduced by 2 orders of magnitude compared with the values before assimilation except distributions 2 and 4. The prescribed and inverted VEVCs are shown in Figure 9, which shows a good agreement between the prescribed and inverted VEVCs except distributions 2 and 4. Actually, after assimilation, the cost functions for all six distributions decline remarkably by more than 3 orders of magnitude (because of the similarity with the cost functions in Figure 4 and it is not shown here).Still, the WSDCs are able to be successfully inverted within 100 iterations (because of the similarity with the results in Figure 6 and it is not shown here).
The estimation results are relatively poor when the VEVC increases along with the depth at the bottom of water (i.e., distributions 2 and 4).We design new strategy to solve the problem.Because there are only a few layers at the bottom of water (called the bottom layers in this paper) where the results are poor, the extrapolation method can be used to calculate the VEVCs.In this paper, linear extrapolation is applied because of its simplicity.More specifically, the VEVCs in bottom layers are calculated by linear extrapolation The results indicate that the prescribed distributions at the bottom of water are inverted successfully with extrapolation method.In practical application, we can learn from the new strategy to deal with the VEVCs in a few layers at the bottom of water.From the experimental results in this group, it is inferred that the oceanic Ekman layer model with adjoint method is likely to have a good capability of estimating the spatially varying VEVCs.This brings hope for us to apply the model to a real case.

Group 5.
One classical approach that is often used to solve the ill-posed inverse problems is regularization method.Tikhonov regularization which is pioneered by Tikhonov [29] is one of the most commonly used ones.The Tikhonov regularization employs a specific class of so-called "stabilizing functionals" to restrict admissible solutions to spaces of smooth functions [30].In this section, the Tikhonov regularization is applied to this adjoint assimilation model to investigate the influence that the regularization parameter exerts on the inversion results.
The smoothing functional is constructed as follows: where  is the cost function defined by formula (2),  sta = (/2)‖ − p‖ 2 is the Tikhonov stabilizer,  ( > 0) is a regularization parameter, and p and  are prior and optimized model control variables (the VEVCs in this paper), respectively.Therefore, the gradient of the smoothing functional is where  is the gradient in formula (5).
In this section, the regularization parameters are set to 10 2 , 10, 10 0 (i.e., 1.0), and 0, respectively.It should be noted that there is no Tikhonov stabilizer when  = 0, and this strategy is just strategy 1 in Section 3.2.In the process of parameter optimization, we tried 1000 iterations.The prescribed and inverted VEVCs and variation curves of cost functions are shown in Figure 11.The RMSEs and correlation coefficient between the inverted and prescribed VEVCs are shown in Figure 12.
As we can see in Figure 11(a), the inverted VEVCs can fit the prescribed ones very well.The cost functions obtained with different regularization parameters can decline remarkably by more than 3 orders of magnitude after 200 iterations.As we can see in Figure 12, the results with too large or too small values of regularization parameter are relatively poor; in other words, the regularization method with appropriate regularization parameter can indeed improve the experiment result to some extent compared with the results before using the regularization method.Although the improvement is not very significant in this group, we believe that the regularization method could be a useful approach to find a meaningful solution in the view of some ill-posed inverse problems.

Conclusions
According to the results of ideal experiments, we can draw the following conclusions: using the adjoint method to assimilate ocean observation values into an oceanic Ekman layer model, the distribution of the VEVCs and the WSDC could be inverted at the same time.
In Group 1, the best estimation results are obtained with the GD algorithm, and even the CG and L-BFGS get a better convergence rate.The GD algorithm, which is controllable and easy to implement, should be regarded seriously as a choice.In Group 2, four strategies of step-length are tested.For all strategies, the inversion results of the VEVCs are satisfactory, and it is proved that when the step-length is equal to 1, it improves the assimilation efficiency.In Group 3, the inverse integral time of the adjoint model is tested.The results indicate that the inversion effects are associated with the inverse integral time and 8 days of inverse integral time leads to the best results in experiments of Group 4; the good assimilation efficiency is achieved by increasing iteration steps if the inverse integral time is short.In Group 4, the inversion of six distributions of the VEVCs is examined.The results indicate that some complicated distribution can be inverted   by increasing the assimilation steps.We also conclude that, at the bottom of water, if the VEVC decreases along with the depth, the inversion results are satisfactory, while if the VEVC increases along with the depth, the results are relatively poor.To solve this problem, we try to use extrapolation method to calculate the VEVCs in layers where the inversion results are unsatisfactory.The experimental results indicate that the attempt is successful.In a practical application, we can learn from the attempt to deal with VEVCs in a few layers at the bottom of water.The regularization method with appropriate regularization parameter can indeed improve the experiment result to some extent in this inverse problem.The WSDCs in all experiments are able to be inverted successfully within 100 iterations.
The factors tested in this paper are closely combined with inversion efficiency.The work inspires us with an effective way to determine the OEVP and the WSDC in the numerical simulation of Ekman wind-driven currents.

Figure 1 :
Figure 1: (a) The prescribed and inverted VEVCs and (b) variation curves of cost functions in Group 1.

Figure 5 :
Figure 5: The modeled (black dotted) and "observed" (black solid) current velocities and their differences (blue solid) at (a) 25 m and (b) 75 m after 1000 iterations in Group 2 with strategy 1.

Figure 6 :
Figure 6: The prescribed and inverted values of WSDC in Group 2 with 4 strategies.

Figure 9 :
Figure 9: The prescribed and inverted VEVCs in Group 4. (a)-(f) denote the prescribed (black line) and inverted (red line) VEVCs for six distributions, respectively.

Figure 12 :
Figure 12: (a) RMSEs and (b) correlation coefficients between the inverted and prescribed VEVCs in Group 5.

Table 1 :
The RMSEs between VEVCs before and after assimilation in Group 1.

Table 2 :
The RMSEs between VEVCs before and after assimilation in Group 2.

Table 4 :
The RMSEs between OEVCs before and after extrapolation.

Table 4 .
The prescribed and inverted VEVCs are shown in Figure10.The RMSEs in VEVCs after extrapolation are reduced by 1 order of magnitude compared with the values before extrapolation.The inverted distributions have good agreements with the prescribed ones.