Sensitivity Analysis of Unsteady Flow Fields and Impact of Measurement Strategy

Difficulty of data assimilation arises from a large difference between the sizes of a state vector to be determined, that is, the number of spatiotemporal mesh points of a discretized numerical model and a measurement vector, that is, the amount of measurement data. Flow variables on a large number of mesh points are hardly defined by spatiotemporally limited measurements, which poses an underdetermined problem. In this study we conduct the sensitivity analysis of twoand three-dimensional vortical flow fields within a framework of data assimilation. The impact of measurement strategy, which is evaluated by the sensitivity of the 4D-Var cost function with respect to measurements, is investigated to effectively determine a flow field by limited measurements. The assimilation experiment shows that the error defined by the difference between the reference and assimilated flow fields is reduced by using the sensitivity information to locate the limited number of measurement points. To conduct data assimilation for a long time period, the 4D-Var data assimilation and the sensitivity analysis are repeated with a short assimilation window.


Introduction
The use of measurement data to improve a numerical prediction is known as a data assimilation method in meteorological and oceanographic communities [1].Recent data assimilation methods are based on the optimal control theory.As a consequence there are two major approaches: sequential and variational methods.The application of these methods to large scale problems in meteorology made the development of data assimilation methods somewhat independent of optimal control studies; that is, the effort is put into the reduction of numerical costs of those methods and the handing of the large degree of freedom of forecast models.One example is the invention of ensemble Kalman filter, which approximately represents system error covariance by an ensemble of a limited number of model runs.By this the cost of matrix operations in the Kalman filter algorithm can be drastically reduced.On the other hand, the numerical cost of variational methods such as the four-dimensional variational (4D-Var) method is usually smaller than the ensemble Kalman filter; therefore, the implementation of the 4D-Var data assimilation into the operational weather forecast was earlier than that of the ensemble Kalman filter.However, the cost for maintaining adjoint codes used for the 4D-Var data assimilation and the need for massive parallel computation, which is driven by the architecture of recent supercomputers, would accelerate the use of ensemble Kalman filters in meteorological community.Because of its rational approach to estimate a true state based on both observation and model prediction, the application of data assimilation methods is not limited to the area of meteorological and oceanographic studies.For example, a pioneering "hybrid wind tunnel" concept was proposed by Hayase et al. [2], in which measurements in a wind tunnel are used to improve the numerical simulation with coarse mesh resolution.The experiment was focusing on a von Karman vortex street in low Reynolds number flows.
The present authors have been studying the applicability of data assimilation methods in aeronautical researches.Numerical simulations of atmospheric turbulences such as clear air turbulence and aircraft wake turbulence were performed by the 4D-Var method coupled with aeronautical computational fluid dynamics (CFD) codes [3,4].Recent attempt is the mitigation of the uncertainty of a Reynoldsaveraged Navier-Stokes (RANS) turbulence model by the use of the ensemble Kalman filter, where parameters of the Spalart-Allmaras turbulence model are optimized based on pressure measurements around the RAE2822 airfoil [5].On the other hand, a classical nudging technique is used to initialize realistic aircraft wake in a computational domain to simulate aircraft wake vortices, although a high-fidelity RANS flow field is nudged instead of measurement data [6].To alleviate the large numerical cost of data assimilation methods compared to a single model run, a reduced-order model (ROM) prediction of terrain-induced turbulence has also been conducted in combination with the particle filter [7].Based on experiences gained from those applications our interest is to have a more general guidance to apply data assimilation methods to fluid dynamic problems with limited measurements.
Predictability in systems with the large degree of freedom is a topic of concern in numerical weather prediction (NWP) in conjunction with data assimilation [1].Here, the predictability is defined by the growth rate of the distance of initially close trajectories in a phase space.From the work of Lorenz, it is widely known that the predictability of a certain type of dynamical system, for example, the Lorenz model, is limited due to deterministic chaos [8,9].This suggests a framework to consider the predictability of an actual dynamical system, such as a system of Navier-Stokes equations.Here the predictability of a system could be rephrased as the ease of data assimilation.To investigate the predictability of a system, the maximum Lyapunov exponent has been used to determine the global predictability.For an actual dynamical system, however, the local Lyapunov exponent, such as the finite size Lyapunov exponent (FSLE), is important to represent the predictability of the system [10].Carrassi et al. coupled a breeding technique with a data assimilation procedure based on the concept of assimilation in unstable subspace [11].They showed the stability of the data assimilation system based on the Lyapunov exponent and employed adaptive (targeted) observation to stabilize the system.
The idea of targeted observation has been studied in meteorological community to improve the NWP.The targeted observation has a large potential to use recently available measurements from aircraft (the Aircraft Meteorological DAta Relay, AMDAR) and unmanned aerial vehicles (UAVs) which can be considered as "flying sensors." One of the major approach of the sensitivity analysis aiming for targeted observations is an adjoint-based method because the major operational weather centers employ the adjoint-based 4D-Var data assimilation system [12].Daescu provided a framework of the forecast sensitivity with respect to observations in the context of the 4D-Var data assimilation [13].Hossen et al. studied the effect of random perturbations on the forecast error [14].The adjoint equation approach has a variety of applications including estimation problems [15][16][17] as well as data assimilation problems [3,4,18].
The present study is an attempt to investigate the impact of measurement strategy in a data assimilation system by using a sensitivity analysis method.We refer to the preceding work of the sensitivity analysis conducted by Daescu and Navon [12].In this paper we consider an idealized situation in numerical experiments, that is, a system of a counterrotating vortex pair where self-induced advection velocity realizes a transient flow field.The impact of the number of measurement points on the assimilated flow field is firstly investigated, and the possibility of adaptive/targeted measurement is explored in the numerical experiments where locations of the measurement points are optimized to effectively reduce the error against a reference flow field.The idea of adaptive/targeted observation in meteorology is interpreted in fluid dynamic problems where unsteady flows of much smaller scales are of interest.The rest of this paper is organized as follows.Section 2 describes the numerical methods for flow simulation and data assimilation including the sensitivity analysis.The computational setting of twoand three-dimensional flow fields defined by a vortex pair is presented in Section 3. In Section 4, we discuss the results of the sensitivity analysis and the impact of measurement strategy in the two-and three-dimensional flow fields.Section 5 concludes our study.

Governing Equations and Numerical Methods.
For flow simulation we employ incompressible Navier-Stokes equations: where   and   represent the velocity components in three spatial directions (,  = 1, 2, or 3) and the pressure deviation from the reference state  =  0 +   , respectively.  = (  /  +   /  )/2 denotes the strain rate tensor.The summation convention is used for the velocity components   .In this study, a typical value of air density is employed:  0 = 1.2 kg/m 3 .The kinematic viscosity in (1) is defined by the sum of molecular viscosity and eddy viscosity obtained by a subgrid-scale model.The equations are discretized by the fully conservative fourth-order central difference scheme [19].Time integration is performed by the third-order low storage Runge-Kutta scheme [20].The Lagrangian dynamic model is used as a subgrid scale model for large-eddy simulation (LES) [21], which avoids excessive eddy viscosity in vortex core region [22].The adjoint code is derived first by linearizing the above equations and then by rewriting it from backward by replacing inputs and outputs of each code line.The latter operation corresponds to a transpose of the matrix composed of coefficients of the linearized Navier-Stokes equations.Both Navier-Stokes and the adjoint code are parallelized by a domain decomposition approach using the message passing interface (MPI) library.

Sensitivity Analysis of an Unsteady Flow Field.
The objective of the 4D-Var data assimilation is to obtain an initial flow state which reproduces corresponding measurements during a certain time period (assimilation window) [1]. Figure 1 shows a schematic of the 4D-Var data assimilation.The vertical and horizontal axes show a flow state in a phase space and time, respectively.A solid black line shows a trajectory of a true flow state.Broken lines show trajectories of the simulated flow state starting from different initial states.The 4D-Var data assimilation searches the initial flow state of the true trajectory by evaluating the difference between those trajectories and measurements within an assimilation window.
The difference between measurements (usually measurements have spatiotemporally less information compared to a prediction model) and corresponding numerical results evaluated by conducting the numerical simulation over a period of time is represented as a cost function with respect to an initial flow state Q 0 : Here   is a measurement operator which converts the dimension of computational flow variables into that of measurement data to evaluate these differences, while a vector y  represents measurements.Subscript  shows a time step of the flow computation and  is the total number of time steps.Equation ( 3) is a function of Q 0 ; that is, the data assimilation process is formulated as a minimization problem of (Q 0 ) with a control variable Q 0 .In a standard formulation of a cost function for the 4D-Var data assimilation, a background error term is taken into account.Here we do not consider the background error term because we attempt to retrieve a flow field based on available measurements; therefore, a background state just plays a role of initial conditions for the 4D-Var iteration.Note that we follow the notation of quantities used in meteorological community [23], except for a state vector Q  , because a variable   is used as a spatial coordinate in (1).The 4D-Var method handles measurement errors through a measurement error covariance matrix   , where its elements are the covariance between measurements.
In this study   is set to the identity matrix.Basically the covariance matrix defines the relative importance of measurements.We do not consider the relative importance of each measurement in this study; instead we attempt to locate measurement points effectively during the 4D-Var iteration.
To obtain the gradient of (Q 0 ) which is used for the minimization of (Q 0 ), a Lagrange function is introduced using a Lagrange multiplier vector   .The procedure to obtain the gradient is finally written as follows [3]: Equation ( 4) shows that the gradient of (Q 0 ) is obtained by the inverse time integration of   using the adjoint operator M   with a force term: After obtaining the gradient, the minimization of (Q 0 ) is conducted by the quasi-Newton method modifying the initial flow variable Q 0 .The Hessian matrix is approximated using the limitedmemory Broyden-Fletcher-Goldfarb-Shanno (BFGS) method [24,25].In this method, memory requirements are reduced because the approximated Hessian matrix is not stored explicitly.
In this study we investigate the impact of measurements on a retrieved flow field within a framework of the 4D-Var data assimilation.We refer to the preceding work of Daescu [13] and Hossen et al. [14] regarding the derivation of the sensitivity of the 4D-Var cost function to the observation (observation sensitivity).We start with rewriting (4) as where a linearized operator M 0, advances a flow state from the time  0 to   .By differentiating (5) with respect to the observation vector y  , we obtain On the other hand, the gradient of the 4D-Var cost function (3) could become ∇ Q 0 (Q 0 , y  ) = C at a certain step during the 4D-Var iteration, where C denotes a nonzero constant vector.
Referring to the derivation in Daescu [13] we consider the differentiation of the implicit function, which is a function of an initial flow state Q 0 and an observation vector y  , ∇ Q 0 (Q 0 , y  ) − C = 0 as follows: Note that Daescu [13] considers the implicit function which is a function of an analysis state Q a 0 and an observation vector y  ; that is, ∇ Q 0 (Q a 0 , y  ) = 0, where the first-order optimality condition is satisfied.On the other hand, we apply the implicit function differentiation to the gradient of the 4D-Var cost function during the minimization process, where the 4D-Var cost function does not satisfy the optimality condition.Numerical experiments are conducted in this paper to investigate the applicability of this treatment in a data assimilation system based on Navier-Stokes equations.
The rigorous analysis of the applicability is beyond the scope of the present study.Using ( 6) and ( 7), as well as a chain rule we obtain the sensitivity of the 4D-Var cost function with respect to the observation as where Hessian matrix, and we use the approximated Hessian matrix obtained from the limitedmemory BFGS.As described in Daescu and Navon [12], the use of the approximated Hessian matrix may introduce errors in the observation sensitivity evaluation that are difficult to quantify.A mapping of the observation sensitivity onto the mesh points of the LES is conducted as follows: In the following the observation sensitivity is rephrased as the measurement sensitivity, which is more appropriate to describe quantities acquired in laboratory experiments.The development of linear and adjoint codes can be done by step-by-step processes in the following way.First the derived linear code is checked by where the left-hand side decreases with the order of .Here the small perturbation Q to the reference state Q is scaled by .It is also possible to check the angle, ), which decreases with  2 .The adjoint code is a transpose of the linearized code and is derived line by line without composing an explicit matrix, where the following relation holds in the order of 10 −14 with the FORTRAN double precision real.The above validation procedures can be conducted in small program modules such as convective and diffusion terms of Navier-Stokes equations as well as a whole code including all terms and a time integration part.Finally, the gradient computation is validated by , where again the left-hand side decreases with .Table 1 shows the validation of the adjoint code based on small state perturbations.On the other  hand, Table 2 shows the strong scaling of the parallel gradient computation where the number of total mesh points is fixed while increasing processor numbers.kinematic viscosity is set to ] = 1.5 × 10 −5 m 2 /s such that the vortex circulation based Reynolds number amounts to Re = Γ/] ≈ 10 7 .Periodic boundary condition is used for all spatial boundaries in the two-and three-dimensional cases considered in this study.

Procedure of Numerical Experiments.
In the twodimensional case, a numerical experiment is conducted first by generating a reference flow field starting from the conditions mentioned above.From the reference flow field, we acquire pseudomeasurements based on the following strategies; that is, velocity components of all mesh points are extracted as measurements (referred to as 2D-F), velocity components on every second mesh point in both and directions are used (2D-H), and those on every fourth mesh point are used as measurements (2D-Q).In comparison with 2D-F, the number of measurements is one fourth in 2D-H and one sixteenth in 2D-Q.Then the 4D-Var cycle (forward time integration for the evaluation of the cost function, backward time integration of the adjoint code, Hessian matrix computation with limited-memory BFGS, and linear search) is started from an arbitrary flow field, where we consider a weaker vortex pair with larger vortex separation compared to the reference flow field (Γ 0 = 200 m 2 /s,   = 6 m and  0 = 60m).Targeted measurement process starts after a few iterations of the 4D-Var cycle.Having an approximated Hessian matrix and the gradient from the adjoint code, the measurement sensitivity can be computed by using (8).Here we only consider the measurement sensitivity at the beginning of time integration.Using the measurement sensitivity mapped onto the numerical mesh by (9), measurement points are redistributed based on the magnitude of the measurement sensitivity, where the total number of the measurement points is kept constant, that is, one fourth of the LES mesh in 2D-HA and one sixteenth of the LES mesh in 2D-QA.The time integration of the Navier-Stokes equations is conducted until one tenth of vortex reference time  0 = (2 2 0 )/Γ 0 = 33.5 s, that is, 3.35 s.The vortex pair descends a distance of one tenth of vortex separation  0 during this period [22].Another procedure of the numerical experiment is considered in two-and three-dimensional vortex evolutions for a long time period of one vortex reference time  0 , where the 4D-Var iteration and the sensitivity analysis are repeated with a short assimilation window.We consider the cases which use velocity components of the all mesh points as measurements (referred to as 2DL-F and 3DL-F for two-and threedimensional cases, resp.), velocity components on every second mesh point in each spatial direction as measurements (2DL-H, 3DL-H), and the case in which every second mesh point in each spatial direction is used as measurements along with the redistribution based on the measurement sensitivity (2DL-HA, 3DL-HA).The vortex system of the reference case is characterized by vortex circulation Γ 0 = 300 m 2 /s, vortex core radius of   = 6 m, and vortex separation  0 = 24 m.To invoke three-dimensional instability of a vortex pair such as the Crow instability, a sinusoidal perturbation with the amplitude of 7.2 m is applied along vortex centerlines as in Figure 3(a).On the other hand, the data assimilation starts from the conditions of Γ 0 = 300 m 2 /s,   = 6m, and  0 = 30m, where the sinusoidal perturbation is not applied as shown in Figure 3(b).The absence of the sinusoidal perturbation critically affects the evolution of a vortex pair in three dimensions; that is, the Crow instability leading to vortex reconnection does not appear without the source of instability such as the sinusoidal perturbation [22].

Results and Discussion
4.1.Two-Dimensional Case. Figure 4 shows the history of the cost function defined by (3) for three measurement strategies, while the global error evaluated by the quantities on all mesh points is also presented.Here the horizontal axis shows the number of the 4D-Var iteration.Figure 4 indicates that the decrease of the cost function in the 4D-Var data assimilation  is not always connected with the convergence of the retrieved flow field to the reference flow field.In 2D-F the value of the cost function and the global error are identical because three velocity components on all mesh points in the domain are used as measurements.The value of the cost function is proportional to the number of measurement points and time steps; therefore, the drastic reduction is realized in 2D-F compared to 2D-Q.On the other hand, the decrease of the global error becomes slow as the number of measurements is reduced.This confirms that there is a limitation for defining a flow field by using a limited number of measurements, that is, a number smaller than a degree of freedom of a numerical model.Note that the behavior of the cost function and the global error might vary depending on the data assimilation methods used and parameters included in the optimization algorithms; however, the tendency shown here may be unchanged in the case of other data assimilation methods.
Figure 5 shows a similar plot as Figure 4, but with adaptive measurement strategy, that is, 2D-HA and 2D-QA.Here the redistribution of measurement points is conducted at every 5th iteration of the 4D-Var assimilation cycle.Until the 5th iteration the values of the cost function and the global error are the same as those of Figure 4.At the 6th iteration the value of the cost function rapidly increases because the measurement points are redistributed to the regions where the global error is relatively large.The cost function again decreases after a few iterations.During the reduction of the cost function, the global error values are also decreased in 2D-HA and 2D-QA, and the values become smaller than those in Figure 4.This implies that the global error can be effectively reduced by locating measurement points in regions where the relative error is large.And it is realized by evaluating the measurement sensitivity during data assimilation.Even with the adaptive measurement strategy, the global error of 2D-HA and 2D-QA is larger than that of 2D-F where measurements are given at all mesh points.Figure 6 shows the distribution of measurement points in 2D-HA, where the total number of measurement points is one fourth of that in 2D-F.Figure 6(a) shows the initial distribution, and Figure 6(b) shows the distribution after the first redistribution of measurement points conducted at 6th iteration of the 4D-Var cycle.These measurement points are colored by the magnitude of the measurement sensitivity at those locations, where reddish color indicates larger sensitivity.It is confirmed that the measurement points are clustered near vortices.Since the measurement points are redistributed using the mesh points of the numerical simulation, the measurement points do not come closer than the distance of mesh spacing.The resolution of measurements finer than that of numerical simulation may not improve the retrieved flow field, although accurate estimates of the observational errors, including the error correlation structures, become available in the context of data assimilation.Figure 6(c) shows the distribution of measurement points at 16th iteration, where the distribution is similar to that of Figure 6(b).In the same way, Figure 7 shows the measurement points of 2D-QA where the total number of measurements is one sixteenth of that in 2D-F.As in the 2D-HA the measurement points are clustered near vortices after the first adaptation.

Two-Dimensional
Case for a Long Time Period.In this section, we again consider a two-dimensional case, but for a longer time evolution considering multiple assimilation windows during that period.Figure 8 shows the history of the cost function for several data assimilation strategies, where the measurements produced from the reference flow field are provided for data assimilation in 2DL-F, while every second Two-dimensional evolution of a vortex pair is well evaluated by the evolution of vortex positions because a vortex pair descends due to self-induced velocity; therefore, the velocity distribution needs to be correctly reproduced to obtain correct vortex positions in time.Figure 9 shows the vertical vortex positions from the reference case, from an initial flow field before data assimilation and from the reproduced flow field by data assimilation.Here the results of 2DL-H and 2DL-HA are shown.Before the data assimilation, the vortex descent is reduced due to the weaker vortex circulation compared to the reference case.For 2DL-H and 2DL-HA, a descending distance is comparable to that of the reference case.The difference between 2DL-H and 2DL-HA is not significant.
Figure 10 shows the distribution of measurement points during the time integration in 2DL-HA.Those points are colored by the magnitude of the measurement sensitivity, where reddish color indicates the larger sensitivity.Initially the measurement points distribute near a vortex pair as shown in Figure 10(a).Later the measurement points spread to wider area and the magnitude of measurement sensitivity becomes small.

Three-Dimensional
Case for a Long Time Period.In this section, we consider the three-dimensional evolution of a vortex pair.As in the previous section, we use an iterative procedure for a long time evolution of one vortex reference time.Figure 11 shows the history of the cost function for several data assimilation strategies, where the measurements   produced from the reference case are provided for data assimilation in 3D-F, while every second mesh point in each spatial direction has measurements in 3D-H; that is, the total number of measurement points is reduced to one eighth of the 3D-F.On the other hand, 3D-HA is the case where measurements points are redistributed based on the measurement sensitivity.The horizontal axis shows the number of iterations in the same way as in Figure 8.As in the previous two-dimensional case, 3D-HA case effectively reduces the global error compared to 3D-H.  Figure 12 shows the evolution of a vortex pair, where the sinusoidal disturbance applied to the reference case substantially affects the evolution of a vortex pair such as vortex reconnection.The evolution of the reference case is shown in Figures 12(a) to 12(c), while that of the reproduced flow field is shown in Figures 12(d) to 12(f).The initial state of the reproduced vortex pair in Figure 12(d) is disturbed near the domain edge, that is, the position of vortex reconnection; however, it disappears as the data assimilation process proceeds.Note that the vortex pair shown in Figure 3(b) keeps descending without any three-dimensional vortex deformations if we do not apply the 4D-Var data assimilation.Therefore, the flow field at  = 20 appears very different with or without assimilating measurements.
Figure 13 shows the evolution of measurement points colored by the measurement sensitivity.For visualization purpose, a measurement point which has measurement sensitivity smaller than a certain threshold is not shown.As in Figure 13(a), the measurement points are focused near a vortex pair, especially near the position of vortex reconnection.In the later time in Figure 13(c), the measurement points cluster near the position of vortex reconnection.It is confirmed that the error of the reproduced flow field against the reference flow field remains near the position of vortex reconnection, and the measurement points are effectively distributed near the location with the higher measurement sensitivity.

Conclusions
In this study we conducted the sensitivity analysis of a vortical flow field within a framework of the 4D-Var data assimilation.The idea of adaptive/targeted observation in meteorology which aims to effectively determine a flow state by limited measurements was employed in fluid dynamic problems where unsteady vortical flows of much smaller scales are of interest.We firstly investigated a two-dimensional flow field defined by a pair of vortices which descends due to self-induced advection velocity.The amount of measurement points affected the convergence of the cost function as well as that of the global error against the reference flow field.The measurement strategy based on the measurement sensitivity effectively redistributed measurement points near vortices.This resulted in the further reduction of the global error.
Then, the same configuration with a longer time period was investigated by repeating 4D-Var data assimilation with a short assimilation window, where the impact of adaptive measurement was also confirmed.In the three-dimensional configuration of a vortex pair, the reproduction of the flow field becomes difficult because there exist three-dimensional mechanisms leading to flow instabilities such as the Crow instability.The redistribution of measurement points based on the measurement sensitivity successfully reduced the global error against the reference flow field.During the data assimilation iterations, the measurement points were focused near the position of the vortex reconnection which helps to reproduce the onset of the vortex reconnection.

Figure 1 :
Figure 1: Schematic of data assimilation based on the 4D-Var data assimilation.

Figure 2 :
Figure 2: Computational domain for the two-dimensional case, where mesh lines and initial vortex positions of the reference flow field are shown.

3. 1 .Figure 3 :
Figure 3: Computational domain for the three-dimensional case, where the vortex position is shown by the isosurface of vorticity magnitude.(a) A reference flow field with a sinusoidal perturbation of the vortex positions along vortex centerline and (b) an initial flow field before data assimilation.

Figure 4 :
Figure 4: Cost function and global error with different number of measurement points.

Figure 5 :
Figure 5: Cost function and global error with adaptive measurement based on the measurement sensitivity.

Figure 6 :
Figure 6: Distribution of measurement points with adaptive measurement at (a) 1st, (b) 6th, and (c) 16th 4D-Var iteration in 2D-HA, where the measurement points are colored by the magnitude of the measurement sensitivity at those locations.

Figure 7 :
Figure 7: Distribution of measurement points with adaptive measurement at (a) 1st, (b) 6th, and (c) 16th 4D-Var iteration in 2D-QA, where the measurement points are colored by the magnitude of the measurement sensitivity at those locations.

Figure 8 :Figure 9 :Figure 10 :
Figure 8: Cost function and global error with and without adaptive measurement in the two-dimensional cases for a long time period.

Figure 11 :Figure 12 :
Figure 11: Cost function and global error with and without adaptive measurement in the three-dimensional cases.

Figure 13 :
Figure 13: Distribution of measurement points with adaptive measurement at (a)  = 0 s, (b)  = 10 s, and (c)  = 20 s in the case of 3DL-HA, where the measurement points are colored by the magnitude of the measurement sensitivity at those locations.

Table 1 :
The validation of the adjoint code based on small state perturbations.

Table 2 :
Strong scaling of the gradient computation.