Evaluation of the Impact of Argo Data on Ocean Reanalysis in the Pacific Region

Observing System Simulation Experiments (OSSEs) have been conducted to evaluate the effect of Argo data assimilation on ocean reanalysis in the Pacific region. The “truth” is obtained from a 5-year model integration from 2003 to 2007 based on the MIT general circulation model with the truly varying atmospheric forcing. The “observations” are the projections of the truth onto the observational network including ocean station data, CTD, and various BTs and Argo, by adding white noise to simulate observational errors.The data assimilationmethod employed is a sequential three-dimensional variational (3D-Var) scheme within a multigrid framework. Results show the interannual variability of temperature, salinity, and current fields can be reconstructed fairly well. The spread of temperature anomalies in the tropical Pacific region is also able to be reflected accurately when Argo data is assimilated, whichmay provide a reliable initial field for the forecast of temperature and currents for the subsurface in the tropical Pacific region. The adjustment of salinity by using T-S relationship is vital in the tropical Pacific region. However, the adjustment of salinity is almost meaningless in the northwest Pacific if Argo data is included during the reanalysis.


Introduction
An ocean reanalysis system of the global ocean has been established recently by National Marine Data and Information Service (NMDIS) of China for the purpose of understanding monthly, annual, and interannual changes of sea surface height (SSH), as well as three-dimensional (3D) temperature, salinity, and currents.MITgcm (MIT general circulation model) serves as the ocean dynamical model in the reanalysis system [1], which is a state-of-the-art ocean model and is also employed in the Estimating the Circulation and Climate of the Ocean (ECCO) reanalysis project.The ocean data assimilation scheme used is a sequential 3D variational (3D-Var) analysis scheme designed to assimilate temperature and salinity using a multigrid framework [2].This sequential 3D-Var analysis scheme can be performed in 3D spaces and can retrieve resolvable information from longer to shorter wavelengths for a given observation network to yield multiscale analysis.The historic observational data assimilated in the reanalysis system include temperature and salinity profiles from ocean station, conductivity-temperature-depth (CTD), various bathythermograph (BT), and Argo floats, as well as sea surface height anomaly (SSHA) from altimeter and sea surface temperature (SST) from satellite remote sensing.The other purpose for which we develop the global ocean reanalysis is to provide better real-time (daily or hourly resolution) lateral boundary conditions for the ocean dynamic model used for China Ocean ReAnalysis (CORA; [3]) developed by the NMDIS, from which the reanalysis products of SSH, 3D temperature, salinity, and currents from 1986 to 2008 in the China coastal waters and adjacent seas have been produced (http://www.cora.net.cn).
The 21st century Argo (Array for Real-time Geostrophic Oceanography) observing network is very important for global ocean climate studies.In particular, the salinity observation provided by the Argo network gives significantly more information comparing with the 20th century XBT observing network.Cooper [4] pointed out that the single variable assimilation of temperature will deteriorate the density field, which can result in a worse analysis of current field than that even with no data assimilation.The increasing salinity observation from Argo is essential to improve the structure of density field during data assimilation.To evaluate the impact of Argo on ocean data assimilation, many studies have been carried by various institutes (e.g., [5][6][7][8]).However, it is unclear what the concrete effect of Argo data assimilation has on ocean reanalysis in the Pacific region, especially in the subsurface layers of the tropical Pacific region and the northwest Pacific region.In addition, it is also necessary to know the role of T-S relationship in bivariate data assimilation when salinity data has increased dramatically thanks to Argo.
Observing System Simulation Experiment (OSSE) is one of the useful approaches to evaluate impact of the ocean observing system [9].Within the OSSE framework, simulated rather than real observations serve as the input to a specified data assimilation system [10].In this study, simulated observational values are drawn from a "truth" model.Besides, at every model grid point, time series of the "truth" values of the state variables, such as the temperature, salinity, and currents, can be obtained from the "truth" model integration.Here we intend to evaluate the effect of Argo data assimilation on the ocean reanalysis in the Pacific region utilizing the abovementioned reanalysis system, especially the tropical Pacific region and the northwest Pacific region, which can be served as essential step to deeply understand the effect of Argo data assimilation on the World Ocean.This study is organized as follows: Sections 2 and 3 briefly describe the numerical model and ocean data assimilation scheme, respectively.Section 4 gives sensitivity experiment design.The impact of Argo on the ocean reanalysis in the Pacific region and conclusions are in Sections 5 and 6, respectively.

Numerical Model
MITgcm is developed by Marshall et al. [11].The MITgcm manual illustrates that one hydrodynamical kernel is used to drive forward both atmospheric and oceanic models.It has a nonhydrostatic capability and can be used to study both small-scale and large-scale processes.Finite volume techniques are employed, yielding an intuitive discretization and supporting the treatment of irregular geometries using orthogonal curvilinear grids and shaved cells.Besides the above characters, MITgcm is developed to perform efficiently on a wide variety of computational platforms (http://Mitgcm .org).
The model domain in this study is from 74.25 ∘ S-84.75 ∘ N, 0.25 ∘ E-359.75 ∘ E. The KPP [12,13] vertical mixing scheme is adopted.A horizontal C-grid has 1/2 ∘ × 1/2 ∘ resolution telescoping to 1/4 ∘ meridional spacing near the equator, and the horizontal grid numbers are 720 × 348.The -level standard vertical grid is used, with a total of 35 vertical levels configured.ETOPO5 bottom topography [14] is used in the model, and the minimum and maximum of water depths are 5 m and 5000 m, respectively.The time step is 600 s.The atmosphere forcing is from the National Centers for Environmental Prediction (NCEP) reanalysis, which includes daily wind speed at 10 m, net heat flux, and net freshwater flux.Wind speed is converted to wind stresses using the formula of Yelland and Taylor (1994).The surface temperature and salinity are relaxed to monthly climatologies, and the relaxation time scale is set to 100 days.

Data Assimilation Scheme
The multigrid 3D-Var data assimilation scheme developed by Li et al. [2] is used in the reanalysis system.The scheme is able to retrieve resolvable information in 3D space from longer to shorter wavelengths for a given observation network and yield multiscale analysis.The multigrid technique is introduced into the 3DVAR data assimilation to obtain longwave information of the observations over data-sparse regions and shortwave information over data-dense regions.The cost function can be written as where X is the correction of the state variable referred to the background.Y is the difference between the available observation and the interpolated background field at the observation locations.O is the observation error covariance matrix.H is the interpolation operator from the model space to the observation space.The superscripts  and  show the transpose and the th level grid, respectively. shows the final level.It can be seen that the background error covariance matrix does not appear in (1), which has been represented implicitly by the grid levels.Compared to the traditional scheme of 3DVAR, the multigrid 3D-Var scheme has higher forecast accuracy and lower root-mean square errors.More details can be found in Li et al. [15].Figure 1 shows the flowchart of the temperature and salinity data assimilation scheme.Firstly, using the polynomial fitting, the T-S relationship is calculated from the simulated temperature and salinity fields.Secondly, observed temperature data is assimilated into the numerical model using the multigrid 3D-Var data assimilation scheme.Thirdly, the background field of salinity is adjusted according the assimilated temperature field by the derived T-S relationship.
Here we have assumed that the T-S relationship remains unchanged after the temperature is assimilated.Finally, the available observation of salinity is assimilated into the model.Following Troccoli et al. [16], a latitudinal filter has been applied to the salinity and temperature increments so that the whole salinity increment is applied only within 30 ∘ of the equator.Outside this region, the weight given to the salinity analysis diminishes linearly to zero at 60 ∘ N and 60 ∘ S.This is done to avoid implementing the salinity correction scheme in areas where stratification is weak.The "observed" data used in reanalysis sensitivity experiments are constructed by projecting the truth onto a real observational network (limited to the top 1000 m in this study).Data types in the real observational network include XBT, CTD, DRB, OSD, UOR, MRB, and Argo from 2003 to 2007, and positions of observational profiles come from the World Ocean Data (WOD2009) and China Argo Real-time Data Center (http://agro.org.cn),respectively.The projection from the model space onto observational space is a bilinear interpolation in the horizontal direction and the Akima interpolation in the vertical direction.A Gaussian white noise with the mean and standard deviation being 0.0 ∘ C (0.0 psu) and 0.2 ∘ C (0.05 psu), respectively, is added to temperature (salinity) "observation" as random error simulation.For simplicity, "observations" including the temporal and spatial information of XBT, CTD, DRB, OSD, UOR, and MRB are called "conventional observations," and those including Argo temporal and spatial information are called "Argo observations."It should be noted that CTD and Argo profiles have both temperature and salinity observations, while the other profiles may only have temperature observations.Distributions of temperature and salinity of conventional observations and Argo observations from 2006 are shown in Figures 2 and 3, respectively.It can be seen that the distributions of Argo observations are much denser than the conventional observations in the model domain, especially in the south Pacific, where the conventional observations are scarcely distributed.The distribution of conventional salinity is very limited, compared with conventional temperature, especially south of 50 ∘ S, where conventional salinity is almost invisible.However, the number of temperature and salinity data from Argo is equivalent in the Pacific region.

Experiment Setup.
Five experiments are presented in Table 1.All these experiments employ the same model setup described in Section 2 and the data assimilation scheme described in Section 3. EXP 1 is the control run with no "observations" assimilated, where climatological temperature/salinity field in January derived from SODA (Simple Ocean Data Assimilation) [17,18] and climatological monthly wind and net heat flux derived from NCEP serve as its initial condition and driving force, respectively.EXP 1 is spun up for

Impact of Argo Data on the Ocean
Reanalysis in the Pacific Region  and 1000 m.We can see that the RMS errors of salinity for subsurface in EXP 3 are obviously lower than those in EXP 5.In particular, the RMS errors of salinity in EXP 5 are bigger than those in EXP 1 all the time, which indicates that the analysis of salinity is somewhat inferior in subsurface ocean if only the conventional data is assimilated into the numerical model when the T-S relationship is ignored.
Figure 5 presents the RMS errors of temperature and salinity in five experiments for the top 1000 m in the tropical Pacific region (5 ∘ S-5 ∘ N).The RMS errors of salinity in EXP 2 are lower than those in EXP 4. The RMS errors of salinity in EXP 4 present a sharp fluctuation, with maximum reaching 0.16 psu on the 700th day.The great improvement of the analysis of salinity makes a better analysis of density field, which also makes the RMS errors of temperature in EXP 2 slightly lower than those of EXP 4. The RMS errors of salinity in EXP 2 are also much lower than those in the other three experiments besides EXP 4. This gives the fact that the T-S relationship is necessary even if Argo data is assimilated in the tropical Pacific region.However, Argo data is also indispensable in the tropical Pacific region.Without the Argo data assimilated, such as EXP 3 and EXP 5, the RMS errors of salinity become worse than those in EXP 1.
The results indicate that the number of conventional salinity observations is too little to improve the analytical result of salinity in the tropical Pacific region if the oceanic initial fields and atmosphere forcing are inaccurate.
Figure 6 is the same as Figure 5, except for the northwest Pacific region (120 ∘ E-150 ∘ E, 10 ∘ S-52 ∘ E).The T-S relationship is important when the conventional data is assimilated in this region (comparing the results of EXP 3 and EXP 5).However, the effect of the T-S relationship is not obvious if Argo data is assimilated (comparing the results of EXP 2 and EXP 4), which indicates that the effect of the T-S relationship on the analysis of salinity is not important for the northwest Pacific relative to the tropical region.
The accuracy of analysis of temperature and salinity can be improved greatly if Argo data is assimilated into the ocean model in the whole domain of the Pacific region.Assimilation errors are reduced by 28% for temperature and 37% for salinity.However, in the northwest Pacific region where the temporal and spatial distribution of Argo data is not dominant comparing with the conventional data, the improvement in the analysis of temperature and salinity is not the same as in the tropical Pacific region.Assimilation errors are reduced by 11% for temperature and 16% for salinity in the northwest Pacific region.
Compared the results of Figure 4 and Figures 5 and 6, it can be noted that there are obvious shrinking processes of the RMS errors when Argo observations are assimilated (EXP 2 and EXP 4) in the whole Pacific region.One can see from Figure 3 that the distributions of Argo are sparse in the subpolar and polar regions, especially in the ACC region in the Southern Ocean, where Argo observations almost can not be found.Therefore, the RMS errors in the datasparse regions are decreased gradually through the model dynamical constraint rather than the direct observational constraint.In contrast, observational numbers in both the tropical Pacific and the northwest Pacific are enough to constrain the dynamic model, where the RMS errors can be reduced rapidly.
The "true" velocity field can be used for verifying the effect of Argo data as an independent element.Figures 7(a) and 7(b) present the RMS errors of U (eastward) and V (northward) component in the five experiments for top 1000 m, respectively.It can be seen that the RMS errors are not reduced obviously after Argo data is assimilated (comparing EXP 2 with EXP 3), both for U and V components.The results of EXP 5 are the worst in all five experiments, which suggests that the density field is deteriorated owing to insufficient observations and the ignorance of T-S relationship.Figures 7(c) and 7(d) show the RMS errors of U and V component in the five experiments between 100 m and 1000 m, respectively.The analysis of U and V components in both EXP 2 and EXP 4 can be improved below 100 m when Argo temperature and salinity are assimilated, where the effect of atmosphere on ocean state is smaller than that near the ocean surface.While the analysis of EXP 3 and EXP 5 is much worse than that of EXP 1.The conventional data is very unevenly distributed in the subsurface and Argo data is able to remedy the disadvantage by adjusting the density field and then to improve the accuracy of current analysis.
Figure 8 shows time series of temperature anomalies of the "truth," EXP 1, EXP 2, and EXP 3 in the top 500 m at Nino 3.4 region.The results of EXP 1 (Figure 8(b)) display a strong annual change, which is induced by the periodic driving of climatological wind and net heat flux.The shift of phase and intensity of the temperature anomalies in EXP 2 (Figure 8(c)) are coincident with those of the "truth."The variability below 300 m in EXP 3 (Figure 8(d)) is inconsistent with that of the "truth," which indicates that the number of conventional temperature is insufficient to improve the accuracy of temperature analysis in the subsurface in the tropics.The insufficient observations are unable to rectify the errors induced by the initial fields or atmosphere forcing.Contrary to that, the assimilation of Argo data can improve the accuracy of temperature analysis in the subsurface.show the distributions of 5-year averaged RMS errors of temperature and salinity in the five experiments in the Pacific region, respectively.Large RMS errors of temperature in the control run (Figure 9) lie in the northwest Pacific region and the south Pacific region (south of 60 ∘ S), and large RMS errors of salinity lie in the northwest Pacific region, the east subtropical Pacific region, and the south Pacific region (south of 60 ∘ S).After all data is assimilated (Figures 10 and 12), the analysis of temperature and salinity are both improved greatly in the whole north Pacific region.There is also a visible improvement in the tropical Pacific region and south Pacific region comparing with that of the control run.However, the improvement of temperature is not distinct in the south Pacific region when Argo data is ignored just as in EXP 3 and EXP 5 (Figures 11 and 13).The improvements of salinity in EXP 3 and EXP 5 are both small comparing with that in EXP 1 except in the northwest Pacific region.Further, the analysis of salinity in EXP 5 becomes awful in the tropical Pacific region without considering the T-S relationship, which can ruin the structure of density field and result in improper dynamic fields in this region.Figures 14 and 15 show time series of temperature anomalies of the "truth," EXP 1, EXP 2, and EXP 3 at 50 m and 500 m in the tropical Pacific, respectively.It can be seen that the results of both EXP 2 (Figure 14(c)) and EXP 3 (Figure 14(d)) can reflect the spread of temperature anomaly accurately at 50 m (comparing them with Figure 14(a)).
However, the results of EXP 2 (Figure 15(c)) are superior to those of EXP 3 (Figure 15(d)) at 500 m.This also confirms that Argo data is quite effective for improving the analysis of temperature in the subsurface in the tropical Pacific region.

Conclusions
The results of five experiments within the OSSE framework confirm the key role of Argo data in improving the reanalysis fields of temperature and salinity in the subsurface in the Pacific.Further, the reanalysis fields of currents can be improved via assimilating Argo data.The spread of temperature anomalies in the tropical Pacific region is able to be reflected accurately when Argo data is assimilated, which may provide a reliable initial field for the forecast of temperature and currents for the subsurface in the tropical Pacific region.In the northwest Pacific region, the utilization of T-S relationship is useful for inhibiting the deterioration of salinity field when only the conventional data is assimilated.In contrast, when Argo data is included in that region, the adjustment of salinity tends to become almost meaningless.In the tropical Pacific region, however, the T-S relationship is also essential for adjusting the background field of salinity even if Argo data is included in addition to the conventional data.At the same time, Argo data plays a key role in further correcting the density field in the tropical subsurface.In addition, results also indicate that the analyses of hydrographic and dynamic fields, before Argo project was fully

Figure 1 :
Figure 1: Flowchart of the temperature and salinity data assimilation.

4. 1 .
Construction of the "Truth" and "Observed" Data.Velocity, temperature, and salinity in January 2002, derived from a fully coupled data assimilation system of Geophysical Fluid Dynamics Laboratory (GFDL) developed by Zhang et al.[8], are served as the initial fields for model integration.The model is spun up for 10 years, using looped daily wind stress and net heat flux derived from the NCEP in 2002.Wind stress and net heat flux from 2003 to 2007 are used to drive the model for 5 years.The obtained simulation results are used as the "truth" for comparing the reanalysis results of sensitivity experiments to evaluate the impact of Argo data assimilation.
20 years to provide the initial fields for these five experiments.By inputting the obtained initial fields the model runs for another five years in each experiment using climatological monthly wind and net heat flux derived from the NCEP.In such period of five years, Exp 2 assimilates "conventional observations" and "Argo observations," while EXP 3 assimilates only "conventional observations."In EXP 2 and EXP 3, T-S relationship is used to adjust the background fields of salinity after temperature is assimilated into the numerical model.EXP 4 and EXP 5 are the same as EXP 2 and EXP 3, respectively, except that T-S relationship is ignored.

Figure 4 :
Figure 4: Distributions of temperature and salinity RMS errors in the top 1000 m (a and b), in the top 100 m (c and d), and between 100 m and 1000 m (e and f), respectively.Red, black, blue, pink, and green curves are the results of EXP 1 to EXP 5, respectively.

Figure 5 :Figure 6 :
Figure 5: Distributions of temperature (a) and salinity RMS (b) errors in the top 1000 m of the tropical Pacific region.

Figure 7 :
Figure 7: Distributions of U and V RMS component errors in the top 1000 m (a and b) and between 100 m and 1000 m (c and d), respectively.

Figure 8 :Figure 9 :
Figure 8: Time series of temperature anomalies of the "true" field (a), EXP 1 (b), EXP 2 (c), and EXP 3 (d) in the top 500 m at Nino 3.4 region.