^{1}

^{1, 2}

^{1}

^{2}

Two nonlinear regression methods, Bayesian neural network (BNN) and support vector regression (SVR), and linear regression (LR), were used to forecast the tropical Pacific sea surface temperature (SST) anomalies at lead times ranging from 3 to 15 months, using sea level pressure (SLP) and SST as predictors. Datasets for 1950–2005 and 1980–2005 were studied, with the latter period having the warm water volume (WWV) above the

The El Niño-Southern Oscillation (ENSO) phenomena, the strongest climatic fluctuation on time scales ranging from a few months to several years, is characterized by interannual variations of the tropical Pacific sea surface temperatures (SST), producing warm (El Niño) and cold (La Niña) episodes [

Since the early 1980s, much effort has been devoted to forecasting the tropical Pacific SST anomalies. ENSO forecast models can be categorized into three types: dynamical models, statistical models, and hybrid (statistical-dynamical) models [

Most of the statistical models used for ENSO forecast have been linear models, for example, multivariate linear regression (LR) and canonical correlation analysis (CCA) [

To better model the nonlinear features of ENSO, neural network (NN) models, originally from the field of computational intelligence, have been used for nonlinear regression [

NN methods, generally regarded as forming the first wave of breakthrough in machine learning, became popular in the late 1980s, whereas kernel methods (e.g., support vector regression SVR) arrived in a second wave in the second half of the 1990s [

The warm water volume (WWV) above the

In this paper, we test if the forecasting of tropical Pacific SST anomalies by Wu et al. [

The warm water volume (WWV) index is defined as the volume integral of water above

Monthly SST data at

The reason two data sets of SST and SLP data were used, one covering the period 1950–2005 and the other 1980–2005, is because the WWV data were unavailable prior to

Removal of the climatological seasonal cycle and smoothing by a 3-month running mean were performed on the data to obtain the anomalies. Principal component analysis (PCA) was performed on the SST anomalies, with the five leading principal components (PC) retained. The first five PCs account for

The first five principal components of SST (multiplied by

The first five spacial modes (EOFs) for the SST anomaly field (1950–2005) with positive contours indicated by solid curves, negative contours by dashed curves, and zero contours thickened. Contours are multiplied by a factor of

EOF 1

EOF 2

EOF 3

EOF 4

EOF 5

Our choice on the number of PCs to use for predictors and for predictands was based on Wu et al. [

The predictors used are the first

Following Hsieh [

For the NN model in (

Finally, the following procedure was used to optimally select the number of hidden neurons (

With the value of

Next, instead of adaptive basis functions

The high-dimensionality problem has finally been solved with the kernel trick, that is, although

As mentioned earlier, SVR improves on NN methods by avoiding multiple local minima. It also uses a more robust error norm, the

The SVR problem is solved by minimizing the following function:

The SVR model has three hyperparameters

Individual forecasts of the first five PCs of SST were conducted using LR, BNN, and SVR. Forecasts were made at lead times of

For the 1980–2005 record, from the training and validation procedure for BNN, out of a total of 250 cases (as there were 5 predictand PCs, 5 lead times and 10 validation segments, giving

The correlation skills evaluated over testing data for the SST PCs for the 1980–2005 record are shown in Figure

Forecast correlation skills for the

PC1

PC2

PC3

PC4

PC5

The nonlinearity of ENSO variability is manifested through PC

To find out if the SST response to WWV could be governed by nonlinear dynamics, the performance of the models was tested using the WWV index as the sole predictor. The correlation scores for PC

Cross-validated correlation skills for PCs

PC1

PC2

SST anomaly fields were reconstructed by summing the five predicted PCs multiplied by their corresponding EOF spatial patterns. SST anomalies averaged over the Niño

Correlation scores before and after WWV was added to the predictor set for the Niño

Niño

Niño

Niño

Niño

RMSE scores before and after WWV was added to the predictor set for the Niño

Niño 4

Niño 3.4

Niño 3

Niño

Contour maps of the correlation scores between the predicted and observed anomalies for the nonlinear models are provided in Figures

BNN forecast correlation performance and its difference from that of LR at lead times of 3–12 months for the period 1980–2005

BNN (lead = 3 mon)

BNN

BNN (lead = 6 mon)

BNN

BNN (lead = 9 mon)

BNN

BNN (lead = 12 mon)

BNN

SVR forecast correlation performance and its difference from LR for the period 1980–2005

SVR (lead = 3 mon)

SVR

SVR (lead = 6 mon)

SVR

SVR (lead = 9 mon)

SVR

SVR (lead = 12 mon)

SVR

Although adding the WWV index as a predictor slightly enhanced the forecast skills, it reduced the difference in prediction skill between the nonlinear and linear models in some cases. For instance, the slight prediction advantage at lead times of 9–12 months of BNN relative to LR was reduced when WWV was added, as can be seen by comparing Figures

BNN forecast performance and its difference from LR for the period 1980–2005

BNN(+WWV) (3 mon)

(BNN

BNN(+WWV) (6 mon)

(BNN

BNN(+WWV) (9 mon)

(BNN

BNN(+WWV) (12 mon)

(BNN

SVR forecast performance and its difference from LR for the period 1980–2005

SVR(+WWV) (3 mon)

(SVR

SVR (+WWV) (6 mon)

(SVR

SVR (+WWV) (9 mon)

(SVR

SVR (+WWV) (12 mon)

(SVR

The analysis was repeated for the longer record of 1950–2005. Compared with the results for the 1980–2005 period, the results from 1950–2005 showed an even smaller difference between the nonlinear and linear methods. This is not surprising since the ENSO episodes during 1950–1979 were weaker and less nonlinear than those from 1980 onward [

The forecast results of the

Although SVR has two structural advantages over NN models, namely, (a) no multiple minima as there is no nonlinear optimization involved, and (b) an error norm robust to outliers in the data, it did not give overall better forecasts than BNN. Presumably (a) the use of an ensemble average in BNN to deal with multiple solutions from multiple minima was quite adequate, and (b) the data set did not contain drastic outliers to utilize the advantage of the robust SVR model.

Addition of WWV as an extra predictor generally increased the forecast skills slightly; however, the influence of WWV on SST anomalies along the central-eastern tropical Pacific region appears to be linear as the difference between nonlinear and linear models often diminished when WWV was included. As the WWV embodies the large-scale low-frequency dynamics [

The authors were supported by the Natural Sciences and Engineering Research Council of Canada and the Canadian Foundation for Climate and Atmospheric Sciences. The authors are grateful to Dr. Aiming Wu for his helpful comments. The WWV index of Meinen and McPhaden [