^{1}

^{2}

^{1}

^{1}

^{2}

The online gradient method has been widely used in training neural networks. We consider in this paper an online split-complex gradient algorithm for complex-valued neural networks. We choose an adaptive learning rate during the training procedure. Under certain conditions, by firstly showing the monotonicity of the error function, it is proved that the gradient of the error function tends to zero and the weight sequence tends to a fixed point. A numerical example is given to support the theoretical findings.

In recent years, neural networks have been widely used because of their outstanding capability of approximating nonlinear models. As an important search method in optimization theory, gradient algorithm has been applied in various engineering fields, such as adaptive control and recursive parametrical estimation [

Conventional neural networks' parameters are usually real numbers for dealing with real-valued signals [

Convergence is of primary importance for a training algorithm to be successfully used. There have been extensive research results concerning the convergence of gradient algorithm for real-valued neural networks (see, e.g., [

The remainder of this paper is organized as follows. The CVNN model and the OSCG algorithm are described in the next section. Section

It has been shown that two-layered CVNN can solve many problems that cannot be solved by real-valued neural networks with less than three layers [

For the convenience of using OSCG algorithm to train the network, we consider the following popular real-imaginary-type activation function [

Let the network be supplied with a given set of training examples

The neural network training problem is to look for the optimal choice

For the convergence analysis of OSCG algorithm, similar to the batch version of split-complex gradient algorithm [

There exists a constant

The set

In this section, we will give several lemmas and the main convergence theorems. The proofs of those results are postponed to the next section.

In order to derive the convergence theorem, we need to estimate the values of the error function (

Suppose Assumption

The second lemma gives the estimations on some terms of (

Suppose Assumptions

From Lemmas

Suppose Assumptions

With the above Lemmas

Let

To give the convergence theorem, we also need the following estimation.

Let

The following lemma gives an estimate of a series, which is essential for the proof of the convergence theorem.

Suppose that a series

The following lemma will be used to prove the convergence of the weight sequence.

Suppose that the function

Now we are ready to give the main convergence theorem.

Let

Using Taylor's formula, we have

From (

Recalling Lemmas

In virtue of (

From Lemma

This lemma is the same as Lemma

This result is almost the same as Theorem

Using (

Next we begin to prove (

Furthermore, from Assumption (

In this section we illustrate the convergence behavior of the OSCG algorithm by using a simple numerical example. The well-known XOR problem is a benchmark in literature of neural networks. As in [

This example uses a network with two input nodes (including a bias node) and one output node. The transfer function is tansig

Convergence behavior of OSCG algorithm for solving XOR problem (sum of gradient norms

In this paper we investigate some convergence properties of an OSCG training algorithm for two-layered CVNN. We choose an adaptive learning rate in the algorithm. Under the condition that the activation function and its up to the second-order derivative are bounded, it is proved that the error function is monotonely decreasing during the training process. With this result, we further prove that the gradient of the error function tends to zero and the weight sequence tends to a fixed point. A numerical example is given to support our theoretical analysis. We mention that those results are interestingly similar to the convergence results of batch split-complex gradient training algorithm for CVNN given in [

The authors wish to thank the associate editor and the anonymous reviewers for their helpful comments and valuable suggestions regarding this paper. This work is supported by the National Natural Science Foundation of China (70971014).