^{1}

^{1}

^{2}

^{3}

^{1}

^{1}

^{2}

^{3}

We investigate the conditions for which nonnegative matrix factorization (NMF) is unique and introduce several theorems which can determine whether the decomposition is in fact unique or not. The theorems are illustrated by several examples showing the use of the theorems and their limitations. We have shown that corruption of a unique NMF matrix by additive noise leads to a noisy estimation of the noise-free unique solution. Finally, we use a stochastic view of NMF to analyze which characterization of the underlying model will result in an NMF with small estimation errors.

Large quantities of positive
data occur in research areas such as music analysis, text analysis, image
analysis, and probability theory. Before deductive science is applied to large
quantities of data, it is often appropriate to reduce data by preprocessing,
for example, by matrix rank reduction or by feature extraction. Principal
component analysis is an example of such preprocessing. When the original data
is nonnegative, it is often desirable to preserve this property in the
preprocessing. For example, elements in a power spectrogram, probabilities, and
pixel intensities should still be nonnegative after the processing to be
meaningful. This has led to the construction of algorithms for rank reduction
of matrices and feature extraction generating nonnegative output. Many of the
algorithms are related to the nonnegative matrix factorization (NMF) algorithm
proposed by Lee and Seung [

An interesting question is whether the NMF of a
particular matrix is unique. The importance of this question depends on the
particular application of NMF. There can be two different viewpoints when using
a model like NMF—either one can believe that the model describes nature and
that the variables

The first articles on the subject was two correspondences between Berman and Thomas. In [

The first article investigating the uniqueness of NMF
is Donoho and Stodden [

Simultaneously with the development of NMF, Plumbley
[

The result in [

In this paper, we investigate the circumstances under
which NMF of an observed nonnegative matrix is unique. We present novel
necessary and sufficient conditions for the uniqueness. Several examples
illustrating these conditions and their interpretations are given.
Additionally, we show that NMF is robust to additive noise. More specifically,
we show that it is possible to obtain accurate estimates of

This paper is structured as follows. Section

We will here
introduce convex duality that will be the framework of the paper, but first we
shall define the notation to be used. Nonnegative real numbers are denoted as

In the paper, we make a geometric interpretation of
the NMF similar to that used in both [

The

In some literature, the positive span is called the conical hull.

A set

The

The following lemma is easy to prove and will be used
subsequently. For a more general introduction to convex duality, see [

(a) If

(b) If

(c) If

(d) If

In this section, our definition of unique NMF and some
general conditions for unique NMF are given. As a
starting point, let us assume that both

Let

The
following is an example of an

The inverse of a nonnegative matrix is nonnegative if and only if it is a scaled permutation.

Lemma

A
matrix

The scaling and permutation ambiguity in the
uniqueness definition is a well-known ambiguity that occurs in many blind
source separation problems. With this definition of unique NMF, it is possible
to make the following two characterizations of the unique NMF.

If

The proof follows the analysis of
the

The NMF is unique if and only if
there is only one

The proof follows directly from the definitions.

If

The following definition will be shown to be a
necessary condition for both the set of row vectors in

A set

In the case of closed sets, the boundary close
condition is that

The set of row vectors in

If the
set of row vectors in

That not
only the row vectors of

The following is an example where

Let

A three-dimensional space is scaled such that the
vectors are in the hyper plane:

In three dimensions, as in Example

A set of vectors

If

In this section, a condition for unique

This is
an investigation of uniqueness of

The
figure shows data constructed as in Example

In the example above,

Next, it is investigated how to make an asymmetric uniqueness constraint.

A set of vectors in

Note that in the definition for sufficiently spread
set the

The dual space of a sufficiently spread set is the positive orthant.

A sufficiently spread set is
nonnegative and the positive orthant is therefore part of the dual set for any
sufficiently spread set. Let

In the case of finite sets, the sufficiently spread
condition is the same as the requirement for a scaled version of all the
standard basis vectors to be part of the sufficiently spread set. It is easy to
verify that a sufficiently spread set also is strongly boundary close and that
the

If a pair

Lemma

Theorem

In the
previous sections, we have analyzed situations with a unique solution. In this
section, it is shown that in some situations the nonuniqueness can be seen as
estimation noise on

Let

The proof is given in the appendix. The theorem states
that if the observation is corrupted by additive noise, then it will result in
noisy estimation of

This example investigates the
connection between the additive noise in

The three basis
pictures: (a) a dog, (b) a man, and (c) the sun from Example

In the example, two different noise matrices,

The estimation error of the factorization

The graph shows the
connection between the norm of the additive error

Let

This
follows directly from Theorem

Let

Data constructed as in Example

In this
section, the row vectors of

Let the
row vectors of

If the data is scaled,

Let all the elements in

The
figure shows data constructed as in Example

The approach in this paper is to investigate when
nonnegativity leads to uniqueness in connection with NMF,

As shown in Figure

The sufficiently spread condition defined in Section

We have investigated the uniqueness of NMF from three different viewpoints as follows:

uniqueness in noise free situations;

the estimation error of the underlying model when a matrix with unique NMF is added with noise; and

the random processes that lead to matrices where the underlying model can be estimated with small errors.

The theorem
state that

the vectors

Let

Select

This research was supported by the Intelligent Sound project, Danish
Technical Research Council Grant no. 26-02-0092. The work of M. G. Christensen is supported by the Parametric Audio
Processing project, Danish Research Council for Technology, and Production
Sciences Grant no. 274-06-0521.
Part of this work was previously presented at a conference [