This paper introduces applying a novel sparse approximation method into solving nonrigid structure from motion problem in trajectory space. Instead of generating a truncated traditional trajectory basis, this method uses an atom dictionary which includes a set of overcomplete bases to estimate the real shape of the deformable object. Yet, it still runs reliably and can get an optimal result. On the other hand, it does not need to consider the size of predefined trajectory bases; that is to say, there is no need to truncate the trajectory basis. The mentioned method is very easy to implement and the only trouble which needs to be solved is an

Nonrigid structure from motion (NRSfM) refers to the process of recovering 3D coordinates of a nonrigid object from its 2D projections. This technology plays an important role in computer vision applications [

Moreover, experiments also proved that the efficiency of the trajectory basis method relied on two factors: the type of the trajectory basis and the number of bases. And in the selection of trajectory basis, it turned out that the discrete cosine transform (DCT) for Markov was more suitable to be defined as a general basis [

An advantage of the sparse coding approach is that it is not restricted to only one trajectory basis function and may be generated by two or more incoherent basis functions. It is very useful to recover the trajectory curves which consist of some different types of bases functions. What is more, since the goal of sparse approximation is to represent trajectory sequences as a sparse combination of all atoms, there is no need to predefine the number of trajectory bases.

The measured projective trajectories are contained in a

In this formula,

Matrix

We all know that most of the energy in natural signal concentrates in low-frequency area and discrete cosine transform (DCT) has a huge advantage in the specificity of energy concentration [

In Akhter’s method [

Traditional trajectory bases approaches of reconstruction mainly involve the use of a certain kind of orthogonal bases, such as the Fourier basis, various DCT bases, and other orthogonal wavelet bases. In these situations, one wishes to represent all the trajectory curves of feature points as a linear combination of a certain waveform. But there is a problem that if the number of trajectory bases was too big, the equations system would be a NP-hard problem. So this paper introduces a new method which uses sparse approximation method to represent the trajectory curves instead of traditional trajectory bases method. And the sparse approximation provides a class of algorithms that learn basis functions only when they capture higher-level features in the input data [

In reality, wavelets perform poorly on high-frequency sinusoids and, on the contrary, sinusoids perform poorly on impulsive events [

A dictionary is a concatenation of some orthonormal bases. So the expression is not unique, and many combination methods arise. But the goals of these methods are all to get a highly sparse decomposition which contains very few nonzero terms. This puts forward an optimization problem:

then a unique K-sparse solution of the vector

The

Having mastered the above theories, applications of sparse approximation algorithm in the NRSfM problem will be presented in the following words.

The goal of the NRSfM is to estimate the camera motion matrix

One can compute the rank-3K decomposition of a measurement matrix via singular value decomposition (SVD) [

The coefficients matrix

To simplify the expression, it can be represented by the following optimization problem:

An iterative feature-sign search algorithm, which can solve for

The training dataset used in this paper is from CMU Motion Capture dataset, which covers a variety of human actions. The random synthetic data are used only in the condition of algorithm validation and the sparse approximation method performs clearly better than others. So the result will be not mentioned and this paper reports the results on real sequences only. The real scenes tested in this paper mainly include the commonly sequences of “Yoga” (41/307), “drinking” (41/1102), “pickup” (41/357), “shark” (91/240), “Stretch” (41/740), and “walking” (55/260), where

At first, this paper does one experiment on “Yoga” sequence. A union atom dictionary of DCT and Dirac function is used in this experiment. Having recovered the 3D coordinates of the deformable object, the corresponding coefficients of the atom dictionary are obtained naturally. And the experimental result is shown in Figure

Coefficients distribution from the 1st to the 250th atoms on the “Yoga” sequence (better viewed in color).

Coordinate “atom” refers to the basis order of the dictionary generated by trajectory basis functions. That is to say, the “atom” corresponds to the columns of one dictionary matrix. Coordinate “coefficients” refer to the corresponding coefficients of each atom. From the above graph, one can easily find that most of the atom coefficients are zeros. It proves that the assumption is suitable that the coefficients matrix is a sparse matrix. At the same time, the result also verifies the feasibility of the sparse approximation method.

It has been proved that DCT was better than the others in the application of trajectory basis model and it is the most common method. This paper compares the sparse method against the trajectory basis methods in the situation of selecting DCT bases with different size. The atom dictionary used in sparse approximation method is also generated by a union of DCT basis and Dirac basis. This subsection presents the different performances on the “Yoga” motion sequence of these methods and gives the shape reconstruction error on one diagram as shown in Figure

Shape estimation error on the “Yoga” motion sequence. From left to right, each column indicates different performances when the sizes of DCT basis are 3, 6, 9, 12, and 13, respectively, and the most right column gives the result of sparse methods (better viewed in color).

Clearly, selecting 12 as the size of DCT basis is the best solution and the shape estimation error is the smallest when recovering the object shape with DCT basis model. But when considering with the sparse coding method, the sparse method is obviously better than the trajectory basis methods even in any size of DCT basis. It is very easy to understand that the atom dictionary contains all predefined trajectory bases even those not used in traditional trajectory basis. The difference between these two methods is that the sparse method only sets the coefficients of the unused trajectory bases to zero and these unused trajectory bases may still be used in other trajectory curves. But, in the trajectory basis model, once the trajectory bases were truncated in a fixed size, the trajectory curves could only be represented by these truncated bases.

This subsection gives a statistical comparison between the well-size DCT trajectory basis method [

Estimation errors of each frame on the “Yoga” sequence. The top line denotes the results curve with DCT method when the size of DCT bases is 12, and another line denotes the results curve with sparse coding method (better viewed in color).

From Figure

To verify the effectiveness of the sparse coding method, this paper repeats the experiment in different scenes. The experiment is done with some real sequences which include “Yoga,” “walking,” “pickup,” “shark,” and “drink.” The best size of DCT trajectory bases is obtained from the previous experiments which have been proved to be effective. This experiment applies the best solution of DCT basis model to every scene and compares the results of DCT basis model with the sparse approximation method. From previous works, the best DCT basis size of “Yoga” is 12, “walking” is 8, “pickup” is 12, “shark” is 2, and “drinking” is 10. The penalty coefficient of the sparse approximation method is 0.1. The experimental result is shown in Figure

Shape estimation error on different scenes (better viewed in color).

From Figure

To verify the effectiveness of the sparse approximation method that this paper mentioned, a lot of experiments have been done. At this subsection of the paper, some experimental results were shown clearly to enhance the persuasion. This subsection gives out the shape reconstruction results of the “Yoga,” “shark,” and “Stretch” sequences using the DCT method and the sparse approximation method, respectively. The truncated size of DCT basis in the “Yoga” experiment is 12, and the penalty coefficient of the sparse method is 0.1. The experimental results are seen in Figure

3D reconstruction results on the “Yoga” sequence with two methods, respectively. The red dots denote the ground truth coordinates and the blue circles denote the estimation coordinates. The top row is the reconstruction result with sparse approximation and the bottom row is with the DCT trajectory basis method. Both of the results of these two methods are shown with the

3D reconstruction results on the “shark” sequence with two methods, respectively. The red dots denote the ground truth coordinates and the blue circles denote the estimation coordinates. The top row is the reconstruction result with sparse approximation and the bottom row is with the DCT trajectory basis method. Both of the results of these two methods are shown with the

To verify the effectiveness of sparse approximation method, this subsection also gives out the experimental result of “Stretch” sequence; Figure

3D reconstruction results on the “Stretch” sequence with two methods, respectively. The red dots denote the ground truth coordinates and the blue circles denote the estimation coordinates. The top row is the reconstruction result with sparse approximation and the bottom row is with the DCT trajectory basis method. Both of the results of these two methods are shown with the

From Figures

This paper introduces a novel sparse approximation method to resolve the NRSfM problem. It is quite easy to understand and is guaranteed to get an optimal solution. It shows that one needs not to consider the size of truncated DCT bases via using the sparse approximation method. In this paper, only the union of DCT and Dirac function is applied into the reconstruction of 3D motion object. It is expected that camera rotation matrix estimation error can be obtained more accurately.

In this paper, the present experiments are all in a situation of orthographic camera models. And thanks to recent progress in signal sparse coding, the proposed solution can be easily applied into solving the NRSfM problem. This paper just gives out one thought to solve the size of the trajectory bases. Some other ideas can also be used in this problem. At the same time, obtaining a rotation matrix accurately will help to get a satisfied result.

The authors declare that there is no conflict of interests regarding the publication of this paper.

This work is supported by the National Natural Science Foundation of China (61272311) and is also supported in part by the Natural Science Foundation of Zhejiang Province (LY13F020042, LY14F010022, and Z15F020017) and 521 Project of Zhejiang Sci-Tech University.