Improved Error Reduction and Hybrid Input Output Algorithms for Phase Retrieval by including a Sparse Dictionary Learning-Based Inpainting Method

,e phase retrieval (PR), reconstructing an object from its Fourier magnitudes, is equivalent to a nonlinear inverse problem. In this paper, we proposed a two-step algorithm that traditional ER/HIO iteration plays as the coarse feature reconstruction, whereas the KSVD-based inpainting technique deals with the fine feature set accordingly. Since the KSVD allows the content of oversampled dictionary with sparse representation to adaptively fit a given set of object examples, as long as the ER/HIO algorithms provide decent object estimation at early stage, the pixels violating the object constraint can be restored with superior image quality. ,e numerical analyses demonstrated the effectiveness of ER+KSVD and HIO+KSVD through multiple independent initial Fourier phases. With its versatility and simplicity, the proposed method can be generalized to be implemented with more PR state-of-the-arts.


Introduction
In most optical imaging applications, the globally spectral phase encodes critical structural information about the object. However, the optical detection devices merely record the magnitude squared of the field from a diffraction pattern and cause to loss the phase information. In order to recover an object field from its diffraction intensity, phases in the spectral domain need to be estimated. is is a basic definition about the phase retrieval (PR) problem. Such problem has been an active research topic in numerous engineering and scientific applications [1][2][3][4][5][6][7][8]. Generally, PR is a nonlinear inverse ill-posed problem; therefore, prior information of the underlying object is necessary to enable its recovery. e existing PR algorithms can be roughly categorized into the following classes: general iteration [9][10][11][12][13][14][15][16][17][18], sparse-based and compressed sensing methods [19][20][21][22][23], and convex and nonconvex optimization methods [24][25][26][27], respectively. e classical PR algorithm subject to the general iteration was proposed by Gerchberg and Saxton [9], Yang and Gu [10], and Fienup [11,12], respectively. e core idea is to start with an initial guess about the Fourier phase and then project the estimate onto the object domain (subject to spatial constraint) and the Fourier domain (subject to Fourier constraint) alternatively until the convergence criterion is reached. e Fourier constraint guarantees that the spectrum of the reconstructed object has the same Fourier magnitude as the measurement. Meanwhile, finitely supported, real-valued, and nonnegative priors of the object are usually imposed as the object constraints. e most popular and simplest PR iterative algorithms for a single measurement problem are the error-reduction (ER) and the hybrid input-output (HIO) algorithms, respectively. To successfully retrieve the Fourier phase, the oversampling condition, i.e., finer than the Nyquist interval, should be fulfilled in the Fourier dimension in both ER and HIO [1,17]. Various extended versions, such as relaxed averaged alternating reflection algorithm [18] and difference map algorithm [13], were subsequently developed to improve either the numerical robustness or estimation performance. Recently, inspired by the compressed sensing [28], the sparsity priors were incorporated into the object constraint. Experimental results demonstrated that the sparsity-based iterative methods have superior performance over the traditional ones with no sparsity priors [20,21]. Since the sparse prior requires an overcomplete representation, the dictionary learning for the PR algorithm was proposed based on adaptive sparse representation and self-learning [22]. To reduce the learning time, the transfer orthogonal sparsifying transform learning method was developed to avoid inefficient updates at the early iterations [23]. e last point of view about PR algorithms is to formulate the PR as a nonconvex problem. e gradient descent method is employed to solve such formulated problem [24]. ese schemes can guarantee the convergence to a stationary solution. On the other hand, nonconvex problems can be transformed into convex relaxations and be solved with convex optimization methods [25][26][27]. However, the mathematical theory is not yet developed to guarantee convergence for the nonconvex constraints. Interested readers can refer to [29] for an extensive bibliography and complete discussion about the PR problem.
In order to exhibit the effectiveness of sparse representation, in this paper, we proposed a novel algorithm associated with image inpainting technique as the extensive versions of two classical iterative PR algorithms (ER and HIO), respectively. e activation of object constraints in ER is periodically imported into the ER and HIO algorithms so that pixels of the estimated object violating the object constraints will be assigned zero values. Zero values in an image can be taken as the missing pixels. In order to recover missing pixels, the image inpainting technique was developed. Here, the K-singular value decomposition (KSVD) method [30], a dictionary learning algorithm for image inpainting, is adopted to restore these missing pixels. As a consequence, the ER and HIO algorithms combining with the KSVD algorithm (ER + KSVD and HIO + KSVD) were proposed. Since all of the ER, HIO, and KSVD algorithms had been proved to converge [11,30], the proposed methods can guarantee the convergence evidently. e cost function of relative squared error in numerical simulation validates that our method is able to avoid stagnation, get lower error values, and outperform the ER and HIO benchmark. e rest of this paper is organized as follows. In Section 2, related work and the KSVD method are briefly reviewed. Our main framework is also provided. In Section 3, the performance of the proposed methods is compared with classical PR methods, whereas further discussion is also addressed. Finally, conclusions and future works are summarized in Section 4.

Prior Works and Our Main Framework
For the purpose of completeness of this paper, we firstly defined the PR problem from a single intensity measurement and two classical algorithms, ER and HIO, respectively.
Subsequently, the sparse representation model for restoring missing pixels though the KSVD method was introduced. Finally, performance and evaluation through the proposed algorithms ER + KSVD and HIO + KSVD were discussed accordingly.
For the optical detection devices (e.g., charge-coupled device cameras), only the intensity distribution I(u) � |F(u)| 2 is available. e measured Fourier magnitude |F(u)| can always be obtained through the square root of the intensity and related to its counterpart in the spatial domain f(x) by the Fourier transform: where x, u, and F represent a two-dimensional (2D) spatial coordinate, a 2D frequency coordinate, and the Fourier transform operator, respectively. Basically, PR problems focus on how to find the phase information ϕ(u) in the Fourier domain from the Fourier measurement |F(u)|. Once the 2D Fourier phase ϕ(u) is recovered, accompanying the measurement |F(u)|, we can apply the inverse Fourier transform F − 1 : to reconstruct the original object accordingly.

ER and HIO Algorithms.
e ER and HIO algorithms involve alternate Fourier transform between the spatial and Fourier domains; both fields are subject to either prior constraints (e.g., real and nonnegative) or the measured data, as depicted in Figure 1.
Given a measured Fourier magnitude |F(u)| and a set of object constraints, the procedure of the ER algorithm consists of four steps as follows: where k is the k th iteration and c is a set of pixels at which g k ′ (x) satisfies the object constraints. In this article, we assumed the object constraint as a positive real-valued object g k ′ (x) within a finite supported region, i.e., the estimate g k ′ (x) outbound the known diameter of the object will be 2 International Journal of Optics assigned zero. In Figure 1 and equations (3a)-(3d), (g k ′ /g k+1 ) and (G k /G k ′ ) represent the first/final approximation in one iteration cycle to the original object f and Fourier spectrum F, respectively. ϕ k is the estimate of the Fourier phase ϕ of the original object. e four steps complete one iteration cycle and continue until reaching either the maximum number of iterations or the cost function below the threshold value.
Based on the ER algorithm which discards all the violating pixel values, Fienup proposed the HIO algorithm with a memory mechanism to speed up the convergence. e iterative process of HIO at k th iteration is the same as the ER algorithm except for the third step in equation (3c): where β is a constant feedback rate and usually selected to be less than 1, making the estimated object smoothly approaches toward the desired solution. An important feature of HIO lies in that its ability to converge to a global optimum instead of a local optimum. However, being vulnerable to noise, the image quality of recovery through the HIO algorithm is sometimes not satisfactory.
Since the Fourier magnitude is available in measurement, the error is accessible in the Fourier domain, whereas the convergence of both ER and HIO algorithms can be monitored by the relative squared error (RSE): which is used to evaluate how well the estimated Fourier magnitude |G k (u)| matching the given Fourier magnitude |F(u)|. According to Fienup's work [11], the error by which the estimated Fourier magnitude violates the given Fourier magnitude is monotonically decreasing. erefore, the ER and HIO algorithms are convergent.
In Figure 1, the activations for forcing the estimated object to conform the object constraints in ER and HIO algorithms are denoted by OC ER and OC HIO , respectively. In this paper, we regarded the pixel values violating the object constraints in equation (3c) as the missing pixels. Rather than totally discard (ER) or modification (HIO), we try to extract the useful image features through the sparse representation from the overcomplete dictionary. Such scheme has been well adopted in various image inpainting techniques.

KSVD Method for Image Restoration.
e image inpainting technique was developed to restore missing or damaged areas in an image. Among various image inpainting methods [30,31], here we selected the KSVD method to restore the missing pixels which were those violating the object constraints in the PR problem. KSVD, generalizing the Kmeans clustering process, is able to represent signals as linear combinations of a few atoms from a dictionary matrix. e atom in compressive sensing is usually represented as a column vector of the dictionary matrix. A proper dictionary matrix can be obtained from updating dictionary atoms at each iteration to better fit the data, whereas the initial dictionary matrix usually utilizes the 2D discrete cosine transform. To derive combinations of a few atoms, sparse representations via pursuit algorithm are introduced. erefore, the KSVD algorithm alternates between the sparse coding and dictionary update stages. e properties of the updated dictionary atoms set the limits on the sparse coefficient vector. e outline of the whole KSVD process is shown in Figure 2.
For the sake of paper completeness and to clarify the significance of KSVD for sparse representation, here we briefly review its math framework. For further details, interested readers can refer to the literature [30]. Using an overcomplete dictionary matrix D ∈ R M×K (M < K) that contains K prototype atoms for columns d j K j�1 , a signal y ∈ R M can be represented as a sparse linear combination of these atoms in the latent space. e representation of y is approximated as y ≈ Da.
e vector a ∈ R K contains the representation coefficients of the signal y. Since M is smaller than K, there are infinite solutions for this representation problem. With appropriate regularization [30], the imposition of the fewest number of nonzero coefficients set on the solution is likely to have a satisfactory representation.
When an image contains some missing pixels, the image can be firstly divided into N patches of the same size and make up an example set Y � [y 1 , y 2 , . . . , y N ], as shown in Figure 3. e extracted patches can be overlapping (or nonoverlapping) and converted to the column vectors. Each small patch y i , with dimension R M , can be approximately represented as the linear combinations of a few atoms from a dictionary matrix D ∈ R M×K . e representation error per y i is defined as e 2 i � ‖y i − Da i ‖ 2 2 , where a ∈ R K is the coefficient vector of linear combinations. en, the mean square error (MSE) of the overall representation is given by  Figure 1: Diagram of the iterative PR algorithm, where the parameters K MAX and k represent the maximum number of PR iterations and the k th iteration, respectively. e blue block indicates that the ER and HIO algorithms use different activations to force the estimated object to satisfy object constraints.
where A � [a 1 , a 2 , . . . , a N ] and F is the Frobenius norm. To search for the best possible dictionary for the sparse representation of the example set Y, the inpainting problem is therefore formulated as follows: where ‖ ‖ 0 denotes the number of nonzero elements in a vector. As the regularization, the coefficient vector a i has a maximum number of nonzero entries T 0 .
To solve the objective function expressed in equation (7), the KSVD algorithm iteratively alternates between the sparse coding and dictionary update stages, as shown in Figure 2.
For the sparse coding stage, the dictionary matrix D is fixed and the pursuit algorithms are employed to adequately compute the representation vector a i for each example y i [32,33]. Subsequently, the process of updating the dictionary matrix can be done by using the SVD. Since the dictionary matrix consists of K columns, KSVD updates the learning dictionary by computing SVD K times, each determining one column of the matrix. Carrying out a series of steps has capability to make MSE monotonically decreasing; therefore, the convergence of KSVD can be guaranteed [30]. Here, we exhibited an example of recovering missing pixels in an image by using the KSVD method. With 25% missing pixels and PSNR � 13.77 dB shown in Figure 4(b), we successfully filled the zero flawless and improve the image quality to PSNR � 31.56 dB. e KSVD method can effectively recover missing pixels as well as retrieve the latent features from a noisy image with sparse representation.
Since the missing pixels can be associated with those pixels violating the object constraints in ER, we applied the KSVD method upon ER/HIO iterations and examined its effectiveness in the PR problem.

Motivation and Our
Methods. Typically, a reconstruction with the least RSE is not always representative in terms of image. In this paper, in order to ensure that the ER and HIO algorithms will not stick in the minimum error, the image inpainting technique is periodically applied to perturb the convergence behavior of these two PR algorithms. Introducing the image inpainting technique can provide the PR algorithm with an ability to escape the local optimums. In addition, the recovered missing pixels using the image inpainting technique in the current iteration can render a finer representing feature set of the object for the next PR iteration. e core ideas of connecting the missing pixels with the ER and HIO algorithms, referred to ER + KSVD and HIO + KSVD, are, respectively, discussed as follows.
For the ER algorithm, the pixels in which the first estimated object g k ′ violates the object constraints in ER criterion will be set to zeros, as expressed in equation (3c).
ese zero values in the estimated object were viewed as the missing pixels, marked red in Figure 5(a). en we carried out the KSVD method to restore these missing pixels. e impact of the image inpainting method on the ER algorithm is exhibited in Figure 5. e object was not only inpainted with higher PSNR but also provide finer features for the next PR iteration. e procedure of the ER algorithm combining the KSVD method (ER + KSVD) is illustrated in Figure 6. Here, K MAX is the maximum number of iterations for the ER algorithm, k is the k th iteration, and K cycle indicates that KSVD is executed every K cycle iterations for K MAX iterations. e ER + KSVD algorithm turns to the red path every K cycle iterations for inpainting pixels in the object g ER , which is obtained by passing the estimated object g k ′ through the activation OC ER and hence has missing pixels. Taking the object g ER as an example set g ER ≡ Y � [y 1 , y 2 , . . . , y N ], the small patch y i can be approximately represented as linear combinations of a few atoms, y i ≈ Da i . Assembling all small patches y i corresponding to their sparse coefficient vectors a i , for i � 1, . . . , N, the missing pixels in g ER can be retrieved by the multiplication DA. Coefficients of sparse representations are summarized in the matrix A, and the dictionary matrix D is learned from the given example set Y at each iteration. erefore, a well feature set of the object, represented as g k � DA, is then transformed back to the Fourier domain for the next PR iteration. It is noted that only the pixels within the known support were recovered by KSVD. Zero pixels were preserved outside the support. e other iteration steps were kept the same as the ER algorithm, and we repeated equations (3a)-(3d) and then periodically carried out the aforementioned KSVD until reaching a satisfactory minimal RSE or the given maximal number of PR iterations. e mechanism of passing through KSVD can be viewed as an additional perturbation of the magnitude in the spatial domain, and its mathematical expression is as follows: where Δg k (x) is a recovered missing pixel as well as denoised pixel by means of the KSVD. e proposed method with spatial magnitude perturbation could be effective to make the PR algorithm avoid converging to a local optimum. Similarly, we employed the HIO + KSVD algorithm to examine its effectiveness in the PR problem. e activation OC ER was introduced in every K cycle iterations, and then, the violating pixels were equivalently defined as the missing pixels before executing the KSVD algorithm, as illustrated in Figure 6.

International Journal of Optics
Again, we connected the proposed KSVD scheme to the traditional PR iteration of equations (3a), (3b), (5), and (3d) until reaching a satisfactory minimal RSE or the given maximal number of PR iterations. e only difference between the ER + KSVD and HIO + KSVD algorithms is the activation (OC ER /OC HIO ) in the object constraints during the PR iteration. Since the KSVD allows the content of oversampled dictionary matrix D with sparse representation to adaptively fit a given set of object examples, the proposed method can be generalized to be implemented with many PR state-of-the-arts.

Numerical Simulations
In this section, we compared the performance of the five PR algorithms: ER, HIO, ER + HIO, ER + KSVD, and  Figure 6: Illustration of the process to perform the ER + KSVD and HIO + KSVD by using different activations (OC ER /OC HIO ) in the object constraints, as depicted in the blue block. e yellow block diagram is the core of our proposed method which combines KSVD with the classical PR algorithms to render finer features of the object for the next PR iteration. e parameters K MAX , k, and K cycle are represented by the maximum number of PR iterations, k th iteration, and iteration cycle for executing KSVD, respectively. Here, the same initial Fourier phases are given and the activation OC ER on the object constraints is selected during PR iteration to fairly examine the performance improvement when applying the image inpainting technique. e estimated object g k from our proposed method (ER + KSVD) shows that its peak signal-to-noise ratio (PSNR) is higher than using the classical ER method.
HIO + KSVD, and demonstrated the effectiveness of image inpainting technique during the PR iteration. All the cases discussed here are a single measurement problem, i.e., only the Fourier magnitude and real, nonnegative, and finitely supported object constraints are available. A well-known objective image quality metric, peak signal-to-noise ratio (PSNR), was utilized to evaluate the performance of PR algorithms between the reconstructed image and ground truth.
In order to retrieve phase information in the Fourier domain, the oversampling condition should be fulfilled [1,15,17], as shown in Figure 7(b). e sampling rate, defined as σ � total pixel number pixel number inside the support is to be at least the Nyquist rate. Figure 7(a) shows that the original object image size was zero-padded (σ � 2) to avoid aliasing effects. Under this condition, the support of the object image must be well known to reconstruct the object from the oversampled diffraction pattern in the Fourier domain. In order to quickly examine the proposed method, here we predefined locations of the object image's boundary edges to get the tight support region, as indicated by the red rectangle in Figure 7(a). One can also use the autocorrelation of the object to get a support region. is finite support is finally used in the object constraint to force the estimated object to zero when it locates outside the support region during the PR iteration. Given the oversampled diffraction pattern (σ ≥ 2) in the Fourier domain and the object constraints (real, nonnegative, and finitely supported) in the spatial domain, the simulation results of four test cases through the five PR algorithms are shown in Figure 8. e maximal number of PR iterations (K MAX � 800) was set as a stopping criterion for the five PR algorithms, and KSVD is executed every 160 iterations (K cycle � 160). In the ER + HIO algorithm, ER and HIO alternatively execute 200 iterations in each cycle until the maximum number of PR iterations is reached. Note that the five PR algorithms were given the same initial Fourier phase at the baseline. With a decent initial phase, the classical ER or HIO algorithm can provide well appearance of the object at early stage, and the proposed methods (ER + KSVD and HIO + KSVD) subject to this reconstructed object (after 160 iterations in ER or HIO) can further improve image quality with additional KSVD scheme. Nonetheless, the results with ER + KSVD are likely to fail, as shown in the third and fourth rows in Figure 8. No matter how we increased the iteration number, the result with ER + KSVD is still poor. e reason lies in that the ER + KSVD highly depends on the ER at earlier stages. If ER could not provide finer features of the object before executing KSVD, KSVD dictionary cannot self-learn from such poor data sample. In order to overcome this issue, we can impose another initial random guess. Likewise, the performance of HIO + KSVD depends on the HIO, which provides the prominent feature set of the object at earlier stages. e measured diffraction pattern can be set as the ground truth, and we examined the RSE evolution during the PR iteration, as shown in Figure 9. Under the test objects "NCTU", RSEs with KSVD-based have periodical spikes, whereas the perturbation enables the algorithm to jump out of the stagnant local minima. e results agree with the aforementioned statement that the KSVD can further improve the image quality with moderate initial conditions. To further manifest the advantage of the KSVD-based scheme, we applied another metric, structural similarity index (SSIM), to examine the reconstructed image quality. As shown in Figure 9(b), SSIM computes the structural similarity index for the estimated Fourier magnitude |G k (u)| using the measured Fourier magnitude |F(u)| as the reference image. Since KSVD was developed to attain the image features through a self-learning scheme, it is no surprise that SSIM is significantly increased as KSVD was executed at each iteration cycle.
Another advantage of the proposed KSVD-based scheme lies in its robustness against the noise. When noise was added to contaminate the diffraction intensity, the noise redistributed in the spatial domain would lead to reconstruction failure [17]. To demonstrate the KSVD-based method is superior to the classical PR algorithms under the noise case, we refer to the noise level defined in Rodriguez's work: where |F noise-free (u)| represents the noise-free Fourier magnitudes and |F noise (u)| the Fourier magnitudes with noise [15]. In this study, Poisson noise is added to the diffraction intensity with about R noise � 15%. Various reconstruction results obtained from the noisy diffraction patterns are shown in Figure 10. All the five PR algorithms were given the same initial Fourier phase for fair comparison. e maximal number of PR iterations and the iteration cycle for executing KSVD are set to be K MAX � 1500 and K cycle � 100, respectively. In the ER + HIO algorithm, ER and HIO alternatively execute 150 iterations in each cycle until the maximum number of PR iterations is reached. According to PSNR of the reconstructed images, the additional KSVD function effectively improved the classical ER and HIO methods. At this moment, we were wondering which value of the iteration cycle K cycle is the best to periodically apply KSVD in our proposed methods. Given the same initial Fourier phases, reconstruction results by using the ER + KSVD and HIO + KSVD algorithms with different iteration cycles (from 16 to 160) are shown in Figure 11.
For the ER + KSVD method, no significant improvement was observed under different iteration cycles. On the other hand, it is obvious that the best image quality is obtained when K cycle � 16 in the HIO + KSVD method. e reason lies in that HIO has higher reconstruction performance than ER; moreover, the estimated objects violating object constraints should be fixed by KSVD immediately when they were recognized at the early stage. In this work, the intrinsic International Journal of Optics object was restored after several times of the classical PR iteration rather than alternating the PR and KSVD algorithms (K cycle � 1) [20].
e reason for such two-step procedure lies in that the KSVD is time-consuming, and we suggest to allow HIO/ER iteration as the coarse feature reconstruction, whereas KSVD deals with the fine feature set accordingly. Figure 12 shows the final RSEs (corresponding to reconstructed objects in Figure 11) under different iteration cycles. In each case, the ER + KSVD and HIO + KSVD    algorithms reach the maximum number of PR iterations (K MAX � 800). Note that since the initial Fourier phase is randomly generated, the reconstruction results of the portrait from the ER + KSVD and HIO + KSVD algorithms using the same iteration cycle (K cycle � 160) in Figures 8 and 11 have different reconstructed image quality. Besides, executing the KSVD algorithm is timeconsuming, so we compare the computation time of the proposed methods for different iteration cycles to trade off the speed and reconstructed image quality. As shown in Figure 12(a), the ER + KSVD method using different iteration cycles unfortunately has no explicit varying trend of RSE (monotonically increasing or decreasing) to determine the best iteration cycle K cycle . is result makes sense because the reconstructed image quality from ER + KSVD has no significant improvement. In contrast,  Figure 11: Comparison of reconstructed objects with different iteration cycles (K cycle � 16, ·32, ·64, ·96, ·128, ·and · 160). e total number of PR iterations was set to K MAX � 800. Here, the objects were reconstructed from the noise-free diffraction intensity and all of the cases were under the same initial Fourier phases.  Figure 12: Based on the reconstruction results in Figure 11, the performance of the proposed methods with different iteration cycles K cycle is illustrated. Computation time is normalized to the time it takes to use K cycle � 64. (a) ER + KSVD and (b) HIO + KSVD.
the HIO + KSVD method has a performance plateau which is reached at about K cycle � 70.

Conclusion
In this paper, we presented a new framework with ER + KSVD and HIO + KSVD to improve the classical ER and HIO algorithms for a single measurement PR problem. e core concept lies in that the estimated object is periodically processed through the activation OC ER , whereas the pixels violating object constraint were defined as the missing pixels. ese missing pixels were then inpainted through the KSVD method, which is a sparse dictionary learning-based approach for image inpainting. e proposed two-step procedure lies in that the KSVD is time-consuming, and we suggest to allow HIO/ER iteration as the coarse feature reconstruction, whereas KSVD deals with the fine feature set accordingly. Since the KSVD allows the content of oversampled dictionary matrix with sparse representation to adaptively fit a given set of object examples, as long as the ER/HIO algorithms provide decent object estimation at early stage, the recovered missing pixels through KSVD can be restored with finer features of the object information. In this paper, we examined the effectiveness of the proposed scheme with more than 20 different and independent initial Fourier phases. e numerical analyses exhibited superior PR capability over the traditional ER or HIO algorithm alone. ere were two possibilities for future work: one is to apply the state-of-the-art image inpainting techniques upon the PR process and extract more object features. e other one is with complex-valued objects, which would change the object constraint from this work (real, nonnegative, finitely supported) to a more complete discussion.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.