Reduced-Dimensional Capture of High-Dynamic Range Images with Compressive Sensing

/e range of light illumination in real scenes is very large, and ordinary cameras can only record a small part of this range, which is far lower than the range of human eyes’ perception of light. High-dynamic range (HDR) imaging technology that has appeared in recent years can record a wider range of illumination than the perceptual range of the human eye. However, the current mainstream HDR imaging technology is to capture multiple low-dynamic range (LDR) images of the same scene with different exposures and then merge them into one HDR image, which greatly increases the amount of data captured. /e advent of singlepixel cameras (compressive imaging system) has proved the feasibility of obtaining and restoring image data based on compressive sensing. /erefore, this paper proposes a method for reduced-dimensional capture of high dynamic range images with compressive sensing, which includes algorithms for front end (capturing) and back end (processing). At the front end, the K-SVD dictionary is used to compressive sensing the input multiple-exposure image sequence, thereby reducing the amount of data transmitted to the back end. At the back end, the Orthogonal Matching Pursuit (OMP) algorithm is used to reconstruct the input multiple-exposure image sequence. A low-rank PatchMatch algorithm is proposed to merge the reconstructed image sequence to obtain anHDR image. Simulation results show that, under the premise of reducing the complexity of the front-end equipment and the amount of communication data between the front end and the back end, the overall system achieves a good balance between the amount of calculation and the quality of the HDR image obtained.


Introduction
With the development of mobile Internet and Internet of ings (IoT) technology, devices with cameras are becoming more common, such as smart phones, network surveillance cameras, laptop computers, autonomous vehicles, and traffic monitoring cameras. Furthermore, camera is now an essential feature for smartphones and laptops. However, common cameras on the market can only capture low-dynamic range (LDR) images, i.e., these cameras can only capture a small part of the rang of illuminance in a real scene. e dynamic range of the real scene perceptible to the human eye is as high as 10 8 : 1, but the dynamic range of the LDR images captured by these cameras is only 2 8 : 1 or 2 16 : 1, which makes the LDR images unable to truly represent the real scene. To solve this problem, high-dynamic range (HDR) imaging technique has been proposed, and it can capture a wider range of illumination than that of human eye. ere are two ways to obtain HDR images: software and hardware. e hardware method directly captures HDR images by increasing the dynamic range of the sensor, but the range is very limited, and it is expensive [1]. erefore, the software method is currently the main method, i.e., fusing multiple-exposure LDR images (hereinafter called image sequence or sequence) to obtain HDR images. e fusion method can be further divided into two categories: one is to restore the Camera Response Function (CRF) and then reconstruct the HDR light radiation pattern [2]; the other is to directly fuse the pixels of multiple-exposure sequences at the pixel level. Both categories of methods need to consider all the pixels of the multiple-exposure image sequence, increasing the computational complexity and storage space. In the process of transmission and storage, the images are further compressed and transformed to remove redundancy to extract the required information.
is method of sampling after compression results in sampling redundancy, excessive storage space, and increased transmission costs.
Compressive sensing (CS, also called compressed sensing) [3][4][5] can solve the above problem, which compresses signal while sampling. CS breaks through the limitation of the traditional Shannon sampling theorem and can perform high-probability reconstruction of incomplete signals at a condition far below the Nyquist sampling rate. Rice University has developed a single-pixel camera based on the theory of compressive sensing [6]. By replacing the CCD or CMOS sensors with a digital micromirror array (DMD) and a single photon detector, it only needs to sample the image fewer times than the number of pixels. Its appearance confirms the feasibility of compressive sensing applying to imaging systems. erefore, this paper proposes a method for reduced-dimensional capture of high dynamic range images with compressive sensing, which includes algorithms for front end (capturing) and back end (processing). At the front end, the K-SVD dictionary is used to compressive sensing the input multiple-exposure image sequence, thereby reducing the amount of data transmitted to the back end. At the back end, the Orthogonal Matching Pursuit (OMP) algorithm is used to reconstruct the input multiple-exposure image sequence. A low-rank PatchMatch algorithm is proposed to merge the reconstructed image sequence to obtain an HDR image. complexity of the front-end device and the amount of communication data between the front end and the back end. e method proposed in this paper uses compressive sensing technology to reduce the computational complexity and data volume of HDR image capturing front-end devices, thereby reducing the cost of the entire system.

Reduced-Dimensional Capture and Reconstruction of
Multiple-Exposure Image Sequences. Since compressivesensing cameras are not common now, we assume that the front end uses a common camera to capture a series of LDR images with different exposures. is assumption makes the system not only easier to implement, but also easier to compare with other methods. Every image in the image sequence is resampled using compressive sensing. Compressive sensing includes sparse representation of signals, design of measurement matrices, and design of signal reconstruction algorithms [7].

Sparse Representation of Image with Overcomplete
Dictionaries. Signals are not sparse in practical applications, but when a suitable basis is used to represent the signals, they are sparse or compressible [8], i.e., the number of nonzero elements is small, which is conducive to the improvement of the sampling rate. e use of sparse representation in multiple fields has become increasingly mature, such as compression, regularization in inverse problems, and feature extraction [9]. Sparseness is the premise of compressive sensing, which means that the signal itself is sparse or sparse after some transformation, for example, transforming nonsparse signals into sparse ones by Fourier transform, discrete wavelet transform [7], i.e., the nonsparse signal is represented by a linear combination of several atoms in a fixed dictionary (such as a DCT dictionary, a wavelet dictionary, a Haar dictionary, and a Gabor dictionary). e fixed dictionary has a simple structure and simple calculation, but it can only be applied to a limited range of signals, and the sparse representation cannot be guaranteed to be optimal, i.e., the sparseness of the signal sparse representation cannot be guaranteed. To best suit a set of given signals, we can train an overcomplete dictionary with the given signals. e K-SVD [9] method can continuously iterate through sparse coding and dictionary update to optimize the sparse representation of the signal on the premise of a given training set. K-SVD can be regarded as a generalized form of K-means clustering. e only difference is that the number of atoms used for each signal is different.
Blocking each image in the multiple-exposure image sequence can further reduce storage space and the block size can usually be 8 × 8, 16 × 16, 32 × 32, and so on [10].    Solve matrix with a pursuit algorithm Images blocking to get signal matrix Y = (y 1 , y 2 , …, y N ) Define a group of indices pointing to y i that use d k : Compute representation error matrix: e detail of E R k and SVD decomposition of E R k can be found in [9].
image is expanded by 0. N � r × c × M/b 2 is the number of vectors generated from the image sequence and n � b 2 is the length of y i . Y ∈ R n×N is a matrix of column vectors y i . e dictionary D (J) ∈ R n×K is made up of atom vector d k , where K is the total number of atoms in D (J) and the superscript (J) is the number of iterations. X ∈ R K×N is the sparse representation of Y under dictionary D and is made up of row vectors x i T , i � 1, 2, . . . , K, where the subscript T of x i T indicates that x i T is a row vector and superscript T indicates matrix transpose. Equation (1) is the object function of K-SVD, where x i is the ith column of matrix X and T 0 is the predetermined number of nonzero elements in e matrix E k is the error for all the input signal when the kth atom is removed. e detail of E R k and SVD decomposition of E R k can be found in [9].

Measurement Matrix Design.
After the signal is sparsely represented, a suitable measurement matrix Φ ∈ R K×n is needed to compressive sense the signal. e design principle of the measurement matrix is that the sensing matrix Θ � Φ D should meet the Restricted Isometry Property (RIP) [11], [12] to ensure one-to-one mapping from the original space to the sparse space. e compressive sensing of signal y i is shown in (2), where z i ∈ R K×1 is the compressed sample of signal y i : When Φ is a Gaussian random matrix, the sensing matrix Φ can satisfy the RIP with large probability [13]. e advantage of a Gaussian measurement matrix is that it is not related to almost any sparse signal, so it requires very few measurements. erefore, we use the Gaussian random matrix as measurement matrix.

Reconstructing Image Sequence.
e reconstruction method is the core step of compressive sensing. e quality of the reconstruction method determines the quality of the reconstructed image. Compressive sensing reconstruction methods mainly include three categories [14]. e first is greedy algorithm (such as orthogonal matching pursuit (OMP) [15], stagewise orthogonal matching pursuit (StOMP) [16], and regular orthogonal matching pursuit) (ROMP) [17]). is method solves the local optimal solution to approximate the signal in each iteration. e second is a convex optimization algorithm (such as the base tracking algorithm (BP) [18], the interior point method [19], the gradient projection method [20], and the iterative threshold algorithm [21]). Convex optimization can achieve better reconstruction results with a small number of samples but has a higher computational complexity. e third is combination optimization algorithm, which uses the group testing to accurately reconstruct the signal. e reconstruction speed is fast, but the scope of application is limited, such as HHS Pursuit [22]. In this paper, we use the OMP algorithm to reconstruct the image sequence. e performance of the OMP algorithm is stable and the reconstruction accuracy is high, which can ensure that the original signal is accurately recovered at a lower sampling rate.
Given the sensing matrix Θ � Φ D � (θ 1 , θ 2 , . . . , θ K ) and the compressed sample z i of signal y i , the OMP algorithm can estimate the sparse representation x i of signal y i . en the signal y i can be recovered by 3 e idea behind the OMP is to pick columns in a greedy fashion, i.e., at each iteration t, the column θ t of Θ that is most strongly correlated with the remaining part of x i is chosen [15]. Figure 3 is the flowchart of the OMP algorithm. e input is the sensing matrix Θ and one of the compressed signals z � z i , i � 1, 2, . . ., N in (2). After running N times of OMP, we can get the matrix X (the sparse representation of Y) and Y can be calculated column by column using (2). At last the image sequence can be reconstructed from Y.

Low-Rank PatchMatch Algorithm.
During the capture process of the multiple-exposure image sequence, camera shaking or unpredictable moving objects in the scene are inevitable, which will cause artifacts or noise to appear in the final fused HDR image. Currently, block matching fusion method is mainly used to eliminate the artifacts and noise. e essence of block matching fusion is to find a mapping relationship between two different images A and B (given the image block set of A and B as {PA} and {PB}, respectively), i.e., by calculating the correlation, find the nearest-neighbor field (NNF) of B, so that the error of similar image blocks in the two images is minimized. By looking for the block in {PA} that is closest to block in {PB}, the artifacts in the fused image are reduced.
If image block matching is performed through a full search, the complexity is as high as O(mM 2 ), where m and M are size of the image and the size of the block, respectively. To reduce the complexity, Connelly Barnes et al. [23], [24] proposed a fast PatchMatch algorithm with randomized nearest neighbor and successfully reduced the complexity of the algorithm to O(mMlog(M)). e main steps of the algorithm can be summarized as initialization, propagation, and random search. Due to the high efficiency and better performance of the PatchMatch algorithm, it has a profound impact in the fields of image stitching, image completion, and image reorganization.
In fact, there are generally more than three multipleexposure LDR images of the same scene to fuse into HDR image, so Pradeep Sen et al. [25] proposed multisource bidirectional similarity (MBDS), as shown in (4). S is the original image, and T is the target image. N is the number of source images. P and Q are patches in S and T, respectively. ω k (P) weighs the source patches when calculating completeness based on how well-exposed they are. In order to measure the weight of a well-exposed image block, the wellexposed image block has a large weight, and vice versa. d (·) is a distance metric, which is usually calculated using the l 2 norm. |T| is the total number of image blocks of the target image. is formula mainly includes the integrity of mapping from S to T and the correlation from T to S. MBDS selects well-exposed blocks in the image sequence to fill the registration image, so it can achieve better registration results: From the perspective of low-rank matrix recovery, combined with the idea of MBDS, this paper proposes an improved algorithm for removing artifacts from HDR images. e objective function is shown in (5). e input image sequence has N images I i , i � 1, . . ., N, and I ref is the reference image selected from the sequence. L i , i � 1, . . ., N, is the result image of I i being aligned to the reference image I ref ; that is, the content of L i is aligned with the reference image, and the exposure parameters remain the same with I i . Function g i (I j ) is the mapping from exposure parameter i to exposure parameter j. Function h(·) maps the grayscale domain of LDR to the radiance domain of HDR. Function vec(·) turns a two-dimensional image into a column vector: Solve the MBDS problem to get L i . More details can be found in [25]. In addition, the addition of a low-rank constraint enables the aligned images to ensure a sufficiently low rank, i.e., to maintain a linear correlation in brightness. e solution is to divide into two independent local optimization subproblems, namely, the problem of MBDS and low-rank matrix recovery. At the same time, the iterative solution under multiresolution scale is used to find the optimal solution of MBDS. e low-rank matrix finally obtained is the target HDR image with high dynamic range and linear brightness of the scene. e process is shown in Figure 4.

Results and Discussion
is section will analyze the convergence of the low-rank PatchMatch algorithm, simulate the multiexposure image compressive sensing and reconstruction algorithm and the antiartifact fusion algorithm, and evaluate the algorithms in terms of subjective and objective criteria.

Convergence of the Low-Rank PatchMatch Algorithm.
Randomly generate data matrices with rank of r and size of 1000 × 500, and add sparse noise with a noise ratio of p. Validate the convergence by two sets of experiments. In first set, fix matrix rank r to 1, and observe the convergence under different noise ratios p. In the second set, fix the noise ratio p, which is set to 0.2 in the experiment, and observe the convergence under different ranks r. e results are shown in Figure 5. In both cases, the low-rank PatchMatch algorithm converges within 5 iterations.

Image Evaluation Criteria.
In this paper, we will use the mean squared error (MSE), peak signal-to-noise ratio (PSNR), information entropy, average gradient, and running time to objectively evaluate the image quality and algorithm. e definition of mean squared error (MSE) is shown in (4), where m and n are the width and height of the images, and f and g are two different images. e standard deviation represents how much the experimental data deviates from the mean. e higher the standard deviation, the more diverse the result data, and the lower the accuracy of the result: PSNR is a commonly used criterion for reconstructed image quality evaluation. According to the definition of the standard deviation in equation (4), the definition of PSNR is given in equation (5), where MAX I is the maximum value of the pixel value of an image, for example, 255 for an 8 bit grey image. e larger the PSNR value, the lower the degree of distortion of the image: PSNR � 10 × log 10 MAX 2 Information entropy represents the average information of an image, that is, the average information after removing Input compressed signal z and sensing matrix: Θ = {θ 1 , θ 2 , ..., θ K } Find the index λ t by λ t = arg max (j = 1, 2, …, K) |<r t-1 , θ j >| Augment the index set and the matrix of chosen atoms: Initialize the residual r 0 = z, the index set Λ = Φ, chosen atoms matrix Θ 0 is empty,and the iteration counter t = 1.
Estimate x t by least squares: Calculate the new approximation at of z and the newresidual: Output estimation of x:  g � 2 2 .

(9)
For the fusion of multiple-exposure images of complex scenes with moving objects, the evaluation of deghosting needs to be further explored. At present, there is no mature objective criterion to evaluate the deghosting of HDR images. In this paper, the deghosting evaluation method proposed by Karaduzovic-Hadziabdic and Telaovic et al. [26] is used, and the test image set used is a complex real scene.

Simulation of Compressive Sensing and Reconstruction for
Multiple-Exposure Images. e simulation platform for this experiment is MATLAB 2015b; the hardware is 32G memory, Intel Core i5-6600K processor (main frequency 3.5 GHz). Airplane and Lena with an image size of 512 × 512 were selected for simulation, and the simulation results were compared with BP, OMP, and StOMP algorithms at lower (R � 0.3), medium (R � 0.5), and higher (R � 0.7) sampling rates, respectively. e simulation results are shown in Table 1.
Among the three major types of algorithms for compressive sensing reconstruction, the convex optimization algorithm has the best performance, but then it has the highest complexity and the longest reconstruction time. As a representative of convex optimization algorithm, BP has better reconstruction performance than greedy algorithm and combinatorial optimization algorithm. At the sampling rate of 0.3 and 0.5, compared with BP algorithm, the performance of our algorithm is better than BP algorithm. With a sampling rate of 0.8, although the PSNR of our algorithm is slightly lower than BP, it is higher than the other algorithms. In addition, the reconstruction time of our algorithm is shorter than the reconstruction time of the BP algorithm except that the sampling rate is low (R � 0.3).
e simulation results are shown in Figure 6 when the sampling rate is 0.5. From a subjective point of view, for the letter area on the fuselage and wings of the airplane image, both the BP algorithm and our algorithm can recover the clear letters, but the letters recovered by the OMP algorithm and the StOMP algorithm are blurred. e images recovered by BP, OMP, and StOMP algorithms all have obvious noise. e particle noise of StOMP algorithm is the most obvious. Our algorithm can basically restore the image information correctly. e reconstruction difference of the Lena image is not as obvious as the aircraft image from the subjective point of view, but it can be observed that the images recovered by BP, OMP, and StOMP all have different degrees of noise, and the particle noise of StOMP is the most obvious.
Among the above algorithms, the StOMP algorithm has the shortest reconstruction time, but the reconstruction effect is also the worst, and the particle noise is very obvious. Because of the existence of noise, the average gradient of the StOMP algorithm is higher than that of other algorithms. e average gradient of our algorithm is similar to the OMP algorithm, which is better than the BP algorithm. From the perspective of MSE, the MSE of BP algorithm is the smallest, and our algorithm is second. Information entropy is similar to the case of MSE.
reconstructed, and then fused into HDR images. e results are compared with robust principal component analysis (RPCA) [27], partial sum of singular value (PSSV) [28], [29], and [2], MBDS algorithm (referred to as SEN) proposed by Pradeep Sen et al. [25], the brightness and texture consistency deghosting method (referred to as HU) proposed by Jun Hu et al. [30], and the low-rank restoration based deghosting fusion proposed by Tae-Hyun Oh et al. [31]. Figure 7 shows the Arch image sequence. ere are moving people in the picture. Artifacts can occur with direct fusion. Reference [2] and RPCA cannot suppress the appearance of artifacts, and our algorithm and PSSV algorithm can both suppress artifacts well and have better subjective visual effects. Figure 8 shows the results of the Puppet sequence. Our algorithm adds low-rank constraints to minimize the impact of misaligned regions and keep the resulting image as linear as possible. It can be seen from the result that our algorithm is better. Figure 9 shows the results of Sculpture Garden sequence. ere are many pedestrians in the picture, which makes it difficult to remove artifacts. From the results of the fusion, [30] is the worst, and there are obvious artifacts in [6], and the SEN method has a fuzzy phenomenon. Both HU and this algorithm suppress artifacts well, but due to the effect of image blocking, block effect exists in the fusion result. Figure 10 is the local area details of the fusion result in Figure 9. Because result of [31] is the worst compared to other algorithms, the detail of it is not enlarged. e literature [6] is less effective in removing artifacts because of the obvious silhouette cross. ere is obvious blurring at the pedestrian edges of the SEN method. HU and our algorithm achieve better results, but our algorithm has noise caused by block effects.

Conclusions
Aiming at the problems of traditional cameras with redundant sampling, large storage space consumption, and inability to fully record the radiance in the real scene due to the limitation of the dynamic range of the sensor, this article uses the K-SVD dictionary to compressive sensing LDR images of different exposure in the same scene. en the LDR images is reconstructed and fused with low-rank PatchMatch algorithm to get an HDR image. e simulation results show that the method in this paper can effectively reduce the sampling rate and remove the image artifacts and blurring caused by the camera shake and the motion of the objects in the scene. It provides a method for compressive sensing to obtain HDR images.
However, due to the introduction of block compressive sensing, the size of the image block has become a factor that cannot be ignored. Simulation results show that when the image block is small, the block effect is more obvious and the edge details are distorted. But when increasing the image block, it will increase the storage space and computational complexity. In addition, adding compressive sensing and dictionary learning before fusion increases the computation time, sacrificing time in exchange for reduction in complexity, and sampling rate. erefore, the next step is to perform pixel-level fusion in the compressive sensing domain of the HDR image to further reduce the time required for the algorithm and improve the quality of the fused image.

Data Availability
No data were used to support this study.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.