Weighted Least Squares Based Detail Enhanced Exposure Fusion

Many recent computational photography techniques play a significant role to avoid limitation of standard digital cameras to handle wide dynamic range of the real-world scenes, containing brightly and poorly illuminated areas. In many of these techniques, it is often desirable to fuse details from images captured at different exposure settings, while avoiding visual artifacts. In this paper we propose a novel technique for exposure fusion in which Weighted Least Squares (WLS) optimization framework is utilized for weight map refinement. Computationally simple texture features (i.e., detail layer extracted with the help of edge preserving filter) and color saturation measure are preferred for quickly generating weight maps to control the contribution from an input set of multiexposure images. Instead of employing intermediate High Dynamic Range (HDR) reconstruction and tone mapping steps, well-exposed fused image is generated for displaying on conventional display devices. A further advantage of the present technique is that it is well suited for multifocus image fusion. Simulation results are compared with a number of existing single resolution and multiresolution techniques to show the benefits of the proposed scheme for variety of cases.


Introduction
In recent years several new techniques have been developed that are capable of providing precise representation of complete information of shadows and highlights present in the real-world natural scenes.The direct 8-bit gray and 24-bit RGB representation of visual data, with the standard digital cameras in single exposure settings, often causes loss of information in the real-world scenes because the dynamic range of most scenes is beyond what can be captured by the standard digital cameras.Such representation is referred to as low dynamic range (LDR) image.Digital cameras have the aperture setting, exposure time, and ISO value to regulate the amount of light captured by the sensors.It is therefore important to somehow determine exposure setting for controlling charge capacity of the Charge Coupled Device (CCD).In modern digital cameras, Auto Exposure Bracketing (AEB) allows us to take all the images without touching the camera between exposures, provided the camera is on a tripod and a cable release is used.Handling the camera between exposures can increase the chance of missalignment resulting in an image that is not sharp or has ghosting.However, most scenes can be perfectly captured with nine exposures [1], whereas many more are within reach of a camera that allows 5-7 exposures to be bracketed.When the scene's dynamic range exceeds the dynamic range (DR) of camera it is exposure setting that determines which part of the scene will be optimally exposed in the photographed image.The DR of a digital camera is typically defined as the charge capacity divided by the noise [1,2].At single exposure setting, either detail in the poorly illuminated area (i.e., shadows) is visible with long exposure or brightly illuminated area (i.e., highlights) with short exposure (see Figure 1).Thus, image captured by the standard digital camera at single exposure setting from a scene containing highlights and shadows is partially over-or underexposed.As a result, there will always be a need to capture the detail of the entire scene with a sufficient number and value of exposures.The process of collecting complete luminance variations in rapid successions at different exposure settings is known as exposure bracketing.
In principle, there are two major approaches to handle the incapability of existing image capturing devices.The first approach is to develop HDR [3][4][5][6][7] reconstruction from multiexposure images that reconstruct full dynamic range up to 8 orders of magnitude and later tone map these images to adjust their tonal range, to some extent, for depiction on typical display devices.HDR [3][4][5][6][7] imaging is called scene-referred representation which represents the original captured scene values as close as possible.Such representation is sometimes referred to as extrasensory data representation.One of the important applications of HDR capturing techniques for security application is capturing video at entrance of the buildings [1].Conventional cameras are not able to faithfully capture the interior and exterior of a building simultaneously while HDR camera, which is based on two-phase workflow, would be able to simultaneously record indoor as well as outdoor activities.Other important applications of HDR representation are satellite, scientific, and medical imagery, in which data is analyzed and visualized to record more than what is visible to the naked eye.On the other hand, because of limited contrast ratio, standard displays (LCD, CRT) and printers are unable to reproduce full dynamic range captured by the HDR devices.In such cases, HDR data needs to be remapped [7] with a lower precision for display on conventional devices.Tone mapping algorithms can be either spatially variant or spatially invariant.In particular, spatially variant methods (also called local operators) [7][8][9][10] exploit local adaptation properties of human visual system (HVS), while spatially invariant methods [11][12][13] exploit global adaptation (also called global operators) of HVS.
Higher bit depths are usually not used because the display devices would not be able to reproduce such images at levels that are practical for human viewing [1].Although for some real-world scenes low bit depth is sufficient to capture entire detail, there are countless situations that are not accommodated by low bit depth.Although HDR display devices will be developed in the near future, conventional printers may lead to inconsistencies which will be responsible for loss of details in the output.Recently Sunnybrook technologies, BrightSide, and Dolby prototypes of HDR display devices have been proposed [1,14,15] that can display HDR data directly.As a result, to avoid these inconsistencies, we must use tone mapping operators [7][8][9][10][11][12][13] to prepare HDR imagery for display on LDR devices.Alternatively, we may directly generate 8-bit low dynamic range (LDR) image that looks like a tone-mapped image [1].
The second approach for the purpose is combining multiexposure images directly into 8-bit single LDR image that does not contain underexposed and overexposed regions [18,26].Thus it provides convenient and consistent way for preserving details in both brightly and poorly illuminated areas by skipping the construction of HDR image and the use of tonemapping operators [7][8][9][10][11][12][13]. The incorporation of the notion of combining multiple exposures without typical HDR and tone mapping steps is known as "exposure fusion, " as shown in Figure 1.The fundamental goal of exposure fusion is often to improve the chance of creating a realistic scene without HDRI representation and tone mapping step.The underlying idea of various exposure fusion approaches [18,26] is based on the utilization of different local measures for generating weight map to preserve details present in the different exposures.The current manuscript belongs to the second approach.The block diagrammatic representation of the present detail enhanced framework is shown in Figure 1.We have used edge preserving filter based on partial differential equations (PDE) [27] for two-scale decomposition that separates sharp details and fine details across various input images with different exposure levels.The current state-ofthe-art method for automatic exposure fusion exploits the capability of edge preserving filter [27,28] to generate weight function that guides the fusion of different exposures based on two-scale decomposition.We propose WLS filter [20] optimization framework and sigmoid function for weight map refinement of base layers and detail layers, respectively.Farbman et al. [20] has utilized WLS filter to construct a multiscale edge preserving decomposition multiscale tone and detail manipulation.To achieve the optimal contrast in the fused image the current paper develops an appropriate mask based on weak textures and color saturation measure to composite multiexposure images.The method is applicable for the fusion of broad range of textured images.See Figure 2 for an example of our exposure fusion results for typical scene contain artificial light source (i.e., highlights), shadows, reflections, indoor details, and outdoor details.
Texture features [29] refer to the characterization of regions in an image by their spatial arrangement of color or intensities.Image textures are one way that can be used to help in classification of images [30].Weak edges or texture information are the ideal indicators to detect over (or under) exposed regions in the image [17].Raman and Chaudhuri [17] employ a Bilateral Filter (BLT) for compositing multiexposure images, in which weak edges were considered to design weight map.Interestingly, thus an analysis of weak textures seems to be the definition of perceived contrast.We take advantage of such possibility and design the appropriate matting function based on anisotropic diffusion for exposure fusion.
To analyze an image texture, there are primarily two approaches: structural approach and statistical approach.Structural approach uses a set of primitive texture elements in some regular or repeated pattern to characterize spatial relationship.While statistical approach defines an image texture as a quantitative measure of the arrangement of intensities in a region.In general later approach is easier to compute and is more widely used in computer graphics applications, since natural textures are made of patterns of irregular subelements.It has been noticed that simple averaging to fuse details from multiexposure image data yields low contrast in the fused image, especially in brightly and poorly illuminated areas.In the present approach, texture details will decide the contribution of corresponding pixel from different exposures in the fused image.A rich texture details mean a maximum contribution, which tells that image block has higher weight during the fusion process.Such metric is used to quantify the perceived local contrast of an image under different exposure settings and allows discarding underexposed and overexposed pixels.Therefore, to handle underexposed and overexposed regions, we propose a texture feature analysis based on Anisotropic Diffusion (ANI) [27,28] that has the applicability to design weighting function as shown in Figure 1.Our goal is to exploit the edge preserving property of ANI to produce well-exposed image from ISRN Signal Processing input images captured under different exposure settings.The detailed description of ANI based two-layer decomposition and weight map computation is given in the later sections.Our main contributions in this paper are highlighted follows.
(1) Two-scale decomposition based on anisotropic diffusion is proposed for fast exposure fusion, which does not require optimization of number of scales as required in the traditional multiscale techniques.
(2) A novel weight construction approach is proposed to combine texture features and saturation measure for guiding image fusion process.For weight map construction, we seek to utilize the strength of texture details under the change of exposure setting that takes place between an underexposed and an overexposed image.WLS filtering is proposed for weight refinement.Furthermore, fast sigmoid function based weight map generation for detail layers is proposed that reduce computational complexity of the algorithm.
(3) The important contribution of this paper is the advantages including ease of implementation, quality of compositing, and the provision of detail layer enhancement without introducing artifacts.
The remainder of this paper is structured as follows.Section 2 discusses the current available literature.Section 3 discusses description of separation of large scale variations and smaller scale details (i.e., texture details) based on ANI, consideration of smaller scale details and saturation measure for weight map generation, and the WLS and sigmoid based weight map refinement that produce single well-exposed image using simple weighted average approach.Section 4 discusses the utility of proposed approach for multifocus image fusion and the comparison with the popular single resolution exposure fusion, multiresolution exposure fusion and popular tone mapping operators.Section 5 summarizes the paper with future directions and conclusion.

HDR Imaging.
There is a tremendous need to record a much wider gamut than standard 24 bit RGB.The practice of assembling HDR image from multiple exposure images recovers true radiance value present in the real-world scenes [2].The camera response function recovered from differently exposed images is used to create HDR image whose pixel values are equivalent to the true radiance value of a scene.Radiance maps are stored in a file format that can encode recovered HDR data without losing information."Floating point tiff " format tends to encode dynamic range up to 79 orders of magnitude and has better precision than the radiance format.Reinhard et al. [1] has provided the description and evaluation of formats available to store true radiance values.However, the success of HDR image capture has shown that it is possible to produce an image that exhibits details in poorly and brightly illuminated areas.Moreover, HDR formats have since found widespread applications in the computer graphics and HDR photography.
The prototypes of HDR display devices provide direct HDR display capabilities by means of a projector or Light Emitting Diode (LED) array that lights the Liquid Crystal Diode (LCD) from behind with a spatially varying light pattern [14,15].Unfortunately, conventional display devices (i.e., CRT and flat panel display) have dynamic ranges spanning a few orders of magnitude, much lower than those of the real-world scenes, often less than 100 : 1.In order to display HDR images on monitors or print them on paper [31], we must remap the dynamic range of the HDR images to reproduce low dynamic range (LDR) images suitable for human visual system (HVS).In the literature, several tone mapping methods for converting real-world luminances to display luminances have been developed and fulfilling the fast growing demand to be able to display HDR images on low dynamic range (LDR) display devices.Most tonereproduction algorithms make use of photoreceptor adaptation [32] to achieve visually plausible results.Local operators [7][8][9][10] involve the spatial manipulation of local neighboring pixel values based on the observation that HVS system is only sensitive to relative local contrast.Global operators [11][12][13] do not involve spatial processing.Tonemapping is achieved by applying spatially invariant operator to treat every pixel independently.Both types of techniques have their own advantages and disadvantages in terms of computational cost, easy implementation, halo effects (artifacts), spatial sharpness, and practical application.Reinhard et al. [1] give detailed review of various tone mapping operators.
A simple S-shaped curve (sigmoid function) has been utilized as tone mapping function [33].The middle portion of such sigmoidal function is nearly linear and thus resembles logarithmic behavior.Moreover, sigmoidal functions have two asymptotes: one for very small values and one for large values.Fattal et al. [9] has introduced gradient based approach to preserve details from HDR image.To simulate the adaptation behavior of human visual system, they have attempted gradient modification at various scales.A reduced, low dynamic range image is then obtained by solving a Poisson equation on the modified gradient field.The algorithm has used local intensity range to reduce the dynamic range in transform domain and preserve local changes of small magnitudes.The method was almost free of artifacts and does not require any manual parameter tweaking.
Recently, dynamic range compression based on two-scale decomposition has been proposed [10].The base layer was obtained using a nonlinear BLT filter [17] and detail layer was computed by taking difference between the input image and the base layer.Only the contrast of base layer was reduced, thereby preserving fine details.

Exposure Fusion.
In recent years, various fusion algorithms have been developed to combine substantial information from multiple input images into a single composite image.The principal motivation for image fusion is to extend the depth-of-field, extend spatial and temporal coverage, to increase reliability, and extend dynamic range of the fused image and the compact representation of information.Imaging sensor records the time and space varying light intensity information reflected and emitted from object in a three-dimensional observed physical scene.However, image fusion has a fundamental difficulty in preventing artifacts and preserving local contrast when fusing the characteristics recorded from the incident radiations, such as exposure value, focusing, modality, and environmental conditions.The automated procedure of extracting all the meaningful details from the input images to a final fused image is the main motive of image fusion.To facilitate image fusion, it may be necessary to align input images of the same scene captured at different times, or with different sensors, or with different exposure (EV) settings (called bracketing), or from different viewpoint using local and global registration methods [34,35].Normally it is assumed that the input images are captured with the help of tripod mounting.Hence, in general, we expect point-by-point correspondence between different input exposures of a scene.From technical stand point, the fused image reveals all details present in the scene without introducing any artifacts or inconsistencies which would distract the human observer or subsequent image processing stages.Ogden et al. [36] has proposed pyramid solution for image fusion.The pyramid becomes a multiresolution sketch pad to fill in the local spatial information at increasingly fine detail (as an artist does when painting).The Laplacian pyramid contains several spatial frequency bands which depicts certain edge information [37].Image gradient orientation coherence model based fusion has been proposed for blending flash and ambient images [38,39].This model seeks to utilize the properties of image gradients that remain invariant under the change of lighting that takes place between a flash and an ambient image.Region segmentation and spatial frequency have been utilized for multifocus image fusion [40].A fast multifocus algorithm has recently been developed [23] which utilizes weighted nonnegative matrix factorization and focal point analysis for preserving feature information in the fused image.
Raman and Chaudhuri [17] have utilized edge-preserving filter (i.e., bilateral filter) for the fusion of multiexposure images, in which appropriate matte is generated based on local texture details for automatic compositing process.Goshtasby [41] proposed exposure fusion method based on weights determined by blending function.Information metric was considered to design blending function.Smaller weights were assigned to an image block carrying less information while higher weights were assigned to best-exposed image block.Therefore, an image block was considered bestexposed within an area if it carries more information about the area than any other image blocks.To maximize information content in the fused image, a gradient-ascent algorithm was used to determine optimal block size and width of the blending functions.The size of the block automatically varies with image content as the type of image is changed.Szeliski [42] has used multidimensional histogram as postprocessing operator to achieve optimal contrast enhancement in the fused image; simple averaging was performed to smoothly combine the pixels into a fused image.This method was based on the observation that if the average intensity of the image is maintained during the average operation using histogram equalization, then new image can be created with increased contrast.
Mertens et al. [18] have used multiresolution approach [38] for the fusion of multiexposure image series.The technique was designed to create a well-exposed image without extending the dynamic range and tone mapping of the final image.This approach blends multiple exposures in Laplacianpyramid code based on quality metrics like saturation and contrast.A part of the technique was stitching of flash and no-flash images, which seems to be suitable for detail enhancement in the fused image.The performance of this multiscale technique is dependent on the number of decomposition levels, that is, the pyramid height.However, the presented approach seems to be computationally expensive.Recently, various fast and effective weighted average based exposure fusion approaches have been proposed.Among these guided filtering based two-scale decomposition fusion approach [43], median filter and recursive filtering based fusion approach [44] and global optimization using generalized random walks for fusion [21] are producing fusion results with better quality.These methods were utilizing different image feature for weight calculation and further refined weight was used to control the contribution of pixels from input exposures.Instead, we use anisotropic diffusion which is effective for two-scale decomposition and weight map generation based on image feature such as weak textures.The major advantage of our technique is that it is based on singleresolution weighted average approach.Generally speaking, due to computational simplicity the present approach can be used in various consumer cameras entering the commercial market.Moreover, we noticed that the present approach can be applied for the multifocus image fusion and has much better results than existing multifocus and multiexposure image fusion methods.

WLS Based Exposure Fusion
3.1.Overview.A new type of exposure fusion technique is developed to avoid the limitation of conventional digital camera to handle the luminance variation in the entire scene.The primary focus of this paper is the development of a fast and robust exposure fusion approach based on local texture features computed from edge-preserving filter.Unlike most previous multiexposure fusion methods, we build on ANI, a nonlinear filter introduced by Perona and Malik [27] in 1990 that has the ability to preserve large discontinuities (edges).It derives from magnitude of the gradient of the image intensity and controls diffusion strength in the image to prevent blurring across edges.As such, the algorithm implemented (see Figure 4) includes four steps.
(1) A first step, in our algorithm, is two-scale decomposition based on ANI which is used to separate coarser details (base layer) and finer details (detail layer) across each input exposure.
(2) Weak texture details (i.e.detail layer computed from ANI) and saturation measure are utilized to generate weight mask for controlling the contribution of pixels from base layers separated across all the multiple exposures.
(3) WLS and sigmoid function based weight map refinement is performed for coarser details and finer details computed in the first step, respectively.(4) Weighted average based blending of coarser details and finer details are performed to form a composite seamless image without blurring or loss of detail near large discontinuities.

Extraction of Coarser Details and Finer Details.
Edge preserving filters have received considerable attention in computational photography over the last decade.BLT [45] and ANI [27] are the most popular edge-preserving operators.Standard BLT uses distances of neighboring pixels in space and range.The space varying weighting function is computed at a space of higher dimensionality than the signal being filtered.As a result, such filters have high computational costs [46].ANI has led to an excellent tool for smoothing fine details of an image while preserving the coarser details (i.e., edges).It is modeled using partial differential equations (PDEs) and based on nonlinear iterative process.The diffusion equation in two dimensions is defined as follows: where the operator ∇ calculates the image gradient of an input image , ‖∇‖ represents the magnitude of the gradient of image intensity, (‖∇‖) is a spatially varying nonlinear operator that smoothes fine details while avoiding blurring of coarser details, (, ) specifies spatial position, and  is the iteration parameter.
The diffusion strength in the image is determined by the conduction coefficient which is influenced by the gradient of the image intensity.The principles of conduction coefficient are (i) smoothing the fine textures and (ii) preserving coarser details in the image data.Such type of nonlinear diffusion is achieved by considering image structure.On the other hand, fixed value of conduction coefficient (i.e., (⋅) = 1) yields isotropic linear diffusion that tends to have constant response for fine textures and strong edges.Therefore to achieve nonlinear diffusion the conduction coefficient is chosen to satisfy () → 0 when  → ∞ so that the diffusion process is "stopped" across the region boundaries (i.e., edges) at locations of high gradients.
A diffusion functions (⋅) used in our approach can be defined as follows: where  is a scale parameter that is determined by empirical constant and the selection of scale parameter may be different for a particular application [27].In our algorithm the value of  = 1/7 was fixed for all cases, which is determined empirically to yield optimally diffused image for fine details extraction.
Anisotropic diffusion [27] for discrete signal is computed as follows: where    is a discrete version of input signal,  determine the sample position in the discrete signal, and  determines iterations.The constant  is a scalar that determines the rate of diffusion,   represents the spatial neighborhoods of current sample position , and |  | is the number of neighbors.
For discrete image, North (  ), South (  ), East (  ), and West (  ) spatial locations are considered for the computation of conduction coefficients.In our case, local window of size (3 × 3) from input image () is chosen, which intuitively appears most suitable for the computation of conduction coefficient at low computation cost, but other window sizes are possible as well.So after computing all the possible values of the conduction coefficients for pixel position () in the discrete image, the diffused image is obtained as follows: where ∇  , ∇  , ∇  , and ∇  indicate the difference of North, South, East, and West neighbor for pixel position , respectively.
Let   be the th source image which needs to be operated by an ANI filter.In order to separate coarser and finer details, we first decompose source images into two-scale representations by using anisotropic diffusion.The base layer (i.e., the diffused image defined in ( 4)) of each source image is obtained as follows: BL  = ANI ,, (  ) . ( Once the base layer is obtained for each th input image, the detail layer (  ) can be directly calculated by subtracting the base layer from the source image as follows: 3.3.Weight Estimation.The motivation behind weight map computation is to yield nonlinear adaptive function for controlling the contribution of pixels from base layers and detail layers computed across all input exposures.Interestingly, the detail layer (  ) computed for th source image in (6) yields analysis of weak textures that seems to be indicator of contrast variation in the image.We adopted such metric to quantify the perceived local contrast of an image under different exposure settings and allow discarding underexposed and overexposed pixels.
Furthermore, in order to accomplish optimal contrast and color details in the fused image, we additionally incorporate the color saturation measure (SAT  ) to our weighting function.In practice for th RGB source image, SAT  is computed for th source image as the standard deviation within the   ,   , and   channel, at each pixel where,  = (1/3)(  +   +   ).

ISRN Signal Processing 7
As shown in Figure 1, in order to remove the influences of underexposed and overexposed pixels for producing wellexposed image, the two image features, that is,   and SAT  , are combined together by multiplication to estimate combined features (FR  ): Then FR  is convolved with the symmetric Gaussian low pass kernel () having 5 × 5 kernel size (  ) with standard deviation (  ) 5 to construct the saliency maps SM  : Next, the saliency maps are compared to determine the weight maps as follows: where  is number of source images, SM   is the saliency value of the pixel  in the th image.

WLS Based Weight Refinement and Weighted Fusion of
Coarser Details and Finer Details.In this section, we propose WLS optimization framework [20] and sigmoid function [28] based weight map refinement approach to obtain noiseless and smooth weight maps.First, WLS filtering is performed on each weight map WM  with the corresponding source image serving as the source image for the affinity matrix [20].The motivation behind weight maps refinement is as follows.The fusion rules (weight map) computed in (10) are hard (the value of weight maps are changing abruptly), noisy, and not aligned with the object boundaries.Weight maps need to be as smooth as possible, since rapid changes in the weight maps will introduce seam and artifacts in fused image.[20] based edgepreserving operator may be viewed as a compromise between two possible contradictory goals.Given an input image V, we seek a new image , which, on the one hand, is as close as possible to V, and at the same time, is as smooth as possible everywhere, except across significant gradients in V.

WLS Optimization Framework. WLS
To achieve these objectives we seek to minimize the following quadratic functional: where the subscript   denotes the spatial location of a pixel.The goal of the data term (   − V   ) 2 is to minimize the distance between  and V, while the second (regularization) term strives to achieve smoothness by minimizing the partial derivatives of .The smoothness requirement is enforced in a spatially varying manner via the smoothness weights   and   , which depend on V. Finally,  is responsible for the balance between the two terms, increasing the value of  results in progressively smoother images .
Using matrix notation we may rewrite (11) in following quadratic form: Here   and   are diagonal matrices containing the smoothness weights   (V) and   (V), respectively, and the matrices   and   are discrete differentiation operators.
The vector  that minimizes ( 12) is uniquely defined as the solution of the linear system where   is the identity matrix and  V =        +        .Modulo the difference in notation, this is exactly the linear system used in [47], where it was primarily used to drive piecewise smooth adjustment maps from a sparse set of constraints.
In the present approach,   and   are forward difference operators, and hence    and    are backward difference operators, which means that  V is a five-point spatially inhomogeneous Laplacian matrix.As for the smoothness weights, we define in the same manner as in [47]: where ℓ is the log-luminance channel of the input image V and the exponent  (typically between 1.2 and 2.0) determines the sensitivity to the gradients of V, while  is a small constant (typically 0.0001) that prevents division by zero in areas where V is constant.Equation ( 13) tells us that  is obtained from V by applying a nonlinear operator   , which depends on V: In the present approach, WLS ,, (V) represents the WLS filtering operation.Where ,, and  are the parameters which decide the degree of smoothness, sensitivity to the gradients and small constant of the WLS filter, respectively.In our case WM  computed in (10) serves as the input image to WLS filter (i.e., V = WM  ).More specifically, the coarser version of weight map WM  will serve as refined weight map for th base layer  BL  : Once the resulting weight maps for base layer are obtained, sharp and edge-aligned weights are computed based on 1-D sigmoid function for fusing the detail layers.As shown in Figure 1, the spatially smoothed weight maps of base layer are utilized to compute sharp weight mask of detail layer which preserve texture details in the fused image.Therefore, unlike [43], the proposed solution attempts computationally simple approach to estimate the best possible weight maps for detail layer fusion.
Let    denote refined weight map for th detail layer and SIG ,  , ( BL  ) is the 1-D sigmoid function [28] applied on  BL  , where  ∈ R,   ∈ R, and  are the weight parameter, independent variable, and the parameter which decide the threshold to further control the degree of sharpness, respectively.Then    is computed as In theory, the 1-D sigmoid is computed as: where   ∈ R is the independent variable,  ∈ R is a weight parameter of the sigmoid function, and  be a fixed threshold to further control the sharpness of sigmoid function.
Once the resulting weight maps  BL  and    are obtained, the pixelwise weighted composition of base layers (i.e., fused base layer BL  ), detail layers (i.e., fused detail layer   ), and the resulting fused image   can be directly calculated as follows:

Experimental Results and Analysis
In order to evaluate the performance and effectiveness of the proposed image fusion approach, we have summarized the comparison of our fusion approach with different exposure fusion, tone mapping, and multifocus image fusion methods.Two objective evaluation metrics (i.e., quality score  / [48] and visual information fidelity for fusion VIFF [49]) were employed to access the fusion quality and to analyze the effect of free parameters used in the approach.Currently, all experimental results are generated by the MATLAB implementation.Furthermore, to measure distortion in the fused image and strengthen the evaluation capability of  / and VIFF, we incorporate the Dynamic Range Independent Visible Difference Predictor (DRIVDP) [22].DRIVDP metric is sensitive to three types of structural changes for distortions measurement (i.e., loss of visible contrast, amplification of invisible contrast, and reversal of visible contrast) between images under a specific viewing condition.

Comparison with Other Exposure Fusion, Multifocus
Image Fusion and Tone Mapping Methods.Figures 2, 3, and 4 depict examples of fused images from the source multiexposure images.It is noticed that the proposed approach enhances texture details while preventing halos near strong edges.In order to check effectiveness and robustness of present approach, the algorithm is tested on variety of multiexposure image series.The proposed approach is computationally simple and the results are comparable to several exposure fusion and tone mapping techniques.As shown in Figures 2(a)-2(c) the details from all of the source images are perfectly combined and reveal fine textures while preserving local contrast and natural color.In Figures 3(a)-3(d) we compare our results to the recently proposed approaches.Figure 3(d) depicts the results of optimization framework [16] and Figure 3(e) shows the mate-based fusion results using the edge preserving filter such as BLT [17].It can be observed that other fusion methods perform well in preserving image details while they fail to reconstruct texture and color details in the brightly illuminated areas.The result of Mertens et al. [18] (see Figure 3(f)) appears blur and loses texture details while in our results (see Figure 3(c)) the fine texture are emphasized that are difficult to be visible in Figure 3(f).This is because of utilization of Gaussian kernel for pyramid generation as it removes Pixel-to-pixel correlations by subtracting a low-pass filtered copy of the image from the image itself to generate Laplacian pyramid and result is a texture and edge details reduction in the fused image.The results produced in Figures 3(d)-3(f) lose visibility in brightly illuminated areas and details are lost in the tree leaves, and the texture on the wall is washed out.Although the results of Raman and Chaudhuri [17] (see Figure 3(e)) exhibit better color details in tree leaves, they appear slightly blurry.In our results (Figure 3(c)) details are preserved in the brightly illuminated areas, yet at the same time fine details are well preserved (tree leaves, wall texture, and lizard).
To further compare our results visually with Mertens et al. [18], iCAM06 [19], WLS [20], and GRW [21], respectively, Figures 4(a), 4(b), 4(c), 4(d), and 4(e) depict experimental results for National Cathedral sequence (1024 × 768 × 3).Proposed fusion results shown in Figure 4(a) illustrate the ability of enhancing fine texture details.As well as having the ability to produce good color information with natural contrast.This can bring an increased illusion of depth to an image textures.Therefore, enhanced texture details in the fused image let you get everything sharp and yield an accurate exposure that is entirely free of halo artifacts.Although tone mapped results of iCAM06 [19] and WLS [20] have produced comparable results, but they do not preserve contrast from input LDR image series.Figures 4(b) and 4(e) show the results of pyramid approach [18] and GRW optimization framework [21], respectively, which preserve global contrast but losses color information.GRW [21] based exposure fusion is shown in Figure 4(e) which depicts less texture and color details in brightly illuminated regions (i.e., lamp and window glass).Note that Figure 4(a) retains colors, sharp edges, and details while also maintaining an overall reduction in high frequency artifacts near strong edges.The results produced in Figures 4(b)-4(d) were generated by the programs provided by their respective authors.The HDR images for iCAM06 [19] and WLS [20] were generated using HDR reconstruction [2].The results of GRW [21] shown in Figure 4(e) are taken from its Tree leaves and wall texture appear overexposed and blurry in (d) Kotwal and Chaudhuri [16], (e) Raman and Chaudhuri [17], and (f) Mertens et al. [18].In the proposed results, it becomes almost possible to preserve texture and strong edge features simultaneously.Moreover, by applying the proposed approach, the fine textures are accurately enhanced.Input image sequence is Eric Reinhard, University of Bristol.[18], (c) iCAM06 [19], (d) WLS [20], and (e) GRW [21].Note that our method yields enhanced texture and edge features with better color appearance.Input image sequence is courtesy of Max Lyons.
paper.In order to give a relatively fair comparison in our experiments, we have used default sets of parameters for tone mapping [19,20] and exposure fusion [18] methods.
Figure 5 shows the distortion maps computed from DRIVDP metric proposed by Aydin et al. [22].This quality assessment metric detects loss of visible contrast (green) and amplification of invisible contrast (blue).The main advantage of this metric is that it yields meaningful results even if the input images have different dynamic ranges.Though we consider here DRIVDP based quality assessment to compare proposed method with one exposure fusion [18] method and two tone mapping methods [19,20].We assume that the LDR images are shown in a typical LCD display with maximum luminance 100 and gamma 2.2.We also assume that for all the LDR images, the viewing distance is 0.5 metres and the number of pixels per visual degree is 30 and peak contrast is 0.0025.Significance of the choice of these parameters can be found in [22].Figures 5(a)-5(v) show a side-by-side comparison of the loss of visible contrast (green), and amplification of invisible contrast (blue) of proposed results with others methods.To compute visible contrast loss illustrated in Figures 5(d), 5(i), 5(n), and 5(s), respectively, for fused images in Figures 5(c), 5(h), 5(m), and 5(r), the underexposed image (i.e., Figure 5(a)) is used as reference image.Similarly, to compute visible contrast loss illustrated in Figures 5(e), 5(j), 5(o), and 5(t), respectively, for fused Source images  images in Figures 5(c), 5(h), 5(m), and 5(r), the overexposed image (i.e., Figure 5(b)) is used as reference image.We ran invisible amplification metrics on fused images, which are generated using a similar procedure as used for loss of visible contrast metric.The two source images with good exposures, respectively, for the brightly illuminated region (i.e., window) and the poorly illuminated region (i.e., wall) are given in Figures 5(a) and 5(b).The distortion maps for proposed method, iCAM06 [19], Mertens [18], and WLS [20] are given in Figures 5(c)-5(g), Figures 5(h)-5(l), Figures 5(m)-5(q), and Figures 5(r)-5(v), respectively, along with the fused images.In a distortion map, green, blue, and gray pixels indicate contrast loss, amplification, and no distortion, respectively.It can be noticed that the proposed results are more effective in preserving local contrast and color information than the other methods.Please note that visible contrast loss and distortions are the least using the proposed approach.Moreover, to compare the performance of the proposed approach, iCam06, Mertens et al., and WLS, we have employed four fusion quality metrics, that is,  / , VIFF, Mutual Information ( MI ) [50], and Spatial Frequency ( SF ) [51]. / [48] evaluates the amount of edge information transferred from input images to the fused image.A Sobel operator is applied to yield the edge strength and orientation information for each pixel.For two input images  and , and a resulting fused image  (i.e.,   computed in (21)), the Sobel edge operator is applied to yield the edge strength (  ,   ) and orientation (  ,   ) information for each pixel as where    (  ,   ) and    (  ,   ) are horizontal and vertical Sobel template cantered on pixel   (  ,   ) and convolved with the corresponding pixels of image .The relative strength and orientation values of an input image  with respect to  are formed as ) , where Figure 6: Multifocus image fusion results: (a), (c) demonstrates the effect of selective focus to capture more details from a particular part of a scene and (b), (d) images generated by the proposed approach.Note how in the fused image all objects appear all-in-focus and enhance the color and texture details present in the foreground and background extracted from the original input images (input sequence is courtesy of Adu and Wang [23]).The process of edge information preservation values is defined in [48].Finally, the  / is defined as For the "ideal fusion, "  / = 1.VIFF [49] first decomposes the source and fused images into blocks.Then, VIFF utilizes the models in visual information fidelity (VIF) (i.e., Gaussian Scale Mixture (GSM) model, Distortion model, and human visual system (HVS) model) to capture visual information from the two sourcefused pairs.With the help of an effective visual information index, VIFF measures the effective visual information of the fusion in all blocks in each subband.Finally, the assessment result is calculated by integrating all the information in each subband: where    is a weighting coefficient.According to VIF theory, a high VIF yields a high quality test image.Therefore, as VIFF increases, the quality of the fused image improves.
The quality metric  MI measures how well the original information from source images is preserved in the fused image: where (), (), and () are the marginal entropy of , , and  and MI(, ) is the mutual information between the source image  and the fused image : where (, ) is the joint entropy between  and , () and () are the marginal entropy of  and , respectively, and MI(, ) is similar to MI(, ).The fourth criterion is  SF .The spatial frequency, which originated from the human visual system, indicates the overall active level in an image and has led to an effective objective quality index for image fusion [51].The total spatial frequency of the fused image is computed from row (RF) and column (CF) frequencies of the image block and  SF is defined as where   (  ,   ) is the gray value of pixel at position (  ,   ) of image   : The quantitative performance analysis using the aforesaid evaluation indices are shown in the caption of Figure 5.The present approach has outperformed the other methods.We can see that the proposed method can preserve more useful information compared with iCam06, Mertens et al., and WLS fusion methods.In particular, evaluation results in Figure 5 have demonstrated that  / , VIFF,  MI , and  SF have correspondence with the DRIVDP-based evaluation.
Furthermore, to check the applicability of proposed approach for other image fusion applications, we have presented the experimental results for multifocus image fusion.In Figures 6, 7, and 8, it is demonstrated that proposed method is also suitable for multifocus image fusion to yield rich contrast and texture details.One of the key characteristics of present approach for multifocus image fusion application is illustrated in Figure 6 Comparisons of Adu [23], DWT [24], Tian et al. [25], and our approach for multifocus image fusion are illustrated in Figures 8(a)-8(d).The result produced in Figure 8(b) is taken from its paper [23].Results of DWT [24] shown in Figure 8(c) were generated by the MATLAB Wavelet toolbox.For the DWT-based methods, the low-pass subband coefficients and the high-pass subband coefficients are simply merged by the averaging scheme and the choose-max selection scheme, respectively.The DWT-based fusion algorithm is performed using five-level decomposition and db3 wavelets are used in scale decomposition.The results of Jing et al. shown in Figure 8(d) are generated from the MATLAB code provided by the author.Note that our method (see Figure 8(a)) yields enhanced texture and edge features.We can significantly preserve and enhance fine details separately because our approach excludes fine textures from the base layers.

Analysis of Free Parameters and Fusion Performance
Metrics.Proposed method has eight free parameters, that is, , , , , , , , .We fix  = 5,  = 30,  = 1/7,  = 3,  = 0.001,  = 0.1,  = 1.2, and  = 0.0001 in all experiments and they are set as default parameters.It is preferred to have a small number of iterations () to reduce computational time.The fusion performance is not affected when  ≤ 5 because present method does not depend much on the exact parameter choice of .The parameters selection criterion for , , and  is given in [27,28], respectively.We have set  = 30,  = 1/7, and  = 0.001 as default parameters for all experiments.In the present approach, the fusion performance is dependent on two free parameters, that is,  and .To analyze the effect of lambda () and free parameter () on  / [48] and VIFF [49], we have illustrated four plots (see Figures 9(a)-9(d)) for input image sequence of Cathedral (1024 × 768 × 3), Bellavita (800 × 535 × 3), and Book (569 × 758 × 3).The detailed description of  / and VIFF is given in the previous subsection.To assess the effect of lambda () and free parameter () on fusion performance, the  / and VIFF are experimented.
To analyze the influence of  and  on  / and VIFF, other parameters are set to  = 5,  = 30,  = 1/7,  = 0.001,  = 2, and  = 0.0001.As shown in Figures 9(a) and 9(b), the fusion performance will be worse when the values of  and  are too large or too small.It should be noticed in Figure 9 that the  / and VIFF decrease when the  and  are too large or too small.The visual inspection of effect of a's on "Cathedral" sequence is depicted in Figures 10(a)-10(c).It can easily be noticed in Figures 10(a)-10(c) that as  increases, the strong edges and textures get enhanced and therefore leads to a detail preserving fusion results.In order to obtain optimal detail enhancement and low computational time, we have concluded that the best results were obtained with  = 5,  = 30,  = 1/7,  = 3,  = 0.001,  = 0.1,  = 1.2, and  = 0.0001, which yield reasonably good results and satisfactory subjective performance for all cases.
To further demonstrate the analysis of errors introduced by the free parameter , four fundamental error performance metrics are adopted, that is, Root Mean Squared Error (RMSE), Normalized Absolute Error (NAE), Laplacian Mean Squared Error (LMSE), and Peak Signal to Noise Ratio (PSNR).The RMSE measure the differences between resulting image and reference image.The error in a pixel  *   is calculated using Euclidean distance between a pixel in a resulting image with  > 1 and the corresponding pixel in the reference image  ref The total error in a resulting fused image is computed using square root of the Mean Square Error (MSE), that is,√(1/)∑   ( *   ) 2 .NAE is a measure of how far is the resulting fused image (when  > 1) from the reference fused image (when  = 1) with the value of zero being the perfect fit.Large value of NAE indicates poor quality of the image [51].NAE is computed as follows:  image sequence of Cathedral, Bellavita, and Book, the errors increase dramatically as free parameter  in (18) becomes too large but increase slowly when  ≤ 6.Using graph presented in Figure 11(d), we want to illustrate what can happen if PSNR is used as distortion measure.It has been found that PSNR decreases gradually as free parameter  increases and can be seen that the proposed approach is performing consistently for different value of free parameter (i.e., ) proposed for detail enhancement.
Another interesting interactive tool for manipulating the detail and contrast in the multifocus image fusion has been experimented.Figures 12(a Here, we demonstrate that we can generate highly detail enhanced fused image from multifocus image series, before objectionable artifacts appear.We found that present approach is very effective for boosting the amount of local contrast and fine details.The effective manipulation range is very wide and will vary in accordance with the texture details present in the input image series: it typically takes a rather extreme manipulation to cause artifacts near strong edges to appear.

Conclusion and Future Scope
Our proposed technique constructs a detail enhanced fused image from a set of multiexposure images by using WLS optimization framework.When compared with the existing techniques which use multiresolution and single resolution analysis for exposure fusion the present method perform better in terms of enhancement of texture details in the fused image.Our research was motivated by the edge-preserving property of anisotropic diffusion that has nonlinear response for fine textures and coarser details.The two-layer decomposition based on anisotropic diffusion is used to extract fine textures for detail enhancement.Furthermore, it is interesting to note here that our approach can also be applied for multifocus image fusion problem.More importantly, the information in the resultant fused image can be controlled with the help of proposed free parameters.At last, the future work involves improvement of this method for adaptively choosing the parameters of the WLS filter and checking the utilization for different kinds of image fusion applications.

Figure 1 :
Figure 1: Illustration of proposed framework consisting of three principal blocks.(a) Base layer and detail layer extraction.The input images are transformed into two-scale decomposition.(b) Weight map construction and refinement.(c) Weighted fusion of base layers and detail layers.The base layers and detail layers across input image series are fused using simple weighted average approach.

Figure 3 :
Figure 3: Lizard: comparison with popular exposure fusion approaches.(a), (b) Source images, (c) results of our new exposure fusion method.Tree leaves and wall texture appear overexposed and blurry in (d) Kotwal and Chaudhuri[16], (e) Raman and Chaudhuri[17], and (f) Mertens et al.[18].In the proposed results, it becomes almost possible to preserve texture and strong edge features simultaneously.Moreover, by applying the proposed approach, the fine textures are accurately enhanced.Input image sequence is Eric Reinhard, University of Bristol.

Figure 4 :
Figure 4: National Cathedral: comparison with popular exposure fusion and tone mapping methods.(a) Results of our new exposure fusion method, (b) Mertens et al.[18], (c) iCAM06[19], (d) WLS[20], and (e) GRW[21].Note that our method yields enhanced texture and edge features with better color appearance.Input image sequence is courtesy of Max Lyons.

Figure 5 :
Figure 5: Comparison of proposed results with iCAM06, EF, and WLS on the National Cathedral sequence using DRIVDP [22].(a), (b) The two source images give good exposures for the paintings on window glass and lamp and the wall, respectively.In a distortion map, green, blue, and gray pixels indicate visible contrast loss, amplification, and no distortion, respectively.Proposed results are more effective in preserving local details and colors than the others.For the paintings on window glass and lamp and the wall, proposed results (see (c)-(g)) depict the least distortion, followed by iCAM06 (see (h)-(l)), Mertens et al.EF (see (m)-(q)), and WLS (see (r)-(v)).

Figure 7 :
Figure 7: Multifocus image fusion results: (a), (b), (c), and (d) demonstrate the effect of selective focus to capture more details from a particular part of a scene and (e), (f), (g), and (h) images generated by the proposed approach.These results demonstrate that present approach has helped to handle typical situations of foreground and background present in the scene (input sequence is courtesy Slavica Savic).

Figure 8 :
Figure 8: Comparison of Clock image results with recently proposed conventional multifocus image fusion methods.(a) Proposed results, (b) Adu et al.[23], (c) DWT[24], and (d) Tian et al.[25].It has been found that the proposed approach helps the viewer to observe enhanced texture and edge features simultaneously without depicting visible artifacts.Input image sequence is courtesy of Adu and Wang.

Figure 9 :
Figure 9: Sensitivity analysis of lambda () and free parameter ()which, respectively, control the smoothness in weight maps computed for base layers and detail layers.(a) Effectiveness of  on  / , (b) effectiveness of  on VIFF, (c) effectiveness of  on  / , and (d) effectiveness of  on VIFF.

Figure 10 :
Figure 10: Visual inspection of fine detail enhancement in typical lighting situations: The free parameter  in (18) controls detail enhancement and sharpening.We have found that variation of "" between 3 and 6 is sufficient for fine details extraction and give better results for various typical situations.By choosing higher value of ,the texture details are accurately enhanced.It also shows that selection of much higher value for detail enhancement does not introduce artifacts near object boundaries due to detail enhancement.(a)  = 1, (b)  = 6, and (c)  = 12.

Figure 11 :
Figure 11: Sensitivity analysis of errors introduced by the free parameter .The errors increases dramatically as  becomes too large but increases slowly when  ≤ 6.It is observed in (b) and (c) that error increases as  increases but still the deviation range is less.(a) Effectiveness of  on RMSE, (b) effectiveness of  on LMSE, (c) effectiveness of  on NAE, and (d) effectiveness of  on PSNR.