Infrared and Visible Image Fusion with Hybrid Image Filtering

Image fusion is an important technique aiming to generate a composite image from multiple images of the same scene. Infrared and visible images can provide the same scene information from different aspects, which is useful for target recognition. But the existing fusionmethods cannot well preserve the thermal radiation and appearance information simultaneously.+us, we propose an infrared and visible image fusionmethod by hybrid image filtering.We represent the fusion problem with a divide and conquer strategy. A Gaussian filter is used to decompose the source images into base layers and detail layers. An improved co-occurrence filter fuses the detail layers for preserving the thermal radiation of the source images. A guided filter fuses the base layers for retaining the background appearance information of the source images. Superposition of the fused base layer and fused detail layer generates the final fusion image. Subjective visual and objective quantitative evaluations comparing with other fusion algorithms demonstrate the better performance of the proposed method.


Introduction
Image fusion is an important technique of image enhancement that extracts different salient feature information from numerous images into one full enhanced image for increasing the amount of information and utilization of the image. In recent years, image fusion technology has been applied in several aspects such as multifocus, medical, remote sensing, infrared, and visible images [1], especially in the merging of infrared and visible images. e image generated by the infrared image according to the principle of thermal imaging has high contrast and mainly provides the saliency target information of the fused image, and the visible image mainly includes the accurate background information. e saliency target in the infrared image is important for target recognition, while the background texture data in the visible image are the key to environmental analysis and detail judgment. Infrared and visible image fusion provides more comprehensive information, which has important practical significance in military and civilian fields [2].
At present, researchers have proposed large amount of approaches for infrared and visible image fusion [3]. Spatial and transform domain-based approaches are two popular approaches in this area. e spatial domain-based approaches mainly generate weights according to the characteristics of the original spatial information of the pixels or regions in the source image. ese methods are usually simple and fast, but the edge blur is easy to occur. e transform domain-based method mainly includes two parts: image decomposition and fusion rules. Multiscale decomposition tools decompose the source images into different scale spaces for obtaining layers with different feature information. Fusion rules depending on the information characteristics of each layer guide the fusion of different layers.
en, the fusion result is obtained by the inverse operation of multiscale decomposition [2]. e transform domain-based methods have become a research hotspot for its good adaptability to the human visual system. Traditional multiscale transform methods have achieved good fusion results, such as pyramid [4], discrete wavelet [5], contourlet, non-subsampled contourlet [6], and non-subsampled shearlet transforms [7]. However, a large number of transform coefficients resulting in the complexity of parameter optimization compromise the fusion performance.
In recent years, deep learning has been utilized in the modeling of complicated relationship between data and extraction of distinctive features [2,3]. e methods based on deep learning such as convolutional neural networks [8][9][10], adversarial network [11][12][13][14], and dictionary learning-based sparse representation [15] have achieved better fusion performance. Ma et al. proposed a new end-to-end model, termed as the dual-discriminator conditional generative adversarial network, for fusing infrared and visible images of different resolutions [16]. Chen et al. proposed a target-enhanced multiscale transform decomposition model for infrared and visible image fusion to simultaneously enhance the thermal target in infrared images and preserve the texture details in visible images [17]. Xu et al. presented a new unsupervised and unified densely connected network for infrared and visible image fusion [18]. Zhang et al. proposed a fast unified image fusion network for infrared and visible images based on proportional maintenance of gradient and intensity [19].
ey are good at feature extraction and data reproduction. But the difficulties of ideal image selection and training, learning parameter settings, and the domain knowledge may compromise the fusion quality [8]. Despite the flexibility, rigidity, and robustness of the conventional infrared and visible image fusion approaches, some improvements could be attained in this area, yet. e current study concentrates on these improvements.
As is known, conventional transform domain-based methods suffer from complication of parameter optimization and cost in coefficient processing. us, an edge-preserving filter with spatial consistency and edge retention are introduced to image fusion. Edge-preserving filtering is an effective tool for image fusion, which enhances the edge information of the image and reduces artifacts around the edges. Local filter, global optimization, and hybrid filter based techniques are the three main methods in this area [2].
Bilateral filter [20], cross bilateral filter [21], iterative guided filtering [22], guided filtering (GFF) [23], gradient domain-guided filtering (GDGF) [24], and co-occurrence filter [25] (CoF) methods are the popular local filter-based methods. e visual quality of the fused images could be significantly enhanced through the mentioned filtering methods. e mean filter can well remove noise by averaging value of the neighboring pixels, so it may cause reduction of visual quality. e bilateral filter method can well smooth images and maintain their details except for the results in gradient reversal artifacts. To overcome the mentioned deficiencies, the guided filtering approach presented in [23] could be effectively employed. However, the near edges of the image could not be accurately described via the local liner model in a guided filtering approach. is approach may cause several halos in the fused images that degrade the fusion efficiency. e gradient domain-guided filtering method significantly improves the visual quality except for much time computation of three visual features. e CoF method can well extract the detail information of the original images and enhance the visual quality of the fused image, but the iterative co-occurrence filtering on base layers for sharping boundaries between textures costs much time.
Weighted least squares filtering approach [26,27], L 1 fidelity with L 0 gradient [28], anisotropic diffusion [29], and fourth-order partial differential equation [30] are some wellknown global optimization-based methods. e image smoothing could be achieved via the mentioned methods by solving an optimization problem with various fidelity terms or regularization. Although superior results and considerable enhancement of the fusion quality are always attained through these approaches, some iteration rules performed in them are time consuming. Besides, several parameters such as regularized factors, scale level, and synthetic weights should be adjusted in these methods that could be considered as another deficiency. On the other hand, various layers fused through the L 1 fidelity with the L 0 gradient method [27] by combining several levels with various weighted values may cause blocking artifacts during the fusion procedure. e algorithms based on hybrid filter smooth the source images with two or more filters, such as Gaussian filter, bilateral filter [31], and rolling guidance filters [22]. ese methods can well suppress the halos and retain details except for the lack in robust of adaptability.
As mentioned above, GFF can obtain better results with high computational efficiency except for representation of the image well near some edges. e local linear model used in the guided image filter improves the computational efficiency and gives the guided image filter the superiority in representation of background information of the source image. Unlike the across texture, the edges could be smoothed through CoF, which enables the extraction of texture data of the original images by CoF. Inspired by the advantage of GFF and CoF, a fusion method is presented for infrared and visible image with the guided filter and cooccurrence filter. e advantage of the guided filter in background information extraction and that of the co-occurrence filter in edge structure information extraction are combined to improve the fusion performance of infrared and visible image. e contributions of this study can be concluded as the following four aspects: (1) a novel infrared and visible image fusion approach using the guided filter and co-occurrence filter is proposed. (2) e guided filtering in base layers and co-occurrence filtering in detail layers enhance the fusion efficiency of the source images. (3) e base and detail layers are fused with the saliency maps constructed by the guided filter and co-occurrence filter, respectively. (4) e range filter of the normalized cooccurrence matrix is removed for improving the filtering speed.
e remaining parts of the current work are given as follows. Section 2 provides a detailed description of the guided filter and fast co-occurrence filter. Section 3 describes the presented approach. Section 4 presents the experimental results and their related discussions. Finally, Section 5 is devoted to conclusions and future work aspects.

Guided
Filter. e output image Y(i) of the guided filter is a local linear transformation of the guided imageI(i): where i denotes the position of the image, ω p is a partial rectangular window of (2r + 1) * (2r + 1) at center pixel p, r is an input parameter of the GF, and a p and b p are obtained by minimizing the output image Y(i) and the input image X(i): where λ is a regularization parameter given by the user and the optimal value is obtained by linear ridge regression: where μ p and δ p represent the mean and variance of the guided image I in ω p , respectively, |ω| is the total number of pixels in ω p , and X p is the mean of the input image X in ω p . e filtered output image is where a i and b i are the average of the different matrix windows: Large windows and regularization parameters can be used to fuse smooth regions containing background information, and the fusion efficiency is fast [32].

Improved Co-Occurrence Filter.
e co-occurrence filter places the value-domain Gaussian filter of the BF into the normalized co-occurrence matrix [33]. e pixel values of high co-occurrences are assigned with larger weights and are smoothed, and pixels of low co-occurrences have smaller weights and are not smoothed. e CoF is defined as follows: where Y p and X p are the pixel values of the output and the input images, respectively, p and q denote indices of the pixels, and w(p, q) is the weight contributed by the pixel q to the output pixel p.
In a Gaussian filter, w(p, q) could be described as follows: where G δ s (p, q) denotes a Gaussian filter, d(p, q) denotes the Euclidean distance of the pixels p and q, and δ s is the specific parameter.
In the BF, w(p, q) is defined as follows: where δ r is a certain parameter. CoF combines the normalized co-occurrence matrix with the range filter in BF to extend BF to handle the boundary. e formula is described as follows: where M is a 256 * 256 matrix and C(a, b) denotes the cooccurrence of the counts a and b. h(a) and h(b) represent the corresponding frequencies of a and b in the image: where δ is the Gaussian filter parameter ( �� 15 √ by default) and [] is a logical operation, and if the item inside is true, the result is 1, otherwise 0.
CoF gathers co-occurrence data from the image and filters out noise while maintaining sharp boundaries between different textures. However, a co-occurrence filter is utilized to calculate the co-occurrence value by means of local window combined with the Gaussian filter, which increases the filter rate. e time complexity of the co-occurrence matrix in the original co-occurrence filter is O(n × r 2 ), where n denotes the number of pixel points and r is the size of the local window.
To increase the filtering procedure speed, we remove the range filter of the normalized co-occurrence matrix and globally count the number of co-occurrences of the image. e C(a, b) of the improved co-occurrence filter (ICoF) is calculated as follows: e number of intervals of the statistical pixel pair is determined experimentally to be 6. e simplified co-occurrence matrix time complexity is O(n), which improves the filter speed. e comparison of the filter results of CoF and ICoF is shown in Figure 1.
Due to the removal of the range filter, ICoF will produce spots in the texture, but ICoF still has the ability to maintain the edges. To further verify the edge retention capability of ICoF, the normalized image is input into two filters, and the results are presented in Figure 2.

Mathematical Problems in Engineering
As shown in Figure 2, CoF is smooth for the target person's head, leaves, grass edges, etc., while the ICoF edge retains more stringent. For the experimental comparison of the filter speeds of CoF and ICoF, the time consumption of the ten groups of experiments is averaged to attain the final time consumption. e experimental image size is 270 * 360, and the results are given in Table 1. It is evident from the table that the filter speed of ICoF is about 50% smaller than that of CoF.
As mentioned above, the guided filter is good at background information extraction, and ICoF is good at filtering out noise while maintaining sharp boundaries between different textures. us, we make the most of their advantages to construct a novel fusion method for improving the fusion performance of the infrared and visible image.

Proposed Fusion Method
We propose a hybrid filtering-based fusion approach for infrared and visible images. e proposed fusion framework is described in Figure 3. We use the Gaussian filter to perform two-scale decomposition on the source image. e detail and the base layers are, respectively, fused by a weighted averaging method of the improved co-occurrence filter and guided filter, and finally the fusion result is obtained by superimposition.

Two-Scale Decomposition.
e mean filter and Gaussian filter are usually used for two-scale decomposition. e Gaussian filter decomposition is sharper, and the edge information obtained by the Gaussian filtering is more significant. us, the Gaussian filter is applied to our fusion framework for obtaining the base layer containing a large amount of background information. Assume that I i represents the registered input image, where i � 1, 2. e base layer B i of the two source images is calculated as follows: where G is the Gaussian filter with the standard deviation 5 while the filter window is adjusted as 11 × 11. e detail layer D i including the edge structure can be obtained by subtraction of the base layer B i containing the background information from the source image I i :

Fusion of Base Layers.
It is easy to see that background information of the base layer is smooth. e Laplacian filter is applied to extract the saliency of the layer for obtaining a contrast map. en, the Gaussian low-pass filter is applied to filter the high-frequency noise for a final saliency mapBS i . e multiple filtering is performed as follows: where Lap represents the Laplacian filter and the filtered operator is [0 1 0; 1 −4 1; 0 1 0]. G denotes the Gaussian filter, and the filter window size is 11 × 11, and the standard deviation is 5. e initial binary weight maps BP k i of the base layer are obtained by comparing the saliency maps of the base layers. e pixels with high saliency have high weight: where k represents a pixel point, and a binary weight map is obtained by comparing pixel by pixel. e binary weight maps BP k i directly applied to the base layer fusion may produce artifacts and blurring because of spatial inconsistency. As is known, the guided filter can well preserve the edges between textures while smoothing the surrounding pixel values. Optimized weight maps with the guided filter can preserve texture information more while maintaining spatial consistency. Optimized weight maps are defined as follows: where W B i represents the final base layer weight map, the input of the guided filter is BP i , and the guided image is the original base layer B i . e multiple base layer weight maps are merged with the base layer to obtain the fused base layer F B :

Fusion of Detail Layers.
e detail layer mainly includes the edge structure data of source images. Five classical saliency detection algorithms: spectral residual (SR) [34], frequency-tuned (FT) [35], maximum symmetric surround (MSS) [36], median mean difference (MMD) [37], and visual weight map (VWM) [22] are compared to search for a better measurement of the saliency of the edge structure in detail layers. e results are shown in Figure 4. It is evident that FT is better for the significant extraction of character and leaves except for the road and the grass. e MMD has improved the saliency of the road, but the contrast of the grass is insufficient. SR has a lot of saliency positions, but not very continuous. e saliency of VWM is obvious, but there is still some information lost. e saliency of MSS is obvious in representing roads and grasses, and it is more comprehensive. us, we use MSS as the saliency extraction method of the detail layer containing a lot of edge structures.
MSS is defined as follows: where S(i, j) is the saliency value of the output at position is the corresponding CIELAB pixel vector of the input image after the Gaussian filter. e value range of the Gaussian filtered gray image in this paper is adjusted to [0, 100] for obtaining the brightness value as X f (i, j). indicates the L 2 norm. X μ (i, j) is as follows: where i 0 and j 0 represent the offset and H is a subgraph, which is defined as follows: where r and c represent the width and height of the input image, and H � (2 × i 0 + 1)(2 × j 0 + 1). e initial weight map DP k i is obtained by comparing the obtained saliency map. It is defined as follows: Binary weight maps directly applied for the fusion of detail layers may produce artifacts and blurring. Considering the advantage of CoF in texture feature extraction and the visual feature of the detail layer, we preserve the intertexture edge with the co-occurrence filter smoothing binary weight map. In order to improve the speed, we optimized CoF by removing the range filter of the normalized co-occurrence matrix. ICoF is defined as follows:  Mathematical Problems in Engineering where δ s denotes the standard deviation of the spatial Gaussian filter in ICoF. Parameter settings are consist with those of the original co-occurrence filter, and δ 2 s � 2 * �� 15 √ + 1. e weight maps obtained guide the fusion of the detail layer for obtaining the fused detail layer F D : 3.4. Two-Scale Image Reconstruction. Since the detail layer and the base layer are obtained by subtraction, the reconstruction is obtained by merging the fusion results of the detail and the base layers:

Experimental Results and Discussion
In order to evaluate the performance of the proposed method, the classical and recent proposed fusion algorithms are used for comparison. e experimental results are assessed with subjective evaluation and objective evaluation.

Parameter Setting.
e parameters of the fast co-occurrence filter have been introduced in the previous section. For the setting of the two parameters r and ε of the guided filter, through multiple groups of experiments combined with subjective and objective evaluation analysis, the value of the two parameters and the size of the fused image are positively correlated. is paper introduces a window factor t which divides the sum of image rows and columns to obtain the final window size r: where m and n represent the corresponding rows and columns of the image, rounded up, and the value is integer. It is known that not all objective evaluation indicators can effectively reflect the fusion effect. In order to avoid different evaluation indicators for determining the value of the window factor, we use Q AB/F as the objective feedback value of fusion results with different t. e final value of r is determined based on the trend of Q AB/F of the fusion results going along with different t. e experimental results are shown in Figure 6. It can be seen from Figure 6 that Q AB/F has reached a peak when t is set to 34. us, the window factor is determined to be 34. ε is obtained by dividing r by a constant g, ε � (r/g). According to the parameters of the guided filter fusion base layer in [19], g � 150.

Evaluation Metrics.
To verify the performance of the presented algorithm comparing with existing approaches, we apply five evaluation metrics to the objective fusion assessment. ey are quality assessment based on structural similarity entropy (EN), standard deviation (SD), multiscale structural similarity (MS-SSIM) [44], normalized feature mutual information (NFMI) [45], edge retention fusion quality indicator Q AB/F [46], and visual information fidelity degree VIFF [47]. ese metrics describe the fusion results from different perspectives, and the larger of those values indicate that more source image information is retained, and better fusion performance is achieved [48].

Qualitative Assessments.
Using qualitative result analysis, Figures 7-11 show the fusion results with different algorithms for the infrared and visible images of Figure 5. e GFF algorithm can enhance the visual qualities of the fusion image by employing appropriate weights on the corresponding regions. Although the additional structure and detailed data could be transferred from the guidance image to the fused one via the GFF method, the images around certain edges could not be described appropriately through the local liner model in this method. Although no explicit constraint occurs from the edge-aware factor to treat edges, it may generate specific halos in the fused images. A few blurring artifacts could be observed due to the GFF fusion such as the lost data of grass (boxes II and III) in Figure 7(a), the blurred branches (box I) in Figure 8(a), the reduced brightness of the trunk portion (box II) of the tree in Figure 9(a), the reduced brightness of the tent (box II) in Figure 10(a), the lost cloud information of the sky background (box I) in Figure 11(a), and the blurred street lamp (box I) and blurred building (box II) in Figure 12(a). CP decomposed the source image into some decomposition layers with various resolution and spatial frequencies. CP has achieved a good visual effect except for the lack in adaptability. e CP fusion results show some blurring, such as the low contrast in Figure 7(b), blurred branches (box I) in Figure 8(b), the black block appearing in the lower right corner (box III) in Figure 9(b), the low brightness of the jeep and soldiers in Figure 11(b), and the blurred street lamp (box I) and the blurred leg of the crowds (box II) in Figure 12(b). TSSD used a mean and median filter to extract visual saliency extraction. e weight map obtained from the saliency map guided the fusion of the base   and detail layers, which greatly improves the transfer ability of complementary data from the original images effectively. But the ability of the mean and median filter in saliency extraction compromised the fusion performance, such as the lost surrounding fence and grass information (boxes II and III) in Figure 7(c), the blurred grass around (box II) in Figure 8(c), the blurred trunk portion (box II) of the tree in Figure 9(c), the low brightness of the tent (box II) in Figure 10(c), the low brightness of the jeep and soldiers (box II) in Figure 11(c), and the low brightness of the crowds (box II) and street lamp (box I) in Figure 12(c). e fusion process is converted to a l 1 -TV minimization problem through the GTTV in which the fundamental intensity distribution in the infrared image and the gradient change in the visible image are preserved through the data fidelity and regularization terms, respectively. Moreover, the thermal radiation and the appearance data in the infrared and visible images could be preserved, respectively (except for the lack of adaptability). e blur appears such as the blurred branches (box I) in Figure 8(d), the blurred trunk (box II) of the tree in Figure 9(d), the lost information of the tent in Figure 10(d), the blurred jeep and soldiers (box II) in Figure 11(d), and the blurred crowds (box II) in Figure 12(d).
CSR used convolutional sparse representation to overcome detail preservation and strong sensitivity to misregistration in image fusion. It has obtained better fusion performance for infrared and visible image. But the regularization parameters in the CSR model and the spatial size of dictionary filters lack in self-adaptability, which compromise the fusion efficiency such as the brightness of the target person (box I) in Figure 7(e), the dark background in Figure 8(e), the low brightness of the trunk portion (box II) of the tree in Figure 9(e), the low brightness of the jeep and soldiers (box II) in Figure 11(e), and the low brightness of the street lamp (box I) in Figure 12(e).
WLS employs the RGF and Gaussian filter to decompose input images to base and detail layers. WLS constructs weight maps by choosing various features of the IR and visible image. Effective visual information could be appropriately transformed into the fused image through the WLS while reducing the noise from the IR image. But the lack of parameter flexibility compromises the fusion quality, such as the low brightness of the target person (box I) in Figure 7(f ), the dark background in Figure 8(f ), the blurred trunk portion (box II) of the tree in Figure 9(f ), the low-contrast vehicle and soldiers (box II) in Figure 11(f ), and the lowcontrast street lamp (box I) and crowds (box II) in Figure 12(f ).
ree visual property measurements are utilized by the MVFMF to generate decision maps that are optimized through gradient domain-guided filtering. e visual description of the detailed data of the original images was provided through it. Although the fusion efficiency was significantly enhanced via the MVFMF decision map construction model, the feasibility of the constant parameters for all original images with various values is not satisfactory in some instances. Slight blurs and artifacts could be observed in the fusion images such as the lost information of the grass (boxes II and III) in Figure 7(g), the blurred branches (box I) in Figure 8(g), the low brightness of the trunk portion (box II) of the tree in Figure 9(g), the information of the tent (box II) in Figure 10(g), the lost cloud information of the sky background (box I) in Figure 11(g), and the lost building information of the street lamp (box I) in Figure 12(g). e CBF extracts the detail information by using joint bilateral filtering and avoids the gradient reversal artifacts of the bilateral filter. It uses weighted average of pixel values to the fuse source image and improves the visual quality of the fused image. However, the weight construction may compromise the contrast of the fused image. e little blurs and the corresponding artifacts appear in the fused images such as the blurred target person (box I) in Figure 7(h), the blurred edge of the tree (box I) and the blurred figure of the man in Figure 8(h), the blurred trunk portion (box II) of the tree in Figure 9(h), the vehicle and soldiers (box II) in Figure 11(h), and the street lamp (box I) and crowds (box II) in Figure 12(h).
TEMTD is a novel infrared and visible image fusion method based on target-enhanced MST decomposition. It deals with problem regarding the preservation of thermal radiation features in the traditional MST-based methods. It can simultaneously maintain thermal radiation characteristics in the infrared image and texture details in the visible image by using a specific fusion rule design. It uses the decomposed infrared low-frequency information to determine the fusion weight of low-frequency bands and highlight the target. e common "max-absolute" fusion rule is performed for fusion for high-frequency bands. e common "max-absolute" fusion rule may compromise the fusion performance, such as the blurred bush (boxes II and III) in Figure 7(i), the blurred branches (box II) in Figure 9(i), the blurred bush in the left bottom of Figure 10(i), the blurred sky in Figure 11(i), and the street lamp in Figure 12(i).
LLRR is a novel fusion framework for fusing infrared and visible images. A projection matrix L learned by latent lowrank representation is applied to extract detail parts and base parts of the input images at several representation levels. It extracts multilevel salient features by using latent low-rank representation.
e final fused image is reconstructed by adaptive fusion strategies designed specifically for dealing with the detail parts and the base parts, respectively. e LLRR framework can be used to provide an efficient decomposition approach for extracting multilevel features for an arbitrary number of input images. e adaptive fusion rules improve the ability in texture and structural information extraction. e bush in Figure 7(j), the branch in  . e co-occurrence filter features of boundary detection and edge preservation could be utilized by the CoF for weight optimization, which improves the fusion quality of the base layers and detail layers. e same fusion rule of base and detail layers compromises the fusion performance, such as the lost information of the grass (boxes II and III) in Figure 7(k), the blurred grass (box II) in Figure 8(k), the blurred details of the branches (box I) in Figure 9(k), the blurred grass at the lower right corner (box II) in Figure 10(k), the lost cloud information of the sky background in Figure 11(k), and the lost information of the street lamp in Figure 12(k). We perform different image filtering for saliency extraction based on the visual feature of base layers and detail layers. By and large, compared with other algorithms, the proposed algorithm not only retains the saliency desired data of the infrared image but also preserves the background data in the visible image.

Quantitative Assessments.
In the current subsection, seven quality measurement indices (as discussed in the previous subsection) are applied to the test images. e obtained results are presented in Tables 2-6. Tables 2-6 demonstrate that the metric values of the presented approach are higher than the corresponding ones obtained with existing approaches, which demonstrates that further data could be preserved from the original images, and superior fusion quality could be obtained. As is known, Q (AB/F) shows the edge data of the fused image retained from the source image. NFMI measures the feature information of the fused image retention from the source one based on mutual information. MS-SSIM is based on structural similarity to measure the retention degree of the structural data of the fused image from the original one. EN indicates the information value contained in the fused image as a whole. SD reflects the contrast of the image with the difference degree between each pixel and the average value of the pixels. VIFF measures the degree of fusion of the effective visual information of each region.
Histogram of the average metric (Q AB/F , NFMI, VIFF, and MS-SSIM) amounts of the fusion approaches is presented in Figure 13. It is easy to see that the average Q AB/F values of GFF, MVFMF, and GF-ICoF are obviously higher than that of other methods, which demonstrates that GFF, MVFMF, and GF-ICoF can well preserve the edge data from original images to fused images. e average NFMI value of GFF, CSR, LLRR, MVFMF, CoF, and GF-ICoF are obviously higher than that of other fusion methods. It demonstrates that more significant feature data could be transformed from the source image to the fused one through the five methods transfer compared with existing fusion approaches. Moreover, the mean VIFF and MS-SSIM values of TSSD and GF-ICoF are obviously greater than that of existing fusion approaches. It shows that TSSD and GF-ICoF can transfer more significant structure data from original images compared with existing fusion techniques.
Histogram of the mean metric (EN, SD) amounts of the fusion approaches is depicted in Figure 14. It is easy to see that the average SD values of MVFMF, LLRR, CoF, and GF-ICoF are obviously greater than the corresponding values obtained by other fusion approaches. It demonstrates that the fusion results of MVFMF, LLRR, CoF, and GF-ICoF have higher contrast with better visual quality compared   e proposed method has sufficient retention for the edge and structure data from the original image. e fused image obtained with the presented approach has the best visual quality in terms of contrast. Overall, the approach is preferred to existing algorithms in accordance with different evaluation metrics. e time complexity of each algorithm is obtained on the running time of 10 tests. e selected image size is 270 * 360. e results are shown in Table 6. It is evident that the time consumption of CSR is highest. e parameter training and dictionary learning cost much time. e extraction of detail parts and base parts of the source images is implemented by learning with latent low-rank representation, which results in higher time consumption of LLRR. e algorithm models of CP, TSSD, and GFF are simple and small time consumptive. e time consumption of WLS, CBF, TDMTD, and MVFMF is a little more than the previously mentioned methods. CSR spends a lot of time in sparse representation, which leads to excessive time overhead. e time cost of the propose method and the GTTV algorithm tends to be similar. Because the time consumption of the CoF is a little more, the CoF fusion algorithm uses an iterative way to fuse the base layer, which further increases the time spent for the fusion. Compared with the CoF fusion algorithm, the time efficiency of this work is increased by about 90%. In terms of the subjective and objective results, we can conclude that the algorithm is an effective infrared and visible image fusion algorithm.

Conclusions
As conventional infrared and visible image fusion methods suffer from low contrast and background texture loss, a novel fusion approach is presented using guided and improved co-occurrence filters. e advantage of the guided filter in background information extraction and the advantage of the co-occurrence filter in edge structure information extraction are combined to improve the fusion performance of the infrared and visible image. e co-occurrence filter is improved by removing the range filter and globally synthesizing the co-occurrence information. e filtering time of the co-occurrence filter is reduced by half while preserving across texture edges. e qualitative assessments demonstrate that the fusion results of the proposed method can retain the thermal radiation and appearance data in the infrared and visible images, respectively. e quantitative comparisons on seven metrics with recent fusion approaches indicate that more significant edge and structure data could be transformed from the original image to the fused one through the presented approach. Future work will further improve the speed of the proposed method and apply it to image fusion applications such as medical and remote sensing.
Data Availability e two image datasets used to support the findings of this study are included within the open data collection in http:// www.imagefusion.org/ and https://figshare.com/articles/ TNO_Image_Fusion_Dataset/1008.

Conflicts of Interest
e authors declare that they have no conflicts of interest.