Airborne Infrared and Visible Image Fusion for Target Perception Based on Target Region Segmentation and Discrete Wavelet Transform

Infrared and visible image fusion is an important precondition of realizing target perception for unmanned aerial vehicles UAVs , then UAV can perform various given missions. Information of texture and color in visible images are abundant, while target information in infrared images is more outstanding. The conventional fusion methods are mostly based on region segmentation; as a result, the fused image for target recognition could not be actually acquired. In this paper, a novel fusion method of airborne infrared and visible image based on target region segmentation and discrete wavelet transform DWT is proposed, which can gain more target information and preserve more background information. The fusion experiments are done on condition that the target is unmoving and observable both in visible and infrared images, targets are moving and observable both in visible and infrared images, and the target is observable only in an infrared image. Experimental results show that the proposed method can generate better fused image for airborne target perception.


Introduction
Unmanned aerial vehicles UAVs are aircrafts which have the capability of flight without an onboard pilot.UAV can be remotely controlled, semiautonomous, autonomous, or have a combination of these capabilities.UAV can execute given missions in a whole lot of domains 1 .In order to complete these various missions, UAV firstly needs to be equipped with sensor payloads to acquire images of the mission area and realize environment perception.The sensor payloads include infrared sensor, and visible light sensor.Infrared and visible image fusion from the same scene is the basis of target detection and recognition for UAV.
However, the images shot by airborne sensors are dynamic, which increases more difficulties for visible and infrared image fusion.In order to acquire the situation assessment, it is very important to extract the target information.In fact, information of texture and color in visible images are very abundant, while the target information, especially an artificial target, is more outstanding in infrared images.According to this, we can divide the image regions based on target regions, which can utilize the target information more effectively.
Image fusion can be performed at four levels of the information representation, which are signal, pixel, feature, and symbolic levels 2 .An infrared and visible image fusion method based on region segmentation is proposed in 3, 4 and adopted the nonsubsampled contourlet transform NSCT to fuse the regions given and optimize the quality of the fused image, while the method did not capture the target information effectively.Yin et al. proposed an infrared and visible image fusion method based on color contrast enhancement which can be propitious to target detection and improve the observer performance 5 .Hong et al. presented a fusion framework for infrared and visible images based on data assimilation and genetic algorithm GA 6 .Shao et al. introduced fast discrete curvelet transform FDCT and focus measure operators to realize the fusion of infrared and visible images 7 .Other fusion methods for the infrared and color visible images have been compared in 8-10 , in which the fusion method based on discrete wavelet transform DWT performed well.Though NSCT, FDCT, and other novel transforms are superior to DWT in some performances, the transform should not be the key and core problem for infrared and visible image fusion 11 .In this paper, we only use DWT as a meaning to research the fusion concerning target detection and perception, for DWT has less computational complexity 12 .Yao et al. proposed a new approach to airborne visible and infrared image fusion 13, 14 and researched target fusion detection 15 , which used the relevant information in different frames that could meet the demand for low real-time fusion.
Dynamic image fusion has its own characteristics, which require that the fusion method is consentaneous and robust to both time and space 16-18 .In order to utilize the different region features and get more effective target and background information, a method of visible and infrared image fusion in DWT domain based on dynamic target detection and target region segmentation is proposed.First, the image segmentation is done based on the detected candidate target regions, and then the information between the frames can be used to attain the stability and consensus to time.Finally, the different fusion rules are designed according to the characteristics of target regions to complete visible and infrared image fusion.

Image Segmentation Based on Target Regions
According to the requirements of airborne image processing for the accuracy and the speed, the moving target in the airborne images can be detected using the frame difference method based on background motion compensation 19 .The algorithm flow of target region detection is shown in Figure 1.

Motion Information Extraction Based on Frame Difference
On the basis of motion compensation for the image background, target detection can be done by applying the frame difference method to the image sequences.The regions whose pixel values are constant can be regarded as background regions, while the regions whose pixel  values change with different points are moving target regions, which include the motion information of targets.Using the converse transform parameters, we can make motion compensation for frame n 1 and compute the difference between the former frame and the current.
where D n x, y denotes the frame difference of frame n 1 at point x, y and I n x, y denotes the pixel value of frame n at point x, y .The change of pixel values in unmoving regions should be zero; however, because of some random noises, luminance change, and weather change, the differences are fairly small and belong to the salt and pepper noise.In order to extract the moving target regions, we need to select a proper threshold to segment the source images.Thus, we can acquire the moving target regions and relevant motion information.This method is easy to realize and robust to illumination, which could be a proper method for airborne image processing.

Target Clustering and Image Segmentation
After getting motion information from the neighbor frames, each target region has not been differentiated, and some unpredicted noises may appear in the target regions.In order to distinguish the different moving regions and filter noisy points in the target regions, the target clustering algorithm is proposed, which is shown in Figure 2.
The criterion for judging whether the points belong to a certain cluster is distance.According to the transcendental knowledge, the distances among different clusters can be established.The threshold can be set beyond which a new cluster can be added.When each target is distinguished, the number and the range of points in each cluster need to be computed.The points whose quantity is less than a certain number can be regarded as noisy points or false points and then can be eliminated.For the confirmed target regions, the region size and boundary can be computed, and then each target region can be marked out in the source images.
Suppose the visible image is E 1 and the infrared image is I 1 , each image is to be divided into two parts at least, target region of R and background region of B, shown in Figure 3.There exist several target regions in source images.As shown in Figure 3, two source images For different regions, the different fusion strategies and methods are adapted to gain better fusion effect.

Visible and Infrared Image Fusion Based on Target Regions in DWT Domain
First, input the registered source visible image E 1 and infrared image I 1 .Find the DWT of each E 1 and I 1 to a specified number of decomposition levels of N. Suppose x, y denotes the coordination of any pixel point, g E i x, y and f E N x, y denote high-frequency and low-frequency subband coefficients of the visible image, respectively, g I i x, y and f I N x, y denote the high-frequency and low-frequency sub band coefficients of the infrared image, respectively, F denotes the fused image, and g F i x, y and f F N x, y denote the high-frequency and low-frequency sub band coefficients of the fused image.The fusion rules of different regions are as follow.
Step 1.For the target region that only exists in visible image or infrared image, the highfrequency and low-frequency subband coefficients of F can be computed as follow:

3.1
Step 2. For the common target region R, according to similarity measurement of two images, we use the selective or the weighted average rule to fuse the source images.The similarity of the two images M EI in region R is defined as x,y ∈R I E x, y 2 x,y ∈R I I x, y 2 .

3.2
Then compute the energy of all high-frequency sub band coefficients in region R of the two images, and use it as the fusion measure , y 2 , P E, I.

3.3
If M EI R < α, where α is a similarity threshold, such as 0.8, then apply the selective fusion rule to the two image

3.4
If M EI R ≥ α, then apply the weighted average fusion rule to the two images and define the weighted coefficients

Mathematical Problems in Engineering
Then fusion process in the common target region R can be described as follows:

3.6
Step 3.For the common background region B, take the different fusion strategies for the highfrequency and low-frequency sub band, respectively.For low-frequency sub band, we use the average method for fusion directly For high-frequency sub band, we adopt the fusion rule based on windows proposed by Burt and Kolczynski 20 .First, compute the energy of high-frequency sub band coefficients in local window, and use it as the fusion measure where N x, y is the window whose center is x, y , ω m, n is a weighted coefficient, and the sum is 1.According to the similarity fusion method in a local window, the similarity is defined as , n S P i x, y S I i x, y .

3.9
If M i x, y < α i , where α i is the similarity threshold, then apply the selected fusion rule for the two image , y , if S E x, y < S I x, y .

3.10
If M i x, y ≥ α i , then apply the selected fusion rule to the two image, take the weighted average fusion rule for the two images, and define the weighted coefficients The fusion process in the common background region B can be described as follows:

Target Is Unmoving and Observable Both in Visible and Infrared Images
First, the experiment to fuse the visible and infrared images in which the target is unmoving and observable in both of the two images is conducted.There exists a static tank target in source images.We can get the small rectangle region including the tank-based target region segmentation, as shown in Figures 4 i and 4 j , other part is the background region, thus, the source image can be divided into different parts, and different fusion rules are adopted to produce the fused image.As the source images are fused using different pyramid methods, the decomposition level is 3, while the fusion method based on discrete wavelet transform adopt the decomposition level of 2.
The evaluation metrics of different fusion method are given in Table 1.The definition of these metrics can be found in 21, 22 .It can be seen that our method not only preserves abundant texture information, but also image edges maintain well and the fusion effect is better than other methods.In fact, result of weighted average method is blurring in background, and target is not clear, which shows that the source information has not been utilized efficiently.The results of pyramid decomposition methods are abundant in background, but the target information has not been preserved.The basic wavelet method preserves both the details and the target information.However, all of above methods are based on one rule, and the whole image use the same fusion rule, which has not used different fusion rules according to different region characteristics and inevitably result in losing information.The result of our method which adopts different fusion rules according to the different regions shows that the background information is more abundant, and the target is more outstanding.

Target Is Moving and Observable Both in Visible and Infrared Images
Second, the experiment to fuse the visible and infrared images in which the target is moving and observable both in the two images is conducted note: the source images are from 16 .and 5 f show that the background information is more abundant, and the target is more outstanding than those of the basic wavelet method.

Target Is Observable Only in Infrared Image
In the actual environment, the target can be sheltered from other objects, which can be observed by the infrared sensor, such as Figures 6 a and 6 b , in which the person target is unobservable in the visible image, while it is observable in the infrared image.From Figures 6 c and 6 d , it can be seen that the target information changes using the basic wavelet method, which becomes dark inside and blur along the edge, while the target is clear using our method which means that the target information and background information have been mostly preserved and can meet demand of the target extraction.

Conclusions
In this paper, we proposed a new approach to infrared and visible image fusion based on detected target regions in DWT domain, which can help UAV to realize environment  perception.Other than the conventional fusion methods based on region segmentation, we proposed a frame difference method for target detection, which can be used to segment the source images, and then we design different fusion rules based on target regions to fuse the visible and infrared images, which can gain more target information and preserve more background information in the source images.In the future, the method can be spread to other source image for fusion, and the time performance of our method can be improved based on GPU Graphics Processing Unit and other hardware.

Figure 1 :
Figure 1: Flow of target region detection.

Figure 4 :
Figure 4: Results on condition that the target is observable both in visible and infrared image.

a Visible image 2 bFigure 5 : 3 c
Figure 5: Results on condition that targets are moving both in visible and infrared image.

Figure 6 :
Figure 6: Results on condition that the target is observable only in infrared image.

Table 1 :
Evaluation metrics of different image fusion methods.