Image Matching by Using Fuzzy Transforms

We apply the concept of Fuzzy Transform (for short, F-transform) for improving the results of the image matching based on the Greatest Eigen Fuzzy Set (for short, GEFS) with respect to max-min composition and the Smallest Eigen Fuzzy Set (for short, SEFS) with respect to min-max composition already studied in the literature. The direct F-transform of an image can be compared with the direct F-transform of a sample image to be matched and we use suitable indexes to measure the grade of similarity between the two images. We make our experiments on the image dataset extracted from the well-known Prima Project View Sphere Database, comparing the results obtained with this method with that one based on the GEFS and SEFS. Other experiments are performed on frames of videos extracted from the Ohio State University dataset.


Introduction
Solution methods of fuzzy relational equations have been well studied in the literature (cf., e.g., [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15]) and applied to image processing problems like image compression [16][17][18][19] and image reconstruction [7,8,[20][21][22].In particular, Eigen Fuzzy Sets [23][24][25] have been applied to image processing and medical diagnosis [2,6,7,16].If an image  of sizes  ×  (pixels) is interpreted as a fuzzy relation  on the set {1, 2, . . ., } × {1, 2, . . ., } → [0, 1], the concepts of the Greatest Eigen Fuzzy Set (for short, GEFS) of  with respect to the max-min composition and of the Smallest Eigen Fuzzy Set (for short, SEFS) of  with respect to the min-max decomposition [2,24,25] were studied and used in [26,27] for an image matching process defined over square images.The GEFS and SEFS of the original image are compared with the GEFS and SEFS of the image to be matched by using a similarity measure based on the Root Mean Square Error (for short, RMSE).The advantage of using GEFS and SEFS is in terms of memory storage is that we can indeed compress an image dataset (in which each image has sizes  × ) in a dataset in which each image is stored by means of its GEFS and SEFS which have total dimension 2.
The main disadvantage of using GEFS and SEFS is that we cannot compare images in which the number of rows is different from the number of columns.Our aim is to show that we can use an F-transform for image matching problems, reducing an image dataset of sizes  ×  (in general,  is not necessarily equal to ) into a dataset of dimensions comparable with that one obtained by using GEFS and SEFS if  = , so having convenience in terms of memory storage.
The F-transform based method [28][29][30] is used in the literature for image and video compression [29,[31][32][33], image segmentation [20], and data analysis [22,34]; indeed, in [31,32] the quality of the decoded images obtained by using the F-transform compression method is shown to be better than that one obtained with the fuzzy relation equations and fully comparable with the JPEG technique.
The main characteristic of the F-transform method is to maintain an acceptable quality in the reconstructed image even under strong compression rates; indeed in [20] the authors show that the segmentation process can be applied directly over the compressed images.Here we use the direct F-transform in image matching analysis with the aim of reducing the memory used to store the image dataset.In fact, we compress a monochromatic image (or a band of a multiband image)  of sizes  ×  via the direct F-transform to a matrix  of sizes  ×  using a compression rate  = ( × )/( × ).
By using a distance, we compare the F-transform of each image with the F-transform of the sample image.We also   adopt a preprocessing phase for compressing each image with several compression rates.In Figure 1 we show the preprocessing phase on a dataset of color images.We compress each color image in the three monochromatic components corresponding to the three bands , , and .

Advances in Fuzzy Systems
At the end of the preprocessing phase we can use the compressed image dataset for image matching analysis.Supposing that the original image dataset was composed by s color images of sizes  ×  using a compression rate  = ( × )/( × ), we obtain that the dimension of the compressed image dataset is constituted totally of 3( × ) pixels.
In Figure 2 we schematize the image matching process.The sample image is compressed by the F-transform method; then we compare the three compressed bands of each image obtained via F-transform with those ones deduced for the sample image by using the Peak Signal to Noise Ratio (for short, PSNR).At the end of this process, we determine the image in the dataset with the greatest overall PSNR with respect to the sample image.
Here a monochromatic image or a band of a color image  of sizes  ×  is interpreted as a fuzzy relation  whose entries (, ) are obtained by normalizing the intensity (, ) of each pixel with respect to the length  of the scale, that is, (, ) = (, )/.We show that our F-transform approach can be also applied in image matching processes to images of sizes  ×  (eventually,  ̸ = ), giving analogous results with respect to that one obtained with GEFS and SEFS based method.The comparison tests are made on the 256 × 256 color image dataset extracted from View Sphere Database, an image dataset consisting in a set of images of objects in which an object is photographed from various directions by using a camera placed on a semisphere whose center is the same considered object.We also use the Ohio State University color video datasets sample for our tests.Each video is composed by frames consisting of color images; we show the results for the Mom-Daughter and sflowg motions.In Section 2 we recall the concepts of F-transform in two variables.In Section 3 we recall the GEFS and SEFS based method; in Section 4 we propose our image matching method based on the F-transforms.Our experiments are illustrated in Section 5, and Section 6 is conclusive.

Max-Min and Min-Max Eigen Fuzzy Sets
Let  be a nonempty finite set,  :  ×  → [0, 1] and  :  → [0, 1], such that where "∘" is the max-min composition.In terms of membership functions, we have that for all ,  ∈  and  is defined as an Eigen Fuzzy Set of .
Let   :  → [0, 1],  = 1, 2, . .., be defined iteratively by It is known [2,24,25] that there exists an integer  ∈ {1, . . ., card } such that   is the GEFS of  with respect to the max-min composition.We also consider the following: where "◻" denotes the min-max composition, that is, in terms of membership functions: for all ,  ∈  and  is also defined to be an Eigen Fuzzy Set of  with respect to the min-max composition.It is easily seen that ( 14) is equivalent to the following: where  and  are pointwise defined as (, ) = 1 − (, ) and () = 1 − () for all ,  ∈ .Since   for some  ∈ {1, . . ., card } is the GEFS of  with respect to the max-min composition, it is immediately proved that the fuzzy set  :  → [0, 1] defined as () = 1 −   () for every  ∈ [0, 1] is the SEFS of  with respect to the min-max composition.
In [27] a distance based on GEFS and SEFS for image matching is used over images of sizes  × .Indeed, considering two single-band images of sizes  × , say  1 and  2 , such distance is given by where  = {1, 2, . . ., },   ,   are the GEFS and SEFS of the fuzzy relation   , respectively, obtained by normalizing in [0, 1] the pixels of the image   ,  = 1, 2.
In [26,27] experiments are presented over color images of sizes 256 × 256 concerning two objects (an eraser and a pen) extracted from View Sphere Database.Each object is put in the center of a semisphere on which a camera is placed in 91 different directions.The camera establishes an image (photography) of the object for each direction which can be identified from two angles  (0 ∘ <  < 90 ∘ ) and Φ (−180 ∘ < Φ < 180 ∘ ) as illustrated in Figure 3.
A sample image  1 (with given  = 11 ∘ , Φ = 36 ∘ for the eraser and  = 10 ∘ , Φ = 54 ∘ for the pen) is to be compared with another image  2 chosen among the remaining 90 directions.GEFS and SEFS are calculated in the three components of each image in the RGB space, for which it is natural to assume the following extension of (17): where   ( 1 ,  2 ),   ( 1 ,  2 ),   ( 1 ,  2 ) are the measures (17) calculated in each band , , .For image matching, the GEFS and SEFS components in each band are extracted from each image, thus forming a dataset with reduced storage memory.An image is compared with the images in the dataset using (18).If the dataset contains  color images of sizes ×  and the dimension of the original dataset is 3 2 , then the dimension of the GEFS and SEFS dataset is 6, so we have a compression rate given by So we obtain a compression rate  = 0.007813 if  = 256.

The Image Matching Process via F-Transforms
We consider an image dataset formed by color images of sizes  × .In the preprocessing phase we compress each image of the dataset using the direct F-transform.Each image is divided in blocks of sizes () × () and each block is compressed in a block of sizes () × ().Thus the images are coded with a compression rate  = (()×())/(()× ()).In our experiments we set the sizes of the original and compressed blocks, so that  is comparable with (18).For example, for  =  = 256, we use () = () = 24 and () = () = 2, so  = 0.006944.
In the reduced dataset we store the F-transform components of each image.We use the PSNR between a sample image  1 and an image  2 defined for every compression rate  (cf.( 9)) as    where RMSE (Root Mean Square Error) is given by (cf.(10)) If we have color images, we define an overall PSNR as where PSNR  ( 1 ,  2 ), PSNR  ( 1 ,  2 ), PSNR  ( 1 ,  2 ) are the similarity measures (20) calculated in each band , ,  compression rate .In our experiments we compare the results obtained by using the F-transforms (resp., GEFS and SEFS) based method with the PSNR (20) (resp.( 18)).We use the color image datasets of 256 gray levels and of sizes 256 × 256 pixels, available in the View Sphere Database for each object considered, the best image  2 of the object itself maximizes the PSNR (22).In other experiments we use our F-transform method over color video datasets in which each frame is formed by images of 256 gray levels and of sizes Advances in Fuzzy Systems

Results of Tests
We compare results obtained by using the GEFS and SEFS and F-transform based methods for image matching on all the image datasets, each of sizes 256 × 256, extracted from the View Sphere Database.In the first image dataset, concerning an eraser, we consider, as sample image  1 , the image obtained from the camera in the direction with angles  = 11 ∘ and  = 36 ∘ .For brevity, we consider a dataset of 40 test images, and we compare  1 with the images considered in the remaining 40 other directions.In Table 1 (resp., Table 2) we report the distances ( 17) and ( 18) (resp., PSNR (20) and (22) with  = 255) obtained using the GEFS and SEFS (resp., F-transform) based method.
In Figure 4 we show the trend of the index PSNR obtained by the F-Transform method with respect to the distance (18) obtained using the GEFS and SEFS method.
As we can see from Tables 1 and 2, both methods give the same reply: the better image similar with the image eraser in the direction  = 11 ∘ and  = 36 ∘ (Figure 5) is given from that one at  = 10 ∘ and  = 54 ∘ (Figure 6).The trend in Figure 4 shows that the value of the distance (18) increases as the PSNR decreases.
In order to have a further confirmation of our approach, we have considered a second object, a pen, contained in the View Sphere Database whose sample image  1 is obtained from the camera in the direction with angles  = 10 ∘ and  = 54 ∘ .We also limit the problem to a dataset of 40 test images whose best distances ( 17) and (18) (resp., (20) and (22) with  = 255) under the SEFS and GEFS (resp., F-transform) based method, are reported in Table 3 (resp.,Table 4).
In Figure 7 we show the trend of the index PSNR obtained by the F-transform method with respect to the distance  obtained by using the GEFS and SEFS method.As we can see from Tables 3 and 4, in both methods the best image similar to the original image in the directions  = 10 ∘ and  = 54 ∘ (Figure 8) is given from that one at  = 10 ∘ and  = 18 ∘ (Figure 9).Also in this example, the trend in Figure 7 shows that the value of the distance (18) increases as the PSNR decreases.
Now we present the results over a sequence of frames of a video, Mom-Daughter, available in the Ohio University sample digital color video database.Each frame is a color image of sizes 360 × 240 with 256 gray levels for each band.We use our method with a compression rate  = 0.006944; that is, in each band every frame is decomposed in 150 blocks, and each block has sizes 24 × 24 compressed to a block of sizes 2 × 2. Since  ̸ = , the GEFS and SEFS based method is not applicable.We set the sample image as the image corresponding to the first frame of the video.We expect that the frame number of the image with higher PSNR with respect to the sample image is the image with frame number close to the frame number of sample image.In Table 5 we report the best results obtained using the F-transform based method in terms of the ( 20) and ( 22) with  = 255.As expected, albeit with slight variations, all the PSNRs diminish by increasing of the frame number, and the second frame (Figure 11) is the frame with the greatest PSNR w. r. t. the first frame (Figure 10) containing the sample image.
In Figure 12 we show the trend of the PSNR (22) with the frame number.This trend is obtained for all the sample video frames in the video dataset.For reasons of brevity, now we report only the results obtained for another test performed on the sequence of frames of another video in the Ohio sample digital video database, the video sflowg.The PSNR in Figure 15 diminishes by increasing the frame number, and the second frame (Figure 14) is the frame with the greatest PSNR w. r. t. the first frame (Figure 13) containing the sample image.
For supporting the validity of the F-transform method for all the sample frames, we measure, for the frame with the greatest PSNR w.r.t the sample frame, the correspondent value PSNR 0 obtained by using the original frame instead of the correspondent compressed frame decoded via the inverse F-Transform.In Figure 16 we show the trend of the difference PSNR 0 − PSNR with respect to PSNR 0 .The trend indicates that this difference is always less than 2. This result shows that if we compress the images in the dataset with rate  = 0.006944 by using the F-transform method, we can use the compressed image dataset for image matching processes, comparing the decompressed image with respect to a sample image despite the loss of information due to the compression.

Conclusions
The results on the images of sizes  ×  ( =  = 256) of the View Sphere Image Database show that, using our Ftransform based method, we obtain the same results in terms of image matching and in terms of reduced memory storage reached also via the GEFS and SEFS based method, which is applicable only over images with  = , while our method concerns images of any sizes.Moreover our tests executed on color video frames of sizes  ×  ( = 360,  = 240 pixels with 256 gray levels) of the Ohio University color videos dataset show that, by choosing the first frame as the sample image, we obtain as image with the highest PSNR that one corresponding to the successive frame, as expected, although a loss of information on the decoded images because of the compression process.

Figure 3 :
Figure 3: The angles for two directions  1 and  2 (the object is in the origin).

Figure 4 :
Figure 4: Trend of PSNR with respect to  index for the eraser obtained from the comparison with the sample image at  = 11 ∘ and  = 36 ∘ .

Figure 7 :
Figure 7: Trend of PSNR with respect to distance (18) for the pen obtained from the comparison with the sample image at  = 10 ∘ and  = 54 ∘ .

Figure 12 :
Figure 12: Trend of PSNR ( = 0.006944) with respect to frame number for the video Mom-Daughter.

Table 3 :
Best distances from GEFS and SEFS based method with  = 0.007813 for the pen image dataset obtained from the comparison with the sample image at  = 10 ∘ and  = 54 ∘ .available in the Ohio University sample digital color video database.A color video is schematically formed by a sequence of frames.If we consider a frame in a video as the sample image, we prove that the image with greatest PSNR with respect to the sample image is an image with frame number close to the frame number of the sample image.