Naval Target Classification by Fusion of Multiple Imaging Sensors Based on the ConfusionMatrix

This paper presents an algorithm for the classification of targets based on the fusion of the class information provided by different imaging sensors. The outputs of the different sensors are combined to obtain an accurate estimate of the target class. The performance of each imaging sensor is modelled by means of its confusion matrix (CM), whose elements are the conditional error probabilities in the classification and the conditional correct classification probabilities. These probabilities are used by each sensor to make a decision on the target class. Then, a final decision on the class is made using a suitable fusion rule in order to combine the local decisions provided by the sensors. The overall performance of the classification process is evaluated by means of the “fused” confusion matrix, i.e. the CM pertinent to the final decision on the target class. Two fusion rules are considered: a majority voting (MV) rule and a maximum likelihood (ML) rule. A case study is then presented, where the developed algorithm is applied to three imaging sensors located on a generic air platform: a video camera, an infrared camera (IR), and a spotlight Synthetic Aperture Radar (SAR).


Introduction
The ability to quickly and reliably recognize non cooperative targets is of primary importance for surveillance operations in Homeland Security (HS) applications.The development of efficient fusion strategies and the improvements in the design of more reliable sensors have increased the research interest in the classification techniques in many fields.In particular, automatic surveillance systems based on imaging sensors are gaining significant interest, as proved by recent research work [1][2][3] addressed to improve the quality of image-based surveillance systems.The recognition techniques can include approaches based either on the human interpretation of the data provided by a sensor system or on automatic methods.The Automatic Target Classification (ATC) techniques can use data coming from sensors of different nature, such as infrared, electro-optic cameras, and radar systems.As described in [4], the process of target recognition can be conceptualized as composed by five levels or subprocesses: the detection, that is, the process of distinguishing the target from thermal noise; the discrimination, that is, the capability to extract potential targets from surrounding clutter; the preclassification, that is, a sort of prescreen in order to exclude targets that are not of interest from further processing; the classification, that is, the process during which the targets are characterized as belonging to a specific class according to some particular features; and, finally, the identification, that is, a more sophisticated process which may refer to the individuation of the target cooperativeness or to the extrapolation of more specific features, for example, in a maritime environment the type or the name of a ship previously associated to a naval class.This work assumes that the first three processes [5] have occurred and is only concerned with the classification process.In particular, we investigate how data coming from sensors of different nature can be combined to improve the classification task.In ATC the classification task can be accomplished using several approaches.A model-based technique uses a model of the target, obtained, for example, by a Computer-Aided Design (CAD) or an Electro-Magnetic (EM) simulation [6], to compare the simulated models with the signature of the target under test.The computational load of this methodology can be very high, especially when more than one sensor are used.Another methodology can consist of collecting many real versions of the target signature and of comparing them with the signature of the current target under test, but in this case a very large database is required and if the target (or observing environment) changes significantly the classification process may be unsuccessful.
In this work a classification algorithm is developed, which uses a different approach, where the information on the target class is provided by imaging sensors of different nature and it is expressed by means of a confusion matrix (CM).This approach allows us to overcome the difficulties related to the high computational load of the methodologies described above and to insert the classification task in the analysis and simulation of a large and complex system.The CM is analytically computed for each imaging sensor and it models the performance of the sensor during the classification process.The entries of this matrix are the conditional error probabilities in the classification and the conditional correct classification probabilities.These probabilities amount to the target class likelihood functions and are used to make the decision on the target class by each sensor.The sensor CM is analytically computed as a function of its sensitivity features, its resolution, and using a database of prestored reference images.Then a final decision on the class is made, using a suitable fusion rule in order to combine the decisions coming from the different sensors.The overall performance of the classification process is evaluated by means of the "fused" confusion matrix, that is, the CM pertinent to the final decision on the target class.Two fusion rules are considered to combine the class information coming from the different sensors: a majority voting (MV) rule and a maximum likelihood (ML) rule.The ultimate purpose of this fusion process is to combine the outputs of the different imaging sensors to obtain an accurate and reliable estimate of the target class.This analytical approach is then applied to a case study, where three imaging sensors located on a generic platform, performing in coastal surveillance, are considered: a video camera, an infrared (IR) camera, and a spotlight Synthetic Aperture Radar (SAR).The final information on the class, considered by means of the "fused" CM, could then be exploited by the system where the sensors are located to perform other surveillance operations, such as the evaluation of the threat level of a noncooperative target.
In [7], different levels of abstraction in the fusion of data coming from different imaging sensors are described: the signal-level fusion is the combination of signals from different sensors, performed before the production of images; the pixel-level fusion consists of merging different digital images; and the feature-level fusion extracts specific features from different image and combines them.The approach developed in this work, using the CM to model the classification capability of the imaging sensor, refers to a higher level of abstraction.A similar approach, where the CM is used to model the classification capability of the sensor, is also used in [8], but there the classification process is used to support the data association and to improve the tracking, especially in presence of association uncertainty in kinematic measurements.In the literature many applications are proposed where radar images are combined with images from different kind of sensors [7,[9][10][11] or where heterogeneous data sets coming from dissimilar imaging sensors are combined at an information fusion level [12].In [13] we have described the classification algorithm based on the CM applied to visible and infrared images.In the present work three sensors are considered instead of two.In particular, we have considered electromagnetic simulated images from a spotlight SAR, in addition to those from the visible camera and from the IR sensor.The results of a similar case study have already been presented in [14], in relation to three sensors.In the work presented herein we present a more complete and methodical description of the algorithm, we show more details about the numerical case study considered and the figures of the simulated images, and we report in the appendix the entire mathematical details of the analytical computation of the CM.
The main contribution of the proposed classification algorithm is the development of a methodology that allows us to emulate and incorporate the classification process in the study and simulation of a complex multisensor system, without increasing the computational load of the overall simulation.In fact, in [15] the proposed algorithm has been inserted into the simulation of a multisensor system for coastal border surveillance, without increasing the computational load of the whole simulator.
The paper is organized as follows.Section 2 describes the classification algorithm, based on the analytical computation of the CM.The fusion of the decisions on the target class coming from different imaging sensors is presented in Section 3. Two decision rules are considered, that is, a majority voting (MV) rule and a maximum likelihood (ML) rule.The performance of the decision rule is described in Section 4. In Section 5 a case study is illustrated, where the developed algorithm is applied to three imaging sensors located on a platform performing in a maritime surveillance scenario: a video camera, an IR camera, and a spotlight SAR.The numerical results for this case study are presented.Finally, in Section 6 some conclusions are drawn.The analytical details of the computation of the CM are reported in the appendix.

The Classification Algorithm
The generic entry of the CM of a classifier is the probability that a target belonging to the class i is misclassified as belonging to class j: c (k)  i j = Pr the kth sensor decides for H j when H i is true where H i represents the hypothesis that the target belongs to class i and { d k = j} represents the event {the kth sensor decides for H j }.Thus the ith row of the CM represents the event {the true class of the target is i} and the class likelihood function for the sensor output j is the jth column of C [8].The off-diagonal elements of the CM represent the conditional error probability during the classification and the diagonal elements are the conditional correct classification probabilities for a given sensor, under the hypothesis H i :

CC|Hi
= Pr{correct classification for the kth sensor | H i } = Pr{the kth sensor decides for H i when H i is true}. ( Then the correct classification probability for the kth sensor is where M is the number of hypotheses (i.e., the number of classes considered), the term P (k) CC|Hi is the conditional correct classification probability given by the diagonal element c (k)  ii of the CM, and Pr{H i } is the probability that the i-th hypothesis is true.The error probability conditioned to the ith class, for the kth sensor, is The entries of the CM are used to model the performance of each sensor during the classification and to make the decision on the target class.This means that a target detected by the system is declared as belonging to class j with a probability equal to p (k) i j derived from the elements of the CM and this probability is used by the sensor as a threshold for the decision on the class.More specifically, in order to associate a class to an incoming target, a random variable uniformly distributed in the interval [0, 1] is generated and it is compared with the threshold given by the entries of the sensor CM: then the kth sensor decides for H n This is done to simulate the classification event without generating the data.The simulation of the classification event based on the elements of the CM is shown in Figure 1.
In the classification approach described here, the entries of the CM are computed in an approximated closed-form by means of an analytical evaluation, whose details are described in the appendix.The parameters required for the analytical evaluation for each sensor are: (i) the signal-tonoise ratio (SNR) at the output of the sensor; (ii) the sensor resolution; (iii) a set of reference images stored in a database; and (iv) the cross-correlation between the images of the database.The CM can be expressed as the following function: where SNR is the signal-to-noise ratio, N H and N V represent the sensor resolution in terms of number of pixels on the horizontal and vertical planes, respectively, and M is the dimension of the reference database.In order to simplify the analysis, the following assumptions are made: (1) the detection of the target has already occurred (e.g., it has been performed by a radar system); (2) the image database for each sensor is exhaustive, that is, the possibility that the image of the target under test is not contained in the database is not considered; (3) the reference images of each database do not contain noise, but this is added during the analytical computation of the CM; (4) the noise added on each image is Gaussian and independent from pixel to pixel.
As described in more details in the appendix, the computation of the entries of the CM in the ith row is derived from the computation of the classification error probability for the ith class.The error probability is computed in an incremental way, by defining the elementary error event in the space composed by all the possible hypotheses (H 1 , . .., H M ) and by adding the contribution of this event to the overall error probability.The partial contributions for the ith class are assigned to the off-diagonal elements c (k) i j .The diagonal elements can be computed as In the case considered here, the dimension of the reference database is equal to the number of classes considered, since the hypothesis of exhaustive database is made.The images of the reference database for each sensor can be derived from a CAD model of the target.The algorithm for the computation of the CM is schematically represented in Figure 2.An example of database construction is mentioned in the case study of Section 5 and described in [13].

Fusion of the Decisions on Target Class
The purpose of the fusion process is to combine the outputs of all the imaging sensors in the system to obtain an accurate and reliable estimate of the target class.As stated before, the performance of each imaging sensor during the classification process is modelled by means of its confusion matrix The fusion process is described in Figure 3 in the case of K imaging sensors.For simplicity, let us consider K = 3 sensors.For each imaging sensor, the CM matrix is analytically computed as described in Section 2 and its entries are used to make a local decision on the class, that is, International Journal of Navigation and Observation d 1 , d 2 , and d 3 .Then these local decisions are combined using a suitable decision rule.Thus the observed data is a threedimensional vector d = (d 1 , d 2 , d 3 ) whose elements {d k } are discrete random variables (r.v.) that take values in the set I S = {1, 2, . . ., M}, where M is the number of classes considered, and represent the decision on the target class coming from each imaging sensor.Moreover, we assume that the elements of d are mutually independent, that is, the decisions made by different sensors are independent.
Let us consider the set is, the set of all the observable sequences of K = 3 elements, that can be constructed with the M elements of the set I S .The dimension of I d is M 3 .Our purpose is to map the three-dimensional vectors d into a scalar value belonging to the set I S and representing the final estimated class of the target, that is, d f in Figure 3.This means that there are M possible hypotheses {H 1 , H 2 , . . ., H M }.We assume that these hypotheses have the same a priori probability: We indicate with g(d) the fusion rule, that is, the function that maps the observed vector d into a final decision in favour of one of the M hypotheses: where {d f = j} represents the event (we decide in favour of H j ).This approach, based on the fusion of the decisions made by each sensor through the CM entries, allows us to manage the combination of information coming from very dissimilar imaging sensors and to compensate for the sensor parameter differences, such as the fields of view, the resolutions, and the noise features.The overall performance of the fusion process can be expressed by means of a "fused" confusion matrix, that is, the matrix pertinent to the final decision on the target class d f .Two fusion strategies are investigated and compared in this work: one based on the majority voting decision rule and the other based on the maximum likelihood decision rule.

Majority Voting Decision Rule (MV).
The majority voting decision rule consists in choosing for target class that occurs more times in the observed sequences.In the case of the three-dimensional sequences d = (d 1 , d 2 , d 3 ) considered here, the MV rule can be analytically expressed as follow where L q (d 1 , d 2 , d 3 ) is the number of times the value q appears in the sequence d = (d 1 , d 2 , d 3 ), that is, the number of occurrences of the qth class in the observed sequence.When L q (d) = 1 for q = 1, 2, and 3, that is, }, the MV rule is not applicable.In these cases, we choose the class in favour of which the "more reliable sensor" has decided.Note that the more reliable sensor is the sensor for which the conditional correct classification probability, given by the diagonal elements of the CM, is higher.For instance, if the sequence d [8] = (1, 2, 4) is observed, the final class will be d f = q for which Pr{d k = q | H q } is maximum, for k = 1, 2, 3.In this example we have: and d 3 = 4; then we consider the diagonal elements c (1)  11 for the first sensor, c (2)  22 for the second sensor and c (3)  44 for the third one and we decide 11 , c (2)  22 , c (3)  44 = c (3)  44 . ( The observable three-dimensional vectors d [m] are all the possible sequences of that can be made with the M = 4 elements of the previously defined set I d .Table 1 shows all the 64 observable sequences. According to the decision rule described by (10), we can construct a fusion table for MV decision rule, as shown in Table 2.The last column of the table contains the final decision on the target class made according to the MV rule.
Using this table, we can construct the "fused" matrix F, after the fusion of the information on the target class: The entries of this matrix are where D j is the decision zone of H j , that is, the set of m's for which we decide in favour of H j .It is defined as The elements of the sum Pr(d = d [m] | H i ) can be computed as: that represents the product of the entries of the CMs of the three sensors: in (16) with j, k, n, i = 1, . . ., 4.
Note that the sum with respect to j of the elements in (13), that is, the sum of the elements of each row of the fused CM, is equal to 1 by construction.In fact, the combination of elementary events (i.e., single-sensor decision events) belonging to three distinct probability sets, where probability sums to 1, will provide a set of probabilities whose sum will be again equal to 1.  [1]  1 1 1 3 0 0 0 1 d [2]  1 1 2 2 1 0 0 1 d [3]  1 1 3 2 0 1 0 1 d [4]  1

Maximum Likelihood Decision Rule.
In many applications, the most common approach utilized to distinguish between two or more hypotheses is based on the Bayes rule, that assume a priori knowledge of the probabilities of the hypotheses under test.The Bayes rule is based on the minimization of the expectation of the cost function C i j , defined as the cost assigned to the decision to choose in favour of H j when H i is true [16].The analytical formulation of the Bayesian approach applied to the decision on the target class is The rule expressed by ( 17) is called M-ary maximum a posteriori probability (MAP) decision rule, since Pr(H j | d) is the probability that the hypothesis H j is the true one after the observation of the data d, thus it is an a posteriori probability.
As stated before, this decision rule assumes prior knowledge of the likelihoods of the hypotheses.
According to the Bayes theorem, the a posteriori probability Pr(H j | d) can be expressed as where Figure 3: Fusion of the decision on the target class.
prior probability of the hypotheses can be done, the decision rule of ( 17) can be expressed as This is called M-ary maximum likelihood (ML) decision rule, since p(d | H j ) is the likelihood function of the jth hypothesis.Note that the decision rule (19) provides the minimum error probability only when the prior probabilities Pr(H q ) are all equal.
According to the ML rule, in order to decide the final class of the target using the observed data sequences d, we have to choose for the hypothesis H q that maximizes the following probability mass functions: where The elements of the product in (21) are the entries of the confusion matrices C (1) , C (2) , and C (3) , respectively.The joint conditional probability mass function of d can be expressed as follows: This is shown in Figure 4 for the sequence d [8] = (1, 2, 4).According to the ML decision rule described above, we can derive a fusion table for all the observable sequences, as shown in Table 3.The last column of the table contains the final decision on the target class made according to (19).
Similarly to the case of the MV rule, from this table we can evaluate the fused confusion matrix F, by using ( 13) and (14).

Performance Analysis
The performance of the decision rule can be expressed in terms of closeness of the fused confusion matrix F to the identity matrix, which represents the ideal case.In fact, an ideal classification process is characterized by probabilities of error (off-diagonal elements of the CM) equal to zero and International Journal of Navigation and Observation by probabilities of correct classification (diagonal elements) equal to one, that is, where I M is the identity matrix of order M. The conditional correct classification probabilities for the fused matrix F can be expressed from its diagonal elements, similarly to those of the CMs of the K sensors, C (1) , C (2) . . .C (K) : The probability of correct classification is then where in the last part of the equality, we have used the assumption (8) of equal a priori probability for the hypotheses.By replacing (24) in (25), we obtain where we have considered that the sum of the diagonal elements represents the trace of matrix F. To evaluate the performance of the fusion process, we consider the probability of correct classification expressed in (26) and we select as the best performing matrix the one for which the probability of correct classification, and then the ratio tr(F)/M, is the nearest to 1, that is, its maximum value.This occurs when the trace of matrix F at the numerator is close to M, which is the trace of the identity matrix.From this point of view, the correct classification probability is an indication of the closeness of matrix F to the identity I M .The same performance criterion can be explained by considering an alternative interpretation.In order to evaluate the closeness of the fused matrix F to the identity, we can define the following quality factor [14]: The parameter Q belongs to the interval [0, 1] and it is close to 1 when the fused matrix F is very close to identity and it is close to 0 when F is significantly different from identity.As we can see by comparing ( 26) and ( 27), the parameter Q is equivalent to the probability of correct classification.Thus, the best fused matrix is the one for which this quality factor is nearest to 1, that is also the maximum value of the correct classification probability.The difference tr(I M ) − tr(F) at the numerator of the first side of expression (27) represents the sum of the off-diagonal elements of F: This property is due to the fact that the sum of all the elements of the matrix F is equal to M, that is due to the fact that the sum of the elements in each row of the CM is equal to 1.

Numerical Case Study
In this section a case study is presented, where the developed algorithm is applied to three imaging sensors located on a generic air platform: a video camera, an infrared camera (IR), and a spotlight Synthetic Aperture Radar (SAR).A numerical example, concerning the classification process performed by the three imaging sensors, is provided, for four classes of naval targets.This case study allowed us to include and test the algorithm proposed in this work inside the simulation of a complex multisensor system, which performs its operations in a realistic scenario for maritime border surveillance.The considered system is notional.The numerical values considered in this example reflect a typical maritime situation, with standard environmental conditions.

The System and the Scenario.
The background of this case study is represented by an integrated multisensor system for the coastal surveillance.The focus herein is on the classification process, in particular on the fusion of the target class data coming from different imaging sensors.This is a part of a research activity whose final goal is the development of a computer simulator that emulates the main functions of the integrated multisensor system for coastal surveillance (see [15,17,18]).The integrated multisensor system is composed of two platforms of multiple sensors: a land-based platform, located on the coast, and an air platform, moving in front of the coast.The land platform is equipped with a Vessel Traffic Service (VTS) radar, an infrared camera (IR), and a station belonging to an Automatic Identification System (AIS) that provides an information on the target cooperativeness.The air platform carries an Airborne Early Warning Radar (AEWR), which can operate in a spotlight SAR mode, a video camera, and a second IR camera.The mission of the system is the detection, the tracking, the identification and the classification of multiple targets that enter a sea region, the assessment of their threat level and the selection of a suitable intervention on them.The threat evaluation and the selection of the intervention are performed by a command and control centre (C 2 ), which coordinates all the operations of the multisensor system.The threat evaluation logic is based on a deterministic comparison between the target kinematical parameters detected by the two radars and some tolerance thresholds on the speed, on the distance between the target and the coast, and on the direction.This logic also takes into account for the class information provided by the imaging sensors inside the system.The three imaging sensors of the air platform are considered herein.After that the decision on the target class is made by each imaging sensor, according to the algorithm described in Section 2, the fusion of the decisions is performed as described in Section 3. The information on the class is generally not very reliable for long
According to the evaluation logic a Threat Level is assigned to the non-cooperative targets in the set (TL0, TL1, TL2), where TL0 indicates a neutral target, TL1 a suspect target, and TL2 a threat target.The intervention is only selected for the targets assessed as threat and it consists in the allocation of a system resource in order to inspect the nature of the target.Two types of resources are considered here: a helicopter and a patrol boat; both the resources are used only for the target inspection [15,18].The architecture of the surveillance multisensor system we refer to is shown in Figure 5.The simulated scenario is composed of: the geographical area considered, the position of the sensors in this area, the multiple naval targets that enter the scene, and the resources of the system.Four classes of naval targets are considered in this scenario.

Numerical Results
. The CMs of the three sensors have been computed by considering the analytical algorithm described above, for the four classes of naval targets.The analytical computation of the CM requires the setup of a database of reference images.In this numerical example the reference database for each sensor is composed of simulated images, no real data have been considered for now.This database has been constructed using a three-dimensional (3D) CAD model of the naval targets considered in the scenario.The same CAD models have been exploited for the construction of the reference database for the video camera, for the IR camera, and for the spotlight SAR.The sizes considered for the naval targets reported are: (10 × 4.6 × 3) m for the dinghy; (15 × 4.7 × 5.3) m for the motor boat; (16 × 5.3 × 7) m for the fishing boat; (100 × 33.5 × 37.6) m for the oil tanker.For the video camera the image generation is simply obtained by the projection of the 3D CAD on the camera focal plane.For the IR camera, the images are simulated using a specific simulation software, the Opensource Software for Modelling and Simulation of Infrared Signatures (OSMOSIS) [19], developed at the Military Royal Academy of Brussels.For the spotlight SAR the CADs have been processed by a software for the simulation of electromagnetic (EM) images.An example of the simulated images for the dinghy for a view angle equal to 45 • and is shown in Figures 6(a)-6(c), respectively, for the video camera, the IR camera, and the spotlight SAR.The distance between the sensor and the target is 5 Km for the video camera and 1 Km for the IR camera, where the temperature information is represented by the gray scale of the images.
The SNR over the single pixel of the reference images has been evaluated by considering the noise level of each imaging sensor.As it concerns the electro-optical sensors (EO) we have considered the Noise Equivalent Illuminance (NEIL) for the video camera and the Noise Equivalent Temperature Difference (NETD) for the IR camera.In both cases we have taken into account that the SNR decreases with the distance between the target and the sensor because of the atmosphere attenuation.For the video camera we have analytically computed the atmosphere extinction coefficient  [20], assuming a wavelength value of 550 nm.For the IR camera the extinction coefficient has been computed by LOWTRAN (LOW resolution TRANsmission model) for standard weather conditions: a temperature equal to 30 • C, a relative humidity equal to 43%, and a sea state equal to 0. For the evaluation of the SNR in the case of the spotlight SAR, we have considered the radar equation, revisited in order to take into account for the SAR geometry [21,22].The simulated images of the three sensors are referred to the same geometrical and environmental conditions, but the SNR value can be different from one sensor to the other, due to the different nature of the sensors.The classification approach described in this work allows compensating for the sensor parameter differences, such as the fields of view, the resolutions and the noise features.
In this case study we have assumed that the decisions coming from the three sensors are aligned in time.In a future development of this work we will also consider the time misalignment in the decisions, by introducing a delay in the fusion process in order to take into account for the sampling rate of the slower sensor.
The CMs of the three imaging sensors considered in the case study are C (1) = [c (1)  i j ] for the video camera, C (2) = [c (2)  i j ] for the IR camera, and C (3) = [c (3)  i j ] for the spotlight SAR.According to the definition given in (1), the generic entry of the matrix C (1) is the following probability: c (1)  i j = Pr video camera decides for H j when H i is true Similarly, the entries of matrices C (2) and C (3) are defined as c (2)  i j = Pr IR camera decides for H j when H i is true c (3)  i j = Pr Spotlight SAR decides for H j when H i is true Moreover we have: (i) {d 1 = j} ≡ {Video camera decides for H j }; (ii) {d 2 = j} ≡ {IR camera decides for H j }; (iii) {d 3 = j} ≡ {Spotlight SAR decides for H j }.
Thus, the conditional correct classification probabilities are c (1)  ii = Pr{d 1 = i | H i } for the video camera; and c (2)   ii = Pr{d 2 = i | H i } for the IR camera; c (3)  ii = Pr{d 3 = i | H i } for the spotlight SAR.
In the case study a distance between the target and the sensor equal to 10 Km and a view angle equal to 180 • have been considered.The resulting CMs for the video camera, the IR camera, and the spotlight SAR are reported, respectively, in Tables 4, 5, and 6.These tables show that the less reliable sensor, as it concerns the classification of the four targets considered, is the spotlight SAR.On the other side this sensor has a major coverage with respect to the other two sensors.The correct classification probability conditioned to Class 4 (oil tanker) is always P CC|H4 = 1, due to the fact that the size of this class of target (100 m) is significantly different from the size of the other targets considered.The fused CMs obtained by the majority voting rule, F MV , and by the maximum likelihood rule, F ML , are shown in Tables 7 and 8, respectively.From these tables we can observe that the best performing fused matrix, that is, the nearest to I 4 , is the one obtained by the ML decision rule, F ML .The goodness of the CMs in Tables 4-8 is expressed by means of the probability of correct classification P CC that is equal to the quality factor Q defined in (27).As shown in Table 9, the value of P CC nearest to 1 is that corresponding to the fused matrix F ML .The fusion process can provide an improvement in the correct classification probability equal to 3.59% for the MV rule and equal to 5.24% for the ML rule, with respect to the most reliable sensor, that is, the video camera in the numerical example considered here.The value P CC ≡ Q for all the CMs considered in this numerical example is also graphically shown in Figure 7.

Conclusions
This work describes a classification algorithm based on the fusion of the class information provided by multiple imaging sensors.The classification algorithm automatically exploits the a priori knowledge provided by the sensor CM, which is used to model the sensor performance during the classification process.The entries of the CM are the conditional error probabilities in the classification and the conditional correct classification probabilities, and they are used to make the decision on the target class by each sensor.The CM is analytically computed as a function of the sensor SNR, the sensor resolution, a set of simulated reference images stored in a database, and the cross-correlation between the reference images.Then a final decision on the class is made, using a suitable fusion rule, in order to combine the decisions coming from the three sensors.The fusion, operated on the single decisions, allows us to manage the combination of information coming from very dissimilar imaging sensors and to compensate for the sensor parameter differences.The overall performance of the classification process is evaluated by means of the fused CM, that is, the matrix pertinent to the final decision on the target class.Two decision rules are described in the paper: a majority voting (MV) rule and a maximum likelihood (ML) rule.A numerical example is finally proposed where the described classification algorithm is applied to a case study where three imaging sensors are located on a generic platform.The three imaging sensors are a video camera, an IR camera, and a spotlight SAR, and they operate into a multisensor system for coastal surveillance.The final information on the class is used in the multisensor system, as a support to other processes required during the surveillance operation.This methodology allowed us to include the classification process inside the simulation of a complex multisensor surveillance system, without increasing the overall computational load [18].As a final remark, we note that in this analysis we have assumed that a recognition process always occurs.Future developments of the described approach are expected to refine the model, by considering the possibility that the image under test is not contained in the image database and by evaluating the performance of the joint process of recognition and classification.where E jk = M i=1 (y k,i − y j,i ) 2 is the cross energy between an image belonging to the kth class and another image belonging to the jth class, the conditional error probability can be expressed as and it can be divided in the following contributions: such that the conditional error probability P ERR|Hk can be computed by considering two elements in the union and by reiterating M-3 times the operation expressed in (A.17).
More details about this analytical computation can be found in [18].

Figure 1 :
Figure 1: Simulation of the classification event based on the elements of the CM.

Figure 2 :
Figure 2: Computation of the CM.

Figure 5 :•
Figure 5: Architecture of the integrated multisensor system for coastal surveillance.

Table 2 :
Fusion table for the MV rule.

Table 3 :
Fusion table for the ML rule.

Table 7 :
Fused CM by using the MV rule, F MV .

Table 8 :
Fused CM by using the ML rule, F ML .

Table 9 :
Performance of the CMs.