Evaluating Subpixel Target Detection Algorithms in Hyperspectral Imagery

,


Introduction
Ideally, one would like to choose a hyperspectral detection algorithm for use in a particular scenario with the assurance that it would be "optimal," that is, that the type of algorithm to be used and its free parameters would be optimized for the particular task for which it is being considered. Of course, in such cases, the complexity of real-world scenarios and the difficulties of predicting the exact target signature in situ, make it hard to believe that we can predict the optimal target detection algorithm ahead of time. Because the responses of these algorithms can vary depending on target placement, we adapted the Rotman-Bar Tal Algorithm (RBTA) [1] for comparing point target detection algorithms, used for infrared broadband images, to the analysis of hyperspectral imagery [2][3][4]. The RBTA implants targets and evaluates the response of the detecting algorithm to their presence in every pixel in the dataset. Indeed, our development of new algorithms based on this tool has been validated by results obtained by other researchers in actual field tests [5,6].
An inherent weakness of the RBTA method is its assumption that subpixel targets will each be contained within a single pixel. In light of our recent work [7], which showed that even very small targets can affect several pixels, here we fine-tuned the RBTA method to account for this possibility.
Sections 2-6 describes the RBTA in detail. We show how the simulation of target detection performance is dependent on the spatial correlation of the pixels present in the target.
Sections 7-12 analytically considers the expected performances of several detection algorithms under conditions of "pixel phasing," that is, a small target located simultaneously in several adjacent pixels. Our improved RBTA (IRBTA) takes into account target blurring and pixel phasing. The results presented in  show that the superiority of the ACE algorithm and the importance of accounting for target blurring are validated in a real data analysis based on the RIT target detection blind test experiment. Conclusions are presented in Section 17.

Determining the "Best Algorithm" for Target Detection
Manolakis et al. [8] claimed that to identify the best algorithm for target detection, we need datasets with reliable ground truth spanning a diverse range of targets and backgrounds under various scenarios, target fill factors, and atmospheric conditions. Statistically significant numbers of target and background pixels are necessary to construct reliable ROC curves. Because in many cases this degree of data confirmation is unavailable, we suggest an alternative approach for estimating the best algorithm from among several detectors for specific backgrounds and targets. We start by presenting the RBTA [1]. The algorithm was originally developed for broadband infrared images with subpixel targets, but we altered it to account for pixel blur (atmospheric and system effects which would cause the emitted power of the target to be spread over several pixels) and multipixel targets in hyperspectral imagery.
To estimate detector performance, Rotman and Bar-Tal proposed a multistep process that begins with an analysis of the unmodified reflectance image that is available in the website without any embedded targets. (We assume that ideally no targets are present in the datacube being analyzed; if one were present, it would slightly distort the histogram of the image. We trust that such a distortion will not disturb the overall analysis of the image statistics). The algorithm being tested is evaluated for each pixel, and the results are summarized in what we call a false-alarm histogram. Next we embed targets into every pixel and evaluate each of the algorithms. This is done independently for each pixel (rather than simultaneously) so that surrounding pixels are not changed prior to algorithm evaluation. The results are arranged in a target detection histogram. Each histogram (false-alarm and target detection) is then normalized; a variable threshold is set and the area of the false-alarm and target histograms to the right of the threshold are measured. For any particular threshold, a pair of P FA and P D (probability of false alarm and probability of detection) values are generated. The threshold is swept through all possible detector outputs, generating a set of these pairs. When graphed, these points produce the ROC (receiver operating characteristic) curves.
We note that the target implantation mechanism as given here has ignored several possibly significant effects which would affect the values found of PD. In particular, the target spectrum is a nearly noiseless lab spectrum that does not have the same artifacts, noise, and degradation as the real imagery. Additionally, this approach assumes the data has been perfectly atmospherically compensated by RIT's algorithm, which is not necessarily true. In our opinion, this seems to limit the use of our method rather than to invalidate it. Since the atmospheric conditions at the time of the measurement were not known, we cannot implant atmospherically corrected signatures or validate the reflectance dataset that is available in RIT website.
Instead, we are testing the response of the algorithms to an implanted nonatmospherically corrected target which has been substituted in the reflectance dataset as described above; in each examined pixel, the fraction of the laboratory signature replaced the fraction of the background signal. While inaccurate atmospheric correction may result in an unknown decrease in the target detection, we note that the final comparisons are for variations in algorithm selection for a given target signature. The method should not be used to calculate absolute values for the probability of detection of a particular target which indeed has been altered by atmospheric and other effects. Rather, we are attempting to determine which algorithms will have a superior probability of detecting a target of this type in the scenario. Future work should include a quantitative determination to what degree atmospheric effects change the ranking of different algorithms.
This methodology can be used for the following reasons: as a rule, the ROC curve, which are generated tend to have probabilities of detection which range from 0 to 1; the value of probabilities of false alarm, on the other hand, vary from 0 to some chosen threshold P FA-MAX . This threshold is normally set quote low; a standard value would be 0.01. This is appropriate since the acceptable use of most detection algorithms could only be in the range where a small percentage of the pixels in the image would be false alarms. Now, the exact distribution of the background pixels is crucial for the analysis of our detection algorithms; it will indeed be the exceptional pixels in the tail of the distribution which will determine the ROC curve. However, since the probability of detection is being determined by the entire P D scale from 0 to 1, all pixels contribute. In other words, the target detection scheme in this paper is extremely sensitive to a few false alarms; it is much less sensitive to a few pixels with missed "synthetic" target signatures. As such, subtle effects affecting the exact form of the target signature in situ are not being measured; rather the average response of the algorithm to the target signatures placement in all the pixels is the key factor. For our above goal, that is, the comparison of different target algorithms, we believe our method to be reliable.
To summarize, ROC curve evaluation entails the following steps as demonstrated in Figure 1

CEM.
In many cases, it is convenient to scale the matched filter such that it has a value of 1 when the target signature fills the pixel being examined. This scaling can be achieved by normalizing the matched filter to its value when operating on the designated target spectrum: where s is the reference signature of the target, R is the background correlation matrix, that is, an [L × L] matrix, L is the number of bands, and x is the observed pixel.  whitened space and thus leads to planar decision surfaces in that space. An important characteristic of the CEM algorithm is that its output is correlated to the target's fractional abundance in signature x, assuming the target signature is well isolated from the other endmembers, mixing is linear, and the relative abundances of the endmembers follow a Dirichlet distribution [9].

GLRT and ACE.
Manolakis and his group [10][11][12][13][14] have described a number of stochastic target detection algorithms, including that attributed to Kelly [15] for solving to the Neyman-Pearson decision/detection theory for maximizing the probability of detection of a target with a fixed probability of false alarms. The solution uses a GLRT expressed as where s and x are the same as for (1), m g is the global mean, G is the background covariance matrix, and M is the total number of samples. The ACE algorithm, a variation of the GLRT algorithm, is expressed as with a maximum value of 1 for the case of x = s and a minimum value of 0 when x = m g .
In the context of target detection, the sign of (s − m g ) T G −1 (x − m g ) is important, as only positive abundances are of interest. (In contrast, this would not be the case for thermal gas detection, for example, where the target could be either absorptive or emissive in nature). Thus, in practice, a signed version of the GLRT algorithm is used as follows: The corresponding ACE algorithm for target detection, also a variation of the GLRT algorithm, is expressed as Because real data does not necessarily match the assumptions from which the above algorithms are derived, that is, a background probability distribution function assumed to be multivariate Gaussian with zero mean bias and an additive target model, we generally cannot expect that any of the algorithms will be optimal or even that one will consistently outperform another [8]. Nevertheless, it was shown by Manolakis [13] that for a limited dataset, although each of the algorithms exhibited some degree of success in target detection, the ACE algorithm performed best on the limited dataset tested.
In Figure 1, step 1, note that the target is not in all the positions simultaneously; rather, the result is obtained sequentially. Steps 4 and 8 are generated by one minus the cumulative histogram using the results from step 3 and 7, respectively, (these are the probability of detection-PD). In Step 9, we plot PD values (step 4) versus the PFA (step 8).

Subpixel Target Detection Using Local Spatial Information
Improving target detection involved replacing the global mean with the local mean. Using the local mean is definitely double edged: on one hand, we would expect that the closer the points used to evaluate the background are to the suspected target, the more likely it is that the estimate will be accurate. On the other hand, the noise in the estimate will decrease given more points entering into the estimation, assuming that the background is stationary and the noise is linearly added to the background and independent thereof. Our empirical experience confirmed by several studies (4) and (5) is that the closer we choose the pixels the better, with the condition that we do not have target contamination of the background pixels. It is this proviso that we wish to test here.
We note that we are not dealing here with a "local" covariance matrix which would change when evaluating each pixel in the image. Rather, we use the same covariance matrix throughout the image; it will simply be based on the difference of the sample pixels and their "local" background.
Since we are dealing with a subpixel target, which in the physical domain can affect only pixels in a limited spatial area surrounding the center of the target, we used the eight nearest neighbors approach to estimate the value of the test pixels. The CEM algorithm does not use the mean and will therefore be unaffected by the above changes. The GLRT can be improved as follows: and for target detection with m 8 , the mean of the eight nearest neighbors, replacing the global mean m g . For the ACE detector, the same procedure (replacing m g by m 8 ) may be followed. Segmentation [16][17][18] or even more local covariance matrices [2,4,6,19] can be used to improve the covariance matrix. Common to all these methods is an increased need for high performance computational resources, while the corresponding influence each method has on detection ability is uncertain and highly dependent on the pictures being analyzed. Used in parallel, the algorithms create new difficulties through the combination of results from different segments. We used a global covariance matrix, but adapted it to local variations by using the local rather than the global mean, that is, where X is a two-dimensional matrix (M × L), in which M is the number of pixels and L is the number of bands, m g [1 × L] is the mean vector of X, and M g is m g replicate M times. When we use M 8 for the covariance matrix, we do not need to replicate the mean, because M 8 is also of size [M ×L], and this is the appropriate covariance matrix for whitening X − M 8 .

Data
We tested our algorithms on the online reflectance data sets and the hyperspectral data collected over Cooke City. The Cooke City imagery was acquired on 4 July 2006 using the HyMap VNIR/SWIR sensor with 126 spectral bands. Two hyperspectral scenes are provided with the dataset, one intended to be used for development and testing (the "Self Test" scene, where the positions of some targets are known) and the other intended to be used for detection performance evaluation (the "Blind Test" scene, where the position of targets is unknown). The data was corrected for atmospheric effects and available in the website but the exact atmospheric condition and the atmospheric correction algorithm are not available in the website and we assume that the reflectance dataset is good but not perfect. In Figure 2, we present the image in false color. The target signatures, used both in the algorithm for detection and in the implantation of the synthetic targets in the RBTA method were laboratory measured and in reflectance units. The GSD is approximately 3 m. In Figure 3, we present the spectral signature of the targets in the blind test image.
The list of all targets is presented in Table 1 below.

Analytical and Simulated Performances of GLRT and ACE
6.1.1. Simple Case. The general form for local target detection as described in Section 3 is Journal of Electrical and Computer Engineering 5  with m 8 as the mean of eight neighbors. G, the global-local covariance matrix, is computed as where we can get GLRT and ACE as functions of Ψ 1 + Ψ 2 : For the case in which the PUT (pixel under testing) x is exactly s, we obtain the following results: Let us define the scalar C as− Therefore, when x is exactly s, GLRT and ACE can be written as Assuming that the data is normally distributed, C is chisquare distributed with E(C) = L, where L is the number of bands. For the case in which M E(C) = L, we can assume that

Pixel Phasing Case
When imaging, the target can often fall across several pixels even if its total size is only a single pixel; we will call this effect pixel phasing even though it is a natural consequence of imaging system quantization. The pixel phasing effect can be demonstrated by a target one pixel in size, the imaging of which leads to pixel phasing registration defined by p,  such that 0 < p < 1 (0 corresponds to perfect sampling, with the target completely replacing the background) ( Figure 4). From the point of view of the central pixel, it is not important the spatial location of the fraction within the pixel nor the location of the remainder of the target signature. Assuming uniform backgrounds of m old 8 for both center pixels, they can now be given as in Figure 4. We obtain the following: where x new is the new PUT for the pixel phasing case and where m new 8 is the new mean for the background. We now evaluate the terms (s − m new8 ) and (x − m new8 ) as follows: The GLRT result now becomes where D Local (x) is the general local detector for the pixel phasing case. For the case in which N C, we calculate that GLRT miss sampling (x) ∼ = p 2 + 14p + 49 64 where GLRT miss sampling (x) is the expected GLRT value for the pixel phasing case and M L. The GLRT expected value degrades as a function of p. But for ACE (Ψ 1 = 0, Ψ 2 = 1) we still get expected values of 1: In this model, the complete lack of ACE degradation as a function of pixel phasing may explain why ACE is a more robust detector than GLRT in many test cases, as noted in the literature [8,20].

Ranking the Algorithms by RBTA
The difficult task of synthesizing a synthetic image to help predict which algorithm to select is simplified and detector selection is facilitated if we synthesize only the target signature of our real image. Suppose we want to determine the proper detector for a specific target. We have already selected our method (e.g., CEM, GLRT, or ACE), and now we want to select the size of the local window. One approach is to assume that the best size for the local window is that under which the PUT value can be predicted with minimum error vis-à-vis the real PUT (in which we normalize each band by the mean values of the pixels in that band).
The approach outlined above depends only on the background image, not on the target signature, and it entails two assumptions: first, estimating signature values will improve our detector results independent of the different target signatures and second, the target has no effect on its neighbors. Address these assumptions in the following sections.

How to Use RBTA
The implementation of RBTA, which depends on our ability to implant realistic signals into backgrounds and measure detector response, should be done carefully. We cannot expect the real signature to be identical to a library signature, but we can hope for a high level of similarity. The low percentage of the target signature that actually enters any particular pixel is demonstrated in Figure 6; the response of the CEM filter, which responds proportionally to the percentage of the target fill in the tested filter, was maximum at 0.06.
As a rule, to test and challenge our algorithms by examining the area under the ROC curve, we need to test targets which neither "saturate" the ROC curve (with a probability of detection close to one with no false alarms detected) nor result in a "diagonal" ROC curve (in which the probability of detection equals the probability of false alarms. As the allowable false alarm rate decreases, the strength of our synthetic implanted target would need to increase; if we know what the acceptable false alarm rate is, we can select the target percent that will demonstrate the dynamic range around this rate and get results for our detectors ( Figure 5).
For the values found experimentally for p, the target was easily detectable and saturated our ROC curve. Thus, we only embedded 0.0075 of the target signature in the background pixels to generate the target detection histogram. In our results, we found that for GLRT and ACE, the best local window size is 3 × 3 pixels (CEM has no local form). We also see that using bigger windows to estimate the pixel signature value gets us closer to the performance of global detectors that use a global mean. In this case, it is clea that local detectors are superior to global detectors.
In terms of real data, we must expect each target to affect more than one pixel even if its total physical size is at the subpixel level. A discussion of this point follows below and leads to improvement of the RBT algorithm.

Improvements to RBTA
10.1. Target Size. As will be discussed in Sections 11-12 the apparent target size in the final digital image is related both to its physical size and to various atmospheric and sensor effects, for example, its point spread function (PSF), Gibbs effect, crosstalk between pixels, spatial sampling, band-to- band misregistration [5], and motion compensation. Thus, a target of a single pixel could actually occupy several pixels. In the RIT blind test, there are two 3×3-m targets, that is, exactly the size of the ground sample resolution (GSD) for the self and blind test images. Figure 6 shows a sample target of this size.

Spatial Sampling Effect
If we take into account only the spatial sampling effect, we can estimate the percent of pixel area partially occupied by the target. Notice that even targets of subpixel size often spread over neighboring pixels (Figure 7).
Put formally, the percent of pixel area covered as a function of target size, target location, and target orientation is where θ represents the clockwise rotation of the target relative to the pixel grid, and α, β represent the proportions of target length and width, respectively, relative to pixel physical dimension, Δx, Δy are the transition of the target origin relative to the pixel origin, as demonstrated in Figure 8 UnitBox The expected value E[Sα, β] is If we set θ to a constant value of zero and the target length and width are half the size of a pixel, we can simulate all the locations of the target where it's covering the same percent of pixel area (Figure 9). Calculating (25) for a different size target using a numerical example produced the results shown in Figure 10.
In the graphs depicting pixel coverage as a function of physical target size, the x-axis is the ratio either between the target area and the pixel area (Figure 10(a)) or between the target length and the pixel length (Figure 10(b)). The blue line in both figures represents the percent of target within the pixel, while the green line is the percent of the pixel expected to be covered. It is intuitive that a very small target will be located in only one pixel, covering a small percent of that pixel. It is less intuitive, however, that the expected pixel to be covered will be entirely covered only by a target with an area four times that of the pixel.

Point Spread Function Effect
The PSF effect, present in any optical system, is not always known. Let us assume that the PSF is a typical, rotationally symmetric Gaussian filter of size 3 × 3 with standard deviation sigma 1/2. Figure 11 demonstrates the synthetic spread effect that emerges from the convolution of the optical PSF (Figure 11(a)), and the physical pixels phasing due to target size (Figure 11(b)). For the spatial sampling we took the mean case representing the average pixel phasing that we could expect. Figure 11(c) represents the total effect, for example, convolution between (a) and (b). We devised an improved RBTA (IRBTA) and embedded the pixel signature and its neighbors with ratios as shown in Figure 11(c).
A comparison of Figures 5 and 12 shows that global detector and local detectors that both use only 7 × 7 frames Journal of Electrical and Computer Engineering   have exactly the same performance. Evidently, the target spread effect is only localized in a 5×5 window. Furthermore, the most significant effect is for a local 3 × 3 window. Indeed, the local 3 × 3 seems robust, and it outperformed all other detectors. We expected these detectors to be the best for this target in this image.
For our next stage of proof of concept, we need to compare real detection performances to these simulated results. We obtain this by using the self-test dataset and by submitting our algorithms to the RIT target detection blind test and comparing the ranking of our algorithms for each of the target signatures available in the set; we can then see if the IRBTA predictions of the preferred algorithms are true.

Scoring Methodology and Result Presentation.
Detection algorithm performances are given according to the RIT target detection methodology applied to the aforementioned Cooke City hyperspectral dataset. The score is based on a comparison of the values given to the background pixels in the image to the value given to the target. The target value is defined as the maximum value given the pixels in the target area; the metric then counts how many pixels there are in the overall image greater than or equal to the target value. Since the threshold needed to detect the target would have to be less than or equal to the target value, all points above this value are false alarms. The score given for any algorithm/target combination would thus be the number of pixels above the target value. Perfect detection would equal the value 1, since the only pixel equal or above the target value would be the target itself; no false alarms are present.
In Tables 2, 3, and 4 the scores for the self-test targets were calculated, and those for the blind test were obtained by uploading our results to the RIT website. We have colored coded the results for the ease of the reader. A value of 1 (italic background) indicates perfect performance (no false positives). We marked all scores greater than 448 (e.g., PFA = 2 * 10 −3 ) with bold italic backgrounds. We will associate a false alarm score greater than this as a fundamentally undetected target; ranking above these values is irrelevant All results (i.e., scores) were divided among the three tables: Table 2 presents the global method (CEM, GLRT and ACE) while Tables 3 and 4 contain the results for the local GLRT and ACE, respectively, with different sized windows. The best score for each group is marked with a bold. Ignoring (as stated above) the PFA values > 2 * 10 − 3 (bold italic background), then the global ACE is clearly the best method from among the global methods. The local GLRT and ACE with the 3 × 3 windows are the best of the local GLRT and ACE methods, respectively. Indeed, the overall performance of the local ACE algorithm with 3 × 3 windows was superior to all other algorithms. All these results were obtained using the IRBTA without any need of ground truth. Some targets, for example, V3 in the blind test and V2 and V3 in the self test, degraded with the 3 × 3 window of the local ACE algorithm, but those targets were effectively undetectable with all three algorithms.

Global Methods: Results
Within the results for the global methods (Table 2), we marked detector scores indicating a significant advantage relative to other detectors in bold. Our results clearly show that the ACE detectors outperformed the other global detectors.

Local GLRT and Local ACE: Results
Similar to the global methods analysis, here (Tables 3 and  4) we also applied yellow highlights to cells with exceptional   detector scores that signified a significant advantage relative to other detectors. From our analysis, it is clear that for both the GLRT (Table 3) and ACE (Table 4) detectors, the optimal size of the local window for this size target is 3 × 3, a result that is in accordance with our estimation by IRBTA.

Benchmark Results
The RIT website provides a comparison between the results of different algorithms which have been submitted throughout the world. Table 5, which presents the web rank for the blind test, clearly shows that the performance of the local ACE 3×3 algorithm was superior relative to dozens of results that have been uploaded to the site. We achieved perfect detection for V1 (the nearest score was 15 for algorithms that work well only for V1) and reasonable detection for V2 (the nearest score was 196). There was no algorithm that achieved detection for V3 (the best score was 112). (The above conclusions are correct if we do not take into account one algorithm that was submitted to the RIT site, labeled "WTA", i.e., "Winner Take All." This algorithm exhibited perfect detection for all targets, including the vehicles, However, since the WTA algorithm has not been published, we cannot test it. In addition, there are objective reasons to believe that the algorithm does not actually exist; perfect detection of all targets with no false alarms ever would be an almost impossible result). While we included WTA in our scoring in Table 5, our comments following the Table do not consider  this algorithm). It is possible to compare the actual results obtained by the target detection algorithms on the RIT system to those obtained from the IRBTA simulation. In Tables 6 and 7, we show the results of several different algorithms when detecting the target V1; when we examine the results from the RIT test, we see that the ACE algorithm outperformed the GLRT algorithm, and that there was a definite preference for the 3×3 frame compared to the other frames (Table 6). These results are mirrored in our IRBTA results (Table 7). This (and results on the other targets) confirms our hypothesis that the RBTA can be used to simulate and predict target detection probabilities a priori in scenarios where actual targets are not yet present.

Conclusion
In this paper, we showed that there is no "best hyperspectral detection algorithm" for all images and targets. We noted the significant effect spatial distribution has on detector  performances, and we showed that the RBTA can be used to select the proper detectors from among several detectors but without any need for ground truth. However, point targets can influence their neighboring pixels, due either to the PSF or to the target spreading across more than one pixel. To  account for this potential source of inaccuracy, therefore, we introduced the improved RBTA (IRBTA), whose exact method of use depended on the target size. In addition, we showed that when detectors calculated the mean for estimating the pixel signature value, we did not need ground truth to find the best estimate. We tested our concept through the selection of the best detectors from among stochastic algorithms for target detection, that is, the constrained energy minimization (CEM), generalized likelihood ratio test (GLRT), and adaptive coherence estimator (ACE) algorithms, using the dataset and scoring methodology of the Rochester Institute of Technology (RIT) Target Detection Blind Test project. The results showed that our concepts predicted the best algorithms for the particular images and targets provided by the website.