Adaptive Reference Image Set Selection in Automated X-Ray Inspection

The automatic radioscopic inspection of industrial parts usually uses reference basedmethods.Thesemethods select, as benchmark for comparison, image data from good parts to detect the anomalies of parts under inspection. However, parts can vary within the specification during the production process, which makes comparison of older reference image sets with current images of parts difficult and increases the probability of false rejections. To counter this variability, the reference image sets have to be updated.This paper proposes an adaptive reference image set selection procedure to be used in the assisted defect recognition (ADR) system in turbine blade inspection. The procedure first selects an initial reference image set using an approach called ADRModel Optimizer and then uses positive rate in a sliding-time window to determine the need to update the reference image set. Whenever there is a need, the ADRModel Optimizer is retrained with new data consisting of the old reference image sets augmented with false rejected images to generate a new reference image set. The experimental result demonstrates that the proposed procedure can adaptively select a reference image set, leading to an inspection process with a high true positive rate and a low false positive rate.


Introduction
The highly competitive manufacturing industry has demanded higher quality and lower manufacturing costs for the past several decades.These requirements have led to great technological advances of automation in manufacturing processes [1,2].One of the critical components of any manufacturing process is part inspection.Part inspection consists of tasks of measuring varied attributes of the parts, such as dimensions, shape, mass, locations, and sizes of machining operations, to ensure that they meet required quality standards [3].Part inspection usually employs methods of nondestructive evaluation (NDE) in order not to induce damage to the inspected parts and affect their future usefulness.NDE methods include diverse techniques, like radiographic Xray imaging, and penetrate testing and eddy-current testing.Among these techniques, radiographic X-ray imaging is the most commonly used for locating abnormal features that are located inside the manufactured parts, for example, the aluminum wheels, steering gears of cars, and the turbine blades of jet engines [4,5].
A variety of methods have been developed for automated anomaly detection of industrial parts via computer-aided analysis of the X-ray images [6].These methods can be divided into two categories: reference-and nonreferencebased methods [7].The methods in the latter category, nonreference-based methods, are often used when the reference images are unavailable [7].Various kinds of defects or anomalies are defined, and methods such as expert systems, artificial neural networks, or general filters are used to differentiate them from the characteristics of the normal images [8][9][10][11][12].Due to the difficulty of defining all possible defects or anomalies, the application of these methods is quite limited.When reference images are available, methods in the first category, the reference-based methods are usually utilized since the reference images or their statistics can be chosen as the benchmark.A test image is compared with the benchmark, and if significant differences exist, then the test image is classified as anomalous [5,6,[13][14][15].
The reference-based methods allow a set-actual comparison which is not possible with the nonreference-based methods and are efficient for detecting low contrast defects   [16].The performance of the reference-based methods relies heavily on the reference images [2].However, parts can vary within the specification during the production process [16].For example, for the aluminum die casting, abrasions and wear are common during the lifetime of a mold, and also sand cleaning of the molds leads to variations in wall thickness.These subtle variations are visible in the X-ray images.This makes the comparison of older reference image sets with current images under inspection difficult.
In this paper, we propose an adaptive reference image set selection procedure.It first selects an initial reference image set using the ADR Model Optimizer [2] based on feature extraction of output images from ADR and then uses the performance metric based on the positive rate (callout rate) in a sliding-time window to determine the need to update the reference image set.Whenever there is a need, a new reference image set is generated by retraining the ADR Model Optimizer with a new set of image data consisting of older reference image sets and falsely rejected images.
The rest of the paper is organized as follows.In Section 2, a brief description of the ADR system is given, and the ADR Model Optimizer is described in detail, which is a nonadaptive approach to select a reference model image set from a large defect-free image set for ADR.In Section 3, the adaptation problem of the reference model image set selection is stated.Section 4 presents the proposed procedure, which adaptively selects a reference model image set for ADR.Section 5 discusses the experimental results, and Section 6 is the conclusion part.

Overview of ADR and the Model Optimizer
2.1.ADR.ADR is a reference-based X-ray inspection system used for anomaly detection of turbine blades of jet engines from different perspectives (views) [5].The diagram of the ADR system for defect recognition process for a single view is shown in Figure 1.The system consists of two phases: a modeling phase and an evaluation phase [5].In the modeling phase, a nonparametric Parzen-windows approach is used to build a statistical model at each pixel based on the lowlevel features extracted from a set of aligned and normalized reference model images from defect-free blades.The model is defined as where   (, V) is the defect probability at pixel (, V),    (, V) is the defect prior at pixel based on domain knowledge,   (, V) is a template image chosen from the good parts and used for spatial alignment,  0 (, V) is the baseline image for appearance normalization,  is the defect index,   is the probability threshold separating normal from abnormal variations,   is the minimum defect size, and   is the standard deviation of the Gaussian kernel.Low-level image features like the intensity value are extracted, and a nonparametric statistical distribution   (, V) is created for each feature at pixel (, V).For the pixels with probabilities over the threshold   , 8-connected component algorithm is used, and only regions larger than size   are assumed to be defects.In the evaluation phase, a test image is preprocessed by the same operations including registration and normalization as in the modeling phase.Low-level features of the preprocessed test image are then extracted, and the probability of each pixel being normal or abnormal is calculated and compared with the built statistical models.Pixels are called out if the probability is over threshold   and the defect area size is larger than   and they are marked as blue and red in the output image, representing less material and excess material, respectively, as illustrated in Figure 2.

ADR Model
Optimizer.ADR uses a reference-based method for anomaly detection of turbine blades.A reference model image set is selected from good parts, and statistical models created from the reference model image set are established as the benchmark, with which the images of turbine blades under inspection are compared.The selection of the reference model image set is critical.ADR Model Optimizer is an automatic approach we developed [2] to select a static reference model image set from a large defect-free image set. Figure 3 gives the schematic diagram of ADR Model Optimizer.
As shown in Figure 3, given an anomaly-free image set , the approach selects a model image set  12 from  in two steps, where  12 =  1 ∪  2 . 1 and  2 are two model image sets to be selected in Steps 1 and 2, respectively, with the corresponding sizes ( 1 ), ( 2 ) ≪ the size of set ().
Step 1.The selection of the model set,  1 , as shown in Figure 3 is as follows.
(2) Feed  0 into ADR as the model set, and use  as the test set.(3) Run ADR to inspect every image in .The images considered as anomalous by ADR would be called out with indications marked in the callout images.Define the callout image set as  out1 .
(4) Based on the indication features of  out1 , including the indication size, types, and locations, select  1 from  out1 .
Step 2. The selection of the model set,  2 , as shown in Figure 3 is as follows.
Replace the model set  0 with  1 .Repeat (2), (3), and (4) in Step 1.Note that the callout image set in this step is defined as  out2 , and based on the features of  out2 ,  2 is selected from  out2 .
The final selected model image set is  12 , where From [2], we see that the ADR Model Optimizer can find a model set with an optimal size in two steps to ensure a low false positive rate with acceptable true positive rate.It is validated that the approach can be applied to different types of blades and varied views of each blade type.

Problem Statement
The X-ray image data of turbine blades can vary within the specification over time during the production process.To adapt to the variation, the reference model image set should not be static but should be updated as needed.The problem is formulated to choose performance metrics to measure and detect the variation and develop methods of revising the current model set when significant variation has been detected.The revising method should involve as little human intervention as possible and find a new model set which can lower the false rejection (false positive) rate and maintain an acceptable true positive rate.

Proposed Procedure
The framework of the proposed procedure to adaptively select reference model image sets is illustrated in Figure 4.
The procedure first selects an initial reference model set MOA. Use MOA for ADR to inspect the image data generated after point A, and use the callout rate in a sliding window (CR STW) to measure the variation of the image data.If CR STW > a set threshold Thr, then update the reference model image set MOA to be MOB.
For the selection of MOA, through human analysis of the image data generated from time point O to point A, obtain the defect-free image data.Feed the defect-free image data into the ADR Model Optimizer, and generate the initial reference model set MOA.
For the selection of the updated reference model set MOB, through human analysis of the callout image data from point A to B, get new defect-free image data.Feed the new defectfree image data augmented with the old reference model set MOA into the ADR Model Optimizer, and generate the new reference model set MOB.
The following will discuss the performance metric of the image data variation CR STW and the update method in detail.

Callout Rate in a Sliding
Window.Image data can vary within the specification during the production process.The false positive (false rejection) rate in a sliding-time window (FPR STW) in a time window can be used to measure the variation.However, it is difficult to obtain FPR STW in practice since FPR STW equals the number of false positive images divided by the number of defect-free images, and the identification of the entire defect-free image set in the slidingtime window is the prerequisite for calculating FPR STW.To counter this, FPR STW can be replaced by the callout rate in the sliding-time window (CR STW).FPR STW is usually approximately proportional to CR STW since, among the callout images, the majorities are false positive images and the number of true positive images is very limited.
For CR STW, the size of the time window should be carefully selected.If it is too small, then CR STW will only reflect a short-period image data change.If the time window size is too big, then CR STW cannot reflect the change timely.The size of the time window depends on the specific situation and can be obtained through extensive experiments.

Model Set Update Method.
When significant variation of the image data is detected, the reference model image set needs to be updated.The proposed update method is based on the ADR Model Optimizer.For the ADR Model Optimizer, all the image data collected need to be analyzed by human experts to obtain the defect-free image data.With the defectfree image data, the ADR Model Optimizer is trained to obtain the reference model set.For the update method, not all the image data but only those callout (rejected) images need to be analyzed by human experts to obtain the falsely rejected (false positive) images.Those false positive images are defectfree ones and represent normal variations.Using the false positive images data augmented with the old reference model image set, the ADR Model Optimizer is retrained to generate a new reference model image set.

Experimental Results and Discussion
This proposed procedure for reference model image sets selection has been validated by X-ray images from trailing edge view for turbine blades of blade type "A" through extensive experiments.
A total of 13440 images generated in 122 days are used, including 3 categories: 9835 defect-free images, 235 images with strong indications, and 3370 images with weak indications.The images with strong indications correspond to turbine blade parts that cause safety issues for the jet engines and should be discarded.The images with weak indications correspond to parts with minor anomalies that can be used for the jet engines.Performance metrics used include the callout rate (CR), false positive rate (FPR), and true positive rate (TPR) of the images with strong indications and TPR of the images with weak indications, respectively, in   a sliding-time window.Note that for proprietary information protection the ADR system is tuned arbitrarily, not in the best operating point, and that the results of CR, TPR, and FPR of ADR are not actual number in the production line.
Figure 5 shows the distribution of the number of images generated by day for each category.From Figure 5, we see the number of images generated varies each day.The number of images with strong indications is small compared to the number of the defect-free and the ones with weak indications.
The ADR inspection starts with an initial reference model image set MOA. Figure 6 shows the performance of the initial reference model image set, with CR, FRP, and TPR of the images with strong indications (S set) and TPR of the images with weak indications (W set), respectively, in a sliding-time window of 15 days.The initial reference mode image set MOA is selected on Day 37 with about 3000 defect-free images gathered.
As shown in Figure 6, the callout rate observed in a sliding-time window of 15 days (CR STW, the red-filled-circle line) changes after the model MOA has been selected on Day 37. The false positive rate observed in the same time window (FPR STW, the purple-up-triangle line) has similar change trend like CR STW as expected.For the images with strong indications, the true positive rate in the time window (TPR STW, the green-filled-diamond line) fluctuates over time but is acceptable for this view of the blade type.For the images with weak indications, TPR STW (the greenfilled-pentagram line) is not high, but the images with weak indications correspond to blades with minor anomalies which can be used for jet engines and are not as much concerned as the images with strong indications and will not be discussed.Note that TPR STW for images with weak indications does not exist between Day 51 and Day 66 due to no images with weak indications generated during this time interval which is also shown in Figure 5.
From Figure 6, we see CR STW is above the set threshold (the red dashed line) on Day 100, indicating significant parts variation occurs and the reference model image set needs to be updated.We see FPR STW is above 20% (the blue dashed line) on Day 103, which also reflects the significance of parts variation near Day 100 as expected.
Figure 7 shows the callout rate observed in a sliding-time window of 15 days before and after the update of the reference model image set on Day 100 (point B).From Figure 7, we see that after updating the model set CR STW (the blue-diamond-filled line) decreased below the set threshold (the red dashed line).The corresponding FPR STW and TPR STW for the images with strong indications are shown in Figure 8.
As shown in Figure 8, after the model set is updated, the FPR STW is decreased below 10%, and TPR STW for the images with strong indications remains in an acceptable level, illustrating the effectiveness of the model update method.
The above experiments use a time window of 15 days to observe the variations of the performance metrics.We investigate the selection of the sliding-time window sizes.Figure 9 shows the performance of FPR STW for window sizes of 1, 5, 15, and 25 days with the initial model set MOA.
From Figure 9, if the observing window size is too small, like 1 day or 5 days, the FPR is oversensitive.If the window size is 25 days, FPR is much stable, not sensitive enough to reflect the parts' variations timely.To make a trade-off, we select the sliding-time window size to be 15 days.More experiments are needed to determine the optimal window size heuristically.

Conclusions
We proposed a procedure to adaptively select reference model image sets for the reference-based inspection system of turbine blades, ADR.The procedure defines callout rate in a sliding-time window as the performance metric for parts variation.If this metric is above the set threshold, then the old reference model image set will be updated.The update method involves little human intervention and could generate a new reference model image set with much lower false positive rate than the old set with an acceptable true positive rate of images with strong indications.The proposed procedure might be extended to other referencebased inspection systems for reference data selection.

Figure 1 :
Figure 1: Diagram of ADR system defect recognition process.

Figure 2 :Figure 3 :
Figure 2: Callout images of ADR with markings for blade type A. The blue marking represents less material, and the red represents excess material.

Figure 4 :
Figure 4: Framework of the proposed adaptive method to select the reference model image sets.
defect-free images Number of images with weak indications

Figure 5 :
Figure 5: Distribution of the number of images generated by day for each category.

Figure 6 :
Figure 6: Performance for the initial reference model image set MOA over time.

Figure 7 :
Figure 7: Callout rate in a sliding window of 15 days before and after the update of the reference model image set.
MOA FPR STW for MOB TPR STW of S set for MOA TPR STW of S set for MOB

Figure 8 :
Figure 8: False positive rate and true positive rate of images with strong indications in a sliding window of 15 days before and after the update of the reference model image set.

Figure 9 :
Figure 9: FPR in different sliding-time windows for MOA.