A Novel Vision-Based PRPL Multistage Image Processing Algorithm for Autonomous Aerial Refueling

Autonomous aerial refueling (AAR) technology can increase the flight endurance of unmanned air vehicles (UAVs) effectively. Drogue detection and target tracking method are significant for probe-drogue refueling system in the docking stage. This paper proposes a novel vision-based multistage image processing algorithm of drogue detection and target tracking for AAR. This algorithm divides the whole task into four stages: preprocessor, recognizer, predictor, and locker (PRPL). The adaptive threshold segmentation (ATS) algorithm and support vector machine (SVM) classifier are utilized in preprocessor and recognizer for drogue detection. An improved kernelized correlation filter (IKCF) tracking algorithm and scale adaptive method by window position as well as image resolution adjusted are adopted in predictor and locker for target tracking in complex dynamic environments. Finally, the proposed PRPL multistage image processing strategy is tested using an autonomous aerial refueling testbed. The results indicate that the proposed algorithm achieves high precision, good reliability, and real-time capability compared with conventional algorithms. The average processing time is within 11ms in various environments, which can meet the requirement for drogue detection and tracking in AAR.


Introduction
The AAR technology can greatly improve the flight duration and increase the payload of UAVs. It has always been favored by the experts in the military and national defense fields. Compared with manned aerial refueling technology, which has a higher risk factor and greater difficulty in operation, the UAV technology has greatly reduced operational costs and risks [1][2][3]. There are two major methods for the AAR: the boom-and-receptacle refueling system and the probeand-drogue refueling system. The operation procedure is different from each other. Firstly, in the boom-and-receptacle refueling method, the tanker is an active part that steers a rigid retractable boom stretched from the rear of it to a socket installed on the UAV. The advantages of this method are that the oil delivery speed is fast and it is not sensitive to air turbulence, while the disadvantages are structure complex and it only fuels one aircraft at a time [4]. Secondly, in the probe-and-drogue refueling method, the tanker drags a refueling drogue linked in the end of the flexible hose, and then, the refueling probe attached on the UAV is controlled to dock into the drogue. The advantages of this method are that two or more aircrafts can be refueled at the same time and it has simple structure. And the disadvantage is the refueling drogue susceptible to atmospheric turbulence [5,6]. At present, the probe-and-drogue refueling method is widely used at home and abroad.
There are two key technologies in the docking of autonomous probe-and-drogue refueling system: vision-based drogue detection and target tracking and relative pose measurement. The former is to obtain the exact location of drogue in the visual images [7], and the latter is to establish the relative pose between the UAV and drogue [8]. Nowadays, a lot of research work has carried out on the vision system for probe-and-drogue refueling tasks. It mainly contains two aspects. One is the active vision system and the other is passive vision system [9,10]. The active vision system introduces obvious features, such as installation of special light sources or marking points on the drogue to assist visual navigation, while the passive vision system makes no changes to the drogue, only relying on the characteristics of itself for detection and positioning [11][12][13]. Vision-based Navigation (VisNav) is a commonly utilized active vision system, which has been adopted in several AAR projects. A set of light-emitting diode (LED) beacons are distributed on the profile of drogue, whose light spots are detected by sensing cell installed on the UAV. The beacons are lightened in order, and the sequence of detected beacons is ensured by a communication link [14]. Pollini et al. used an inexpensive charge-coupled device (CCD) camera with an infrared filter to detect LED beacons installed on the drogue [15]. Wang et al. placed a layer of red marking tape on the ring of the drogue to highlight the circular feature [16]. In active vision system, drogue target can be detected in less image information with faster processing speed and higher reliability. However, the main disadvantage is that slight modification of the tanker should be made to afford electrical power for beacons [17]. The passive vision system needs no supernumerary hardware equipment to install on drogue or the tanker. Martínez et al. developed a vision-based position estimation merely relying on the characteristic of drogue itself and employed a Sobel edge template matching strategy for drogue detection [18]. Yin et al. proposed a new drogue detection and target tracking method based on the approximate circle and inner dark feature of refueling port of drogue, in which the edge information is used to obtain image feature [19]. Gao et al. developed a drogue detection method utilizing low rank and sparse matrix decomposition for local multifeatures, in which the sequences of drogue images are disintegrated into the low rank background and sparse maneuvering targets [20]. Qin et al. proposed a coarse and fine two-level detection structure based on the characteristics of the inner refueling port, and the row and column scanning method for the local region of image was adopted to obtain high-precision shape parameters [21]. Huang et al. developed a multidirection closest point searching to extract all possible image areas and row-column scanning method to obtain the inside circle edge points for detecting and tracking drogue [22]. Xu et al. presented a novel algorithm with adaptive boosting and convolutional neural network classifier to detect the drogue in complex environments [23].
The challenge of passive vision system still focuses on improving the reliability and real-time capability. Therefore, it is essential to develop an economic and efficient image processing algorithm with strong robustness and high success rate for the AAR docking task [24].
In this paper, an image processing algorithm is developed specifically for probe-drogue refueling system. A PRPL multistage drogue detection and tracking strategy based on monocular passive vision system is proposed. A high fidelity test platform using full-size drogue and a micro six-rotor aircraft for AAR docking task is employed to evaluate the proposed image processing algorithm.
The rest of this paper is organized as follows. Section 2 presents the PRPL multistage detection and tracking strategy for drogue, including strategy overview, drogue detection method, and target tracking method in detail. In Section 3, three various experiment scenes are conducted on the imitation test platform for AAR, and results are analyzed in details. Section 4 summarizes the conclusions of this paper and future work.

PRPL Multistage Detection and
Tracking Strategy 2.1. Strategy Overview. In the docking stage of the probedrogue refueling system, the position information of the drogue is calculated through the visual acquisition equipment and the corresponding image processing algorithm. The drogue is connected to the tail of tanker with a flexible hose. It moves irregularly in the wake vortex of tanker and atmospheric turbulence. Therefore, it puts forward higher requirement for the detection and tracking algorithm of drogue. In this paper, the detection and tracking of drogue target are divided into four phases: detection preprocessing phase, detection recognition phase, tracking prediction phase, and tracking locking phase. Accordingly, the preprocessor, recognizer, predictor, and locker are designed to solve the difficulty of detection and tracking of drogue. The PRPL multistage strategy diagram of drogue detection and tracking is shown in Figure 1. During docking phase, the real-time video data is obtained by camera installed on receiver. The video data is converted into image sequences as I = fI 1 , I 2 , ⋯, I n g. Then, the preprocessor is used to deal with the first frame I 1 of image sequence, and the candidate region set RS p of drogue target is generated. The recognizer is used to extract the normalized feature of RS p , and the candidate regions are judged by SVM classifier. Afterwards, the information of position P 1 and size S 1 of the drogue area in the image are obtained. According to the detected characteristics of the target region, the target tracker is initialized, and the predictor is adopted to locate the target area quickly. Finally, the locker is adopted to discriminate and update the model of the target position for the subsequent images of I 2 , I 3 , ⋯, I n . If the discrimination fails, that is, the target cannot be tracked, the drogue detection process is performed again on the next frame of image and the previous operation is repeated.
As seen in Figure 1, the preprocessor and recognizer are designed in drogue detection stage; and the predictor and locker are designed in drogue tracking stage. Afterwards, the drogue position in image can be obtained reliably in the locker, which can be used to solve the relative pose information between drogue and UAV in the next step for docking task.

Drogue Detection Method.
The main objective of drogue detection is to scan the global image and select the target area in the image by recognition. The traditional detection method is to use windows with different scales to scan the entire image. Then, the recognition area library is achieved and the target area is identified. It is time-consuming, and the database information is huge, which is high redundancy of recognition area. Therefore, the detection preprocessor and the recognizer are designed, respectively, for the detection task. The preprocessor is responsible for reducing the 2 Wireless Communications and Mobile Computing dimensionality and redundancy of the recognition area database. The recognizer is responsible for determining the target area from recognition database.

Adaptive Threshold-Based Preprocessor.
Based on the color and shape features of drogue, a cascade classifier is built to generate a detection and recognition library. The detection preprocessing cascade classifier is shown in Figure 2.
The outline diagram of drogue is shown in Figure 3. Due to the dark black feature of the inner refueling port of drogue, the candidate target area can be obtained by threshold segmentation of the grayscale image, and then, the color feature classifier can be established. Because the change of environmental light will cause the gray value of the target to change greatly, it is important to choose the threshold. An adaptive threshold selection is used.
The image grayscale histogram shows the overall distribution of grayscale values in the image. The histogram changes as light changes. Therefore, a selected threshold based on the grayscale histogram can partly eliminate the impact of lighting change on target detection. The histogram of the grayscale image can be regarded as a one-dimensional matrix H = ðh 0 , h 1 , ⋯, h 255 Þ. Among them, h i ði = 0, 1, ⋯, 255Þ is the number of pixels whose gray value is i. The local minimum of the matrix H is used for image segmentation. It can segment the target better from the background.
Firstly, through comparing the current value h i with the previous one h i −1 Then, comparing the new current value f i with the next Finally, the local minimum is determined.
where b i = 1 represents the local minimum of the histogram threshold i. The adaptive threshold set is obtained as follows.
Then, t 1 , t 2 , ⋯, t m are used to judge the validity of the target region in turn until the effective target fitting information is obtained.
Based on the color feature of target set obtained in the previous step, the shape feature classifier is used to classify the regions in the target set. Firstly, the segmented regions are processed with morphological method to remove the influence of noise points. Then, the contour information of remaining effective region is extracted, and the boundary and area information of corresponding region is determined by the contour information. According to the shape feature of a relatively standard circular of the inner fueling port of drogue, the shape feature classifier criteria are established as follows.
Length and width ratio : Area ratio : Circularity : where l, w, c, and s represent the length, width, perimeter, and area of the target region, respectively. t r , t a , t b , and t c represent the threshold of aspect proportion, area ratio, and circularity, respectively.

Wireless Communications and Mobile Computing
A SVM classifier-based recognizer is designed to determine the target region from the candidate regions set quickly and accurately.
According to the drogue image characteristics, which have obvious edge structure, the histogram of gradient (HOG) feature is utilized to extract the feature vector, which usually involves the following steps: gamma color correction, gradient calculation, gradient feature statistics, eigenvector normalization, and eigenvector concatenation. Since HOG features are sensitive to the scale changing, the region scale normalization is used to normalize all candidate regions to 128 * 128 pixel. Then, an SVM classifier based on HOG features is designed [25].
Define the sample set to be classified as where x k is the HOG feature vector, y k is the sample classification label, and n is the number of sample. The linear SVM classifier is utilized to obtain the maximum interface, that is, where ω and b are the two parameters of the SVM, which can be solved by secondary optimization method [26]. Based on the detection preprocessor and recognizer, the detection process of drogue is shown in Figure 4 when t r = 0:7, t a = 0:8, t b = 1:2, and t c = 0:65.

Drogue Tracking Method.
Drogue target tracking is to model of the detected region feature and predict the position of the target in the next image frame. It is important to improve the robustness, processing speed, and accuracy of the target tracking model. Therefore, the tracking predictor and the locker are designed, respectively, for the tracking task. The predictor is responsible for establishing and updating of the tracking model and predicting the region of the target in the next image frame based on the tracking model. The locker is responsible for judging failure of tracking, extracting the target edge of the predicted region, and calculating target scale position accurately. The target position calculated by the locker will be feedback to the predictor for model update.

Improved KCF-Based Predictor.
Target tracking is a technology with extremely high requirements for real time and reliability. This paper develops an improved KCF tracking (IKCFT) algorithm based on a highly reliable detection results. Feature extraction and modeling of the target are the key technologies for target tracking, whose speed will influence the real-time performance of tracking. The HOG feature with faster feature extraction and higher robustness and classification model with simple and efficient ridge regression model are adopted. Meanwhile, the training of the model is performed with the positive and negative samples based on the characteristics of fast inversion of the circulant matrix.
The training samples of the tracking model are generated by the cyclic shift of the target region as follows: where x = x 1 x 2 ⋯ x n ½ T is the target sample. Suppose the training sample set is ðx i , y i Þ, and its regression function is f ðx i Þ = ω T x i . The objection of the ridge regression model is represented as follows: where y = y 1 y 2 ⋯ y n ½ T is the label value corresponding to each row of feature X, λ is the regularization coefficient, and ω is the tracking model parameter.   Wireless Communications and Mobile Computing By the nature of the circulant matrix, where F is the discrete Fourier matrix. Then, ω can be calculated.
That is, where ⊙ represents the product of corresponding vector element and x∧ * is the conjugate vectors ofx. Because of the nonlinear characteristic of target classification, a kernel space function [27] is introduced to map the feature vectors x of nonlinear space to the linear space ΦðxÞ, and the linear regression model is converted to f ðx i Þ = ω T Φðx i Þ. There are two main advantages of using circulant matrix for tracking. On the one hand, a large number of negative samples can be obtained from the circulant matrix instead of extracting from background. On the other hand, the solution of the ridge regression model parameters becomes a simple and efficient positive sample vector element product instead of a complex matrix inversion operation. It greatly improves the speed of training of model.
The sample set Z = z 1 z 2 ⋯ z n ½ to be detected is obtained from the predicting regions and their displacement. And the largest sample from the final predicted target region

Wireless Communications and Mobile Computing
f ðz j Þ = ω T Φðz j Þ is selected as the new detected target region. z j is the offset of the target movement.

Scale
Adaptive-Based Locker. Due to fixed tracking window size of the KCF tracking algorithm, it is ineffective for large-scale variation in the tracking target. In addition, it is prone to emerge tracking deviation because of tracking accumulated error in long time target tracking. Furthermore, it will track wrong target when the target is out of sight. In the light of the problems above, a tracking and locking feedback process is increased based on the KCF algorithm.
In view of the tracking target will not change large scale in short time, the method of detecting and correcting frame by frame is adopted. This paper adopted the ratio of the inner circle size of the drogue to the tracking window size as the correction criterion. The tracking window size remains uniform during the entire tracking process. The size of the target relative to the tracking window keeps invariable by changing the resolution of the overall image so as to solve the problem of size change of the target. The tracking phase strategy diagram is shown in Figure 5.
As shown in Figure 5, after adjusting the resolution of the input image and predicting by tracker, the tracking region is obtained. Polar coordinate system conversion is used to change the boundary of the target from a circle to a line to simplify the edge extraction of a circular target. Then, the tracking effectiveness is verified through the extracted target edge and tracking area size information. If the target tracking is successful, the tracking area is corrected and the tracking model is updated.
In this paper, the method of polar coordinate conversion is used to detect the edge of the circular target. The circular edge detection is converted to the line edge detection, and the wrong edge generated by the occlusion can be effectively filtered out. Assuming that the center of the circular target basically coincides with the center of the image area, after the target area is obtained, a polar coordinate system is established with the center of the target area as the pole and the horizontal right as the polar axis. The polar coordinate system can be converted into a Cartesian coordinate system with the x-axis as the polar angle and the y-axis as the polar diameter as follows. Polar coordinate transformation diagram is shown in Figure 6.

Wireless Communications and Mobile Computing
Afterwards, the converted image is smoothing processed, horizontal edge detected, threshold segmented, morphological open operated, and edge point extracted, which is as shown in Figure 5. Finally, the edge points of the Cartesian coordinate system are mapped back to the original image for ellipse parameter fitting.
The iterative least squares ellipse method is used to fit all the effective edge points [22]. And then, the accurate target size and center position ða, b, x 0 , y 0 Þ are obtained. With the movement of the camera or drogue, its size and position in the image will change accordingly. When the size of tracking window remains the same, increasing   7 Wireless Communications and Mobile Computing or decreasing the size of the target will cause the tracking model to lose some target features or introduce wrong background features. Meanwhile, due to the change of position, it needs to predict the moving position of the target at the next moment. Because there will always be errors in the predicted position and as the errors gradually accumulate, tracking drift will occur, and it will lead to track failed. In this paper, the tracking error is corrected by the real-time correction of the inner circle center of the target, and the image resolution is adjusted in real time to remain the size of the drogue target in the image relative to the tracking window unchanged. This method can ensure the correctness of the tracking model and prevent the accumulation of predicting errors.

Wireless Communications and Mobile Computing
The image resolution is adjusted according to the fitted target size and position information; the adjustment criteria are as follows.
where l and w are the length and width of tracking window. After the resolution adjustment above, the new image I ′ is obtained, and the center position of prediction window W is corrected. Here, W = ðwin cx, win cyÞ.
The tracking model is updated by the region I ′ ðWÞ, where I ′ ðWÞ represents the window W corresponding to the image I ′ . The process diagram of drogue tracking algorithm is shown in Figure 7.

Results and Discussion
3.1. Testbed for AAR Docking Task. The PRPL multistage detection and tracking algorithm of drogue proposed in this paper is tested experimentally with a testbed for AAR docking task shown in Figure 8. There are three parts in the testbed, including drogue, UVA, and ground station. A full-scale refueling drogue model is mounted on a 2D mobile platform with 3.5-meter height. A micro six-rotor aircraft is used as the UVA, and an industry camera is installed on it to capture visual images of drogue motion. The whole system operation is monitored by ground station, which guarantees the controllability and safety of the experimentation.
An industry camera of FL3-U3-20E4M-C is installed on UVA shown in Figure 8, which is able to catch moving images at 110 Hz with resolution of 1280 × 720. In this paper, the images are processed at resolution of 800 × 600 to reduce image processing time. In addition, the camera is linked to a full-fledged computer with operation system of Ubuntu 16.04 for the visual system running. Moreover, the proposed image processing algorithm is exploited in C++ language, and the image data is managed with OpenCV libraries.
In order to analyze the algorithm performance proposed in this paper, three experimental scenarios are designed to 9 Wireless Communications and Mobile Computing test the precision, success rate, and consuming time. As shown in Figure 9, scene 1 is weak light condition with complicated background. Scene 2 is moderate light condition with complicated background. Scene 3 is bright light condition with simple background.

Precision, Real-Time Capability
, and Reliability of the Drogue Detection Method. For target detection, the realtime capability and reliability are particularly important. In this paper, precision, success rate, and consuming time of adaptive threshold segmentation detection (ATSD) algorithm are compared and analyzed with sliding window detection (SWD) method. Here, precision refers to the number of frames below the given center error as the percentage of the total frames when the tracking center error changes from 0 to 50 pixels. Success rate refers to the number of frames higher than the given overlap rate as the percentage of the total frames when the overlap rate of the tracking region to calibration region changes from 0 to 1.
The experimental video data is selected in scene one at close (2~5 m), middle (5~10 m), and far distance (102 0 m), respectively. The precision, success rate, and consuming time under different distances are compared in Table 1. Here, the success rate indicator is the value when the overlap threshold is 0.6. As shown in Table 1, the distance of the target causes a large change in the size of the target in the image, which causes a decrease in the detection success rate. If the resolution of the image is changed to adapt the change of size, the consuming time of target detection is greatly increased, and the real-time performance is decreased. As for adaptive threshold segmentation detection algorithm, the detection success rate does not change significantly when the distance changes. It is highly robust to the target size and consumes less time. Partial detection effect pictures of the two algorithms are shown in Figure 10. The performance of various algorithms in far distance is shown in Figure 11. 3.3. Precision, Real-Time Capability, and Reliability of the Drogue Tracking Method. In order to analyze the adaptive threshold segmentation detection and improved KCF tracking (ATS-IKCFT) algorithm performance proposed in this paper, the precision, success rate, and consuming time are analyzed and compared with the traditional KCF tracking (KCFT) method [28] and tracking learning detection tracking (TLDT) method [29].
Partial tracking pictures of various methods for scene 2 at different distances and light conditions are compared in Figure 12. As shown in Figure 12, the adaptabilities to the 10 Wireless Communications and Mobile Computing target scale change of the KCFT and TLDT method are poorer than ATS-IKCFT. When the target scale changes, it will bring into background interference information or loss some of the target feature information, which causes the accumulation of tracking errors and the drift of the tracking region. The tracking accuracy and success rate are shown in Figures 13 and 14. It shows that the tracking accuracy and success rate of ATS-IKCFT are higher than those of both KCFT and TLDT method. The ATS-IKCFT algorithm has high robustness and accuracy for different environments. In terms of real-time tracking, as shown in Figure 15, the ATS-IKCFT tracking algorithm in this paper has reached a high level. The average processing time in various environments is within 11 ms and it reaches 3 ms under better environment, which meets the requirements of high-dynamic and high-frequency output. The consuming time of three experimental results is shown in Figure 15.

Conclusions
Drogue detection and target tracking algorithm is significant for probe-drogue refueling system of AAR. In this paper, a novel PRPL multistage image processing strategy based on vision navigation is proposed for the drogue docking tasks. The detection preprocessor, detection recognizer, target tracking predictor, and target tracking locker are designed and implemented for the detection and tracking of drogue. The preprocessing cascade classifier accelerates the target detection speed and improves the robustness of target detection. The histogram adaptive threshold classifier makes it possible to adjust the appropriate threshold for target segmentation under different lighting conditions. The tracking feedback phase enables the tracker to adapt to the change of the target scale, reduces the accumulation of tracking errors greatly, and ensures the effectiveness of the tracking model. Experiments are conducted with a full-scale drogue model to provide high-fidelity moving images by a micro six-rotor aircraft. Results show that the proposed PRPL multistage image processing strategy for drogue has better accuracy, real-time capability, and reliability than conventional methods, which can meet the requirements for AAR in docking stage.
Future work will focus on the development of relative navigation algorithm between drogue and UVA based on

11
Wireless Communications and Mobile Computing vision system. It will provide a meaningful reference for the practical application of AAR technology.

Data Availability
No data were used to support this study.

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.