Unmanned Aerial Vehicle Navigation Using Wide-Field Optical Flow and Inertial Sensors

1Division of Engineering, Pennsylvania State University, Reading, PA 19610, USA 2Department of Mechanical and Aerospace Engineering and Lane Department of Computer Science and Electrical Engineering at WVU, Morgantown, WV 26506, USA 3Aerospace Engineering Department, University of Kansas, Lawrence, KS 66045, USA 4Department of Mechanical and Aerospace Engineering at West Virginia University, Morgantown, WV 26506, USA


Introduction
Information about the velocity and attitude of an aircraft is important for purposes such as remote sensing [1], navigation, and control [2].Traditional low-cost aircraft navigation relies on the use of both inertial sensors and Global Positioning System (GPS) [3][4][5].While GPS can provide useful information to an aircraft system, this information is not always available or reliable in certain situations, such as flying in urban environments or other GPS-denied areas (e.g., under radio-frequency jamming or strong solar storm).GPS is not self-contained within the aircraft system; rather the information comes from external satellites.Insects, such as the honeybee, have demonstrated impressive capabilities in flight navigation without receiving external communications [6].One significant information source that is used by insects as well as birds is vision [6][7][8].This information can also be made available to an aircraft through the use of onboard video cameras.The challenge with this information rich data is correctly processing and integrating the vision data with the other onboard sensor measurements [9].
Vision data can be processed using feature detection algorithms such as the Scale-Invariant Feature Transform (SIFT) [10] to obtain optical flow vectors, as well as other techniques.Optical flow is useful for aircraft systems because it is rich in navigation information, simple to represent, and easy to compute [11].One of the benefits of this information is that it can be used in order to extract velocity information about the aircraft, which in turn can be used for aircraft positioning.This optical flow information has been used for autonomous navigation applications such as relative heading and lateral position estimation of a quadrotor helicopter [12,13].Another work has considered the use of optical flow for UAV take-off and landing [14] and landmark navigation [15].Another potential benefit of optical flow is that it implicitly contains information about the aircraft attitude angles.This implicit information has been used in related work for UAV attitude estimation using horizon detection and optical flow along the horizon line [16,17] and pose estimation for a hexacopter [18], a lunar rover [19], and spacecraft [20].While this work is useful, these vehicles contain significantly different dynamic characteristics than a typical airplane.Due to this, more analysis of the application of optical flow for airplane applications is necessary.
This work presents a combined velocity and attitude estimation algorithm using wide-field optical flow for airplanes that does not require horizon detection, which is useful because the horizon does not need to be visible in the image frame in order to obtain attitude information.The algorithm relies on the optical flow computed using a downward facing video camera, measurements from a laser range finder and an Inertial Measurement Unit (IMU) that are mounted in parallel to the camera axis, and a flat ground assumption to determine information about the aircraft velocity and attitude.Many of the existing experiments for optical flow and inertial sensor fusion are done using helicopter platforms and focus on position and velocity estimation [21,22].This work considers an airplane system rather than a helicopter, which contains a significantly different flight envelope and dynamics.Additionally, the regulation of attitude information through the use of optical flow is considered, which is not typically done in existing applications.This work takes advantage of all detected optical flow points in the image plane, including wide-field optical flow points which were often omitted in previous works [23][24][25].These wide-field optical flow points are of significant importance for attitude estimation, since they contain roll and pitch information that is not observable from the image center.Although this work considers the use of a laser range finder to recover the distance between the image scene and the camera, it is possible to determine this information using other techniques [26].In fact, it has been demonstrated that the scale is an observable mode for the vision and IMU data fusion problem [27].The presented formulation was originally offered in its early stages of development in [28].Since this original publication, the implementation and tuning of the formulation have been refined, and additional results have been generated.In particular, a simplified formulation is offered which reduces the filter states, and the inclusion of a range state is considered.The main contribution of this paper is the analysis of a stable vision-aided solution for the velocity and attitude determination without the use of GPS.This solution is verified with respect to two sets of actual UAV flight testing data.
The rest of this paper is organized as follows.Section 2 presents the different considered formulations and framework for this problem.Section 3 describes the experimental setup which was used to collect data for this study.The results are offered in Section 4 followed by a conclusion in Section 5.

Optical Flow Equations.
Optical flow is the projection of 3D relative motion into a 2D image plane.Using the pinhole camera model, the 3D position (  ,   ,   ) in the 3D camera body frame can be mapped into the 2D image plane with coordinates (, ]) using where , ], and  are given in pixels and  is the focal length.For a downward looking camera that is parallel to the aircraft -axis, and with a level and flat ground assumption, the optical flow equations have been derived [29]: where ,  are the roll and pitch angles, , ,  are the roll, pitch, and yaw body-axis angular rates, , V, , are the bodyaxis ground velocity components of the aircraft, and μ , ] are the components of optical flow in the 2D image plane, given in pixels/sec.This equation captures the relationship between optical flow at various parts of the image plane with other pieces of navigation information.By considering only the area close to the image center ( ≈ 0, ] ≈ 0), the narrow-field optical flow model can be simplified [23][24][25]; however, this removes the roll and pitch dependence of the equation and is therefore not desirable for attitude estimation purposes.

State Space
Body coordinates NED coordinates Range coordinate A diagram describing the definition of the range coordinate, , is provided in Figure 1.Note that the range coordinate, , is equivalent to the camera  coordinate,   .In order to determine the dynamics of the velocity states, the time derivative of the velocity vector observed from the fixed navigation frame is equal to the time rate of change as observed from the moving body axis frame plus the change caused by rotation of the frame [30]: The IMU measures the acceleration with respect to the fixed gravity vector, as in where    is the rotation matrix from the navigation frame to the body frame: Combining these results gives the dynamics for the velocity states [31]: The dynamics of the attitude states are defined using [32] [ To define the dynamics for the bias parameters, a firstorder Gauss-Markov noise model was used.In a related work [33], the Allan deviation [34] approach presented in [35,36] was used to determine the parameters of the firstorder Gauss-Markov noise model for the dynamics of the bias on each IMU channel.The Gauss-Markov noise model for each sensor measurement involves two parameters: a time constant and a variance of the wide-band sensor noise.Using this model, the dynamics for the bias parameters are given by b where  is a vector of time constants and n is a zero-mean noise vector with variance given by a diagonal matrix of the variance terms for each sensor.The time constant and variance terms were calculated in [33] for each channel of the same IMU that was considered for this study.
The state dynamic equations have been defined in continuous-time using the following format: where f  is the nonlinear continuous-time state transition function.In order to implement these equations in a discretetime filter, a first-order discretization is used [37]: where  is the discrete time index, f is the nonlinear discretetime state transition function, and   is the sampling time of the system.
To formulate the observation equations, optical flow information is utilized.In particular, each optical flow point identified from vision data consists of four values: , ], μ , ] .These values are obtained using a point matching method [38] and the Scale-Invariant Feature Transform (SIFT) algorithm [10].Note that the method for optical flow generation is not the emphasis of this research [38]; therefore, any other optical flow algorithm can be used similarly within the proposed estimator, without any loss of generality.
During the state estimation process, the image plane coordinates (, ]) are taken as inputs to the observation equation, allowing the optical flow ( μ , ] ) to be predicted at that point in the image plane using (2), where   is provided by the laser rangefinder measurement, .These computed observables are then compared with the optical flow measurements of ( μ , ] ) from the video in order to determine how to update the states.Since multiple optical flow points can be identified within a single time step, this creates a set of   observation equations, where   is the number of optical flow points at time step .
Since ( 7) and ( 8) are derived from kinematics, the only uncertainty that must be modeled is due to the input measurements.Therefore, the input vector is given by where û is the measured input vector and b is the vector of sensor biases which follow a first order Gauss-Markov noise model as determined in [33].
The uncertainty in the measurements is due to the errors in the optical flow estimation from the video.It is assumed that each optical flow measurement y  has an additive measurement noise vector, v  , with corresponding covariance matrix, R  .For this study, it is also assumed that each optical flow measurement carries equal uncertainty and that errors along the two component directions of the image plane also have equal uncertainty and are uncorrelated; that is, where  is the scalar uncertainty of the optical flow measurements and I is a 2 × 2 identity matrix.

Simplified Formulation.
The motion of a typical airplane is mostly in the forward direction, that is, the speed of the aircraft is primarily contained in the component, , while V and  are small.With this idea, assuming that V and  are zero, the formulation is simplified to the following state vector, x, bias state vector, b, input vector, u, optical flow input vectors, d  , and output vectors, z  : Note that this simplified formulation removes the V and  states which removes the need for -axis and -axis acceleration measurements.Since the yaw state is not contained in any of the state or observation equations it has also been removed.Due to the assumption that V and  are zero, only the -direction of optical flow is relevant.With these simplifications, the state dynamics become The dynamics of the bias states remain the same as in the full formulation except the corresponding bias states for   and   have been removed.The observation equations from ( 2) are simplified to be The advantage of considering this simplified formulation is primarily to reduce the computational complexity of the system.The processing of vision data leading to a relatively large number of measurement updates can significantly drive up the computation time of the system, particularly for higher sampling rates.This simplified formulation not only reduces the computation time through a reduction of states, but also significantly reduces the processing and update time for optical flow measurements since only the forward component of flow is used.This formulation could be more practical than the full state formulation for real-time implementation, especially on systems which are limited in onboard computational power due, for example, to cost or size constraints.

Inclusion of a Range State.
It is possible to include a state to estimate the range in order to recover the scale of the optical flow images.To determine the dynamics of the range state, the flat ground assumption is used.With this assumption, consider the projection of the range vector onto the Earth-fixed -axis, that is, "down, " as shown in Figure 1, by taking the projection through both the roll and pitch angles of the aircraft: Here, the negative sign is used because the  coordinate is always positive, while the  coordinate will be negative when the aircraft is above the ground (due to the "down" convention).Taking the derivative with respect to time yields Compare this -velocity equation with that obtained from rotating aircraft body velocity components into the Earthfixed frame: Equating these two expressions for -velocity gives Simplifying this relationship leads to Substituting in the dynamics for the roll and pitch angles and simplifying leads to the following expression for the range state dynamics: Note that, for level conditions, that is, roll and pitch angles are zero, the equation reduces to which agrees with physical intuition.In order to implement the range state in the simplified formulation, the following expression can be used: 2.5.Information Fusion Algorithm.Due to the nonlinearity, nonadditive noise and numbers of multiple optical flow measurements ranging from 0 to 300 per frame with a mean of 250, the Unscented Information Filter (UIF) [39][40][41] was selected for the implementation of this algorithm [42].The advantage of the information filtering framework over Kalman filtering is that redundant information vectors are additive [39][40][41]; therefore, the time-varying number of outputs obtained from optical flow can easily be handled with relatively low computation, since the coupling between the errors in different optical flow measurements is neglected.
The UIF algorithm is summarized as follows [41].Consider a discrete time nonlinear dynamic system of the form with measurement equations of the form where h is the observation function and w and v are the zeromean Gaussian process and measurement noise vectors.At each time step, sigma-points are generated from the prior distribution using where  is the total number of states and  is a scaling parameter [42].Now, the sigma-points are predicted using where () denotes the th column of a matrix.The a priori statistics are then recovered: where Q is the process noise covariance matrix, and    and     are weight vectors [42].Using these predicted values, the information vector, y, and matrix, Y, are determined: For each measurement, that is, each optical flow pair, the output equations are evaluated for each sigma-point, as in where  denotes an output sigma-point and the superscript (, ) denotes the th sigma-point and the th measurement.The computed observation is then recovered using ẑ() Using the computed observation, the cross-covariance is calculated: Then the observation sensitivity matrix, H, is determined: The information contributions can then be calculated:

Experimental Setup
The research platform used for this study is the West Virginia University (WVU) "Red Phastball" UAV, shown in Figure 2, with a custom GPS/INS data logger mounted inside  the aircraft [28,43].Some details for this aircraft are provided in Table 1.
The IMU used in this study is an Analog Devices ADIS-16405 MEMS-based IMU, which includes triaxial accelerometers and rate gyroscopes.Each suite of sensors on the IMU is acquired at 18-bit resolution at 50 Hz over ranges of ±18 g's and ±150 deg/s, respectively.The GPS receiver used in the data logger is a Novatel OEM-V1, which was configured to provide Cartesian position and velocity measurements and solution standard deviations at a rate of 50 Hz, with 1.5 m RMS horizontal position accuracy and 0.03 m/s RMS velocity accuracy.An Optic-Logic RS400 laser range finder was used for range measurement with an approximate accuracy of 1 m and range of 366 m, pointing downward.In addition, a high-quality Goodrich mechanical vertical gyroscope is mounted onboard the UAV to provide pitch and roll measurements to be used as sensor fusion "truth" data, with reported accuracy of within 0.25 ∘ of true vertical.The vertical gyroscope measurements were acquired at 16-bit resolution with measurement ranges of ±80 deg for roll and ±60 deg for pitch.
A GoPro Hero video camera is mounted at the center of gravity of the UAV for flight video collection, pointing downwards.The camera was previously calibrated to a focal length of 1141 pixels [29].Two different sets of flight data were used for this study, each using different camera settings.The first flight used a pixel size of 1920 × 1080 and a sampling rate of 29.97 Hz.The second flight used a pixel size of 1280 × 720 and a sampling rate of 59.94 Hz.All the other sensor data were collected at 50 Hz and resampled to the camera time for postflight validation after manual synchronization.

Selection of Noise Assumptions for Optical Flow Measurements.
Since the noise properties of the IMU have been established from previous work [33], only the characteristics of the uncertainty in the laser range and optical flow measurements need to be determined.The uncertainty in the laser range finder measurement is modeled as 1 m zeromean Gaussian noise, based on the manufacturer's reported accuracy of the sensor.The optical flow errors are a bit more difficult to model.Due to this difficulty, different assumptions of the optical flow uncertainty were considered.Using both sets of the flight data, the full state UIF was executed for each assumption of optical flow uncertainty.To evaluate the performance of the filter, the speed measurements were compared with reference measurements from GPS which have been mapped into the aircraft frame using roll and pitch measurements from the vertical gyroscope and approximating the yaw from the heading as determined by GPS.The roll and pitch estimates were compared with the measurements from the vertical gyroscope.Due to the possibility of alignment errors, only standard deviation of error was considered.Each of these errors was calculated for each set of flight data, and the results are offered in Figure 4.
Figure 4 shows how changing the assumption on the optical flow uncertainty affects the estimation performance of the total ground speed, roll angle, and pitch angle.The relatively  flat region in Figure 4 for assumed optical flow standard deviations from approximately 3 to 9 pixels indicates that this formulation is relatively insensitive to tuning of these optical flow errors.It is also interesting to note in Figure 4 that Flight #1 and Flight #2 have optimum performance at different values of .This however makes sense, as Flight #2 has twice the frame rate as Flight #1; therefore, the assumed noise characteristics should be one half that of Flight #1.From Figure 4, the optical flow uncertainties were selected to be  = 5 2 pixels 2 for Flight #1 and  = 2.5 2 pixels 2 for Flight #2.

Full State Formulation Estimation Results.
Using each set of flight data, the full state formulation using UIF was executed.The estimated components of velocity are shown for Flight #1 in Figure 5 and for Flight #2 in Figure 6.These estimates from the UIF are offered with respect to comparable reference values from GPS, which were mapped into the aircraft frame using roll and pitch measurements from the vertical gyroscope and approximating the yaw angle with the heading angle obtained from GPS.From each of these figures, the following observations can be made.The forward velocity, , is reasonably captured by the estimation.The lateral velocity, V, and vertical velocity, , however, demonstrate somewhat poor results.This does however make sense, as the primary direction of flight is forward, thus resulting in good observability characteristics in the optical flow in the forward direction, while the signal-to-noise ratio (SNR) for the lateral and vertical directions remains small for most typical flight conditions.However, since these lateral and vertical components are only a small portion of the total velocity, the total speed can be reasonably approximated by this technique.The total speed estimates are shown in Figure 7 for Flight #1 with GPS reference.The attitude estimates for the roll and pitch angles are compared with the vertical gyroscope measurements as a reference, as shown in Figure 8.In order to demonstrate the effectiveness of this method in regulating the drift in attitude estimates that occurs with dead reckoning, the estimation errors from the UIF are compared with the errors obtained from dead reckoning attitude estimation.These roll and pitch errors are offered in Figure 9 for Flight #2. Figure 9 demonstrates the effectiveness of the UIF in regulating the attitude errors from dead reckoning.
In order to quantify the estimation results, the mean absolute error and standard deviation of error of the estimates are calculated for the velocity components with respect to the GPS reference and also for the roll and pitch angles with   2 for Flight #1 and Table 3 for Flight #2, where  is the total airspeed as determined by It is shown in Tables 2 and 3 that reasonable errors are obtained in both sets of flight data for the velocity and attitude of the aircraft.Larger errors are noted in particular for the lateral velocity state, V, which is due to observability issues in the optical flow.Note that mean errors in the roll and pitch estimation could be due to misalignment between the vertical gyroscope, IMU, and video camera.The attitude estimation accuracy is reported in Tables 2 and 3 similar to the reported accuracy of loosely coupled GPS/INS attitude estimation using similar flight data [43].

Simplified Formulation Estimation Results
. Since it was observed in the full state formulation results that the lateral and vertical estimates were small, the simplified formulation was implemented in order to investigate the feasibility of a simplified version of the filter that estimates only the forward velocity component and assumes the lateral and vertical components are zero.The forward velocity, , for Flight #1 is offered in Figure 10, while the roll and pitch errors with   11 for the UIF and dead reckoning (DR).Additionally,   4 and 5 that the simplified formulation results in significantly higher attitude estimation errors with respect to the full state formulation.These increased attitude errors are likely due to the assumption that lateral and vertical velocity components are zero.To investigate this possible correlation, the roll and pitch errors are shown in Figure 12 with the magnitude of the lateral and vertical velocity as determined from GPS for a 50-second segment of flight data which includes takeoff.Figure 12 shows that there is some correlation between the attitude estimation errors and the lateral and vertical velocity, though it is not the only source of error for these estimates.

Results Using Range State.
The results for each flight for both the full state formulation and simplified formulation were recalculated with the addition of the range state.The statistical results for these tests are offered in Tables 6-9.In order to compare the results from the different cases, the standard deviation of error is shown graphically for Flight #1 in Figure 13 and Flight #2 in Figure 14.It is shown in Figures 13 and 14 that the simplified formulation offers poorer estimation performance as expected, particularly for the attitude estimates.The addition of the range state does not affect the performance significantly.

Conclusions
This paper presented vision-aided inertial navigation techniques which do not rely upon GPS using UAV flight data.Two different formulations were presented, a full state estimation formulation which captures the aircraft ground velocity vector and attitude and a simplified formulation which assumes all of the aircraft velocity is in the forward direction.Both formulations were shown to be effective in regulating the INS drift.Additionally, a state was included in each formulation in order to estimate the distance between the image center and the aircraft.The full state formulation was shown to be effective in estimating aircraft ground velocity to within 1.3 m/s and regulating attitude angles within 1.4 degrees standard deviation of error for both sets of flight data.

Figure 1 :
Figure 1: Diagram of the range coordinate.

4. 1 .
Flight Data.Two sets of flight data from the WVU "Red Phastball" aircraft were used in this study.Each flight consists of approximately 5 minutes of flight.The top-down flight trajectories from these two data sets are overlaid on a Google Earth image of the flight test location in Figure 3. Six different unique markers have been placed in Figure3in order to identify specific points along the trajectory.These markers will be used in future figures in order to synchronize the presentation of data.

Figure 4 :
Figure 4: Comparison of errors for different assumed optical flow uncertainties.

Figure 9 :
Figure 9: Roll and pitch estimation errors as compared to dead reckoning (DR) for Flight #2.

Figure 12 :
Figure 12: Comparison of attitude estimation errors with respect to lateral and vertical velocity.

Table 2 :
Flight #1 error statistics for estimated states.

Table 3 :
Flight #2 error statistics for estimated states.

Table 6 :
Flight #1 error statistics for estimated states with range state.

Table 7 :
Flight #2 error statistics for estimated states with range state.

Table 8 :
Simplified formulation error statistics for Flight #1 with range state.

Table 9 :
Simplified formulation error statistics for Flight #2 with range state.