A Calculation Method for Vehicle Movement Reconstruction from Videos

*is paper proposes a new enhanced method based on one-dimensional direct linear transformation for estimating vehicle movement states in video sequences.*e proposedmethod utilizes a contoured structure of target vehicles, and the data collection procedure is found to be relatively stable and effective, providing a better applicability.*emovements of vehicles in the video are captured by active calibration regions while the spatial consistency between the vehicle’s driving track and the calibration information are in sync. *e vehicle movement states in the verification phase are estimated using the proposed method first, and then the estimated states are compared with the actual movement states recorded in the experimental test.*e results show that, in the case of camera perspective of 90 degrees, in all driving states of low speed, high speed, or deceleration, the error between estimated speed and recorded speed is less than 1.5%, the error of accelerations is less than 7%, and the error of distances is less than 2%; similarly, in the case of camera perspective of 30 degrees, the errors of speeds, distances, and accelerations are less than 4%, 5%, and 10%, respectively. It is found that the proposed method is superior to other existing methods.


Introduction
As the video image records the vehicles' motions within the monitoring range and can provide objective raw data for further quantitative analysis, video surveillance equipment has been applied to provide important clues for detection and forensics of vehicle-related applications. For instance, Byon et al. [1][2][3] used the vehicle state, which was identified in terms of speed and acceleration on the associated road section from video recording, as input information to transportation mode detection algorithms. Tarkowski et al. [4] used video recording and reconstructed a motorcyclemotor car collision with both vehicles moving in the same direction and performing simultaneous maneuvers: a left turn and overtaking. Wu et al. [5] calculated the wheel track on an operational highway by video analysis when analyzing the design of visual intervention pavement markings.
Vehicles' speed has a significant impact on severity of road traffic accidents [6][7][8][9], and a reliable method to extract valid data and reconstruct vehicle movement is the critical success factor on accident analysis by videos.
ere have been several traditional techniques such as background subtraction and vehicle segmentation using hand-crafted features that are sensitive to noise. For instance, the optical flow method [10], interframe difference method [11], background difference method [12] resulting from the change of pixels for detecting moving objects and their associated speeds, and color method [13] usually used in the field of video velocimetry. Gao et al. [14] found the method of vehicle speed identification based on video images which helps to detect the vehicle moving distances in accordance with the fixed object and the structure of the target vehicle in the road scene environment. According to the definition of interval running speed and point speed, the speed of the target vehicle can be estimated with a video frame rate. At present, the method has become the main method of using video image for a vehicle's speed identification in the practice of forensic identification of traffic accidents [15][16][17][18][19], but due to the location of the calibration and the video frame rate, the method cannot fully reflect all motion states of the target vehicle in the monitoring range. Kumar et al. [20] presented a semiautomatic 2D solution for vehicle speed estimation from monocular videos, and they employed the state-of-the-art object detector Mask-RCNN to generate reliable vehicle bounding boxes and propose a twostage algorithm to approximately transform the measurements in the image domain to the real world. However, none of these methods are able to estimate the vehicle trajectory (especially the trajectory of the curve) under the view except the case of videos recorded from overlooking perspectives. Furthermore, low speed and multiobjective situations will lead to larger detection errors.
e combination of photogrammetry and computer technology provides a better method for road geometry measurement and vehicle speed calculation [21]. e transformation relationship between the image point coordinate system and the object space coordinate system can be established by using a direct linear transformation principle in the close-range photogrammetry technique [22], which is suitable for photographic measurements of images taken by currently dominant nonmeasurement cameras [23,24]. Yang et al. [25] collected information of pavement and vehicle body in traffic accident scenes based on the direct linear transformation principle of closerange photogrammetry, which can be used as the input condition of software for accident simulation. rough the road surveillance video, Miao [26] and Han [27] selected four fixed points on the road markings as control points. By matching the control points of the image plane coordinates and the object plane coordinates combined with the direct linear transformation principle, the track of a target vehicle on the ground and the associated speed of the vehicle can be estimated. However, the method takes four fixed points on the road as control points, and the selection of road control points is often affected by the number and distribution of road marks and road grade [28]. And the relative distances among the four feature points need to be measured where accidents would occur, resulting in significant limitations with respect to the application conditions and the required precisions in the set-up. Bullinger et al. [29] presented a method to reconstruct three-dimensional object motion trajectories in stereo video sequences, and they embed the vehicle trajectories into the environment reconstruction by combining the object point cloud of each image pair with corresponding camera poses contained in the background SfM (Structure from Motion) reconstruction. is method proved more reliable than monocular trajectory reconstruction approaches, but the invalid stereo matches caused by strong surface reflections are not easy to avoid. Hasith et al. [30] used a detector to detect the object bounding boxes in an image frame and associated these detections to tracks using the Hungarian algorithm based on pairwise dissimilarity scores calculated between detections in the current frame against the tracks in the memory and proposed a method to address MOT (Objective of multiple object tracking) by defining a dissimilarity measure based on object motion, appearance, structure, and size. It is an efficient and advanced framework tracking accuracy, but the tracking accuracy needs to be improved.
Overall, the traditional and existing methods of estimating the vehicle motion cannot fully reflect the driving states of the vehicle, or the required conditions are greatly restricted or do not have enough accuracy.
is paper develops a one-dimensional direct linear transformation method to solve the entire trajectory of a vehicle in the image sequence of recorded video, by capturing linear structure features of vehicle body and using the trajectory of the vehicle as calibration information. e calibration information of the method is derived from the profile of a target vehicle. e data collection process is found to be stable and effective, which eliminates the dependence, as in other existing methods, on other information such as fixed distances in the image, and the requirement for the quantity and distribution of the road markings. e calibrated area covers the whole motion of the vehicle in the video image, which keeps the consistency between traffic trajectory and calibration information space as much as possible. Based on video frame rates, the minimum step length of the entire vehicle trajectory is used to calculate the distance, which reduces the phenomenon of external interpolation during the solving process and improves the accuracy of results.

Basic Assumptions and Calculating Methods
In order to discretize the vehicle's continuous movement from a recorded video with certain image sequences, the video frame rate (i.e., recording frequency) and the vehicle's position in each image frame are used to compute the change in displacement during each time interval between each consecutive pair of image frames.

Basic Assumptions.
Because of the continuity of a vehicle's movement and the high video frame rates commonly used in the emerging camera devices, it can be assumed that the moving trajectory within the two adjacent image frames are approximately a linear motion. erefore, the corresponding displacement between adjacent frames of the video can be assumed to be equal to its actual distance, as shown in Figure 1.

Solution Ideas.
A direct linear transformation does not need interior and exterior orientation elements of the camera, and it is suitable for photographic measurements of images taken by nonmeasurement cameras. e basic formula for solving the direct linear transformation is shown in the following: where x and y are the image plane coordinates, and X, Y, and Z are object space coordinates. And l 1∼11 are the direct linear transformation factors. It is assumed that two adjacent frames can be approximated to form a linear motion, so the linear feature of the vehicle body profile can be considered to be on the same line in two adjacent frames. To make the variable Z and Y as constants, equation (1) can be rewritten as Equation (2) is a one-dimensional direct linear transformation formula, which can solve the one-dimensional linear transformation problem of the image space into the object space. According to Figure 2, four intersections where the line at the outer end of the wheel axis meets the rim edge in the No. N frame images are set as reference points, named a N , b N , c N , and d N . e four marked points make up three segments adjacent to a straight line. en, any three of them are selected as control points. And based on the distance of the pixels and the actual distance between them, one-dimensional direct linear transformation coefficients L N � [l 1 , l 2 , l 3 ] of No. N frame image are solved. On the next frame image (No. N + 1 Frame), combined with the image point coordinates of a N + 1 and a N and the direct linear transformation coefficients L N , the object space distance of a N + 1 and a N are calculated.
en, we can find the No. N + 1 frame image's corresponding one-dimensional direct linear transformation coefficient l N + 1 in accordance with the image distance and the actual distance of any three points of a N+1 , b N+1 , c N+1 , and d N+1 . Similarly, depending on the image point coordinates a N+2 and a N+1 of the next frame image No. N + 2, combined with the direct linear transformation coefficient l N+1 , the actual distance of a N+2 and a N+1 can be calculated. Calibration and calculation can be continued frame by frame in the direction of the vehicle's movement. Finally, the driving state of the vehicle can be obtained.

Calculating Procedure
e solution procedure described in the "Solution Ideas" section can be summarized in the following steps: (1) Extract the image and time-stamp of the target vehicle in the desired duration of video frame by frame. (2) In each frame, four intersection points are formed between the straight line at the outer end of the wheel axis on one side of the target vehicle and the rim edge as the reference points, respectively set to a, b, c, and d. N + 2 frame image, combining with the direct linear transformation coefficient l N k+1 , calculate the actual motion distance between No. N + 1 and No. N + 2 frame images. (7) Calculate the direct linear transformation coefficient l frame by frame by the method above, and set point a as the reference point of the actual driving distance of the target vehicle. Figure 2: Distance calibration between adjacent image sequences.

Approximate straight movement
Frame N + 1 (8) According to the driving distance of the target vehicle in the image sequence, combining with the frame rate of the video, calculate the vehicle's speed and acceleration corresponding to the specified images, frame by frame. (9) Output file information, vehicle feature point spacing, extracted video image range, image sequence and time-stamp information, feature point image coordinates on each frame, one-dimensional direct linear transformation coefficient l N k , the distance of feature point a from one frame to the next frame in the video, and speed and acceleration in every frame.

Design of Validation Test Scheme
is section describes the validation test set-up as the following: (i) Test material: a manual transmission car, Kistler S-350 photoelectric five-wheel instrument, LED monitors, two cameras, SIRIUS digital collector, a notebook computer (for channel setup and data collection), three-dimensional laser scanners, checkerboard, tape, and BMW label.
(ii) Test site: a 150-m-long straight lane.
(iii) Test scenario: Figure 3(a) shows test vehicles and equipment, and 3(b) shows a test scenario. ere are six BMW labels on both sides of the test lane, and two cameras on the side of the driveway. e camera directions are set in the direction of the vehicle at 90°a nd 30°. e test vehicle is equipped with laser fivewheel instrument, digital collector, LED monitor, and car camera. 90°direction camera is 10 m far away from the side of the lane, and the test car runs three times at different driving speeds. DWDEsoft saves the collected video and data information to the notebook computer. e frame rate of the video collected in the test is 30 fps, and the resolution ratio is 640 * 480 pixels. Collected data include travel speed and distance (collection frequency is 100 Hz).

In the Camera Perspective of 90 Degrees
(1) Image Preprocessing and Inter-Frame Displacement Solution. In PC-Rect 4.2, the image of the checkerboard taken by the test camera is processed and the lens distortion correction file is obtained. In MATLAB, the collected video is decomposed into multiple frames, the video frame rate is 30 fps and the total number of frames is 73, and the file format is in RGB24 bit. en, 28 continuous frames are extracted when the test car passes through the video region. e image is corrected by applying lens distortion processing. As shown in Figure 4 (the original image) and  (2): where J � (iii) With the same method to all frame image processing, calculate the displacement between two adjacent frames of the test vehicle for dS (called frames displacement in the following) and the results are shown in Table 1.
(2) Driving Speed Solution. From Table 1, the frame displacement dS value is in the range of 33-38 cm, with a mean of 34.84 cm. e video frame rate is 30 fps, and according to v � dS/dt, we solve the interframe velocity v, as Figure 7 shows. Meanwhile, by the SIRIUS data collection, we record the test speed v 0 value at a collection frequency of 0.01 s, v 0 and v comparison analysis as shown in Figure 8. From Figures 7 and 8, the time t fourth-order fitting curve of calculated speed v is basically the same as that of the test-gathering speed curve, and the overall trend is decreasing. Standard deviation of v(μ + σ) ranges between 36.3 km/h and 39.3 km/h, with a mean of 37.7 km/h. Recorded v 0 ranges between 37.2 km/h and 38.9 km/h, included in the range of 36.3 km/h∼39.1 km/h. e average speed is 38.1 km/h, which is different from the calculated average v 0 approximately by 1.0%.
is shows that the proposed method is valid for speed calculation.
(4) en, we take the derivative equation (4) resulting in acceleration a: At the same time, since the data collection instrument does not record the acceleration value over time. is paper takes the derivative of v 0 with respect to t and gets a scatter acceleration a 0 . en, we fit a third-order polynomial from the scatter value a 0 , as shown in Figure 9.
ere is a large deviation between the a 0 fitted curve and calculated a. It is also noted that the distribution of a 0 is scattered. Because the change of test speed is small (<2 m/s 2 ), close to a uniform motion, while sampling period is as short as, the speed error reflected on the acceleration is amplified 100 times by the sampling frequency.
We notice the time T is only 0.9 s and the change in vehicle acceleration is usually approximately linearly variable under nonemergency situations of 0.9 s. erefore, it is reasonable to make a linear comparison of the acceleration values, as shown in Figure 10. e differences between two slopes of the two fitted lines and mean values are 7.6% and 5.1%, respectively.       From this result, we can see that this method is also applicable to the solution of the mean acceleration value.
(4) Travel Distance Calculation. e travel distance S is found by taking an integral of the frame displacement dS about T. Figure 11 shows the calculated driving distance S by this method compared with test recording S 0 .
S and S 0 reflect that the car is approximately under uniform motion and the linear regression equation is e two slopes and intercepts are similar. e cumulative driving distance in 0.9 s results in S � 9.65 m, and the driving distance of the test record is S 0 � 9.47 m, between which the difference is 1.9%. It is found that the calculation of the driving distance is valid.

In the Camera Perspective of 30 Degrees
(1) Value Correction of Feature Point Coordinate. After the video frame solution and distortion correction processing are applied, we extract 42 continuous frame images of the test vehicle. e images are renumbered in the order of 0 to 41.
Due to the 30°of deflection angle between the lens orientation and the driving direction of vehicle, it is difficult to ensure that the feature points of the body of the vehicle are located on the same line. As shown in Figure 12, there is no guarantee that the pixel distance between adjacent feature points in the image coordinate system can be correctly estimated.
First, 3 feature points for each frame are selected (126 feature points in all). Second, a linear regression of 126 feature points is conducted, as shown in Figure 13. e ordinate value of each feature point is revised to obtain the coordinate value, and it is found that the modified ordinate values change more uniformly and regularly as shown in Table 2.
(2) Driving Speed Solution. According to the corrected coordinates and equation (2), the same method solves dS frame displacement between two adjacent frames. e derivation of dS results in the average speed v between frames, as shown in Figure 14. Comparisons of calculated v 0 and recorded v are shown in Figure 15.
As Figure 14 shows, standard deviation of v (μ ± σ) ranges between 33.6 km/h and 38.7 km/with the mean of 36.1 km/h. Recorded v 0 ranges from 36.5 km/h to 38.7 km/h, resembling the ranges of 36.7 km/h∼38.6 km/h as in the estimated range. e average speed is 37.5 km/h and the deviation from the calculated average v 0 is 3.7%, meeting the error standards.
Linear regressions of curve a and scatter a 0 are shown in Figure 16. e difference between the mean values of two accelerations is 5.6%. Although the two fitting line slopes are opposite, the lines are below 0, indicating the acceleration is always negative.

In the Camera Perspective of 90 Degrees
(1) Driving Speed Solution. When 10 consecutive frames are extracted from the video and the same methods are applied      Journal of Advanced Transportation to calculate the mean speed v, which is compared with record speed v 0 , the result is shown in Figure 18. e average speed of v and v 0 are 76.99 km/h and 76.77 km, respectively, which is different from each other by approximately 0.3%.
(2) Travel Distance Calculation. e linear regression of travel distance S gives e coefficient of determination R 2 � 1 indicates the regression is significant, and the vehicle is approximately under a uniform motion, while the speed is 21.37 m/s (76.93 km/h). e difference between the estimated S � 6.42 m and the observed S 0 � 6.40 m is 0.3%, meeting the error requirements. Figure 19 shows the comparison between the estimated and observed accumulative driving distances. e comparisons of its acceleration values are not discussed here, as the experiment is very close to uniform motion and similar to the situation described in the "Low-Speed Running" section. e next section "Deceleration Running" will discuss especially for the deceleration values.

In the Camera Perspective of 30 Degrees
(1) Travel Distance Calculation. 25 continuous frame images of the test vehicle are extracted first, and 75 feature points coordinates are fixed, as shown in Figure 20.
Based on the modified feature point coordinate, we obtain v that is shown in Figure 21 with v 0 . e average speed of v is 75.31 km/h and the standard deviation (μ ± σ) ranges from 70.58 km/h to 80.07 km/h. e average of v 0 is 76.75 km/h, which is different from the calculated average v 0 for approximately 1.9%, meeting the requirements.
(2) Travel Distance Calculation. Cumulative driving distance S and test record S 0 are shown in Figure 22.    After the regression, with coefficient of determination R 2 � 0.98, the vehicle is approximately under a uniform motion. In 0.8 s, the difference between calculated S � 16.73 m and record S 0 � 17.05 m is 1.9%.

In the Camera Perspective of 90 Degrees
(1) Driving Speed Solution. 10 frames of continuous images are extracted through the video region and the same methods are applied to calculate the mean speed v, which is compared with record speed v 0 shown in Figure 23.
We took F Inspection Method for a validation between v and v 0 for a significance test, then the resulting H value is 0 and P value is 0.38, which is greater than the significant level of 0.05. It indicates that the equation is homogeneous. It is found that there is no significant difference, and the mean speed difference is 1.3%.
(2) Travel Distance Calculation. In the above calculation of acceleration, it shows that the acceleration scatter value obtained directly through ds/dt is dispersed and has a larger error.
erefore, for the solution of acceleration in the deceleration state, we can choose to solve the cumulative driving distance S by curve fitting under different conditions and double derivations to obtain the acceleration curve. e calculated result of S is shown in Figure 24 with an error of about 1.1% and nonlinear change.
(3) Acceleration Solution. Because the speed reduction of the test vehicle is basically unchanged through the video region, secondary fitting S S � −3.1t 2 + 18t, residuals to 0.043, then secondary fitting S 0 residuals to 0.095, and then twice derivative of both     a � −6.2, a 0 � −5.8. (12) e resulting acceleration difference is 6.9%.

5.3.2.
In the Camera Perspective of 30 Degrees. By the same method, driving speed and distance can be obtained as follows in Figures 25 and 26. ere is no significant difference in driving speeds, the mean error is 1.2%, the distance error is also 1.1%, and the acceleration error is 9%.

Conclusion
In this paper, a method based on a direct linear transformation is proposed to calculate the moving state of a vehicle in a video image, which can fully reflect the running state of the vehicle based on the collected data and is found to be stable and easy to obtain. e application condition is thoroughly described throughout the paper. Examples are given to show the estimation processes of the speed of the vehicle, acceleration, and driving distance under 2 angles for 3 different motion states.
In order to verify the validity of the method, vehicle motion states are estimated in experimental test cases; the results show that in the 90°angle of camera perspective, lowspeed, under-medium-speed, and high-speed and deceleration running three states, the calculation of the speed and record value error is found to be less than 1.5%, the acceleration value error is less than 7%, and the driving distance value error is less than 2%. In 30°angle of camera perspective of view, low-speed, under-medium-speed, and high-speed and deceleration running three states, the calculation of the speed and record value error is found as less than 4%, the driving distance value error is less than 5%, and the acceleration value error is less than 10%. e performance of estimations made by the proposed method in terms of errors for vehicle speed and distance value is found to be higher than other existing methods, and the error of acceleration value complies with the error requirement, which shows that the method presented in this paper is effective in solving the driving state of the vehicles in the video.
is paper does not currently attempt to estimate the vehicle movement states of running trajectory under the view of overlooking camera. We will continue our research by adding vehicles' cross-lane curve states in the near future, and the selection criteria of the feature points of the body profile under overlooking camera will be newly defined.

Data Availability
All of the data related to this paper are available for peer researchers to validate. Disclosure e funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Conflicts of Interest
e authors declare no conflicts of interest.