Taxi Efficiency Measurements Based on Motorcade-Sharing Model : Evidence from GPS-Equipped Taxi Data in Sanya

Urban traffic congestion has become a global problem and has garnered special importance in recent years in the transportation sector, especially in taxi markets. To unlock the potential of Internet of Vehicles (IoV) in Intelligent Transportation Systems (ITS), it was vital to make efficiencymeasurements. In this study, Distance Formula was built to calculate distances by GPS data based on mathematical equations, andMotorcade-Sharing (MS)Model was proposed to improve the efficiency of collaborative vehicles.The experimental data of 2191 GPS-equipped taxis in Sanya of China was adopted to make comparisons between original results and modelled results.Measurement results showed thatMSModel had 10.54%more leisure taxis, reduced 5 overdriving taxis, and saved 33.73% running distance in total compared to the original.This indicated that the application of MSModel could not only alleviate urban traffic congestion but also optimize urban taxi markets, and it has a bright future in the field of taxi and other collaborative vehicles. Future directions could be improving MS Model and expanding data.


Introduction
Nowadays, there are plenty of urgent social issues in the transportation research field.For example, urban traffic congestion has become a global problem and has garnered special importance in recent years in the transportation sector, especially in taxi markets.With the worsening situation of urban congestions and the development of autonomous vehicles, taxi markets are facing increasing opportunities as well as challenges.It is essential that appropriate counter measures must be taken to suppress or reverse or at least alleviate the worsening situation.
At the same time, there are more and more advanced technologies as well (i.e., intelligent traffic system, autonomous vehicles, etc.).To unlock the potential of Internet of Vehicles (IoV) in Intelligent Transportation Systems (ITS), it is vital to make efficiency measurements.In this study, Motorcade-Sharing Model was proposed for promoting taxi efficiency, and GPS-enabled taxi data in Sanya was adopted to support comparisons.
The rest of this paper is organized as follows: Section 2 reviews related works; Section 3 introduces formulas, models, data, and tools; Section 4 describes results from the perspective of path and efficiency; Section 5 makes a brief discussion; Section 6 summarizes main conclusions, contributions, and proposes future directions.

Literature Review
In this section, there are several similar problems and some related works till date, including Internet of Vehicles, collaborative vehicle routing, and taxi service improving.

Internet of Vehicles Problem. Intelligent Transportation
Systems (ITS) had significant impact on our life [1], and they will consist of a crucial component of our modernized society [2,3].There is an amount of advanced technologies involved in ITS, for example, radio frequency identification [4], wireless sensor networks [5,6], network coding [7,8], network energy [9], database storage [10,11], data mining [12], and cloud computing security [13].In recent years, a significant change in ITS was that much more data were collected from a variety of sources and can be processed into various forms [14].And numerous advanced multidisciplinary journals began publishing a special issue of big data, for example, Nature in 2008 [15] and Science in 2011 [16].
Internet of Things (IoT) is a novel paradigm that is rapidly gaining ground in the scenario of modern wireless telecommunications as defined by Atzori et al. [17].In the IoT paradigm, many of the objects that surround us will be on the network [18][19][20].According to Miorandi et al. (2012) [21], it is predictable that the Internet will exist as a seamless fabric of classic networks and networked objects by 2022.And many of the hurdles, such as digitization services, of the Smart City Project can easily be solved by IoT as discussed by Ahmed and Rani [22].
Vehicular Ad Hoc Networks (VANETs) are selforganizing networks that can significantly improve traffic safety and travel comfort as defined by Chaqfeh et al. [23].As an emerging concept transmuting the notion of VANETs, Internet of Vehicles (IoV) has a bright future.For example, Umer et al. (2018) proposed a dual ring connectivity model for VANETs under a heterogeneous traffic flow [24].

Collaborative Vehicle Routing Problem. Vehicle Routing
Problem (VRP) is to find a plan to assign vehicles in such a manner that demands are satisfied and total mileage covered by the fleet is a minimum [25].A lot of related works in this field have already been done.Some of them were with various window constraints [26], while some of them were under intelligent algorithms [27,28].
In recent years, more and more companies tend to reduce distribution costs by creating collaborative environment on the Internet so as to share various information and resources, such as customers, warehouses, and fleets [29].And numerous modeling methods were developed by researchers.For example, Yuan et al. (2017) proposed K−2FMM approach to improve the accuracy level of measuring travel time variability [30].Wang et al. (2017) proposed a developed semi-nonparametric generalized multinomial logit model to extend the standard Gumbel distribution [31].Tang et al. (2017) proposed an improved hierarchical fuzzy inference method based on map-matching algorithm [32].Al-Mayouf et al. ( 2018) proposed an intersection-based segment aware algorithm for geographic routing in VANETs [33].Laha and Putatunda (2018) proposed three incremental learning methods for next pickup location prediction problems [34].Xu and Cai (2018) proposed a variable neighborhood search algorithm for the consistent Vehicle Routing Problem [35].
Collaborative Vehicle Routing Problem (CVRP) is to find the best solution to assign vehicles in such a manner that demands are satisfied by comprehensive efficiency being a maximum.It helps decreasing the waiting time of passengers, but its related studies were much fewer than the ones related to VRP.

Taxi Service Improving Problem.
Taxi service includes varies of aspects, such as safety, convenience, efficiency, comfort, and economy.Utilizing large-scale GPS data to improve taxi services has become a popular research problem in the areas of data mining, intelligent transportation, geographical information systems, and the Internet of Things [36].
Above all, traffic safety is vital.Projects of safety have been more and more popular all over the world, such as the zero-accident goal by 2050 in EU, the Hazardous Material Cooperative Research Program in the United States, and the 5-year safety production plan in China [37].And there are already some researches focusing on taking measures to enhance traffic safety.For example, Zou et al. (2018) proposed a developed empirical Bayes method and applied it to actual crash datasets [38].Jain et al. (2018) proposed an algorithm to control traffic congestion and road safety for Social Internet of Vehicles [39].According to the evaluation results of a transportation system as discussed by Lam [40], journey time was a significant factor in passenger waiting time when vehicle shortages occur.In addition, according to La et al. (2013), time pressure could impact crashes for taxi drivers [41].
In recent years, rapid technical growth of autonomous vehicles tended to accelerate the process of autonomous taxi market.For example, an autonomous taxi service has come into use on a 4.5km campus road at Seoul National University [42].With the development of technique and policy, autonomous taxi market has a bright future.Meanwhile, understanding origin-destination distribution of taxi trips is very important for improving effects of enhancing quality of taxi services [43].
To sum up, some helpful works have already been done.However, few researches proposed MS Model and took big data into consideration.Besides, autonomous vehicles need assigning continuous directives in time while it is difficult for complex algorithms to respond immediately under the background of big data.
How to achieve collaborative vehicle routing and improve taxi service by MS Model based on Internet of Vehicles?In the next section, methods and data are involved to explore this question.

Measurement Steps.
To solve this problem, 4 steps are necessary (see Figure 1).First is acquiring data, including the demands of passengers and the supplies of taxis.Second is measuring distances between passengers and taxis based on the real-time locations of them.Third is working out modelled results based on the MS Model.Last is making comparisons between original results and modelled results.
There were two key points during the measurement.One is Distance Formula (see Section 3.2), and the other is Motorcade-Sharing Model (see Section 3.3).

Distance Formula.
There are five steps in this section and the notations for formulas are in the Appendix (see Appendix A).
The primary function of Distance Formula was to calculate distances by GPS data.First, in order to describe the location of taxis in a mathematical way, the polar coordinate system was established in Formula (1).

𝑃 (𝜌, 𝜃, 𝜑)
( where  ≥ 0, 0 ≤  ≤ , and 0 ≤  < 2.The variable  represents the radial distance.The variable  represents the polar angle.The variable  represents the azimuthal angle.Second, in order to calculate the distance between two locations, the rectangular coordinate system was established in Formula (2).
Suppose that there are two locations of taxis, called  1 and  2 .According to Formulas (1) and ( 2) and  = , their polar coordinates,  1 (,  1 ,  1 ) and  2 (,  2 ,  2 ), can be converted into rectangular coordinates, Third, in order to calculate the distance between two locations, the cosine value of the angle between vectors   ⇀  1 and   ⇀  2 was calculated by Formula (3).
Formula (3) can be simplified into Formula (4) and the derivation process is in the Appendix (see Appendix B).
Fourth, according to the equation of arc length calculation, the distance between vectors   ⇀  1 and   ⇀  2 was calculated by Formula (5).
where  = 6378.1370(),which represents the radius of the earth.
In other words, the distance between two locations of taxis can be calculated approximately by the longitudes and latitudes of them based on GPS data, and the radius of the earth.Suppose that the variable  represents total distance between two locations of taxis, the variable  represents total occupied distance between them, and the variable  represents total unoccupied distance between them.And then  can be computed by Formula (7) [44]. where At last, the target of this study is to minimize the total distance, and it can be indicated in Formula (8).min Suppose that the demand of passengers is invariable, which means that  is constant.And then Formula (8) can be simplified into Formula (9).min Minimizing the total distance is equivalent to minimizing the total unoccupied distance.And in the next section, MS Model is established to solve this problem.(i) Proximity Strategy.The unoccupied taxi with closer distance has its priority to pick up passengers immediately; in other words, passengers will be picked up by the unoccupied taxi whose position is closest to them at that time.
(ii) Passivity Strategy.All the unoccupied taxis pick up passengers passively on the basis of the above-mentioned Proximity Strategy, and they keep parking after passengers arrive.That is to say, only occupied taxis keep driving while others keep parking until necessary.
MS Model needs an online platform to exchange information between taxis and passengers.Taxis and passengers send their real-time locations to the online platform (see Figure 2(a)) and get partners' information in return (see Figure 2(b)).Suppose that there are only 3 taxis and 3 passengers online (see Figure 4(a)).In the original setting, Taxi1 picks up Passenger1, Taxi2 picks up Passenger2, and Taxi3 picks up Passenger3.According to MS Model, taxis and passengers send their real-time locations to the online platform (see Figure 3(a)) and get partners' information in return (see Figure 3(b)).As a modelled result (see Figure 4(b)), Taxi3 picks up Passenger3 as usual.However, Taxi2 is going to pick up Passenger1 because Taxi2 is closer to Passenger1 than Taxi1.Besides, Taxi1 and Passenger2 keep waiting for a second.If there is still no better choice in the following seconds, Taxi1 picks Passenger2 in the end.Autonomous taxis have no conflicts of interests and need an online platform to share information.Thus, MS Model is especially suitable and useful for autonomous vehicles to improve comprehensive efficiency in the future.
When there is more than one taxi online, it is possible for those taxis to improve comprehensive efficiency by cooperating with others by MS Model.It seems that modelled routing without overlapping has higher efficiency than original setting; however, it needs evidences from real data to verify.(ii) Data Processing.According to the origin data, the locations of 2506 taxis in Sanya were recorded every 15 seconds from 9:00 a.m. to 9:59 a.m. on Nov. 15th in 2016, adding up to 766,042 records.First, all the locations of taxis are averaged every 60 seconds so as to improve the accuracy of data; that is to say, average locations of those taxis were recorded every 60 seconds entirely.Second, the flaw data was removed so as to ensure the integrity of data.As a result, there were 2191 taxis and 131,460 records in the experimental data.

Data and
(iii) Implement Tool.Several kinds of statistics software were adopted in this study as follows.Excel was applied for original data processing and result storage.Matlab was applied for programming (see the file named "Programming.pdf"in the Supplementary Materials).Stata was applied for drawing.

Results
In this section, modelled results of 2191 GPS-equipped taxis in Sanya were compared with original results from the perspectives of path and efficiency.

Path
Comparison.We adopted scatter diagram rather than line diagram because the former seems clearer than the latter.Taxi positions were located minutely at a fixed monitor.Thus, crowded districts lied in dark areas.To begin with, there was seldom difference between original results (see Figure 5(a)) and modelled results (see Figure 5(b)) in path comparison results of all the 2191 taxis because of compression.However, disparities gradually appeared when focusing on specific local areas.It came to a specific result by a narrow scope that the longitude was in the range of [109.4 ∘ ,109.6 ∘ ] while the latitude was in the range of [18.2 ∘ ,18.4 ∘ ] (see Figure 6).And at last, it came to another specific result by a far narrow scope that the longitude was in the range of [109.495∘ ,109.505∘ ] while the latitude was in the range of [18.245 ∘ ,18.255 ∘ ] (see Figure 7).
It can be seen from Figures 5-7 that modelled paths were obviously smoother and more unobstructed than original paths; thus, MS Model helped alleviating urban traffic congestion.However, it needs digital comparisons to quantify.

Efficiency Comparison.
All the 2191 taxis were divided into different groups by their running distances.First, the efficiencies of running distances were compared (see Table 1).There were two main findings in Table 1.On the one hand, modelled total distance (23,292.05km) was 33.73% lower than original total distance (35,147.37km),which implied that modelled scheme was more fuel-efficient and thus economical in another word.On the other hand, modelled scheme obtained reducing distance results in groups above 20 km and increasing results in groups below 20 km, which implied that modelled scheme was more efficient in general rather than unilateral.
Second, the efficiencies of running taxis were compared (see Table 2).There were three main findings in Table 2. Above all, modelled taxis with 0 km (289) were about five times the quantity of original taxis with 0 km (58), and increasing 10.54% leisure taxis implied that modelled scheme was more fuel-efficient and thus economical than the original.Furthermore, modelled taxis in groups above 60 km (5) were half the original taxis in groups above 60 km (10), and reducing 5 overdriving taxis implied that modelled scheme had fewer overdriving taxis than the original and thus was safe in another word.Last but not least, modelled scheme obtained reducing taxi results in groups above 20 km and increasing results in groups below 20 km, which implied that modelled scheme was more efficient in general rather than unilateral.
It can be seen from Tables 1 and 2 that modelled scheme had higher fuel-efficient, more economical, and safer taxis in general compared with the original.Thus, MS Model helped optimizing urban taxi markets.

Discussion
While making comparisons in the last section, there were still five factors that had not been contemplated within, which are described as follows.
First, Proximity Strategy has limitations.In reality, it was difficult for taxis to pick up passengers according to their relative positions even if information could be shared adequately.It required the online platform to keep good Second, Passivity Strategy has limitations.In reality, it was difficult for taxis to keep parking after passengers reach their destinations because there were other requirements (i.e., traffic control, refueling, drivers' turning over, etc.).Sometimes, it was more reasonable for taxis to approach busy areas rather than staying.That is to say, MS Model might have much lower efficiency because of traffic conditions.
Third, information sharing was essential for MS Model.Several management plans for vehicular traffic (i.e., big data collections, cloud computing, platform construction, autonomous technique, etc.) should also be prepared completely.That is to say, MS Model might have much lower efficiency because of failure facilities.
Fourth, in most cities in China, morning rush hours are often centered in the period of 7:00-9:00 a.m.[45].Thus, the experimental data in this study was not in rush hours.However, traffic conditions may vary over different time periods; more data should be analyzed and presented in the future research.That is to say, MS Model might have much higher efficiency during rush hours.
At last, this study focused on the efficiency comparison between MS Model and the original.Thus, path optimizations were not considered and taxis drove the original path in MS Model.That is to say, MS Model might have much higher efficiency by optimizing routes.
To sum up, the results could be more accurate by improving the model or expanding the sample size.Nevertheless, the trend would not change that modelled scheme would still result in higher efficiency than the original.In addition, MS Model is suitable and useful for autonomous vehicles.

Conclusions
In this study, the authors proposed Motorcade-Sharing (MS) Model for promoting taxi efficiency and adopted GPSenabled taxi data in Sanya to make efficiency comparisons by Distance Formula.Given the arguments mentioned above, the major findings in this article include several contents, which are described as follows.
(i) The application of MS Model could alleviate urban traffic congestion and it resulted in less obstructed path and more leisure taxis than the original (see Section 4.1).
(ii) The application of MS Model could optimize urban taxi markets and it resulted in higher fuel-efficient, more economical, and safer taxis than the original (see Section 4.2).
(iii) The application of MS Model could improve effects according to statistic results (see Section 4.2) and it has a bright future in the field of taxi and other collaborative vehicles.
The contributions of this study could be as follows: (i) Distance Formula was built to calculate distances by GPS data based on mathematical equations.(ii) Motorcade-Sharing Model was proposed to improve the efficiency of collaborative vehicles.(iii) It was verified that modelled scheme had higher efficiency than the original.
Future directions for research can be in two ways.One is improving MS Model, and the other is expanding the sample size.

A. Notations for Formulas
See Table 3.

Figure 2 :Figure 3 :
Figure 2: Information exchange design.(a) Information flows in online platform.(b) Information flows out from online platform.

Figure 4 :Figure 5 :
Figure 4: Routing comparison of a hypothetical example.(a) Original vehicle routing.(b) Improved vehicle routing.

Figure 6 :Figure 7 :
Figure 6: Path comparison of taxi positions located minutely at a fixed monitor where the longitude was in the range of [109.4 ∘ ,109.6 ∘ ] while the latitude was in the range of [18.2 ∘ ,18.4 ∘ ].(a) Original positions.(b) Modelled positions.
Tool. (i) Data Source.Data in this study was collected from the big data platform, Travel Cloud, which was developed by Ministry of Transport of the People's Republic of China (see Data Availability).The GPS data of Taxis in this study was provided by Sanya Traffic and Transportation Bureau in Hainan Province in China.Major data items include vehicle ID, longitude, latitude, and time (see the file named "Data.xlsx"in the Supplementary Materials (available here)).

Table 1 :
Efficiency comparison of running distances.The data in the 2nd and 4th columns are accumulation of corresponding distances; unit: km.

Table 2 :
Efficiency comparison of running taxis.The data in the 2nd and 4th columns are numbers of taxis with corresponding distances.