Modeling and Prediction of Ride-Sharing Utilization Dynamics

,


Introduction
e increasing availability of portable technologies gives new fuel to studies on metropolitan transportation optimization, pushing urban design one step closer towards the long sought concept of "smart cities" [1,2].Mobile devices and ubiquitous connectivity make it easier than ever to collect data on the way people live in cities and big-data analytic methods facilitate the extraction of actionable insights from it.City administrators and policy makers can in turn act upon such results to enhance city management, channeling current advancements in data analysis for the immediate improvement of urban quality of life.
Many of the fundamental problems in big cities nowadays relate to cars.e high number of vehicles congests the streets, vehicles standing in tra c jams increase air pollution while also increasing traveling times, signi cantly increasing passengers' stress levels.Availability of large-scale datasets accompanied with recent advancements in the analysis of big-data and the development of novel models of human mobility give rise to new possibilities to study urban mobility.
Such new models include for example the work of [3] in which large-scale mobile phone data were analyzed in order to characterize individual mobility, show that human travel patterns are far from random, and are e ciently describable by a single spatial probability distribution.Similarly, [4] show that mobile phone data can be used as a proxy to examine urban mobility and [5] analyzes social network data of di erent cities to nd that mobility highly correlates with the distribution of urban points of interest.Mobile technologies are also the enablers of many successful consumer applications, such as Waze [6], that provide tra c-aware city navigation by using data provided by the community.Alternative ways of moving in the city, such as autonomous mobility-on-demand and short-term car rental have been identi ed among the possible solutions to the ever-growing transport challenge [7].
Ride sharing has the potential of improving tra c conditions by reducing the number of vehicles on the roads, reducing the emission of CO 2 and the fuel consumption per person, and giving the riders the opportunity to socialize with people (that otherwise would have been erce "road competitors").A recent study [8] shows that tra c in the city of Madrid can be reduced by 59% if people are willing to share their homework commute ride with neighbors.Even if they are not willing to ride with strangers, but only with friends of friends (for safety issues), the potential reduction is still up to 31%.Another recent study [9] had shown that on-demand routefree public transportation based on mobile phones outperforms standard x-route assignment methods when comparing traveling times.ese results encourage the deployment and policies supporting ride sharing in urban settings.
However, despite such evidence and others, ride-sharing adoption rate in cities worldwide is slower than what can be expected given the clear bene ts of ride-sharing [10,11].One important reason, as suggested by [12][13][14] and others is the uneven, and o en unstable, potential bene ts associated with ride-sharing.When the value that can be extracted from using a service such as Ly [15], Uber [16], or Sidecar [17] is high at one part of the city, but signi cantly lower at another neighborhood, or worse-suddenly decreases for a period of two days-potential users of the service are much likely to opt for a private car usage [18].
In this work we propose a data-driven framework to dynamically predict the impact, or potential utilization, of ride sharing in a city, at di erent times, and in di erent regions.Speci cally, the technique we propose provides both policy makers as well as ride-sharing operators tools for assessing the future bene t of ride-sharing, encapsulated through the percent of rides saveable through merging of nearby departures and destinations.Simply put, a shared taxi service can use this proposed technique in order to know ahead of time what the ride-sharing demand is going to be (at various places in the city), whereas municipal services can dynamically change tolls and service fees in order to incentivize the use of ride-sharing in "low hours" that are predicted in advance.
Our method is based on analyzing the network features of the dynamic O-D matrix as represented by data collected by various sources, such as mobile phone call records, or sensors mounted on the taxis themselves.In our research, we show a clear correlation between such properties and the portion of "merge-able rides".We have analyzed the e cacy of our proposed network-oriented method using a dataset of over 14 million taxi trips taken in New York City during January 2013 [19].
is work is structured as follows: Section 2 presents an overview of the relevant related research in the eld.In Section 3, we discuss the data and analytic methodologies that were used for this work: starting with the calculation of the average ride-sharing potential as a function of the maximum delay a taxi-user would be willing to sustain, we demonstrate that more than 70% of the rides can be shared when users are willing to undertake up to 5 minute delay.We then demonstrate that urban ridesharing potential is not only highly dynamic, but that it can also be predicted using the analysis of the rides that took place in the city a few hour beforehand.We present a method for comprising a dynamically changing network using the taxi-rides, and analyzing the topological properties of this network (Section 4).We analyze the dynamics of these properties over time, and demonstrate our ability to accurately predict changes in the utilization of ride-sharing several hours in advance.Concluding remarks and suggestions for future works are contained in Section 5.

Related Work
Network features can signal and are o en used to predict events or properties that are external to the network, but in uence it.A network can o en be built on easily available data and serve as an important source for predictions regarding various (seemingly unrelated) events and large-scale decision-making processes [20][21][22].Features of a phone call network can signal the occurrence of an emergency situation or predict trust among individuals [23], and speci c behaviors in a Twitter account can identify a spammer [24].Such discoveries had sparked the interest of researchers in di erent research elds, who could bene t from this new ability to model large-scale human dynamics.One of the elds most in uences by this evolving research thrust was the data-driven study of human mobility and its potential application for Intelligent Transportation Systems [11,[25][26][27][28].
It has been recently shown that in trying to detect semantic network events (such as an accident or a tra c jam) it is crucial to understand the underlying structure of the network these events are taking place at [29,30], the role of the link weights [31], as well as the response of the network to node and link removal [32].Past research [33] had pointed out the existence of powerful patterns in the placement of links, or that clusters of strongly tied together individuals tend to be connected by weak ties [31].It was also shown that this nding provides insight into the robustness of the network to particular patterns of link and node removal, as well as into the spreading processes that take place in the network [34,35].In addition, recent work had demonstrated the trade-o between the number of individuals (the width of the data) and the amount of information available from each one (the depth of the data), with respect to the ability to accurately model crowds behavior [36][37][38].An analytical approach to this problem discussing the (surprisingly large) amount of personal information that can be deduced by an "attacker" who has access to one's personal interactions' meta-data can be found in [39][40][41].
One of the rst works that examined the statistical distribution of event appearance in mobility and communication networks have found that these follow a power law principle [42], and that such distribution is signi cantly a ected by anomalous events that are external to the networks [43].A method for filtering mobile phones Call Data Records (CDRs) in space and time using an agglomerative clustering algorithm in order to reconstruct the origin-destination urban travel patterns was recently suggested in [44].
Recent works that have been analyzing data collected by the pervasive use of mobile phones have broadly supported the notion that most of human mobility patterns are affected by a relatively small number of factors, easily modeled, and very predictable [4,[45][46][47].A comprehensive survey of ride-sharing literature can also be found in [48] and another recent relevant study that developed spatial, temporal, and hierarchical decomposition solution strategy for ride-sharing is presented in [49].
To-date, much of the research related to ride sharing has focused on understanding the characteristics of ride-sharing trips and users.In a recent survey of app-based, on-demand rideshare users in San Francisco, researchers found that 45% of ridesharers stated they would have used a taxi or driven their own car had ridesharing not been available, while 43% would have taken transit, walked, or cycled [50].
A recent work by Santi et al. [51] introduces a way of quantifying the benefits of sharing.e study applies to a GPS dataset of taxi rides in New York City and uses the notion of shareability network to quantify the impact and the feasibility of taxi-sharing.When passengers have a 5 minutes flexibility on the arrival time, and they are willing to wait up to 1 minutes a er calling the cab, over 90% of the sharing opportunities can be exploited and 32% of travel time can be saved.e authors have also shown that the problem is computationally tractable when we look for sharing a taxi among two people with the option of in-route picking up.Furthermore, sharing solutions involving more people are not tractable, but do not provide a significant improvement with respect to solutions involving only two people.Similar results have been demonstrated using a theoretical model analyzing Autonomous Mobility On Demand system, demonstrating that a combined predictive positioning and ridesharing approach is capable of reducing customer service times by up to 29% [52].
An extensive simulation infrastructure for ride-sharing analysis is suggested in [53], allowing the initialization and tracking of a wide variety of realistic scenarios, monitoring the performance of the ride-sharing system from different angles, considering different stakeholders interests and constraints.e simulative infrastructure is claimed to use an optimization algorithm that is linear in the number of trips and makes use of an efficient and fully parallelized indexing scheme.
In another study by Cici et al. [8] mobile phone data and social network data were used to estimate the benefits of ride sharing on the daily home-work commute.Mobile phone data are easier to collect than GPS traces, and have a higher penetration, providing a good sample of a city mobility.Social network data are used to study the effect of friendship on the potential of ride sharing, showing that if people want to travel only with friends then expected ride-sharing benefits are negligible.On the other hand, when people are willing to ride with friends of friends the achieved efficiency resembles this of the variant that also allows riding with strangers (implying that safety issues may have significant effect on the actual success of a ride-sharing solution).
A similar study has been presented by [54] calculating shareability curves using millions of taxi trips in New York City, San Francisco, Singapore, and Vienna, showing that a natural rescaling collapses them onto a single, universal curve.
e authors presented a model that predicts the potential for ride sharing in any city, using a few basic urban quantities and no adjustable parameters.e issue of pricing policies in ride-sharing services have gained significant attention recently.with the booming expansion of commercial ride-sharing services such as Uber, Ly and others.e work of [55] studies dynamic pricing policies for ride-sharing platforms.As such platforms are two-sided this requires economic models that capture the incentives of both drivers and passengers.In addition, such platforms support high temporal-resolution for data collection and pricing.
e combination of the latter requires stochastic models that capture the dynamics of drivers and passengers in the system.
In [56] the authors highlight the impact of the demand pattern of the underlying network on the platforms optimal profits and aggregate consumer surplus.In particular, the authors establish that both profits and consumer surplus are maximized when the demand pattern is balanced across the networks locations.In addition, the authors show that profits and consumer surplus are monotonic with the "balancedness" of the demand pattern (as formalized by the patterns structural properties).e work of [57] proposes a recommendation framework to predict and recommend whether and where should ride-sharing users wait in order to maximize their chances of getting a ride.In the framework, a large-scale GPS data set generated by over 7,000 taxis in a period of one month in Nanjing, China was utilized to model the arrival patterns of occupied taxis from different sources.
e recent work of Alexander and Gonzalez [11] uses smart-phone data in order to model the behavior of an urban population in Boston, in an attempt to assess the impact of efficient ride-sharing service on the urban traffic, and specifically on the expected levels of congestion. is data-centric approach leads to a highly accurate modeling of the mobility patterns in the city.However, much like most of the recent work on this subject, the researchers have followed an aggregative modeling, that tries to find the static long-term definitive mobility patterns, purposely omitting any dynamic fluctuations.
In another study, researchers from the Microso Research Center [58] analyzed the ride data of 12,000 taxis during 110 days in order to model the mobility patterns of potential passengers.Using this probabilistic model, the researchers were able to build a recommendation system for taxi drivers that would maximize their profits (yielding an overall 10% improvement in the overall profits) and a second recommendation system for passengers, advising them where to turn in order to maximize their chances of finding a vacant taxi (with 67% accuracy).A similar research can be found in [59].
A recent review of dynamic ridesharing systems [60] focused on the optimization problem of finding efficient As a rst step in modeling the feasibility and e ciency of ride-sharing schemes using taxi rides in New York City, a comprehensive understanding of the data itself is required.How do the rides distribute over the various geographic locations?Are there patterns that emerge when observing the O'D matrix of the various rides?Can we use those in order to predict the destinations of passengers when they board a taxi at a certain location?e gures below attempt to answer some of the Power Low distribution) strongly implying on the potential of a network-centric approach as the method of choice with respect to the modeling of the dynamics of the data.Some of the following illustrations analyzing the dataset's statistical properties were rst presented in our previous publication [70].ese illustrations appear here to contribute to the reader's understanding of the nature of the data and the behavior dynamics it encapsulates.
Figure 1 reports the distribution of rides per day of the week and per hour of the day.As can be seen in the gure, the number of rides has a far-from-uniform time distribution.More speci cally, the number of rides is higher in the middle of the week and is lower during the weekend.In addition, the daily rides distribution peaks, as expected, in the morning hours and around 6-7 pm.
We use the set of taxi ride records to construct a "rides network" 1 , 2 , comprising of | | nodes representing equally sized squared regions of New York City, and a set of | | edges, such that each edge ( , ) ∈ corresponds to a connection between two regions , ∈ if and only if there exists at least one ride from region to region in the time-frame referred to by the network.Such a connection exists if and only if a ride started at some time departing at and reaching , or vice versa, such that 1 ≤ ≤ 2 is contained in the time period de ned for the network 1 , 2 .
As we create edges only based on rides that were created during a certain period of time the network may change (and quite signi cantly so) for various values selected for 1 and 2 .As the time period de ned by these values increases the matches between passengers and drivers. is ride-matching optimization problem determines vehicle routes and the assignment of passengers to vehicles considering the con icting objectives of maximizing the number of serviced passengers, minimizing the operating cost, and minimizing passenger inconvenience.Another study [61] presented an algorithm that increases the potential destination choice for ride-sharing schemes set by considering alternative destinations that are within given space-time budgets.
On a similar note, a recent study [62] analyzed the bene ts of meeting points in ride-sharing systems, investigating the potential bene ts of introducing meeting points in a ride-sharing system.With meeting points, riders can be picked up and dropped o either at their origin and destination or at a meeting point that is within a certain distance from their origin or destination.e increased exibility results in additional feasible matches between drivers and riders, and allows a driver to be matched with multiple riders without increasing the number of stops the driver needs to make.A similar approach for the optimization of such meetings points was discussed in [63].
e challenge of rides-matching was also discussed in works such as [64,65] or [66], which have demonstrated that 2,000 vehicles (15% of the taxi eet in New York) of capacity 10 passengers (or 3,000 vehicles of capacity of 4 passengers) can serve 98% of the New York taxi demand within a mean waiting time of 2.8 minutes and mean trip delay of 3.5 min.
A path merging approach, which instead of merging rides to and from the same locations calculate new paths which go through the same locations of the original trips, at the same order, and thus improves the ability to merge rides, was discussed in [67].
In a recent theoretical study [68] where the combinatorial optimization of ridesharing matching problem was tackled using the proof of the equivalence between classical centroid clustering problems and a special case of set partitioning called metric k-set partitioning, in which an e cient expectation maximization algorithm was used to achieve a 69% reduction in total vehicle distance, as compared with no ridesharing.
A fully decentralized reputation-based approach is discussed in [69], using a peer-to-peer architecture to provide self-assembling ride-sharing infrastructure capable of functioning with no central authority or regulator.

Dataset and Methodology
Our analysis was performed using a dataset of 14,776,615 taxi rides collected in New York City over a period of one month (January 2013) [19].Each ride record consists of the following elds: pick-up time, pick-up longitude, pick-up latitude, dropo longitude, drop-o latitude, number of passengers per ride, average velocity, and overall trip duration.Times granularity is second-based and positional information has been collected via GPS technology by the data provider.From this raw data sample, we omit records containing missing or erroneous GPS coordinates, as well as records that represent rides that started or ended outside Manhattan, yielded a cleaned dataset containing 12,784,243 rides.the degrees of the nodes of the network .A 'degree' of a node ∈ is the number of nodes is connected to through edges in , where such nodes represent the actual destinations passengers who boarded a taxi at location chose to go to.Namely, a degree of a node represents, therefore, the number of possible destinations a passenger boarding a taxi on location may chose to go to.An important observation is that the popularity of a node as re ected both by its in-degree (i.e. the number of origins passengers depart from in order to get to ) as well as by its out-degree (i.e. the number of destinations passengers leaving may go to) is independent of the geographic size or shape of node -as all nodes refer to equally-sized square regions.
Interestingly, analyzing the distribution of this property reveals that whereas there are some nodes with a high degree (probably corresponding to main train stations or large administration facilities) the vast majority of the nodes have a very low degree.In other words -for the vast majority of the locations in New York, it is extremely easy to predict the destination of a passenger starting his ride there (as a low degree implies a low number of possible destinations, and a high chance of guessing the correct one). is observation is quite remarkable, as it implies that taxi users are much more predictable than may seem.Indeed, it seems that when one boards a taxi, one's destination can quite accurately be predicted.
Speci cally, in 24% of the possible origins of a taxi ride in New York City, the number of possible destination of a passenger leaving these origins is on average 5, and in 43% of the origins it is 10.A quick arithmetics yields that if at some point in time we would pick a random person just boarding a taxi anywhere in New York, we would have more than 7.5% of network is expected to contain more edges, with the densest network received for = −∞,∞ being the network that is based on the complete aggregation of all the rides.In order to encapsulate the tra c properties of a certain point in time we would observe the time period circumventing .Similarly, in order to analyze the network dynamics, that is -the way it changes over time, we would analyze the evolution of the network properties for networks created in nonidentical, yet partially-overlapping time periods.
is methodology is extensively used in Section 4.
For di erent granularity of city partitioning (re ecting through the use of di erent sizes of the square regions) di erent ride networks would be produced.However, Network eory implies that changing this parameter would not a ect the existence of various mathematic invariants such as the network's "Scale Free-ness" or its expected small diameter [71], but rather -mainly change the sparsity of the network and its number of nodes.During this work we have examined several sizes of squared-regions, ranging from rectangular regions of 0.0156 square miles in size, to 1 square mile, obtaining similar results.e analysis below is based on square tiles of 0.39 square mile (i.e. 1 square kilometer).In such a case, when taking = −∞,∞ , the network that aggregates all the rides, it com- prises 813 nodes and 58,014 edges.Figure 2 illustrates the geographical distribution of the nodes on the map of New York.Figure 3 illustrates the distribution of the number of trips on the various O-D routes in the taxi network.By weight we refer to the number of trips that took place through this edge and by Frequency we refer to the number of edges that have a speci c weight.Note the small number of edges who have more than 500 rides (approximately 5,000 edges out of 58,000 edges).Similarly, over 47,000 edges have less than 50 rides passing through them. is observation coincides well with the fact that human mobility is known to follow a power low distribution [3].
As we analyze the network properties of graph implied by the taxi rides, it is interesting to observe the characteristics of  (1) e routing-agnostic scheme is signi cantly less sensitive to the temporary changes in the infrastructure, such as detours, tra c jams, accidents, and so on.
(2) Merging rides based only on their origin and destination makes our ride-sharing policy entirely agnostic to the routing decision of the driver.Alternatively, the approach that is based on allowing rides to be merged even if they do not leave from the same origin, but are rather partially overlapping, depends on the assumption that the route of the "containing ride" indeed passes through the origin of the second ride.
is assumption in turn depends on either perfectly guessing the routing decisions of the driver, or -dictating those decisions to the driver by the ridesharing service.
(3) As a result, our routing-agnostic approach is also expected to be easier to implement in real-life scenario, as it requires less cooperation from the drivers.(4) In addition, the increased simplicity of the routing-agnostic approach makes it easier to optimize from a computational point of view.e routing-aware approach discussed in [51] has a time complexity of 2 log( ) when merging pairs of rides [72], becomes much harder when triple rides merging is allowed [73], and eventually becomes computationally unfeasible for larger numbers of rides-to-be-merged [51].(5) When comparing the merging e ciencies of our proposed routing-agnostic approach with the routing-aware one, it is shown that whereas the latter is slightly more e cient when long wait-times are allowed (increasing our proposed 73% sharability to 93% for 5 minutes maximal delay), the improvement for shorter wait times becomes signi cantly smaller (this is illustrated by comparing Figure 5 to Figure 3 in [51]).
Figure 6 shows the probability density function (pdf) of the number of rides per edge.As can be seen from the gure, the distribution is heavy tailed and seems to follow a guessing precisely his or her destination.is probability is about three times higher than rolling a "Snake Eyes" (two 1's in a 6-sides dice).See Figure 4 for more details.
In this context, it is also important to note that in this work we are less interested in the speci c characterization of nodes having high (or low) degrees, but rather -in the dynamics those values represent over time, as discussed in detail in the following sections.
In order to analyze the "sharability", or the ability to merge rides using the same vehicle at an overlapping times, we applied a simpli ed version of the methodology used by Santi et al. [51] to calculate the potential bene ts of ride sharing: Let = , , , , = 1 ⋅ ⋅ ⋅ be trips where denotes the origin of the trip, the destination, and , the starting and ending times, respectively.We say that multiple trips are shareable if there exists a route connecting all of their origins and destinations in any order where each precedes the corresponding .
Sharability, or 'ridesharing utilization' is expressed in terms of the number of rides that can be 'merged' , as a function of the guaranteed quality of service, expressed through the number of latency minutes agreeable by the passengers -the maximum time delay in catching a ride and arriving at destination, representing the maximum discomfort that a passenger can experience using the service.In other words, given a predened level of discomfort passengers are willing to undertake (expressed in a prolonged wait-time), the ride-sharing utilization depicts the portion of rides that are redundant and can be saved by merging with other rides to and from the same locations.
Our analysis aims at nding pairs of rides, which are represented in the network by the same edge (i.e., have the same origin and destination), that can be shared.For each edge, we examine its corresponding set of originating rides, and count the number of ride pairs that can be merged, taking into consideration the maximum time delay parameter.
e main di erence between our approach and the one discussed in [51] is that we only merge rides that leave the same origin 'tile' and go to the same destination 'tile' .ere are several advantages for this approach:  4: e distribution of nodes degrees in the taxi rides network, representing the number of possible destinations a passenger boarding a taxi at some location in the city may chose to go to.Note the surprisingly high number of origins with very low degrees -number of possible destinations.e nodes' degrees are una ected by the size or shape of the actual geographic region they refer to as all nodes refer to equallysized square patches of the city.Journal of Advanced Transportation some cases, even the merging of two rides at a time might have resulted in overcrowding of the vehicle.
In order to assess the e ect of these two potential phenomena over our analysis, we can observe the distribution of the number of passengers per trip in the data.While doing so, we arti cially segregate trip made using private taxi caps (that can board up to 4 passengers) and trips made with larger vehicles (capable of boarding from 5 to 48 passengers): We examine two approaches for the assessment of the actual theoretical ride-sharing utilization.
Greedy merging, assuming an even distribution of number of passengers: in this approach, we analyze the merging process in a two-phase greedy approach.In the rst phase, we assume that all the original trips that can be merged are indeed merged, and are done so under the assumption that the number of passengers is distributed approximately uniformly, with respect to the various geographic locations.en, the resulting merged trips are merged again, if possible.
is analysis approach should result in a lower bound for the actual ride-sharing utilization, as in real life our ride-matching algorithm would aspire for maximizing the number of merged rides, where possible.
Optimal merging: in this approach we assume that whenever two rides are merged, the number of passengers they have receives the value that would result in the most e cient merging scheme possible (con ned to the overall distribution of the numbers of passengers for rides).is analysis approach should result in an upper bound for the actual ride-sharing utilization, as in real life there will be times where the only way to merge rides would be in a suboptimal way.
Following is a detailed analysis of both approaches: Greedy merging: the expected distribution of the merged trips for the rst phase would be: (i) In 24.23 percent of the pairs, we would merge a trip that has 1 passenger with a trip that has 1 passenger.is results in a merged trip of 2 passengers.(ii) In 23.84 percent of the pairs, we would merge a trip that has 1 passenger with a trip that has 2 passengers. is results in a merged trip of 3 passengers.ese trips cannot be merged, assuming the greedy 2-step approach.(iii) In 15.48 percent of the pairs, we would merge a trip that has 1 passenger with a trip that has 3 passengers.is results in a merged trip of 4 passengers, that cannot be further merged.(iv) In 5.87 percent of the pairs, we would merge a trip that has 2 passengers with a trip that has 2 passengers. is results in a merged trip of 4 passengers, that cannot be further merged.(v) In 30.58 percent of the pairs, we would not be able to merge the trips, has these would be pairs that either power-law.In other words, most of the edges (i.e., pairs of origin-destination) induce a small number of rides, while a small number of edges induce an extremely high number of rides.
Figure 5 presents the percentage of shareable rides as a function of the maximum time delay parameter.Results are encouraging: more than 70% of the rides can be shared when passengers can accept a delay of up to 5 minutes.As expected, the bene t of ride sharing increases when the passengers are willing to take a higher discomfort, and the percentage of shareable rides is more than 90% when passengers can wait 30 minutes or more.
It should be noted that the simpli ed analysis illustrated in Figure 5 assumes that two rides that took place at the same time can always be merged, regardless of the number of passengers in each ride.Since the average number of passengers per ride is 1.7 and most of the rides involve a single passenger, the number of saved rides could have been even higher by merging more than 2 rides at a time.On the other hand, in  ride-sharing utilization of the current supply and demand scheme (as appears in Section 4.2), as well as (b) serve as a prediction method for estimating changes in this utilization, in the near future, up to a few hours (as shown in Section 4.3).

e Need for Dynamic Ridesharing Optimization and
Prediction.Mainstream transportation analysis models (such as [74][75][76][77][78] and many more) approach the problem of transportation forecasting and analysis through the use of long-term data aggregation.Simply put, the dominating approach today sees the accurate approximation of the "steady state", or "average state", of the transportation system as the most e cient way to understand the behavior of the system, and to use this understanding in order to reach better decisions [79].Such decisions are o en concerned with the locations, type, or size of new infrastructures that should be built, large-scale budgets investment alternatives or long-term policy revisions [80].
When examining the rapidly expanding eld of ridesharing this approach su ers an inherent limitation, as it is not well adequate for the nature of decisions ridesharing operators and regulators are required to make.As ridesharing uses existing roads and metropolitan infrastructure, does not require setting xed-place stations of xed-paths, and o en uses existing vehicles, it is mostly located "outside" the realm of these analysis methodologies.Furthermore, ridesharing introduces a new set of factors that traditional methods usually cannot easily cope with, such as dynamic changes in fares, which may signi cantly in uence network properties such as global congestion [81].
Analyzing ridesharing using the existing models would be ine cient at best.Taking the static approach using a longterm aggregation of the supply and demand would inevitably result in a model that would be optimized for the average states of the rides network, ignoring its inherent volatility (that is caused due to daily and weekly patterns as well as irregular spikes created by events such as street-parties, sports events, etc.).
Interestingly, as shown in Section 4.2, the dynamic rides network spends only an extremely small portion of the time in those average network states.Furthermore, our analysis demonstrates that overlooking the dynamic nature of the trafc scheme disregards the vast majority of the network states, as manifested in the O-D matrix, as well as the possible ridesharing utilization of it.Speci cally, this phenomenon is demonstrated in Figure 7 that reveals that the system spends approximately 33% of the time in states that have a potential utilization of either 50% above the monthly average, or 50% below it.
Ignoring this dynamic nature of the urban rides system through the use of a static analysis model (which is the mainstream approach of today) will be inherently limited in its e ciency.e key to unlocking the development of e ective next generation ridesharing systems, therefore, lays in an analysis that is rooted in the understanding of its dynamic nature, and the way to use it in order to develop proactive strategies that dynamically adapt their forecast using an ad-hoc analysis of the network's state.
(a) have one of the trips with 4 passengers, or (b) having a trip with 2 passengers and a trip with 3 passengers, or (c) having two trips having 3 passengers each.
e second phase will, therefore, be able to merge another 0.2423 ⋅ 0.2423 ⋅ 100 = 5.87 percent of the original pairs, which re ects a 5.87 ⋅ 2 = 11.74 percent increase.Overall, this would sum up to 100 − 30.58 + 11.74 = 81.16percent of the naive potential utilization (namely, the utilization that is calculated under the assumption that all rides are merge-able, and that we do not merge more than two rides.
Optimal merging: assuming an optimal merging scheme we can calculate the merging of the relevant New York City data as follows: (i) e 10.84 percent of the rides that have 4 passengers cannot be merged at all.(ii) e 24.22 percent of the rides that have 2 passengers would be merged among themselves.(iii) e 15.72 percent of the rides that have 3 passengers would be merged with a matching 15.72 percent of the rides that have 1 passenger.(iv) is would leave another (49.22 − 15.72 =) 33.5 percent of the rides, that have 1 passenger.ese rides would be merged in a 4-to-1 ratio, virtually implying a 33.5 ⋅ 1.5 = 50.25 percent save.
Altogether, the actual optimal theoretical utilization would sum up to 24.22 + 15.72 + 15.72 + 50.25 = 105.91percent (namely, under the assumption of optimal merging the bene t from merging 4 rides of a single passenger more than compensates the loss due to rides with 4 passengers.erefore, the actual theoretical utilization for the New York City taxi dataset, denoted as , would be bounded by: such that is the potential utilization that is calculated throughout this work, using the method that was described above, ignoring the e ect of multiple merges, as well as the e ect of over-population of rides.

Analyzing the Dynamic Ride-Sharing Network
In the previous section we have described the taxi data that were used for this study, illustrated various mathematical properties of these, and discussed the way they can be analyzed for the purpose of assessing the potential ability of ride-sharing schemes to merge rides between similar locations (denoted as the ride-sharing potential utilization).In this section we demonstrate the inability of static analytic approaches to eciently model this utilization and suggest an alternative approach, that is based on the construction of multiple network-snapshots, derived using a sliding-window based aggregation of the taxi rides.We show that this technique can serve as a valuable methodology for both (a) assessing the potential (1) 0.8116 ⋅ ≤ ≤ 1.0591 ⋅ properties of this dynamic network, which we show are not only highly correlated with the potential ride-sharing utilization at the corresponding points in time, but can also predict the utilization few hours ahead of time.
We divide the rides dataset into hourly aggregated snapshots, creating 31 × 24 = 744 sub-networks, each is denoted by , +1 , such that represents the -th hour in the month.An illustration of one such sub-network is shown in Figure 8. Intuitively, we see that most of the nodes are highly connected, but a considerable number of nodes are connected to only one other node in the network.
Similarly to Figure 5 in which the potential bene t of ride-sharing over the entire data was shown, we have performed the same calculation for every hourly network separately.Figure 9 presents the average potential ride-sharing A potential example for this approach can be found in [82], containing a computational study aimed at identifying environments in which the use of "dedicated drivers" are most useful.As urban supply and demand environments are constantly (and signi cantly) changing (as demonstrated in our analysis of the New York taxi data), it is therefore likely that a strategy that detects the times where the use of such drivers is most e cient and upon such detection -launches these drivers to supply the demand (this can be done using a dynamic change in the commission drivers are required to pay, giving such drivers a temporary priority in certain roads, or forbidding them from granting service on a regular basis expect from when their service is required) -would achieve a superior performance compared to a static strategy that does not react to such changes.
Another example can be the work of [83] in which the size of a carsharing eet is optimized in order to maximize the monetary operational savings.Again, such an approach reaches the global optimization assuming a static approach, whereas the incorporation of the dynamic nature of the system could yield a signi cant. is could be done for example by allowing the eet operators to dynamically use the services of a public service (such as Uber or Ly ), rented cars, or private drivers.Using such service when needed will allow to reduce the ongoing basic cost.

Dynamic Network Analysis.
As discussed in previous sections, one of the main hurdles that prevents the wide adoption of ride-sharing might be the high volatility of its potential utilization, and the extreme unpredictability of it.In this section, we propose to mitigate this problem by using a dynamic network that represents the evolving travel patterns in the city.at is, a multitude of rides-networks, representing data of xed-length periods of time, each of which starting at di erent points in time of equal distances.Such "sliding window" approach is useful for tracking changes in various  (4) Average Betweenness Centrality: each node in the network has a calculate-able betweenness centrality score [84], representing the portion of "shortest paths" between all the node-pairs in the network, that pass through .Formally, for a network node ∈ this is de ned as: where , , is the total number of shortest paths from node to node and , ( ) is the number of those paths that pass through .Averaging these values yields an estimation of the network's e ciency, with respect to the number of nodes whose adequate availability is required in order to preserve the network's ability to maintain e cient ow without increasing the length or durations of trips between arbitrary points [28,85].
(5) Average Closeness Centrality: the closeness centrality of a node [86] is a measure of centrality in a network, calculated as the sum of the length of the shortest paths between the node and all other nodes in the graph.us the more central a node is, the closer it is to all other nodes.For a node ∈ , the measure is de ned as: Averaging the closeness centrality over all the network's nodes yields an estimation of the compactness of the network, that is -how short it is to travel between an arbitrary pair of network nodes.(6) Average Eigenvalue Centrality: eigenvalue centrality [87] (also called eigencentrality or eigenvector centrality) is a measure of the in uence of a node in a network.It assigns relative scores to all nodes in the network based on the concept that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes.
For a given graph with an adjacency matrix the centrality score of a node ∈ , denoted as ( ), is de ned as where ( ) is a set of the neighbors of and is the graph's largest positive real eigenvalue.is can be accurately estimated by taking the ℎ component in the eigenvector that corresponds to the largest positive real eigenvalue.
e use of eigenvalues to analyze propagation phenomena over networks can be see for example in [88], where its usability for predicting the epidemic potential of viruses is demonstrated.
We use a linear regression to t these features for the calculated potential utilization, as well as a multiple linear regression to t the potential utilization for the entire set of network properties.As can be seen in Figure 10, these features show a (2) , utilization taken on all hourly networks, as a function of the maximal delay allowed (notice that this is in fact a lower bound, since we arti cially prevent passengers from being merged with rides "outside" their hourly network).It can be seen that this produces a lower utilization than the previous calculation using the overall aggregation (approximately 10% decrease), caused by the fact that each pair of nodes has a lower probability of being connected.
We now extract a set of six common network properties for each tra c-network , , to be used as the features values representing each network.ese features encapsulate various topological aspects of the network and enable us to project each hourly-collection of tra c data (containing a large and apriorically unknown number of rides) into a single coordinate in a 6-dimensional feature-space.e gure is based on the result presented [70].
matrix over time.e use of eigenvalues to analyze propagation phenomena over networks can be seen for example in [88,89], where its usability for predicting the epidemic potential of viruses (both human and computer-based) is demonstrated.Additional mathematical analysis on the role of eigenvalues in the analysis of network structures can be found in [90]. is property, known to encapsulate various behavioral characteristics of the people whose mobility patterns the network is depicting, displays a clear (and easy to predict and understand) daily pattern, on top of which signi cant and erratic spikes are added, as can be seen in Figure 12. ese spikes seem to appear sporadically, lacking any clear patterns or internal regularity, implying again the need for understanding the dynamic aspects of the network.Now, let us perform a similar analysis over the potential ride-sharing utilization, looking at its evolution over time.e results of this analysis, presented in Figure 7, clearly demonstrate a similar dynamics to the couple of network properties mentioned earlier.Speci cally, it can be seen that alongside the dominating daily pattern (and weaker, but still easy to see, weekly one), there are clear changes in the potential utilization.
ese changes take various shapes and forms, from sudden decrease in the daily peak (as can be seen around = 1400), to changes in the intra-weekly peaks (the rst week analyzed showing a 'U-shaped' form among its days, the second week showing an equal-peaks dynamics, and the third week showing an extremely high Monday and Tuesday, and weaker Wednesday, ursday and Friday), and others.Surprisingly, the magnitude of these changes may even exceed the dominating daily pattern.For example, the change between the rst Tuesday (around = 200) and the third Tuesday ( = 4100) is 90% compared to the monthly average, whereas the average change in potential utilization between workdays and weekends is only 70%.
high correlation with the potential utilization for this hourly network (the gure reports the adjusted squared to account for the di erent number of predictors).

Ride-Sharing Potential Prediction.
In the previous section, we have shown that the monthly rides can be partitioned into hourly aggregative snapshots, each of di erent characteristics (and speci cally, network oriented ones), and di erent ridesharing potentials.In addition, we have demonstrated the correlation between these network properties and the ridesharing potentials of the rides the corresponding networks are implied from (as appears in Figure 10).In this section, we discuss whether this correlation can also be used for predictive purposes.Speci cally, can we deduce from the current values of various network properties how the change in the ridesharing potential compared to its current value.
In order to do so, we rst analyze the evolution of various network properties of the hourly aggregative rides network , +1 over time.Figure 11 illustrates the evolution of the mean nodes' degree of the rides network as a function of time (that is, the average over all of the network's nodes' degrees, for all the dynamic hourly networks).For the sake of clarity, we have increased the time granularity used in the analysis, so that the hourly networks are now generated with 5-minute intervals, thus signi cantly overlapping, and subsequently generating a smoother and easier to read graph.e change from the monthly average of the mean degree as a function of time is portrayed, clearly showing a dominant daily pattern.However, on top of this pattern we can see signi cant hourly uctuations, tens of percent in magnitude.is reveals the existence of strong volatility in the rides dynamics alongside the predicted daily and weekly dynamics.
A similar dynamics is observed when analyzing the evolution of the largest eigenvalue of the rides-network' adjacency  10: Adjusted 2 of the correlation between seven features of the hourly rides network and the potential ride-sharing utilization for this network.Most features have low quality of t, but the combined mixture of all seven results in a remarkably high correlation ( 2 = 0.82).Features are (1) the number of nodes in the network, (2) the number of edges, (3) the averaged degree, (4) the averaged betweenness centrality, (5) the averaged closeness centrality, and (6) the averaged eigenvector centrality.compared to the rides between and + 1. at is, the change in the momentary ride-sharing utilization between "now" (time ) and "in an hour" (time + 1).It is easy to see that this representation reveals a clear and strong negative correlation between the two.
Trying to increase our lookahead and predict the change in the dynamic ride-sharing utilization from a 2 hours timeframe, Figure 14 illustrates the correlation between the value of the largest eigenvalue of the rides network at time and the change in the potential utilization between time (aggregated to + 1) and + 2 (aggregated to + 3).Again, a clear strong negative correlation is easily visible.For example, in times where the value of the largest eigenvalue of the rides network is smaller than 0.012, the potential ride-sharing utilization was At this point, we ask the following question: "can we nd a statistical correlation between current values of the rides network properties and future values of the potential ride-sharing utilization?". is question is of interest, as such a correlation would allow us to predict future changes in the potential utilization, providing valuable tools for both ride-sharing users, operators, and regulators.
We rst address this question by comparing network properties values at time with potential utilization of at time + 1 (1 hour prediction).Figure 13 presents an example of such a comparison, in the form of a scatter plot showing for each point in time a dot whose X-axis is the mean nodes' degree of the network , +1 and whose Y-axis is the change in the potential utilization of the rides between + 1 and + 2 (400 m and 800 m, denoting the pick-up and drop-o distances that still allow rides to be merged), 3 values of time tolerance (30 s, 2 minutes and 5 minutes, denoting the time passengers would be willing to wait in order to merge their rides) and 3 values of prediction horizon (no prediction, 1 hour prediction and 2 hours prediction).e results include a scatter plot of the data, e ects of the various properties, ANOVA, and other statistical analyses as appearing in Supplementary Figures 17-34.
e e ectiveness of the prediction as a function of the prediction horizon (i.e., the distance between the point in time where the prediction is calculated and the point in time this statistically guaranteed (during the month of the observation) to signi cantly increase in the coming 2 hours.Similarly, largest eigenvalue of 0.014 would indicate a signi cant decrease in the ride-sharing potential within the next 2 hours.
Figures 13 and 14 are based on the analysis of the rst 3 weeks of January 2013.ese observations were then validated using the last week of January, as can be seen in Figures 15 and 16.
Once demonstrating the predictive power of the dynamic network's properties with respect to the network's future ride-sharing potential, we can now construct a multiple linear regression model that would t all of these 6 properties.We have created 18 models, for 2 values of distance tolerance   Supplementary Figures 41-46, created for a scenario with distance tolerance of 800 meters, time tolerance of 5 minutes, and prediction horizon of 2 hours.

Summary and Future Work
As the popularity of ride-sharing systems grow, its users-base gradually transform from early adopters to mainstream consumers.Whereas the rst are characterized by a keen a ection prediction refers to) is illustrated in Supplementary Figures 35-40, showing the 2 of the model (both ordinal and adjusted) as a function of the time horizon (between 0 and 12 hours), for several values of distance tolerance and time tolerance.It can clearly be seen that in general (and as expected) the accuracy of the model decreases with the increase in the prediction horizon used (that is, when the model tries to predict the behavior of the system further into the future).
e e ect of each feature, depicted by the adjusted response plot for its various values, is presented in  (an NP-hard optimization problem) in real-time and developed heuristics to quantify potential ride-sharing demand.
ese algorithms reroute trips in order to match them with similar, overlapping trips, explicitly capturing demand for ridesharing relative to passenger's willingness to experience prolonged travel time.However, finding an optimal solution to this problem is not computationally plausible (even under extreme limitations of the problem's space [94]), and even the calculation of approximation heuristics would be computationally intense when done ad-hoc.erefore, the ability to use current traffic dynamics in order to predict properties of an efficient near-future ride-sharing scheme -such as the method we propose in this work -can be used to make this process significantly more efficient [95,96].
Future work should focus on the analysis of the correlation we find in this paper, trying to detect traces of possible causalities.Are network properties merely correlated with ride-sharing utilization, or do they possess an active influence over it?Evidence of the latter would enable us to offer urban designers and policy makers an innovative tool for encouraging and facilitating the adoption of ride-sharing systems.Alternatively, incentives and fees could be better moderated, used as "remedies" in the case of a change in the travel patterns, in order to balance it and maintain a sustainable ride-sharing paradigm.Another approach could be the pipelining of the dynamic ride-sharing utilization forecast as the input of models intended to predict the benefits of ride-sharing on the overall traffic [97].
Recent works have demonstrated the benefit of tracking the network's dynamics in order to improve collaborative decision making [98,99].A possible continuation of the current work can analyze ride-sharing optimization as a case of decentralized decision-making process, using the technique that is presented here.
As the prediction of future ride-sharing potential is ultimately needed for optimization purposes (of the overall travel time, congestion or any other utilization metric) of a dynamic coverage problem, comparing the performance of any proposed method to the theoretical results that are available for various types of such decentralized collaborative coverage challenges (see [100][101][102][103][104][105] and specifically [106]) can also be of value.
Finally, as our suggested approach is agnostic to the actual route taken by the drivers it would be interesting to see whether the introduction of ride-sharing affects additional factors such as detours (that for a merged ride may become cost-effective), usage of toll-routes, etc.

Data Availability
e taxi data used to support the findings of this study, encompassing a dataset of over 14 million individual taxi trips taken in New York City, are accessible at the NYC Taxi repository [19].

Conflicts of Interest
e authors declare that they have no conflicts of interest.
for innovative solution that are powered by cutting edge technologies and aim to disrupt the governing paradigm in the field, the latter are o en interested mainly in the advantages these services can offer them with as smallest change in their habits as possible.With respect to ride sharing, these new users are willing to sustain far less wait-time and are extremely more susceptible to inconvenience than their preceding tech-savvy innovation-hungry early users.e key to a scalable mature ride-sharing infrastructure is found in the level of service such systems will provide, mainly measured by the availability of vehicles when they are needed.Alas, the availability maximization is immediately linked to a reduction in the financial savings that the service can offer.In other words, a further expansion of ride-sharing is being constrained among others by the ability to offer high utilization, defined as the ability to "merge" similar rides in a way that would not require the passengers to sustain more than a minimal delay in their trips.
is optimization problem was extensively discussed in previous literature (comprehensive literature review can be found in Section 2).However, the conventional approach to this problem assumed a static environment which needs to be optimized.By finding the optimal number of cars, or optimal pricing policy, the efficiency (or potential) of the system was assumed to be calculable in a robust way -a key component in the decision of operators where to deploy new systems, in the design of relevant urban legislations by municipal policy makers, and of course in the likelihood of passengers to use these services.
In this work we discussed the dynamic nature of ride-sharing systems.Specifically, we were interested whether ride-sharing utilization is stable over time (which coincides with the implicit assumption of most previous works in this field) or does it undergo significant and o en rapid changes (which would imply the inherent inefficiency of schemes assuming a static nature).We modeled the ride-sharing utilization using the known New York Taxi dataset and clearly show that it is highly dynamic, and that any system that would be designed for the "average" utilization would be highly inefficient.
We then show that assuming a dynamic approach the taxi data can be modeled as a sequence of data-snapshots, resulting in a dynamic traffic-network model.Several recent works have shown that network features can effectively be used to predict a variety of events and properties, e.g., emergency situations, individuals' personality and spending behaviors [91,92].We used a similar technique in order to project the taxi data as into a feature space comprised of topological features of the dynamic network implied by this traffic.is (dynamic) feature space is then used to model the dynamics of ride-sharing utilization over time.
Using this approach we were able to demonstrate a clear correlation between the utilization of the ride-sharing system over time and several topological features of the network it creates.In addition, we demonstrated that the potential benefit of ride sharing expressed as the percentage of rides that can be shared with a limited discomfort for riders can also be predicted a few hours in advance.Such prediction can be used as a tool for an accurate short-term forecasting of the ride-sharing potential in cities and metropolitan areas.
Researchers in [8,51,93] and others have focused on addressing the computational challenges of trip-matching

F 1 :
Probability Density Function (PDF) of the number of rides per day of week/hour of day.A ernoon peaks are centered on average around 7 pm.

F 2 :
Illustration of the rides network , portrayed on the map of NYC.It can be seen that the network has high density through the city, with a few empty spots in Staten Island.

F 3 :
Edge weights of the taxi rides network , denoting the number of trips per edge (namely, between every two nodes in the city).e X-axis denotes the number of trips per edge (representing a pair of origin-destination nodes), and the Y-axis (shown in a logscale) represents the number of edges who have such number of trips.

(i) 49 .
22 percent of the trips have 1 passenger.(ii) 24.22 percent of the trips have 2 passengers.(iii) 15.72 percent of the trips have 3 passengers.(iv) 10.84 percent of the trips have 4 passengers.

F 5 :
Percentage of merged rides (for the entire network).

54 F 6 :
Probability Density Function of the number of rides per edge.

F 7 :
Dynamics of the potential ride-sharing utilization over time.X-axis denotes the time, given in 5-minute granularity.Y-axis denotes the change of the potential utilization compared to its monthly average.

F 8 :
An illustration of the rides sub-network 144 , 145 , denoting the structure that is implied by the aggregation of the rides between the 144-th and the 145-th hour of the month.

( 1 ) 2 )
Number of Nodes: the number of nodes in the network , denoted as | |, representing the number of unique pick-up and drop-o locations of rides made during this time window.Note that although all the networks refer to the same dataset, and the same geographic environment, di erent networks may have di erent values of | |, since at di erent time-seg- ments di erent locations may be "active".(Number of Edges: the number of edges in the network , denoted as | |, representing the number of unique pick-up to drop-o pairs of rides made during this time window.is is also the number of nonzero elements of the temporal O'-D matrix that is derived from this network.(3) Network Density: the average degree of the network's nodes, de ned as | |/| |. is property represents the average number of unique drop-o locations per pick-up location (and vice versa) and is associated with the predictability of rides made during this time window, and is also related to the system's entropy.

F 9 :
Potential of ride-sharing utilization, measured as the percentage of potentially merged rides (averaged over all subnetworks), as a function of maximal delay agreeable by the passengers.

F 11 :F 12 :
Dynamics of the mean degree of the rides network nodes.X-axis denotes the time, given in 5 minutes granularity.Y-axis denotes the change of the mean degree of the network compared to its monthly average.Dynamics of the largest eigenvalue of the rides-dynamic network over time.X-axis denotes the time, given in 5-minute granularity.Y-axis denotes the change of the largest eigenvalue of the network compared to its monthly average.

F 13 :
Change in potential ride-sharing utilization (Y-axis), 1 hour prediction, as a function of the mean degree of the rides network (X-axis).

F 14 :
Change in potential ride-sharing utilization (Y-axis), 2 hours prediction, as a function of the largest eigenvalue of the rides network (X-axis).
in potential ride-sharing utilization (Y-axis), 1 hour prediction, as a function of the mean degree of the rides network (X-axis), created for the last week of the data.
in potential ride-sharing utilization (Y-axis), 2 hours prediction, as a function of the largest eigenvalue of the rides network (X-axis), created for the last week of the data.