A Space-Time Model for Demand in Free-Floating Carsharing Systems

A novel model approach is proposed to estimate the spatiotemporal distribution of demand for free-foating carsharing. Te proposed model is based on a Poisson regression model for right-censored data and estimates possibly time-varying demand rates of small subareas of a service region based on booking data with spatiotemporal information on pickups and dropofs of cars. Te approach allows operators to gain insights into the spatiotemporal distribution of demand for their service and to estimate the loss of demand due to unavailability of cars. Moreover, it can also be used as an input to improve the design of the service, through relocation techniques or to analyze the service with macrosimulation models. In addition, the approach is applied to a case study with real data.


Introduction
Carsharing is a collaborative mode of transportation that, if used appropriately, can improve urban transport services from a user and environmental perspective.Among the environmental impacts that have attracted the attention of scientists, we can include the reduction in vehicle kilometers travelled [1], emission of pollutants (according to [2] up to 56% reduction), energy consumption [2], and congestion [3].Among the social impacts, it was highlighted the reduction of the number of privately owned vehicles (according to [2] up to 13 vehicles could be replaced with one shared car).
Carsharing may be classifed into station-based and freefoating systems [4].In station-based systems, users start and end their trip at stations distributed within the service region.In free-foating systems, in turn, users pick up a vehicle parked near the origin of their trip using an app for booking and end their trip by dropping the vehicle at some chosen parking within the service region.In comparison, stationbased carsharing seems to be easier to operate because vehicles are distributed in a few known locations, while free-foating carsharing ofers the user higher fexibility, and therefore, vehicles are spread throughout the service region.
Due to its fexibility, free-foating carsharing often sufers from a mismatch between the positioning of supply and the orientation of demand; i.e., the dropof places do not correspond to where other users want to pick up the car [5].Terefore, to operate free-foating carsharing efciently by positioning vehicles where demand exists, it is crucial to know how demand is distributed across the service region.Te contribution of this article is to develop an approach specifcally suited to estimate the spatiotemporal distribution of demand in free-foating carsharing systems.
Te challenges that carsharing research faces may be attributable to the following three categories: (i) the defnition of a transport system that is congenial to the needs of users (trying to discover which these needs are), (ii) analysis of the environmental impact of this service, and (iii) the economic efciency of the supply service (aligned with the expected demand).Demand is a crucial input for tackling any of these challenges, and therefore, we believe that a good demand model estimation is essential for improving research for free-foating carsharing systems.
In this paper, we develop an approach to estimate the spatiotemporal distribution of demand in a local freefoating carsharing system.Tereby, we defne the total demand as the number of cars that would have been booked in presence of an infnite number of available cars distributed across the service region, i.e., as the number of car pickups (observed demand) plus the loss of demand due to the unavailability of cars.However, this problem is complex, as free-foating carsharing systems deal with signifcant fuctuations in demand, depending on daytime and the area of a city [6] and is stochastic since it varies even between identical circumstances.In addition, demand can hardly ever be measured directly.In most cases, we only dispose of data on efective bookings, while data on prematurely cancelled bookings are often unstructured or incomplete.Terefore, the total demand needs to be estimated based on incomplete data, using additional assumptions on how people search and decide for booking cars and advanced statistical techniques.Our approach requires the following four inputs, which we believe to be available in most cases: (i) position and time of available cars and pickup place and time, (ii) study area divided into cells and time divided into intervals, with shape and dimension of the cells and the interval length being the model inputs, (iii) assumptions on how users search and decide for booking a car.Although our approach supports various assumptions, we will assume that users start searching with a preferred pickup place in mind and book only if a car is available within a circle around that place, with the radius of that circle being the model input, and (iv) specifcation of a function for how the total demand varies across time, including unknown parameters to be estimated from the data.Tis time function can include month, weekdays, and daytime efects.
Te remaining part of the paper is structured as follows.In Section 2, the relevant literature review is presented.Te postulated statistical model to estimate the spatiotemporal distribution of the total demand is described in Section 3. In Section 4, the approach is tested on real data from a service in a major city in Switzerland, and Section 5 summarizes the work together with the conclusions.

Related Research
Te last 10 years of research have seen great growth in interest in vehicle-sharing [7,8], but there are still a lot of questions that need to be addressed.Te main reason for these unanswered questions is that vehicle sharing is still missing fexible service strategies that can maintain a high level of service and guarantee long-term proftability.In addition, we summarize the literature on demand forecast methods for vehicle-sharing services.
Tis section is organized according to fve approaches.First, research that combines demand forecasting with relocation strategies is discussed.Ten, studies using multiagent simulation tools are described, followed by studies based on the stated preferences of users.After, we focus on studies that use statistical models to estimate demand, and fnally, selected research using neural networks and techniques for censored data is presented.Te section ends with the identifcation of the research gap that we try to fll with this work.

Demand Estimation Combined with Relocation Strategies.
To be successful, carsharing requires the availability of resources (in terms of available vehicles and available parking spots) in the proximity of desired origin and destination of a trip to keep the service attractive [9].As introduced earlier, our approach results in a loss of demand output that could be used as an input for a free-foating user-based relocation system, a solution for rebalancing the stochasticity of cars' spatial distribution, orienting them towards the expected demand.Te relocation problem is the most studied on the carsharing's supply side and has its frst references in [10,11].Further information about user-based relocation approaches can be found in [12][13][14][15][16].
Some papers include both a relocation strategy and its relative demand estimation methodology.Wang et al. [17], for example, adapt a model of logistical inventory management, to forecast demand and relocate vehicles in a oneway carsharing system.Tis work focuses on the specifc forecasting method.Cucu et al. [18] check the balance between the stations, investigating them in relation to the time of departure, the day of the week, the weather conditions, and the trafc conditions associated with their addresses.Stokkink and Geroliminis [9] develop a user-based vehicle relocation approach through the incentivization of customers and a predictive model for the state of the system based on Markov chains, following the concepts of the previous work of Repoux et al. [19].Tis approach is specifcally designed for one-way station-based carsharing systems, which are diferent from our free-foating case.In this approach, the input demand for Markov chains is computed using the approximation method described by Raviv and Kolka [20].Raviv's method assumes that the arrival processes of renters and returners are nonhomogeneous Poisson processes and estimates the rates using an approximation of a user-defned function.Jian et al. [21] develop a discrete choice model that includes vehicle availability as a parameter that directly afects the user's mode.In this way, supply and demand are strictly linked together.Te model aims to determine the optimal relocation decisions to maximize the carsharing proft.Te decisive variables are as follows: (i) number of vehicles relocated from node i to j at time t, (ii) number of vehicles available at node i at time t, (iii) number of users booking one-way trips from node i to j at time t, and (iv) number of users booking round trips from node i to j at time t.Tis work has an interesting supply-demand focus, but it is not applied to free-foating carsharing, and as input, it also needs the total travel demand of each origin-destination pair at each time step (sum of demand from carsharing and other transport services).

Activity-Based Multiagent Simulations.
A big fraction of current platforms, used for demand estimation of new oneway carsharing systems, is based on activity-based 2 Journal of Advanced Transportation multiagent simulations, i.e., microscale computational models for simulating the actions and interactions of autonomous agents [22] that allow modelling the interaction of supply and demand.For station-based operators, Benarbia et al. [23] propose an agent-based relocation strategy based on real-time inventory control within the framework of generalized stochastic Petri nets (PNs) and a discrete event simulation.Te work of Balac et al. [24], using a multiagent simulation tool (MATSim), investigates the efects of supply on the demand of the existing round-trip carsharing (also implemented in the one-way station-based).Te use of MATSim with relocation agents is described by Paschke et al. [25].MATSim was also used by Ciari et al. [26] to estimate the demand for one-way carsharing in the urban area of Zurich.Tis type of solution needs complex inputs, such as the entire transport network (including public transit scheduling) and population data, with which the daily plans of each user will be generated.Tis methodology analyses in detail the convenience of carsharing for each user, shows the potential demand for this service, and is useful for analyzing the possible impact of new policies and new services.On the other hand, however, it does not allow analyzing the positioning of the vehicles with respect to the users who actually use the service.Te analysis tool is therefore more complicated to set up than the one we will propose, and the goal is slightly diferent.Furthermore, this type of demand estimation has not yet been carried out for free-foating Carsharing.

Stated Preference Technique.
Many of the initial studies aimed at understanding the potential demand of stationbased carsharing in the urban modal split.For example, Catalano et al. [27] calibrate a modal split model, by stating the preference technique (SP).An overview of the literature addressed until 2013 can be found in Jorge and Correia [28].Until that time, demand estimation had been developed almost exclusively for round-trip station-based carsharing.Stated preferences technique is still used to catch behavioural patterns, related to specifc location.Recent examples can be found in [29,30].Lately, Zhou et al. [31] have also adopted a stated preference methodology to elicit consumers' valuation of vehicle self-driving capability, a factor rarely examined in the literature.Regression models indicate that latent demand for this new technology is associated with respondents' travel patterns, demographics, values, lifestyles, and environmental concern [32].Stated preference techniques provide real choice data on some individuals and can then be translated to a larger scale of the same environment, on the basis of a series of hypotheses.However, these methods require time to assess the service, they are not adaptable to territories with diferent characteristics, and they do not provide information on the latent demand related to vehicles' positioning.

Regression Models.
Descriptive statistics or regression models may also be used for demand estimation.For example, Wagner et al. [13] predict future demand for freefoating carsharing, using neighbourhood data and point of interest (POI) data.Te technique includes zero-infated and geographically weighted regression models, from which they derive indicators for the area's attractiveness.Willing et al. [33] extend that approach, by additionally including daytime and weekday efects in the model.Within this family of methodologies, but without regression models, Gammelli et al. [34] predict shared mobility demand by incorporating the censored likelihood within a Gaussian process model, with a censored likelihood function capable of handling time-varying supply.Finally, Negahban [35] propose a methodology that combines simulation, bootstrapping, and subset selection to estimate the true demand in a bikesharing service.Between these approaches, only the last two take into account how supply and demand are interconnected.Compared to the neural network approaches listed as follows, they may also be less accurate to predict future demand but easier to be applied in diferent scenarios.Many of the abovementioned studies ignore situations where there are not enough vehicles available, in which case a part of the demand is lost.Vehicle availability is a key factor to attract new users for a carsharing service, and for this reason, low availability can limit the creation of new demand [42].Demand for carsharing is difcult to model since the availability of vehicles is intrinsically dependent on the number of trips and vice versa [28].Te supply-demand interaction for shared cars is illustrated by Li et al. [43], who analyze a free-foating carsharing in a dynamic user equilibrium model.Among the abovementioned studies, Stokkink and Geroliminis [9] and Repoux et al. [19] focus on the loss of demand, Raviv et al. [20,21,24,26] focus on how supply infuences demand, doing a step over the simple supply-demand balance.Finally, Negahban [35]; Gammelli et al. [34]; and Huttel et al. [41] (together with our work) model the relationship between supply and demand, taking into consideration the supply infuence on the demand: they do it by treating the number of pickups (the observed demand) as a censored measurement of the total demand (the number of pickups plus the loss of demand due to unavailability of cars).

Related Research Conclusions.
Data detail, accessibility and reliability, high computational time, calibration, and validation still remain major challenges for travel demand estimation for carsharing systems [22].Local characteristics make it complex to standardize many of the listed methods.
For this reason, we tried to develop a method that is easy to apply and ductile, keeping high reliability.Tis ductility allows future integration with an origin-destination commuting matrix (including the estimate of its carsharing modal split).Between the listed methods, some are more easily transferable than others, but we believe that ours reaches a higher level of reapplication easiness, maintaining convincing results.Tis transferability does not limit highdetail spatial and temporal analysis.Finally, this is also one of the frst carsharing demand estimation methods suited for a free-foating service.
Carsharing with shared autonomous vehicles can provide the combined benefts of autonomous driving technology and access-based consumption [44].Te advent of self-driving vehicles will address carsharing's problems related to parking and noncompetitive access times.Solving these problems will make carsharing a service that will be almost equal to the automated-vehicles taxi service.In these future scenarios, users will not need to walk to pick a car that is parked far from them, because the car will go towards their position.Tis means that the latent demand connected to vehicles' positioning (related to accessibility) will be highly reduced, but it will still remain important to properly distribute cars following the expected demand, to further reduce the time from the request and the start of the trip.
Te problem that we reviewed does not have a resolution methodology that is universally optimal, but a series of parallel methodologies, to be used on the basis of the characteristics and constraints of the analyzed service, the availability of data, and the granularity and logic of the desired outputs.Te methodology that we are presenting, compared to the state of the art, brings together diferent characteristics and allows making demand forecasts with high resolution on free-foating carsharing services, using few data as inputs.When total origin-destination demand data is available, some of the analyzed works, such as the study by Jian et al. [21], could also be used to integrate our method and refne our output.To summarise the gap that we want to fll, our methodology for carsharing total-demand estimation covers the following four strengths: (1) suitability (and application) of the model on a free-foating carsharing system, (2) high temporal and spatial resolution, (3) transferability due to a low amount of local-geographyrelated inputs, and (4) computation of the loss of demand, given a certain supply confguration.As far as we know, previous studies never used methodologies that allowed focusing in parallel on all of the four listed targets.We believe that this combination of strengths makes the proposed methodology a valuable tool for transportation research, especially for contexts that require an agile application and do not allow for time-consuming data preparation.

Space-Time Model
We postulate and implement a statistical model to estimate the total demand for cars of a free-foating carsharing system at a given time and location within a service region.For this, the service region is divided into a grid of disjoint, bounded, and equally sized cells i � 1, 2, . . ., I and time is divided in discrete, consecutive, and equally long intervals t � 1, . . ., T. Te index i refers to a 2-dimensional square-shaped cell defned by center coordinates (e.g., 47 ∘ 33 ′ 17 ″ N 07 ∘ 35 ′ 26 ″ E) and a common side length (e.g., 250 m) and t to the interval [τ t , τ t + ∆ t ] with τ t the time stamp (e.g., June 16, 2022, 08:00: 00) and ∆ t the interval length (e.g., 1 hour).For illustration, Figure 1 shows a discretization of a major city in Switzerland into 222 adjacent cells with 500 m side length.
Square-shaped cells are chosen for simplicity and because they can be scaled up easily to any region.Te model, however, can use any other shaped cells, such as hexagons or other shaped cells better adapted to the service region.Te size of the cells should be chosen just small enough to allow precise conclusions about the spatial distribution of the total demand, but not smaller as smaller cell sizes will increase the number of parameters and therewith the expected computational complexity.For square-shaped cells, we recommend to set the side lengths between 50 m and 500 m.Te duration of the intervals, ∆ t , should be as large as the total demand can be assumed to be constant within the intervals, which will be an implicit model assumption.Setting ∆ t to 1 hour, as in our case study of Section 4, may be a good rule of thumb.
Te variable of interest is the total demand for booking a free-foating vehicle.Let D it be a random variable for the number of users considering booking a car at cell i and time interval t.We assume that Note that, throughout this document, random variables are denoted with capital letters and associated observations with small letters.
Te total demand D it is not directly observable, as when vehicles are not available, the system does not register any information of users looking for a vehicle.We try to estimate the abovementioned rate λ it based on the number of cars available for rent, denoted by C and the number of pickups observed, denoted by P, in the proximity of cell i at interval t.Te model accounts for four situations and is built sequentially: frst, we formulate a model that predicts pickups of available cars by linear combinations of the total demand rates λ it .Second, we extend that model by allowing the rates λ it to vary across time.Tird, we take into account situations where the number of available cars was potentially insufcient to satisfy the total demand.Fourth, a smoothing approach is proposed to consider that neighboring cells are expected to have a similar total demand and to simplify the parameter estimation.
To build up the model, we frst discuss specifc small case examples to outline the logic of how the model spatially links the total demand rates with the observable pickups, and how the model can deal with time-varying total demand.Afterwards, the model is generalized to any grid and extended to situations where not enough cars are available.

Spatially Linking the Total Demand and Pickups.
A picked car at cell i and interval t (that is, p it � 1) does not necessarily imply that the demand originates from cell i.It is also possible that the user would have preferred to pickup the car from an adjacent cell, where there was no car available at the time.To model the number of pickups P as a function of the total demand rates λ it , we assume that users have a preferred pickup cell (i.e., the origin of the demand) but choose with equal probability any car standing in a cell not further away than r max (e.g., 500 m) from that origin cell, as measured by the distance between the centers of the cells.If no car is close enough, the demand gets lost.Note that r max becomes operative only if the center-to-center distances of neighboring cells are smaller than r max , otherwise a pickup is simply linked to the demand from the same cell.
Te r max assumption is used for its simplicity, while the following model can accept other assumptions better suited for the considered problem.For example, r max may be set to diferent values across sections of the service region.Te prerequisite for an alternative assumption is that it defnes for each cell i an according set of cells that could be the origin of demand for a pickup from cell i.In practice, r max is generally unknown and may be determined by using the rule of thumb of Seign and Bogenberger [45] of about 300− 500 m, by conducting a survey or by choosing r max such that a goodness of ft measure (such as the likelihood criterion) is optimized.
Figure 2 shows a square grid with 5×5 cells and only one car (or more than one) at cell 18.As an example, we arbitrarily set r max such that a picked up car can be assigned to a demand from cell 18 or cells around.We call the corresponding set of cells, which is highlighted in yellow in Figure 2, as the demand area for cars in cell 18.If we assume temporarily that the rate parameters do not vary over time (i.e., λ it ≡ λ i ) and the number of cars available in cell 18 is higher than necessary to satisfy the total demand for cars (i.e., C ( Since the right hand of the above equation is a sum of Poisson distributed random variables, P 18,t is also Poisson distributed with parameter (λ 12 + λ 13 + . . .+ λ 24 ) ∆ t (see e.g., [46] Exercise 4.40).Given that the r max assumption holds, this equation collects all relevant information for estimating the parameters λ i for the situation shown in Figure 2. Te demand of cells further away than r max from cell 18 cannot be served because there is no car in proximity, and therefore, the situation does not provide information on the corresponding parameters λ it .In order to get estimates for each of the parameters λ it , we need to have several data of various moments in time where each cell is part of a demand area of a standing car.Otherwise, if a cell i is never part of a demand area, then the corresponding parameters λ it cannot be identifed.
In reality, users might have more than one vehicle in their proximity.For example, Figure 3 presents a situation where two cars are available, one at cell 9 and another at cell 18.Here, the demand areas around cells 9 and 18 overlap at cells 13 and 14.To handle such situations, the total demand rates from cells which have more than one vehicle at reach (i.e., closer than r max distance) may be split in half, such that each of the two cells with vehicles obtains half of the total demand rates.Tis results in the following two equations for the expected number of pickups: ( Alternative rules for dividing cells of intersections of demand areas could be considered.For example, if we believe that users always pick up the closest car, then we would assign λ 13,t to the cars of cell 18 and λ 14,t to the cars of cell 9. Terewith, ( It is not always possible to fully separate the demand areas with the closest car rule.If, for example, the car of cell 18 in Figure 3 is moved to cell 17, then cell 13 has the same distance to both cars.In these cases we split cell 13 in half as in equation (2).
Te second step of the model accounts for total demand variation across time.For example, total demand may change between mornings and evenings and weekdays or seasons.To account for these variations, we adapt the equations from above with further parameters.For example, suppose that the rates vary between weekdays (Monday-Friday) and weekends (Saturday-Sunday) so that the total demand rate of cell i is λ i at weekdays and λ i + β at weekends.Let v t be an indicator with value 1 if time interval t corresponds to a weekend, and otherwise 0. Equation (2) for pickups of cars from cell 9 with an additional weekend efect extends to (4) 3.2.Basic Model.Equation (4) refers to a specifc situation for the considered 5×5 grid and a simple specifcation for time efects.For general situations, grids, and specifcations for time efects, we relate the total demand for available cars in some cell j and time t with a linear combination of the expected total demand of the individual cells at some at a chosen reference time interval and a linear combination of further parameters multiplied with time-related variables Z, as follows: Elements u jti indicate the share of the total demand rate of cell i that is assigned to cars standing in cell j.Tey take values between 0 and 1, and the sum over all cells j with vehicles must be 1 (i.e.,  j u jti � 1).Te linear predictor on the right of equation ( 5) has two components.x jt and λ are design and parameter vectors of length I (number of cells) to predict the expected number of pickups of cars from cell j for some reference time interval, and z jt and β are design and parameter vectors of length P that take into account time efects towards the reference time.For example, expressing equations (4) with equation ( 5) yields x T jt � (0, 0, ∆ t , ∆ t , ∆ t , ∆ t , ∆ t , ∆ t , 0, 0, ∆ t /2, Δ t /2, ∆ t , 0, . . ., 0) and z T jt � 8v t ∆ t .Te proposed model is quite fexible to estimate how total demand is spatially distributed and which time efects are taken into account.Te main restriction is that the dependencies between the expected pickups and the possibly time-varying total demand rates have to be linear regarding the unknown parameters.Te linearity restriction is not as limiting as it might appear at frst sight.Nonlinear evolution along time may be modeled using dummy variables or polynomials that are still linear in its parameters.Section 4 presents a case study with real data to provide a hands-on specifcation.
If we assume that the number of cars is always sufcient to satisfy the total demand of the corresponding demand areas, then the parameters λ and β can be estimated using tools for Poisson regression models, such as maximum likelihood estimation.For some cell j with cars at interval t, the probability of the observed number of pickups p jt is as follows: Te assumption that the number of cars available is always sufcient to satisfy the total demand from the according demand areas may be realistic if ∆ t is chosen small enough to not expect more than one pickup within the time intervals.In general, however, this assumption does not hold, e.g., when two users want to book the same and only car available practically at the same time.Moreover, bypassing the assumption by decreasing ∆ t blows up the data volume and therewith increases the already considerable computational efort for estimation even more.
Using a Poisson model for right censored data [47] allows to account for situations where the total demand possibly exceeded the number of available cars.Te censored Poisson model assigns diferent probabilities depending on whether the number of pickups is smaller or equal to the number of available cars: In the frst case, we assume that the demand was fully satisfed and compute its probability using the Poisson density function.In the second case, where the number of picked cars equals the number of available cars, we assume that the number of picked cars is right censored, i.e., could have been larger if there were more cars available.Terefore, we compute its probability as the cumulative Poisson density function from the number of picked cars to infnity.Expressed mathematically, Estimations for λ and β can be obtained by maximizing the log-likelihood, Since no closed form solution exists to maximize equation ( 8), we developed a gradient-based implementation in R [48] based on the optimizer function nlminb() [49].Te implementation allows the parameters λ (and possibly β) to be log-transformed to avoid negative estimates for rate parameters and automatically drops λ i parameters associated with cells that never were in a demand area and therefore cannot be assessed.Te estimating equations and the developed R functions are available on request.

Smoothing. Te postulated model does not assume any relationship between total demand rates from adjacent cells.
In general, it is reasonable to think that adjacent cells might have similar rates, or that the spatial distribution of these parameters should change smoothly across the service region.Only in particular cases, like geographical circumstances (e.g., a river) or other demand singularities (e.g., location of a big demand attractor), the total demand rate parameters might experience an abrupt spatial change.
We propose to use a kernel smoothing approach (e.g., [50] Chapter 6) to construct dependencies between the total demand rate parameters.Te idea is to estimate "pseudo" total demand rates  λ k for K < I chosen supporting points, and calculate the total demand rates λ i of the I cells as weighted sums of these  λ k , that is, Figure 4 shows as an example the use of K � 9 nine supporting points for a grid with 5×5 cells.Te supporting points are located at the edges of the cells, which is not a requirement.
Any kernel function can be used to compute the weights w ik , such as the Epanechnikov [51] or the Gaussian kernel.We propose an implementation that uses higher weights if a supporting point is closer to cell i: let r ik be the euclidean distance between the center coordinates of cell i and the support point k.Using the standard Gaussian kernel, the w ik are computed as follows: Journal of Advanced Transportation Now, we can rewrite the linear predictor of our model to the following equation: which is again linear in its parameters  λ and β and can be estimated with the previously described tools.
Te simulation studies in the Appendix show that estimates for λ from the smoothing approach can have lower variance than those from the original model, due to the smaller number of unknown parameters involved.Te downsides of this are potential biases for the λ i s, which can also be seen in the Appendix.
Te smoothing approach involves specifying the location and the number of supporting points.An equally-spaced grid is used most frequently, for simplicity reasons.More supporting points allow capturing demand distributions with fner structures [52] but decrease the wiggliness and increase the computational efort.A practical implementation is provided in Section 4.

Case Study
We consider data provided by the Swiss commercial company Mobility (https://www.mobility.ch)from their socalled Mobility-Go free-foating carsharing service in a major city of Switzerland during 2021, where the service was operated with about 128 cars.Te raw data consisted of 28,682 records on individual rentals without service trips and include information on the vehicle number, coordinates, and time stamps of the pickup and dropof.To discretize space and time, each record was assigned to one of 747 cells of 250 m side length, based on the pickup coordinates, and to one of 8,759 hourly intervals (e.g., June 16, 8 to 9 o'clock) based on the pickup time.Te number of cars available for an interval was computed as the number of cars at the beginning of that interval plus the number of cars dropped during the interval.

Descriptive Analysis.
To provide an overview of the used data, we divided Basel into nine equally sized districts, divided according to the cardinal direction from the center of the city.Figure 5 shows the average number of pickups per hour along daytime, weekdays, and months for each area, together with the average of the nine districts.
Daytime presents a classical temporal demand pattern with few demands during night period (below 0.2 PU/h per district), a frst strong increase in the morning between 7:00 and 9:00, and a moderated continuous increase until the daily peak at 19:30 in the afternoon.Te values are higher in the afternoon because the graph also includes nonworking days.If we only consider working days, this daytime profle is much more balanced between morning and afternoon.
Regarding the days of the week, we can see that, as expected, Saturday is the busiest day with about 0.45 pickups per hour and area, followed by Friday and Sunday.For workdays, we see a slight growth from Monday to Tursday, while we could have expected a fat profle for these days.Along the months, we see a tendency of higher values in cold months and lower values during summer.Te relative maximum in May is not self-explicative, but it is important to remember that from this graph, we can extrapolate little information, considering that we have a one-time pattern and not a pattern that was repeated many times, such as the daily profle.Te pattern along the months could have been infuenced by factors such as pricing, policies, information campaigns, or the COVID-19 pandemic that was still relevant in 2021.
Te nine districts into which we divided the city difer in some cases from the average profle, both in terms of frequency and in terms of profle shape.Readers can examine Figure 5 to understand diferences between the districts.
As mentioned above (Section 2), some relocation models (classifed as nonpredictive) rely on few indicators to characterize the demand.Reiss and Bogenberger [53] used three indicators for each district to detect the attractiveness of a district: demand factor, origin-destination factor, and idle times.Here, we check the balance between supply and demand in the nine districts by using their demand factor, which is defned as the ratio between rentals and vehicles in a district.
In Table 1, we can consider the demand factor as a parameter to evaluate where relocation should be carried out (from the districts with many cars and low demand factor to the districts with a high demand factor).We note that the south-west and north-west districts have higher availabilities of cars than the center, despite having a considerably lower number of pickups.Candidates for receiving cars through relocation are the center and the north districts, exhibiting 8 Journal of Advanced Transportation with 84% and 77% the highest ratios between pickups made and car availability.On the contrary, the south-east district has, in average, twice as much cars available than car pickups and is therefore the best candidate to take cars away for relocation.Here, the demand factor is applied on a large scale but could give further information if applied to smaller districts.Similar conclusions on relocation can be obtained from the proposed modelling approach applied in the subsequent analyses, which uses a much more detailed spatial resolution by design.
Using the dummy variable specifcation allows to deal with nonlinear evolution across time, which can be identifed from Figure 5.We implemented four specifcations of the proposed model (M 1 to M 4 ), which difer by how overlaps of demand areas treated (cf.Section 3, Spatially linking the total demand and pickups), and whether or not smoothing (cf.Section 3, smoothing) was applied.Te models M 1 and M 2 use overlapping demand areas (total demand rates from cells which have more than one vehicle at reach are distributed evenly across the intersecting demand areas), whereas M 3 and M 4 use the closest car demand areas (total demand rates are assigned, whenever possible, to the demand area of the closest car).Smoothing was applied only on the models M 2 and M 4 .A grid of 81 uniformly spread supporting points was used for smoothing, reducing the total number of unknown parameters from 767 to 101.
To estimate the models, corresponding design vectors x jt and z jt had to be prepared based on the stated assumptions for the demand areas around cars and time efects.Smoothing additionally required the preparation of the weights w, see equation (11).
Te number of unknown parameters is 767 for basic models (747 one for each zone and 20 temporal parameters) and 101 for smoothing models.For estimation, all parameters λ and β were estimated on the log scale.Tis ensures the estimated total demand rates to be always larger than zero.

Spatial Distribution of the Total Demand Rates.
Figure 6 shows the estimated λ coefcients, which refer to the estimated total demand rates (per hour) at a Monday in September 2021 from 0 to 6 o'clock AM.Most obvious is that basic models result a patchwork of estimated rates, while smoothing models do not.Comparing the two basic models, it can be seen that overlapping demand areas result in more peaks than the closest car demand areas.Tis may be related to our fndings from the simulation studies in the Appendix, pursuant to which the total demand rate estimates of overlapping demand areas have higher variance, see Figure 7.
Figure 3 exhibits that the basic models may not be able to estimate the total demand rates for all cells, see the black squares in the north-east.Tis is because there was never a car available for rent within circle r max around these cells.Smoothing results estimates for these cells; however, these estimates should be interpreted carefully since kernel smoothing approaches are known for boundary bias (e.g., [50] Sec.6.1).
In terms of model ft, we found that the log-likelihood, the Akaike information criterion [54], the root mean squared error (RMSE), and the mean absolute error (MAE) of the basic models are slightly superior to smoothing, see Table 2.While the superiority of the basic model regarding the log-likelihood, RMSE, and MAE was expected because the smoothing models are merely restricted submodels with fewer parameters, the superiority regarding the AIC indicates that the smoothing is too strong and should be improved, e.g., by adding supporting points or placing them more efciently.Furthermore, the models M 1 and M 2 with overlapping demand areas perform insignifcantly better than the according models M 3 and M 4 with closest car demand areas regarding the log-likelihood, AIC, and RMSE, but insignifcantly worse regarding the MAE.
Figure 8 shows the estimated total demand rates of the model M 2 with overlapping demand areas and smoothing, which is the best among the smoothing models according to the log-likelihood.Figure 8 is identical to the top right plot of Figure 6, but with a fner color scale to facilitate closer examinations.Te plot highlights two regions with higher total demand within the center of the service region, which can be attributed to regions close of the train station and the old town, and two local peaks at north-east and south-east.
Figures 6 and 8 present the estimated distribution of the total demand rates for the reference time Monday in September 2021 from 0 to 6 o'clock AM.Te time efects discussed as follows allow total demand rates to vary over time, e.g., because the demand might vary across weekdays.However, because the considered models assume that time efects are constant across the whole service region, the estimated spatial distribution will not change and only the rates will be increased in every cell by the same factor.

Temporal Distribution of Total Demand Rates.
Figure 9 shows the estimates of the four models regarding the three considered types of time efects.Te shape of the coefcients along time is very similar between the four models.Te plot on month efects on the top left reveals that the total demand was highest in January, and there was a temporary peak in May-June.We expect this pattern to be related to the COVID-19 pandemic and to not be repeated in 2022.Estimates for weekday and daytime efects can better accommodate the expected: We fnd a clear peak for total demand on Saturdays, and higher total demand at afternoon and evenings than at night and mornings.Some coefcients almost reach value zero, which is the lowest possible value due to estimating the coefcients on the log scale.Tis is especially notable for month efects where the estimates indicate that the total demand of July, August, October, and November was practically the same as in the reference month September.To fnd out if those estimates with values close to 0 relate to convergence diffculties, we used diferent optimizer routines and applied a number of small model modifcations, such as changing the reference categories and the side lengths of the cells.However, the optimizer routines reported to converge, and the model modifcations did neither clear out the closeto-zero coefcients nor change the fndings for time efects fundamentally.Moreover, the order of the estimated time efects is consistent with the results from the descriptive 10 Journal of Advanced Transportation analyses, cf. Figure 9.For these reasons, we assume that the estimated coefcients are reliable.

Loss of Demand.
Te estimated models may be used to estimate the loss of demand.In line with our model, we distinguish between the following two types of loss of demand: (1) No cars in proximity: In situations where there is no ofer, the entire demand gets lost.For some cell i and time interval t that is further away than r max from the nearest cell with cars, we estimate this type of loss of demand as  l (1) , as the estimated total demand at the baseline setting (in our case: Mondays in September 2021, 0 to 6 o'clock AM) plus the estimated time efect for interval t.
(2) Not enough cars in proximity: In situations where all available cares are picked up, loss of demand occurs because more cars could have been rented with a larger ofer.Consider some cell j with c jt > 0 cars and p jt � c jt pickups at time t.Te conditional expectation for the total demand D jt is in this situation (3) where the model estimate for Terefore, we estimate this type of loss of demand as  l (2) jt �  E(D jt | D jt ≥ p jt ) − p jt , i.e., the estimated conditional expectation minus the number of pickups.Te peculiarity of this type of loss of demand is that our model implies that it cannot solely be attributed to cell j, but to all cells not further away than r max from cell j.However, we did not fnd an analytical formula of how to divide  l (2) jt to the neighboring cells, and therefore, they are attributed to the cell j car in the following results.It should be noted that in the presented case study this second type of loss of demand is practically negligible compared to the frst type.
Te two proposed estimates for loss of demand above refer to the number of cars not rented by the carsharing system compared to the same system with the same demand but an infnite number of cars available.Te following results on the estimated loss of demand are based on model M 2 (overlapping demand areas and smoothing) and the 2021 data used for estimating the model.
For the entire service region, we estimated an average loss of demand of 14.2 cars per day, thereof 12.9 because there were no cars in proximity, and 1.31 because there were not enough cars in proximity.Compared to the average number of pickups per day of 77.0, this means that the number of rentals could be increased by about 18.5% by providing an unlimited number of cars, assuming that increased ofer would not increase the demand.
To detect the loss of demand locally, Figure 10 shows the spatial distribution of the average loss of demand per day, based on the model M 2 .According to Figure 10, loss of demand is especially pronounced in the center of the city and not exactly at the total demand peaks situated on the left and bottom of the center, see Figure 8 (Figure 10).Journal of Advanced Transportation

Summary and Conclusions
Tis article proposes a novel model approach to estimate the spatial and temporal distribution of total demand rates for free-foating carsharing.Te proposed model is based on a Poisson regression model for right censored data and estimates possibly time-varying demand rates of discrete cells of the service region based on booking data with spatiotemporal information on pickups and dropofs of cars.Te model is quite fexible as it can accommodate various shapes of cells of selectable size and diferent temporal effects.Te model was successfully applied for a case study in a major city of Switzerland with data from year 2021.
Te proposed model is useful for the following purposes: frst, the model provides insights to operators on how total demand was spatially distributed and evolved over time.Tis insight can hardly be gained using simple descriptive statistics, because total demand is often not directly observable and therefore must be estimated using auxiliary variables such as the number of pickups, and an advanced modelling technique such as regression.Second, the model may be used to estimate the loss of demand due to unavailability of cars.Tese insights may prove useful to designate convenient dropof places in incentive schemes for user-based relocations or to extract input parameters for macrosimulation models.
5.1.Limitations.Te total demand rates estimated with our approach refer to the free-foating carsharing service that provided the data.Terefore, for competitive situations with multiple services, they cannot be interpreted as the global demand rates of the considered service region.If global demand rates are of interest, the model must be estimated using data that combine the competing services.Moreover, the estimated total demand rates do not take into account for other transport services such as public transport.Terefore, they refer to a given split of available transport services and may be sensitive towards launches or discontinuations of other transport services.

Future Work.
Further investigations could focus on practical aspects of the model.Implementations for larger and more frequented service regions would help to defne the scope of our approach and to improve guidelines for model specifcation.Furthermore, operators may be interested into forecasting future total demand.Forecasting involves extrapolation and has yet not been elaborated with our model approach, partly because it seemed difcult to be implemented for data from the COVID-19 era.A forecasting approach should additionally take into account for auxiliary predictor variables such as weather and should be able to deal with temporal correlation (e.g., by using a model with autoregressive errors) to provide reliable prediction intervals.

Appendix
To validate our method, we performed a simulation study using a 5 × 5 grid of square-shaped cells with side lengths 0.2.Te total demand rates of the cells were proportional to a multivariate normal distribution centered at the center cell 13 and varying between 0.05 and 0.3 cars per hour.To include time efects, the individual total demand rates were increased at evenings (18-24 o'clock) by 0.1 on weekend days Journal of Advanced Transportation by 0.2. Figure 11 illustrates the specifed total demand rates by a map and a scatterplot.
We generated pickups for independent hourly intervals.For this, we frst generated for each interval and cell the number of cars available using a Poisson distribution with a common rate for all cells, and then generated the corresponding pickups based on our postulated model (equation ( 6)) and the specifed total demand rates.Simulations were performed for six scenarios regarding data generation and model specifcation.Each scenario was replicated 512 times, resulting 512 estimated models per scenario.
For the baseline scenario, we used a car rate per cell of 0.16, which corresponds to the average of the total demand rates on weekdays between 0 and 18 o'clock used in this simulation study.Demand areas, which need to be found to construct the vectors x jt of the postulated model, included adjacent and diagonally adjacent cells of the cells with cars (r max � ������ 2 • 0.2 2

√
≈ 0.28).Data for 4,321 hourly intervals were generated, which corresponds to the number of hours of the frst half year of 2021 (including summer time changeover).Fitted models from the baseline scenario are correctly specifed and therefore should identify the data generating total demand rates.
For the alternative scenarios, we halved and doubled the car rate, misspecifed the r max parameter for estimation (r max � 0.2 and r max � 0.4 instead of r max � 0.28 ), halved and doubled the number of time intervals, used the smoothing approach with 3×3 supporting points and considered demand areas that include all adjacent or diagonally adjacent cells including cells have a closer car somewhere else.

A. Results
Figure 12 shows the distribution of the estimated parameters for the baseline scenario.Te estimates vary around the predefned total demand rates, suggesting that the estimation procedure is able to identify the data generating total demand rates if the model is correctly specifed.
Figure 13 compares the estimated parameters for cells 1 (lowest total demand rate) and 13 (highest total demand rate) between the baseline and three alternative scenarios.Te top left panel shows the efect of increasing the number of cars available.It can be seen that increasing the number of cars increases the accuracy of the estimates.Interestingly, the accuracy of the estimates for fewer cars is about the same or slightly better than for the baseline scenario.
Te middle panel of Figure 13 shows the efect of misspecifying the parameter r max , i.e. the maximum deviation users would accept from the preferred pick up cell.While misspecifying r max seems not to afect the estimation of the total demand rate of cell 1, it does for cell 13.Specifcally, choosing r max too small results a upward bias, and choosing r max to large a downward bias.Tis seems plausible because increasing r max implies that the total demand is spread over more cells.
Te right panel of Figure 13 shows the efect of decreasing or increasing the number of observations.As expected, the accuracy improves with an increasing data size.
Te left hand of Figure 7 compares parameter estimates between the baseline scenario and a smoothed estimation with 9 supporting points, which were evenly distributed within the surface of the 5 × 5 grid.As could have been expected, the smoothing approach decreases the variance of the estimates; however, in case cell 1, it introduces a bias by overestimating the total demand rate.
Te right hand of Figure 7 compares the parameter estimates between the overlapping and the closest car demand areas.In both cases the estimates vary around the data generating total demand rates.Te estimates for the overlapping areas around cars have slightly higher variance.

Figure 2 :Figure 3 :
Figure 2: Example situation in a 5 × 5 grid with cars at cell 18 and the according demand area (yellow).

Figure 5 :
Figure 5: Average number of car pickups per hour for the mobility-go data 2021, along daytime, weekday, and month and separately by 9 self-defned districts of the city (south-west).Te black line represents the average of the 9 colored lines.

M 1 : 3 :Figure 6 :
Figure 6: Estimated spatial distribution of total demand rates per hour for renting cars at the baseline setting (Mondays in September 2021, 0 to 6 o'clock AM) from the diferent models M 1 to M 4 .

Figure 7 :
Figure 7: Simulation results on comparing the baseline with alternative scenarios (smoothing and specifcation of demand areas): estimated parameters (boxplots) and according true values (red).

Figure 9 :Figure 10 :
Figure 9: Estimated total increase in total demand of months, weekdays, and day intervals as compared to a Monday at September between 0 and 6 o'clock (reference).

Figure 11 :
Figure 11: Used total demand rates for the simulation study.(a) Spatial distribution of the demand rates on weekdays between 0 and 18 o'clock.(b) Te same demand rates along the cell numbers and efects for evenings (time.eve)and weekend days (time.we).

Figure 12 :
Figure 12: Simulation results for the baseline scenario: estimated parameters (boxplots, black) and according true values (red).

Figure 13 :
Figure13: Simulation results on comparing the baseline with alternative scenarios (car rate, misspecifcation of d max and data size): estimated parameters (boxplots) and according true values (red).For reasons of clarity, only the parameters for the cells 1 (lowest total demand rate) and 13 (highest total demand rate) are shown.
18,t >  i∈ 12,13,...,24 { } D it ), then the expected number of pickups from cell 18 is equal to the sum of the expected total demand of the individual cells.

Table 1 :
Daily average of the number of pickups (P), number of cars available (C) for rent, and demand factor (P/C) by district and average over the districts.

Table 2 :
Goodness of ft measures of the models M 1 to M 4 .
Te RMSE and the MAE were computed using the diferences between the observed and the expected number of pickups for each cell and interval with cars, p jt − E  M (P jt | c jt ).