Maximum Closeness Centrality -Clubs: A Study of Dock-Less Bike Sharing

In this work, we investigate a new paradigm for dock-less bike sharing. Recently, it has become essential to accommodate connected and free-floating bicycles in modern bike-sharing operations. is change comes with an increase in the coordination cost, as bicycles are no longer checked in and out from bike-sharing stations that are fully equipped to handle the volume of requests; instead, bicycles can be checked in and out from virtually anywhere. In this paper, we propose a new framework for combining traditional bike stations with locations that can serve as free-floating bike-sharing stations. e framework we propose here focuses on identifying highly centralized -clubs (i.e., connected subgraphs of restricted diameter). e restricted diameter reduces coordination costs as dock-less bicycles can only be found in specific locations. In addition, we use closeness centrality as this metric allows for quick access to dock-less bike sharing while, at the same time, optimizing the reach of service to bikers/customers. For the proposed problem, we first derive its computational complexity and show that it is NP-hard (by reduction from the 3-Satisfiability problem), and then provide an integer programming formulation. Due to its computational complexity, the problem cannot be solved exactly in a large-scale setting, as is such of an urban area. Hence, we provide a greedy heuristic approach that is shown to run in reasonable computational time. We also provide the presentation and analysis of a case study in two cities of the state of North Dakota: Casselton and Fargo. Our work concludes with the cost-benefit analysis of both models (docked vs. dockless) to suggest the potential advantages of the proposed model.


Introduction
Bike-sharing systems (BSSs) have become a prominent mode of transportation around the world, especially in urban areas. BSSs bring a number of advantages to existing transportation networks. Among them, we note the increased personal mobility, reduced transportation costs, reduced traffic congestion, decrease in use of and dependence in fossil fuel, increase in public transit visibility, enhancement of downtown areas along with the economic development that follows, health benefits, and increase in environmental awareness [1][2][3].
Since their introduction in Europe in the 1960s, BSSs have undergone a series of developments. e most recent of these developments is referred to as the dock-less or free-floating BSS. In a dock-less BSS, residents that are interested in using a bicycle can check out and in bicycles throughout an urban area using nothing more than their smartphones. e bicycles are equipped with a geographic positioning system (GPS), thus enabling users to locate the nearest available bicycle and to unlock it with the use of an app. Riders are then allowed to drop off (check back in) the bicycle anywhere within a geographic area (referred to as the geo-fenced area). Within that area, bicycles are allowed to be parked legally. e trip ends as soon as the checked out bicycle is parked and securely locked anywhere in the geo-fenced area.
As is obvious from the description, dock-less or free-floating bicycles offer enhanced convenience and improved accessibility, which in turn translates to increased personal mobility, compared to conventional bike sharing. e enhanced convenience stems from the fact that users no longer have to wait for a parking spot to become available in a bicycle dock so as to return their bicycle a er the trip (especially in heavily trafficked areas). However, as with many other technologies, dock-less BSSs also present new challenges. e one we deal used to maintain some order. is policy would help mitigate the hodgepodge of problems that can result from adopting a dock-less system [11].
As our framework will optimize the reach of dock-less bike-sharing operations, while also restricting the size of the system, our model will also alleviate some of the problems involved with rebalancing. To further elaborate on our model, we offer Figure 1. On the right, we present a conventional dockbased BSS. e transportation network is presented with nodes and edges (representing streets), with the bicycle docks being noted with blue rectangular nodes: observe that docks are not necessarily located in nodes only, but can also be located along the edges of the network. On the other hand, the figure on the right shows our proposed framework. We now allow for a geofenced area (represented by the shaded area) where users can check out and in bicycles from anywhere. is allows for more people to have fast access to bicycles and reduces the need for docks within that area. Due to that, these docks could be moved to other areas, further than the geo-fenced area, to enable bike-sharing use to other residents. In addition to that, the area where bicycles can be dropped off anywhere is significantly decreased, making it easier for operators to find and collect bicycles so as to rebalance their inventory. Last, we note here that the shaded area of the network on the le forms a 2-club (i.e., a subgraph of diameter equal to 2).
We can summarize our contributions in the following three components: (i) First, we use the -club concept, combined with closeness centrality, so as to identify candidate locations that could be geo-fenced. We also allow for a weight at each node of the network: this modification enhances the speed of the -club formation through the heuristic algorithm devised. (ii) en, we turn our attention to a real-world application. We present an experimental study on the cities of Fargo and Casselton. In the study, we analyze the exact optimization model and the heuristic devised and compare them in computational time and solution obtained. In each -club obtained for varying values of , riders (commuters) are able to reach to any other neighboring sites within a fixed distance (controlled by ), implying that the virtual locations provide better accessibility to demand points. (iii) Last but not least, we present potential strategies for operators to further manage the inventory by applying incentives and making bicycle collection and rebalancing more cost-effective. e remainder of the paper is organized as follows: the next section reviews related literature on BSS design and discusses how those relate to the objectives of this work. en, we provide the necessary mathematical background, provide the definitions of all notation used, and derive the computational complexity of the problem studied. e next section illustrates the mathematical formulation that can be solved using a commercial optimization solver and also proposes a greedy heuristic to solve it. In the following section, we discuss two computational experiments that reveal our findings in two real-world transportation networks: namely the smaller city of Casselton, ND, and the larger city of Fargo, ND. However, due to the size of the network in Fargo, we only test and present the results of the heuristic approach. e last section of the paper is devoted to our conclusions and a brief overview of future plans.

Related Works
ere is a plethora of studies on bike-sharing systems. ese studies generally fall into three major areas: (1) General quantitative analysis; (2) Facility location problems; (3) Redistribution problems. e first body of literature focuses on the quantitative analysis of existing BSSs, analyzing their characteristics, and examining empirical evidence of usage patterns in cities including Dublin [12], Beijing [13], Montreal [14], Brisbane [15], Helsinki [16], Paris [17], Switzerland [18], and New York [19]. Nair et al. examine several aspects of the Velib BSS in Paris, France [17]. eir findings show that integrating transit and BSS can yield higher utilization. Bachand-Marleau et al. surveyed residents of Montréal, Quebec, in Canada to determine the factors leading to use BSS as well as the frequency of use [14]. Campbell and Brakewood quantify the impact that BSSs have on bus ridership in New York City [19]. ey conclude that either bike-sharing members substitute bike sharing for bus trips or the implemented BSS led to travel behavior changes of nonmembers. Audikana et al. studied the impact of a BSS in a small city (less than 100,000 residents) in Switzerland [18]. ey suggested that BSS network density along with the developed partnerships play a critical role in its success. e second stream of literature focuses on the strategic design of BSS where the ultimate goal is to find the locations, capacity, and coverage areas of BSSs [20]. ese studies try to determine the number and location of stations, fleet size, and network structure of the underlying BSSs. ey consider various objectives, including the maximization of demand coverage, the minimization of transportation cost, and the minimization of the overall cost. Lin et al. address the strategic design problem by formulating it as a hub location inventory model [21]. In their work, they consider both total costs (travel cost of users, bike inventory costs, facility costs) and service level (bicycle lanes) in their model. e authors then propose a heuristic method to find high-quality solutions. In a similar study, Lin and Yang propose a nonlinear integer method to determine the optimal location, bike lanes, and routes [22]. eir model assumes a penalty for uncovered demand but does not consider relocation (rebalancing) of bikes. Martinez et al. present a mixed integer linear program to maximize the net revenue by simultaneously optimizing the locations of stations, the fleet size, and bike relocation activities for a regular operation day [23]. Nair and Miller-Hooks formulate an equilibrium network design model to address the same objective as the previous study [24]. ey propose a metaheuristic solution approach to overcome the intractability of the exact solution for real-life, large-scale networks. In another study Reijsbergen identifies alternative locations with the aid of spatial data and simulation techniques: more specifically, a data-driven approach to determine how attractive city areas are for station placement is presented [25]. e literature offers other methodologies, that are not based on facility location models, to define the location of the stations. Garcia-Palomares et al. develop a GIS-based model to calculate the spatial distribution of the potential demand for trips and find the locations of bike stations using the location-allocation modeling approach [26].
Finally, a third group of the literature is associated with the relocation of bicycles in a BSS. e problem arises from demand imbalance leading to accumulation of bicycles at some stations (and consequently, limited bicycle availability in other stations). Vogel and Mettfeld apply a system dynamic method to model the effect of dynamic repositioning on the service level [27]. Shu et al. develop a stochastic network flow model with proportionality constraints to determine bike flow in a bike-sharing network. ey also present a numerical analysis on the Singapore BSS and find that period distribution is the most effective for system performance [28]. Forma et al. develop a 3-step heuristic and mixed integer linear programming model for repositioning [29]. e first step involves clustering the stations based on geographic location and inventory levels using a heuristic method. In the second and third steps, they employ a mixed integer linear program to find the best routes for repositioning vehicles. Alvarez-Valdes et al. address the static repositioning problem using simulation techniques in two stages [30]. In the first stage, they estimate the levels of unsatisfied demand for a set of stations in a given period. In the second stage, they use the estimation as an input to their redistribution algorithm. Schuijbroek et al. combine service level requirements and vehicle routes to rebalance the inventory [31]. ey propose a "cluster-first routesecond" heuristic considering the service level feasibility and approximate routing costs simultaneously. Yan et al. develop four planning models for leisure-based BSSs given deterministic and stochastic demands [32]. ey apply nonlinear time-space network models to integrate bike repositioning and vehicle routing with user dissatisfaction estimations. In a recent study, Celebi et al. propose a hybrid approach jointly considering location decisions and capacity allocation [33]. eir goal is to find the optimal configurations of a BSS by combining set-covering and queuing models to determine service levels.

Journal of Advanced Transportation 4
Most of the previous work that addresses physical bike station location problems illustrates problems including station capacity decisions and demand predictions, among others. To the best of our knowledge, this paper is the first to suggest a solution to problems that have arisen from the emergence of dock-less bike-sharing systems with the aid of a -club. e ultimate goal is to locate potential hubs in a city, referred to as -clubs, by geo-fencing a suitably small area of a city.

Definitions and Notation
Let 퐺(푉, 퐸) be an undirected network, with symbolizing the vertices (intersections of the transportation/biking network) and the edges (streets in the transportation/biking network). Every node is assumed to be assigned a nonnegative parameter, 푤 ≥ 0, referred to as the weight at this specific location. is weight parameter can be used to capture different aspects of the problem at hand, depending on the application. For example, the weight of a node could capture socio-economic attributes like population, points of interests in the vicinity, number of jobs, etc. Another possible way to model and use the weight parameter is through the interactions between different pairs of origin and destination, like traffic flows (outgoing traffic from an origin node, incoming traffic to a destination node, or simply a summation of outgoing and incoming traffic to a specific node). In either way, we assume a distinct, nonnegative number explaining the level of attraction for that node.
We say that 푖, 푗 ∈ 퐸 if there exists an edge starting from node and ending in node , in which case we write that 푎 = 1. We also denote with 푁(푖) = {푗 ∈ 푉 : 푎 = 1} the open neighborhood of node . We write that the diameter of graph is if the maximum shortest path distance between two nodes in the graph is . Clearly, all pairs of nodes in the graph will be located at a distance ℓ from one another with 0 ≤ ℓ ≤ 퐷. Let be the distance between two nodes and , and 푑 푆푗 = min 푖∈푆 푑 푖 푗 as the distance of a node to a set of nodes . en, for any set of nodes 푆 ⊆ 푉, we define a function 푓 : 푉 㨃 → R, as Last, we use P to denote all paths of length less than or equal to . Similarly, P is the set of all paths of length at most connecting two nodes and (푖 ̸ = 푗). Clearly, we have that . e decision version of the problem we are trying to solve is provided in Definition 2. Before that, we need to provide the definition of a -club.
Detecting a -club of maximum cardinality is a wellknown NP-hard problem [34,43]. Hence, it is expected that One of the gaps in the current state-of-the-art is that most focus only on either user accessibility or rebalancing strategies to manage supply and demand within an urban area. As described in the Introduction section, our contribution is to fill exactly that gap and propose a framework that allows for both high accessibility for the users and reliable and cost effective rebalancing and coordination for BSS operators. Our proposed model relies on the definition of a -club from graph theory, whose definition and related literature is offered in the next paragraphs.
Given a simple undirected graph, a -club is a subset of vertices inducing a subgraph of diameter at most . ese structures represent cohesive subgroups in social network analysis with common applications in network-based data mining and clustering. Several authors have discussed mathematical formulations for identifying -clubs of maximum cardinality, as well as various methods to locate -clubs within a network [34][35][36]. In addition to using -clubs, our work also focuses on the centrality of a group of a specific structure. Group centrality, introduced by Everett and Borgatti, aims to identify groups or classes of high centrality [37]. Centrality measures the aim to characterize the importance of an element in a network. ey typically fall into three main classes [38], referred to as degree (i.e., the number of connections of a specific element in the network), closeness (i.e., how close an element is to every other element in the network), and betweenness centrality (i.e., how important an element is in the communications between any two other elements in the network, assuming all such communications take place using the shortest path between the elements).
More recently, researchers have focused on highest betweenness groups [39]. Finally, another extension of identifying highly centralized groups has to do with the added restriction that the group induces a subgraph "motif ", such as being a complete subgraph/clique [40,41], or inducing a star [42].
In this paper, we propose an integer programming formulation and a heuristic algorithm to find the most centralized -club in a transportation network based on closeness centrality. e resultant -club consists of a set of nodes in which the maximum traversing distance is hops (by definition), and the total weighted by population distance to a node in theclub is minimized (as it will be the -club with maximum closeness centrality). Based on this result, a BSS operator could then enable the area covered by the -club as the geo-fenced area where dock-less bike-sharing is allowed and satisfy the following objectives: (1) Maximize demand coverage (as the area obtained is the most centralized, with respect to closeness centrality); (2) Minimize distances traversed for rebalancing operations (as the geo-fenced area is of restricted diameter); (3) Offer a large, convenient geographical area for checking in/out the available bikes without need for physical stations. As the success of a BSS heavily depends on the network of bike paths and bike stations in the community, this is an important objective facilitated by our framework.
node corresponding to its complement), one chain can have at most 푘 ≤ 푘 − 1 nodes in and the remaining chains can have at most 푘 −푘, where 1 ≤푘 ≤ ⌈푘/2⌉. Now, at best, this literal can satisfy at most 푚 − 1 clauses (since by assumption there exists no satisfiable assignment) whereas the literal that satisfies the remaining clause is located within a distance of 푘 −푘 from . Hence, we have: is contradicts the assumption that is a -club with 푓(푆) ≤ 푚 ⋅ 푀.
Case 2. Let 푢 ∈ 푉 be the clause-node in . Since we have a 3 instance, has exactly 3 chains around it, and contains at most 푘 ≤ 푘 nodes from one chain with the remaining chains having at most 푘 −푘 nodes in . e three literalnodes connected through the chains to clause-node can satisfy at most 푚 − 2 other clauses (apart from ). Hence, at best, we have: By assumption, though, we have that 푓(푆) ≤ 푚 ⋅ 푀, which, combined with inequality (3), leads to: which is a contradiction.

Case 3. A similar contradiction to Case 2 is obtained when
-club consists only of nodes in 푐×ℓ . Let the -club be at a distance of 푘 from the clause-node and at a distance of 푀 − 푘 −푘 from the literal-node ℓ of that chain. We then have one clause at a distance of 푘 , at most 푚 − 2 clauses (as, otherwise, literal ℓ satisfies all clauses, a contradiction) at a distance of 푀 − 푘 −푘 + 푀, and at least 1 clause at a distance of, at best, 푀 − 푘 −푘 + 푘 + 푀, leading to: is leads to the same contradiction as in Case 2.
our problem, as described in Definition 2 will also be shown to be NP-complete, rendering the optimization version NP-hard. is is exactly what we show in eorem 1. Before we do that, we define 3-S (3 ), a famous NP-complete problem.
e problem can be shown to be in NP, as both verifying that a subset forms a -club and that 푓(푆) ≤ ℓ can be done in polynomial time.

Now consider an instance of 3
with clauses on literals. We will reduce it to a version of our problem using the following gadget/transformation. First, create two nodes for every literal and its complement ( ℓ ); we connect every node by a chain of 푘 − 1 nodes ( ℓ×ℓ ) to every other node, but its complement (this forms edge set ℓ ). Moreover, create one node for every clause ( ); connect each node in by a chain of 푀 − 1 nodes ( ×ℓ ) to the literals that the corresponding clause consists of ( ), where 푀 >> 푘. Finally, assume that all nodes in have a weight of 1, while all other nodes in 푉 \ 푉 have a weight of 0. We will show that the 3 instance has a feasible assignment if and only if the constructed graph 퐺(푉, 퐸) with 푉 = 푉 ℓ ∪ 푉 ℓ×ℓ ∪ 푉 ∪ 푉 ×ℓ and 퐸 = 퐸 ℓ ∪ 퐸 has a -club 푆 ⊆ 푉 such that 푓(푆) ≤ 푚 ⋅ 푀. e gadget is also shown in Figure 2.
Assume that the 3 instance has a feasible assignment . en, it is easy to see that by construction, the nodes corresponding to the literals in form a -club (let them be ). Moreover, satisfies all clauses, hence there exists at least one node in that is at a distance of from each node in . Hence, we have that 푓(푆) ≤ 푚 ⋅ 푀.
For the other direction of the proof, assume there exists a -club 푆 ⊆ 푉 such that 푓(푆) ≤ 푚 ⋅ 푀; yet, there exists no feasible assignment of literals to satisfy the 3 instance. We distinguish between four cases: (1) consists of exactly one node 푢 ℓ ∈ 푉 ℓ and nodes in ℓ×ℓ in as many as all 2푛 − 1 chains connecting them to all other literals (but its complement). (2) consists of exactly one node 푢 ∈ 푉 and nodes in ×ℓ in as many as 3 chains connecting to the literals clause contains.
consists of only nodes in ×ℓ in exactly one chain connecting a literal-node 푢 ℓ ∈ 푉 ℓ to a clause-node 푢 ∈ 푉 . (4) consists of several nodes in ℓ , along with the nodes in ℓ×ℓ in all chains necessary to connect them within hops.
Case 1. Let 푢 ℓ ∈ 푉 ℓ be the literal-node in . From the nodes in the chains connecting ℓ to the other literals (but the Journal of Advanced Transportation 6 4.1. Formulation. We begin this section with the definition of our variables. We will use two sets of binary variables, defined as follows.

Mathematical Formulation
In this section, we present our mathematical formulation and a greedy heuristic algorithm to solve larger scale instances. We also present some computational results on generated and real-life instances for smaller -clubs (푘 = 2, 3).
. . . where and are the weight parameters (or, importance) of the origin and destination locations and is (as defined earlier) the distance between the origin and destination . In this work, we slightly change the interaction term in the numerator given in (9). Starting from some origin , we are searching all adjacent (nearby) locations 푗 ∈ 푁(푖) so as to add it to the -club being built. Since the term w is the same for all considered locations (as 푖, 푗 ∈ 퐸), we drop it from consideration and hence are le with a ratio of the importance of candidate location (given in the weight parameter ) versus the distance. e algorithm is initialized with all nodes in the nodeset being in the candidate list, I , and the starting -club, , is empty. en, for every node in the candidate list, we "add" it in and calculate the shortest paths from every node to any node inside . en, the ratio becomes the summation of fractions w /2 . e node with maximum ratio is indeed added in , and the candidate list is updated with only neighboring nodes that satisfy the -club criterion. A pictorial example, and its calculations are provided in Example 1. Figure 3 with weights 푤 1 = 푤 2 = 푤 6 = 푤 7 = 5, 푤 3 = 푤 4 = 푤 5 = 10, and we are looking for a 2-club. Initially, I contains all 7 nodes and is empty.

Example 1. Assume that we have the graph of
Starting from node 1, we see that it is located at a distance of 0 from itself, a distance of 1 from nodes 2 and 3, a distance of 2 from node 4, a distance of 3 from node 5, and a distance of 4 from nodes 6 and 7. Hence, we have that 푟 1 = 5/2 0 + 5/2 1 + 10/2 1 + 10/2 2 + 10/2 3 + 5/2 4 + 5/2 4 = 16.875. In the example, it is easy to see that exactly the same is true for nodes 2, 6, and 7.
Similarly, for nodes 3 and 5, we have 푟 3 = 푟 5 = 10/2 0 + 10/2 1 + 5/2 1 + 5/2 1 + 10/2 2 + 5/2 3 + 5/2 3 = 23.75. Finally, for We can now proceed to describe the mathematical formulation, shown in (7). It is based on the maximum -club chain formulation presented in [34]. Newer formulations for identifying -clubs (as in, e.g., [44]) can also be employed, but are not explored here. e objective function in (8a) aims to minimize the total weighted distance every node outside the -club needs to traverse until it accesses a node in the -club. e constraint family in (8b) restricts that a path can only be within the -club if every node that belongs to it belongs in the -club. Constraints (8c) enforce that every node in the graph is at a distance 0 ≤ 푑 ≤ 퐷 from a node in the -club. e following constraints, shown in (8d), recursively enforce that a node can be at a distance of ℓ + 1 from the -club if it is neighboring a node that is located at a distance of ℓ itself. e constraint family in (8e) restricts that two nodes can not both belong in the -club unless there exists at least one path connecting them within hops or less that is in the -club. Finally, the binary nature of all variables involved is enforced with (8f) and (8g).

Greedy Heuristic.
e above formulation is difficult to solve, as the underlying problem was shown to be NP-hard (with a decision version being NP-complete per eorem 1). Hence, along with solving the formulation using a commercial solver, we also devise a practical heuristic. In our case, we opted for a greedy heuristic that always chooses to increase the -club at hand by choosing a node with a maximum weight-todistance ratio: that is, if a node is located near many nodes with big weights, it is more prone to being selected. is approach is shown in Algorithm 1. e backbone of the heuristic method is the spatial interaction model known as the gravity model (as it is similar to Newton's law of gravity). Its basic formula is as follows: of nodes increases, the growth rate is much slower for the heuristic algorithm. is is verified by Table 1 for identifying highly central 2 and 3-clubs. Note that, with the exception of the Berlin graph, the heuristic approach shows a speedup that is on average 3 and 7 times faster than the exact optimization model for 푘 = 2 and 푘 = 3, respectively. e case of the Berlin network is very important. In this transportation network, the exact optimization fails to find a solution within reasonable computational time, and instead spends hours trying to prove optimality. is happens because the diameter of the graph is big, and the number of binary variables in model (6) becomes prohibitively large.

Case Study
In this section, we investigate two case studies from the state of North Dakota, in the cities of Fargo and Casselton. Case studies and real-world visualization are necessary to put the problem in its related context and understand its implications. However, due to the computational complexity of our problem, reaching a solution within reasonable computing time is challenging. Hence, the exact optimization model of (6) was only solved on the (smaller) city of Casselton, whereas in the (larger) city of Fargo, we only present the results of the heuristic (as in Algorithm 1).

Data Description.
Casselton is a city in the state of North Dakota, with a population of 2,329 in the 2010 census. To the best our knowledge, there is no bike-sharing program planned for deployment in the near future. Figure 4 illustrates the overall geography of the city and the population distribution in proportionally graduated circles. e network for the city of Casselton was built with TIGER/Line® road data and block population with ArcGIS 5.0. All roads were converted to sets of vertices and edges representing intersections and road segments, respectively. ere are |푉| = 400 vertices and |퐸| = 523 edges in the resulting graph. e block population polygons are turned to point features for weighing the graph vertices. According to a National Association of City Transportation Officials (NACT) report [6], to achieve an increase in ridership as well as in overall system utility, bike-sharing kiosks should be located no more than 1000 feet apart from one another. erefore, every single vertex has the potential to become a dock-less bike station within 1000 feet. en, each vertex is weighted based on the closeness to the population points.
For the city of Fargo, due to its size, only the greedy heuristic of Algorithm 1 was put to the test. e population in Fargo is 105,545. At the moment, a bike-sharing system is in place, with 11 stations in the locations shown in Figure 5 with a triangle. e same figure also presents the geography of the city and the population in proportional circles. e network for the city of Fargo is obtained in the same way as the one for Casselton. e final graph contains |푉| = 2989 vertices and |퐸| = 4302 edges, which is indeed large-scale for the exact optimization solver.
e key realization here is that the distances are no longer between the candidate node and every other node in the graph, but instead between including the candidate node and every other node in the graph. We also note that node 5 will have exactly the same ratio, by construction of the example. Let us add node 3 to (hence, 푆 = {3, 4}), and I = {1, 2, 5}.

Computational Results
e developed algorithm and optimization model were implemented in Python and all numerical experiments were conducted on a Lenovo laptop with an Intel 2.50 GHz quad-core processor and 8 GB of RAM. To diversify the experiments and fully explore the behavior of the proposed algorithm as well as the optimization approach, two different sets of instances were considered. e first set of instances consists of Watts-Strogatz small-world graphs with a varying number of nodes, edges, and diameter (stylized as 1 -6 ). e second group are three cities (Sioux Falls, Eastern Massachusetts/EMA, and Berlin) from a networks repository for transportation research [45]. In Table 1, we present the computational times as well as information for each network (such as the number of nodes, the number of edges, and the diameter).
Although the computational time expectedly grows for both the commercial solver and the heuristic as the number (1) Number of nodes selected in the -club (cardinality); (2) Population located in the selected nodes (immediate access); (3) Distance-weighted cost from all nodes to the -club (general accessibility).
e number of nodes in the -club represent the desirable, potentially geo-fenced, sites where a rider could check in/out Python. For solving the optimization model, we used Gurobi 7.5 [46]. We are now ready to present our findings in the next section.

Results
We investigate three measures obtained by both the heuristic and the exact optimization: our optimization model, lower distance-weighted costs are preferable. Table 2 summarizes the results for 푘 ∈ {2, . . . , 9} in Casselton. e population represents the number of residents living in the -club. e distance-weighted cost is the actual objective function of our optimization model. Finally, time a bike. e population measure represents the number of the residents within the -club: they are the ones with immediate access to a location with bicycles. Finally, the distance-weighted cost describes the total distance a commuter (from any location in the network) should walk to reach some node in the -club to get access to a bike. erefore, as was also shown in residents living outside the -club must travel to access to desginated geo-fenced areas. e optimization model expectedly offers better results than the heuristic for all -clubs obtained. Finally, when looking at the computational time, it becomes clear that even in a small city like Casselton, the exact optimization approach is prohibitively expensive, with 푘 = 7 taking shows the computational time required to solve the problem.
Starting from the population, in the case of exactly optimizing the formulation, it is consistently smaller than the population covered by the heuristic approach. On the other hand, distance-weighted cost represents the distance that the  chooses the "best" candidate node to add so long as it respects the -club diameter requirement. Because of this, the population immediately covered is bigger in the solution from the heuristic as opposed to the optimization model. We note though that this is not necessarily good, as it might result in locations where a high number of residents have immediate access to dock-less bike sharing, but other residents have to travel very far to access it.
In the case of Fargo, as shown in Figure 8, we only applied the heuristic algorithm to validate our model, as optimizing for the values of that would be meaningful resulted in running out of memory. Figure 8 illustrates the -club heuristic solutions for Fargo, for 푘 ∈ {10, 11, 12, 20}. e potential sites were located in a highly populated area next to the university campus. e existing 11 bike stations already in operation in Fargo are only blocks away from the suggested the 10-club. Table 3 summarizes the numerical results. It is intuitive that due to the fact Fargo has a larger overall population per block, the corresponding numbers in the table are much larger than the ones for Casselton. a little less than 10 hours, and 푘 = 9 requiring more than 24 hours of computation before it terminates upon reporting a suboptimal solution and an optimality gap of 56.8%. e heuristic though is significantly and consistently faster, with a small uptick in computational time linear with the value of as it increases. Figures 6 and 7 present the solutions within the city, and show the sets of nodes selected. Both the heuristic and the optimization approaches suggest groups of vertices located nearby-seeing as the resulting set of nodes forms a -club. However, the heuristic approach starts with the most populated points in the city, and expands the set of nodes around that same point as the diameter of the set ( ) increases. On the other hand, the optimization model is more dynamic, as it tries to minimize the overall distance-weighted cost.
We note that the heuristic is also inconsistent, as there are cases (see, e.g., 푘 = 4 vs. 푘 = 5) where a solution worsens as far as the distance-weighted cost is concerned as increases. is happens because the heuristic of Algorithm 1 myopically  results in higher re-balancing cost. At the same time, it leads to higher customer satisfaction. e dock-less option would at least avoid initial capital investment and pave the way to introduce bike-sharing programs to cities, without sacrificing customer satisfaction with the program.

Conclusions
In this work, we discussed a new paradigm for selecting where a dock-less (geo-fenced) bike-sharing system should be enabled within an urban area. is paradigm tries to solve the disadvantage of kiosk-based bike-sharing programs such as high equipment costs and costs associated with customer dissatisfaction due to lack of bikes/docks at the desired location. Also, the proposed model offers a better solution to existing dock-less problems.
We modeled our problem as one of detecting a connected set of nodes of restricted diameter (that is, where any two nodes are reachable within hops using nodes inside the set), or a -club. e goal was to find a -club of maximum closeness, so as to make sure that all other nodes in the transportation network are close enough to the bike-sharing locations. We showed that, as expected, the problem is NP-hard, and provided an integer programming formulation to solve it. We also propose a greedy heuristic, which is computationally inexpensive. As increases for the obtained -club, we should expect the coordination costs to increase along side as greater values of will imply larger geo-fenced areas. From a practical perspective, BSS operators would have to trade off the size of the geo-fenced area (the larger, the more easily accessible and more convenient to users) to the rebalancing costs (the smaller, the more easily coordinated and cheaper for BSS operators).
We also used our methods to study the resulting setup in two cities of the state of North Dakota, Casselton (of smaller population) and Fargo (of bigger population). e potential cost savings in the dock-less approach could decrease initial capital investments for introducing a bike-sharing program in a city. It also leads to an increase in the number of the virtual docks (capacity) without blocking streets or pedestrian walkways. One might say that dock-less bike sharing brings chaos to cities, due to the freedom of allowing bike check in/out anywhere in a geo-fenced area. at is why our approach could mitigate the described situation and leverage this dock-less alternative, by only enabling some areas with this capability. e model at the moment is built based on the population as the only location weight.
Future directions for our work include the following. First, we could investigate the identification of multiple -clubs of varying sizes within a city. is would allow BSS operators to have multiple smaller geo-fenced areas or fewer larger geofenced areas to cover all bike-sharing demands. As a second direction, we should consider more ways to build the weight parameter in our framework. For example, we plan to investigate how -club formation and how the geo-fenced areas change as we consider city points of interest, distance to nearby transit points, and origin-destination demands throughout the day, among others. Next, another future avenue for our research would be to investigate more closely the interactions

Cost-Benefit Analysis
Equipment, installation, and maintenance are three significant costs involved in implementing a bike-sharing program. e main drawback to physical bike station systems (known as kiosk system) is their high acquisition and operating costs. Stations are costly including tens of thousands of dollars to manufacture and install along with several thousand dollars to acquire customized bikes. Moreover, kiosk systems mandate constant bike rebalancing. is happens because every bike needs to be returned to a kiosk: if the kiosk is full, the riders must find another location with available spots, resulting in higher operational cost and a decrease in customer satisfaction. e cost of each bike is estimated at $1,234 [47]. Assuming a cost of $1000 on average for each bike, the cost for a typical kiosk with 11 docks will range from $29,000 to $34,000, excluding operating costs. Figure 9 shows the relationship between the cost and number of docks. ese figures are even higher at the planning stage ($55,000 per station) [48]. e optimal number of docks is another critical factor in a bike-sharing program. Increasing the number of docks leads to higher costs, and a pile up of bikes in one location, which consequently  between different operators (e.g., dock-less bike sharing and scooter sharing, or dock-less bike sharing and public transit) with respect to different geo-fenced areas.
Data Availability e geospatial data used to support this study are made available from the North Dakota GIS Hub Data Portal at https://gishubdata.nd.gov. e processed data resulting in the graphs produced are available upon request by Ali-Rahim Taleqani at ali.rahimtaleqani@ndsu.edu. Previously reported networks were also used to support this study and are available at https://github.com/bstabler/TransportationNetworks. e dataset is cited at a relevant place within the text as reference [45].

Conflicts of Interest
e authors declare that they have no conflicts of interest.