Multiobjective Route Selection Based on LASSO Regression: When Will the Suez Canal Lose Its Importance?

With coronavirus disease 2019 reshaping the global shipping market, many ships in the Europe-Asia trades that need to sail through the Suez Canal begun to detour via the much longer route, the Cape of Good Hope. In order to explain and predict the route choice, this paper employs the least absolute shrinkage and selection operator regression to estimate fuel consumption based on the automatic identiﬁcation system and ocean dataset and designed a multiobjective particle swarm optimization to ﬁnd Pareto optimal solutions that minimize the total voyage cost and total voyage time. After that, the weighted sum method was introduced to deal with the route selection. Finally, a case study was conducted on the real data from CMA CGM, a leading worldwide shipping company, and four scenarios of fuel prices and charter rates were built and analyzed. The results show that the detour around the Cape of Good Hope is preferred only in the scenario of low fuel price and low charter. In addition, the paper suggests that the authority of Suez Canal should cut down the canal toll according to our result to win back the ships because we have veriﬁed that oﬀering a discount on the canal roll is eﬀective.


Introduction
Despite bleak prospects of worldwide market amid the coronavirus disease 2019 (COVID-19) pandemic, shipping remains the backbone of the global economy [1]. However, shipping companies must reconsider their decision-making of ship operations in order to survive the crisis. Suddenly, Europe-Asia sailing via the Cape of Good Hope rather than the shorter route via the Suez Canal looks like an attractive option. As shown in Figure 1 [2], since the end of March 2020, at least 32 such sailings took place via the Cape of Good Hope. Many ships operated by the three major shipping alliances, 2M, Ocean Alliance, and THE Alliance, have all chosen the longer route.
Sailing around the Cape of Good Hope, more than 3,000 nautical miles and at least 5 days longer than transiting through the Suez Canal, may seem to be a strange move. But the preference of this sailing option is not unprecedented. When fuel price nosedived in late 2015, many ships sailing from the US east coast to Asia did the same until the Suez Canal slashed the transit charge. Apparently, bypassing the Suez Canal is driven by external factors like fuel price, charter rate, and canal transit charge. e additional fuel and charter costs of switching to the longer route are negligible, owing to the drop of fuel prices and charter rates. More importantly, the growing costs can be largely offset by the benefits of avoiding the expensive canal transit charge. It is reported that Maersk shells out approximately 350,000 USD per ship for transiting through the Suez Canal [4]. Under the combined effects of external factors, the detour around the Cape of Good Hope offers shipping companies an effective measure to minimize total voyage cost.
In practice, however, the minimal cost is not the only object in the decision-making of ship operations. e detour via the longer route inevitably increases the total voyage time, which will lower the service level and may annoy shippers. In fact, some shippers are weighing up whether to stop booking slots from shipping companies making the detour [5]. To maintain market share, shipping companies must work to improve the service level, i.e., minimizing the total voyage time. Hence, the decision of whether to detour is a multiobjective problem aimed at trading off minimal total voyage cost against minimal total voyage time.
Considering the growing popularity of detouring around the Cape of Good Hope, this paper aims to disclose how external factors influence the decision of whether to detour. To clarify the mechanism of influence, the main obstacle lies in the difficulty in precisely estimating the total voyage cost. For instance, it is very difficult to estimate fuel consumption, which is impacted by various factors, such as sailing speed, draft, wind direction, and current direction. Most scholars only roughly estimated fuel consumption with a cubic function of sailing speed. Other scholars, namely, Fagerholt et al. [6] and Yao et al. [7], proposed a fuel consumption function based on empirical data from a shipping company but did not consider the impact of external factors on fuel consumption. e inaccurate estimation of fuel consumption will lead to errors in predicting the decision-making of shipping companies. To solve the problem, this paper employs the least absolute shrinkage and selection operator (LASSO) regression model to examine the correlations between eigenvariables and solve the problem of fuel consumption estimation. erefore, this paper employs the LASSO regression model to reflect the relationships among different eigenvariables. In addition, a particle swarm optimization (PSO) technique-based solver is proposed to solve this multiobjective problem. Finally, since the decision-making of ship operations is an immediate choice, the weighted sum (WS) method is introduced. Striving to solve an emerging and valuable issue, this paper makes the following contributions: (1) is paper analyzed the determinants of the ship detouring behavior from Suez Canal to the Cape of Good Hope, which is a realistic problem with critical significance for the shipping industry but only investigated by few scholars so far.
(2) Some cutting-edge big data techniques and optimization methods were applied in combination to solve the problem. Specifically, the data preparation, training LASSO regression model, multiobjective particle swarm optimization (MOPSO) algorithm, and WS method were integrated into a novel optimization framework of sailing speed and sailing route. (3) Four scenarios reflecting the fluctuation of shipping market conditions were proposed and analyzed for the ship detouring problem. For each scenario, the paper also calculated the suggested Suez Canal toll that is able to win back the ships detouring around the Cape of Good Hope. e remainder of this paper is organized as follows: Section 2 makes a thorough review of the related literature; Section 3 establishes a LASSO regression model to estimate fuel consumption, introduces a multiobjective optimization model, describes the solving algorithm MOPSO, and introduces the WS method; Section 4 presents and analyzes the optimization results through a case study; and Section 5 puts forward the conclusions.

Literature Review
Ship operations decision-making mainly includes fleet management, ship scheduling, route planning, and speed setting. Christiansen et al. [8] summarized the studies on ship scheduling and route planning. Mansouri et al. [9] provide a survey of existing research on sustainable decisionmaking of ship operations. Fuel consumption is an important variable in the decision-making of sailing speed and route. Fagerholt et al. [10] obtained fuel consumption through linear interpolation and optimized the sailing speed and route. Zhen et al. [11] also relied on linear interpolation to ascertain fuel consumption and proposed a tabu search (TS) algorithm to minimize the fuel cost. Zhen et al. [12] combined two-stage iterative algorithm and fuzzy logic method with ε-constraint into a novel approach to solve the sailing speed and route decision problem subjected to changing fuel price. Lee et al. [13] optimized the speed of liner shipping under the weather impact, revealing that the sailing speed affects the transit time between ports and, in turn, impacts the service level. To optimize the sailing route, Gkerekos and Lazakis [14] presented a novel framework based on a data-driven model, which plans the ship routes in view of historical ship performance and current weather conditions.
Most speed optimization models assume that fuel consumption is the cubic formula of the sailing speed [15]. In real-world scenarios, fuel consumption is affected by various factors other than sailing speed. Characterizing fuel consumption by sailing speed and load, Wen et al. [16] optimized the route and speed of multiple ships under time, cost, and environmental constraints. Wang and Meng [17] explored the deterministic speed optimization problem, a subproblem of container routing problem: the relationship between sailing speed and fuel consumption was analyzed based on historical data, and the fuel consumption was found to depend on voyage legs, for the weather varies from leg to leg. Kim and Lee [18] introduced the optimizationbased decision support system (DSS) to ship scheduling and used the Linear, Interactive, Discrete Optimizer (LINDO) to maximize the profit of cargo transport. Windeck and Stadtler [19] also developed a DSS for low-carbon shipping network design problem under weather factors. In recent years, big data analytics begin to be concerned in operations research [20]. For example, Lee et al. [13] implemented the method of big data in meteorological archives and predicted fuel consumption based on the massive weather data at different points of the sea, creating a systematic strategy to extract the weather information from massive archive data for route planning. However, their research is not sufficiently comprehensive, for the fuel consumption of ships is not only affected by weather but also influenced by the state of the sea and various other external factors.
Many other scholars have developed big data-driven models for ship fuel consumption by considering the complicated impacts of external factors. Zheng et al. [21] used artificial neural network (ANN) to predict fuel consumption and optimize the sailing speed. Based on the noon report data, ANN is also applied by Beşikçi et al. [22] to predict fuel consumption of a tanker according to eigenvariables including ship speed, mean draft, and cargo load. Drawing on the wavelet neural network (WNN), Wang et al. [23] established a model to optimize energy efficiency in real time and used the model to determine the optimal engine speeds based on the data collected from GPS receiver, wind speed sensor, water depth sensor, and other technologies. In addition, some relevant regression approaches were developed by Lepore et al. [24], Wang and Yang [25], and Wang et al. [23]. For example, Wang et al. [26] discovered the close correlations between various eigenvariables that affect fuel consumption and selected these eigenvariables with the LASSO regression algorithm. In this paper, the LASSO regression, which has been proven to be effective by Wang et al. [26] in analyzing the impacts of multiple eigenvariables, is adopted to predict fuel consumption during navigation.
For the decision-making of sailing speed and route, many intelligent algorithms have been adopted to solve the optimization model. With the aid of the PSO algorithm, Zheng et al. [21] minimized the total fuel consumption by determining the sailing speed between every two stations; the global optimal sailing speed was acquired through comparison between different improved PSO algorithms. Moore et al. [27], as the first to apply the PSO algorithm in multiobjective optimization, highlighted the importance of individual and swarm searches but ignored the maintenance of swarm diversity. Lee et al. [13] introduced the MOPSO algorithm to minimize the fuel cost and maximize the service level and obtained the Pareto optimal solution. Cariou et al. [28] developed a heuristic approach based on a genetic algorithm (GA) and then solved the problem of large-scale combinatory optimization of speed, route, and cargo flow. Gkerekos et al. [14] modified Dijkstra's shortest path algorithm through heuristics fittings and applied the algorithm recursively until finding the optimal route.
To sum up, the PSO algorithm has not been widely applied with data-driven estimation of fuel consumption. Table 1 compares the few relevant studies in terms of fuel consumption estimation, number of objectives, and solving algorithm. To solve the decision-making of ship speed and route amid the COVID-19, this paper selects the LASSO regression model proposed by Wang et al. [26] and the MOPSO algorithm proposed by Nguyen and Kachitvichyanukul [31].

Methodology
e methodology in this paper consists of four parts. Section 3.1 introduces the LASSO regression that can predict fuel consumption precisely. In Section 3.2, a mathematical model is put forward. en, in Section 3.3, a MOPSO algorithm is designed to solve the mathematical model. Finally, the WS method is introduced to deal with route selection.

Fuel Consumption Estimation Based on LASSO Regression
e dimensionality of the original data tends to vary with data fields. For instance, some eigenvariables contain positive and negative values. To prevent solver instability, the original data should be normalized before being imported to the training model. Here, the original data is preprocessed through Z-score normalization.
e Z-score normalization with the mean of zero and standard deviation of one can be expressed as where X � {x i }, i � 1, 2, . . ., n is the original dataset and x is the mean of the original values.

LASSO Regression Model.
e LASSO is a parsimonious model that adds a penalty equivalent to absolute magnitude of regression coefficients and tries to minimize them [32]. e model can be described as minimizing the residual sum of squares (RSS), also known as the sum of squared residuals, where the residual in statistics refers to the deviation of the predicted data from the actual value. If the penalty or constraint is sufficiently large, all coefficients are decreased towards zero. If the penalty or constraint decreases to zero, the coefficients not strongly associated with the outcome are decreased to zero, which is equivalent to removing these variables from the model. erefore, the LASSO is an excellent tool for processing data with complex collinearity.
Suppose there is a set of N samples, and the i th sample consists of the vector x i � (x i1 , x i2 , . . ., x ip ) composed of p covariates and the response variable y i . en, optimize the model: where β � (β 1 , β 2 , . . ., β p ) is the regression coefficient vector under sparse assumption. Without loss of generality, the covariates can be normalized so that i x ij /N � 0 and i y i /N � 0. Letting y and x be the mean values of y i and x i and the unbiased estimation β 0 � y − x T β � 0. en, by supposing the sample X � (x 1 , x 2 , . . ., x N ) T and the output vector y � (y 1 , y 2 , . . ., y N ) T and adopting the L p norm of the vectors (‖x‖ p � ( m i�1 |x i | p ) 1/p ), formulas (2) and (3) can be simplified as arg min e model can be further transformed into the Penalized Least Squares Function, which is also known as the Lagrangian form [33,34]: where λ is the regularization parameter (λ ≥ 0). According to the Lagrangian Duality, λ has a data-dependent relationship with t.

Solving the LASSO Regression
Model. e LASSO regression model is generally solved by the combination of the k-fold cross-validation and the least angle regression (LARS) algorithm (LassoLarsCV) [30,35]. In this research, λ or t is estimated by using 10-fold cross-validation. λ, a constant parameter, was estimated by minimizing formula (5), while β was solved with the LARS algorithm so that the residual error was reduced continuously until it was less than a constant.

Fuel Consumption Estimation.
rough the above analysis, a LASSO regression model was built to predict fuel consumption. Following the multiple linear regression formulation [36], the fuel consumption F is represented as follows: where b are intercepts.

Mathematical Model.
is paper focuses on the decision of whether to detour around the Cape of Good Hope amid COVID-19. e objective is to minimize to total voyage cost, while the voyage time is not strictly restricted. e total voyage cost was broken down into the sailing fuel cost, the berthing fuel cost, the charter cost, and the transit charge of the Suez Canal. e sailing fuel cost accounts for a large proportion of the total voyage cost. Hence, it is important to predict the sailing fuel consumption. During the voyage, the fuel consumption of the ship is affected by various eigenvariables, including but not limited to sailing speed, draft, weather conditions, and sea conditions. Some eigenvariables are strongly correlated, such as wind speed and wind force. e strong correlations make the fuel consumption estimation a typical multicollinearity problem. Hence, this paper employs the LASSO regression method proposed by Wang et al. [26] to select the eigenvariables and improve the interpretability and accuracy of the fuel consumption estimation.
After the LASSO regression model was determined, the mathematical model of our problem was established to describe the total cost and total time of voyage and to reveal the correlation between fuel consumption, sailing speed, and other factors. e two objectives of our problem are Wen et al. [16] A function of speed and payload Multiple Heuristic branch-and-price Fagerholt et al. [10] Linear interpolation Single e commercial optimization software Xpress MP Sheng et al. [29] e third power of speed Single Formula derivation Zhen et al. [11] Linear interpolation Single A TS-based solving method Zhen et al. [12] Linear interpolation Multiple A hybrid strategy coupling the two-stage iterative algorithm and fuzzy logic method with ε-constraint Cariou et al. [28] e third power of speed Single A GA-based heuristic Lee  conflicting with each other: the total voyage cost is positively correlated with the sailing speed, while the total voyage time is negatively correlated with the sailing speed. To rationalize the decision of whether to detour, it is necessary to find the Pareto optimal solution of the tradeoff relationship between the two objectives. e model, aiming to minimize the total cost and total time of voyage, was solved by the MOPSO algorithm.
Our research considers a liner ship operating on a given route with a set of ports of call. Here, the mandatory time window of arrival of each port is considered. However, the sailing speed between the two ports visited in sequence is variable due to navigation environment and other factors. erefore, we set up the nodes N � {1, 2, . . ., n} including ports that can be visited by the ship.
Each node has the arrival time t arrive i and the departure time t leave i of the ship. For the node that is not port, we can set t leave i − t arrive i � 0. Additionally, the trip from port i to port i + 1 was defined as a leg i. e parameters of the optimization model are explained in Table 2.
e total voyage cost of the ship consists of three parts: the sailing fuel cost, berthing fuel cost, charter cost, and the Suez Canal toll. e sailing fuel consumption for leg i was described as the LASSO regression model f (v i , E i ), where E i denotes the various eigenvariables at leg i. e berthing fuel cost per hour at port was fixed because only the auxiliary engine of the ship operates during port call. Let k be the mean amount of fuel consumed per hour at port. en, the fuel cost per hour (α) at port can be expressed as α � kP fuel . e charter cost depends on the total voyage time and is indirectly affected by fuel price: the falling fuel price will change the supply-demand relationship of the shipping market, which in turn changes the ship's charter rate. Finally, if the ship sails through the Suez Canal, the operator must pay an expensive toll at once. erefore, the first objective of the model, seeking to minimize the total voyage cost of the ship on the given route (M 1 ), can be expressed as On the given route, the voyage time of the ship should be as short as possible. erefore, the second objective is to minimize the total voyage time (M 2 ) and can be expressed as e objective (7), minimizing the total voyage cost, conflicts with the objective (8), minimizing the total voyage time. Constraint (9) sets a limit on the total voyage time. Constraint (10) ensures that the sailing speed of the ship falls between the lower and upper limits in all legs.

MOPSO Algorithm.
e PSO is a metaheuristic algorithm that has been successfully applied to many real-world scenarios [37]. In the algorithm, each particle in the swarm is treated as a possible solution to the problem, and the optimal solution is searched for based on the behaviors of particles and the interaction between particles.
Considering only one objective, the original PSO proposed by Kennedy and Eberhart [38] cannot be directly applied to multiobjective problems. e MOPSO, aimed at solving problems with different priorities, has been developed in recent years, for example, handling multiobjective optimization problems with a multiswarm cooperative particle swarm optimizer [39], a bare-bones multiobjective particle swarm optimization algorithm [40], and variable-size cooperative coevolutionary particle swarm optimization for feature selection on high-dimensional data [41]. is paper selects the MOPSO framework developed by Nguyen and Kachitvichyanukul [31] as it is one of the most classical MOPSO and has proven efficiency for this problem. e MOPSO can memorize the current search situation and make timely adjustment to the search strategy, resulting in excellent global convergence and robustness.
To solve the multiobjective problem, the MOPSO mainly combines Pareto sorting mechanism to find the historical optimal solution of particles and update the noninferior solution set. e noninferior solutions are searched for in parallel using efficient clusters. In multiobjective optimization, the MOPSO iteratively outputs a set of noninferior solutions that are dominated by each other. en, the global optimal position can be obtained by randomly screening the noninferior solution set. e steps of the MOPSO framework are summarized as follows: Step 1. Z-score normalization was adopted to normalize the two objective functions of our problem, i.e., total voyage cost and total voyage time: Mathematical Problems in Engineering where Step 2. e initial swarm was set up, and each particle was given an initial speed and position. e other parameters, such as learning rate, upper and lower limits of inertia weight, maximum number of iterations, and swarm size, were also initialized.
Step 3. e fitness of each particle, depending on the two objectives, was calculated as After the algorithm was initialized and updated, the constraints of the two objective functions were satisfied by limiting the control variables to the specified range.
Step 4. e fitness of particles was compared. e optimal position (P-best) and noninferior solution set of particles were updated according to the dominance relations. e global optimal position (G-best) of particles was randomly selected from the noninferior solution set.
Step 5. e speed and position vectors of each of the N particles in the swarm were, respectively, updated as where x i denotes the position of particle i, v i denotes the velocity of particle i, and D is the number of decision variables. In the evolution process, the position and velocity of each particle were, respectively, updated as v k+1 where ω is the inertia weight, k denotes the iteration number, r 1 and r 2 are uniform random variables in the interval [0, 1], d ∈ D is the dimension of the vector, p k id and p k gd denote the acceleration constants for its personal best (P-best) and the global best position (Gbest), respectively, x k id is previous position of the particle I, and v k+1 id is new velocity of the particle i. Step 6. If the maximum number of iterations was reached, the set of Pareto optimal solutions was outputted; otherwise, steps 3-6 were repeated.
Step 7. e upper and lower bounds of sailing speed were determined according to the actual situation and ship performance. en, the minimum values of the two objectives (M L 1 and M L 2 ), i.e., the positive ideal point A, and the maximum values of the two objectives B (M U 1 and M U 2 ), i.e., the negative ideal point B, were obtained.
Step 8. In the set of Pareto optimal solutions, the solution with the minimum relative distance from positive and negative ideal points was screened out as the optimal tradeoff solution C(M * 1 , M * 2 ), and the  corresponding speed was taken as the optimal sailing speed. Formula (15) shows the line connecting positive ideal point A and negative ideal point B, and the intersection of this line and Pareto frontier is the optimal tradeoff solution P(M * 1 , M * 2 ): where e selection of the best tradeoff solution from Pareto frontier is shown in Figure 2.

Weighted Sum (WS) Method.
After getting the optimal tradeoff solution of the total voyage cost and the total voyage time though the MOPSO algorithm, the WS method [42] is used to evaluate two routes under specific fuel prices and charter rates in this paper. e purpose is to provide a decision reference for operators. e idea of the WS method is to convert the multiobjective optimization into a single objective optimization problem by using a convex combination of objectives. e steps of the WS framework are summarized as follows: Step 1: the total voyage cost gap Δ M 1 and the total voyage time gap Δ M 2 between the two routes in given scenario are, respectively, calculated as where Δ M 1 is the total voyage cost gap, Δ M 2 is the total voyage time gap, Δ c M 2 is the loss of economic value, U is a single objective utility function, and ω is the weight of cost.
Step 2: the total voyage time gap Δ M 2 is converted into the loss of economic value Δ c M 2 due to the delayed delivery of the cargo in this period [16]: where τ is the market price of the cargo at the destination.
Step 3: the multiobjective is transformed into a single objective utility function U by introducing the weight of cost ω, which shows the route decision of operators when they have different preferences for voyage cost and voyage time. e WS method solves the scalar optimization problem: In summary, the objective of the mathematical model in Section 3.2 is to minimize the total fuel consumption predicted by the LASSO regression model in Section 3.1. e model was solved by the MOPSO algorithm in Section 3.3. Route selection was obtained through the WS method in Section 3.4. As shown in Figure 3, the algorithm is implemented in four phases: data preparation, training LASSO regression model, finding the optimal solution with the MOPSO algorithm, and the route selection with the WS method.

Case Study
e data selection and preparation are introduced in Section 4.1. e fuel consumption estimation process is presented in Section 4.2. e setting of MOPSO parameters is shown in Section 4.3. Finally, Section 4.4 gives a comprehensive analysis on the route choice.

Data Selection and Preparation.
e case study uses the parameters of a real container ship named "CMA CGM Chile," which transports dry cargoes from the Indian Ocean to Le Havre, France. According to the records of the automatic identification system (AIS) provided by Elane Inc., the ship passed through the Suez Canal (Route SC, Figure 4) during the voyage from December 23 rd , 2019, to January 25 th , 2020, departing from Qingdao, then visiting Ningbo, Daqu, Yangshan, Yantian, Pasir Panjiang, and Sokhna in sequence, and finally arriving at Le Havre. However, the ship detoured around through the Cape of Good Hope (Route CGH, Figure 5) during another voyage from March 16 th , 2020, to April 25 th , 2020. e order of ports visited by the ship was Qingdao, Ningbo, Daqu, Yangshan, Yantian, Pasir Panjiang, and Le Havre. Although the ship detoured around through the Cape of Good Hope, the ports visited in actual voyage did not change. erefore, it is appropriate for this paper to select such a range of ship navigation for calculation. In addition, Mathematical Problems in Engineering 7 several legs except the legs between the ports of call in actual voyage are given in the figures to prove the fuel consumption estimation model proposed.
Here, the fuel costs of the two routes were estimated based on two different fuel prices: P fuel � 365.5 USD/ton (the fuel price in March 20, 2019) and P fuel � 190.6 USD/ton (the fuel price in December 27, 2019). e berthing fuel consumption per hour is set as k � 5.2 ton/hour [28]. e detour will increase the total voyage time, incurring more charter cost, and the charter rate is affected indirectly by the fuel price. Here, the charter cost is set to 20,833.3 USD/day at the fuel price of 190.6 USD/ton [43]. e toll to pass through the Suez Canal is set to 450,000 USD (the toll in January 4 th , 2020). In addition, the arrival time t arrive i and the departure time t leave i of the ship at node i were obtained from AIS data. Here, the market price of the cargo at the destination τ � 10000 USD/day. Finally, the maximum and minimum sailing speeds were taken from the shipping logs. e values of the related parameters are listed in Table 3.

Fuel Consumption Estimation.
e main dataset available from AIS and ocean dataset includes mean speed, mean draft, course, current speed, wind speed, wind force, seawater temperature, seawater salinity, and effective wave height. Among them, the weather data like current speed,  wind speed, wind force, and seawater temperature were extracted from the nc.file obtained from the Copernicus Marine Environment Monitoring Service [44]. Part of the original dataset is presented in Table 4. e original dataset, involving 490 samples and 10 eigenvariables, was used to train our model. Table 5 shows the original dataset after being processed by Z-score normalization. e normalized dataset was randomly divided into a training set and a test set at the ratio of 4 : 1 and used to verify the effectiveness and reliability of our estimation model.
Taking fuel consumption as the response and other eigenvariables as inputs, our estimation model was optimized by computing the best λ conforming to the least RSS. As mentioned in Section 3.1.3, 10 equal subsets were divided from the training set and conducted for validation. In this way, the optimal values of λ and b were determined as 0.020955 and 0.0000928, respectively.
In this case study, the five eigenvariables (as marked by * in Table 6) corresponding to the nonzero regression coefficients were selected for fuel consumption estimation. As shown in Table 6, the eigenvariables were loosely correlated with fuel consumption, except for sailing speed. e eigenvariables like current speed, wind speed, and wind force had very small effects on fuel consumption. e sailing  speed makes the greatest impact on the fuel consumption. As a result, the fuel consumption cost increases significantly with sailing speed, which can also increase the total voyage cost. e correlations of the selected eigenvariables are also illustrated in Figure 6. e LASSO regression model solves the multicollinearity between variables, as evidenced by the strong correlation between wind speed and wind force. A comparison between our estimation model and a general linear regression estimation is provided to verify the performance. e estimation effects of the two models on the same test set were measured by the mean absolute percentage error (MAPE) and root mean square error (RMSE) ( Table 7). e fitting performance of the two methods is compared in Figure 7. Apparently, our model outperformed the general linear regression model in the estimation of fuel cost and achieved lower RMSE and MAPE than that model. Compared with Wang et al. [26], another study utilizes LASSO-based model for ship fuel consumption and achieves a RMSE of 7.4, and the performance of our model seems better. While the workflows of LASSO algorithm in two papers are similar, the RMSE value depends on the data selection. Wang et al. [26] collected data from 97 ships with different sizes. In contrast, our paper focuses on only one ship (CMA CGM Chile), in order to achieve the best accuracy on this ship and precisely simulate the routing decision. Our aim is not to develop a general fuel consumption that can be applied to all ships, indeed, which is even impossible. Now, it can be shown that the LASSO algorithm is able to estimate the fuel consumption of the test ship with considerable precision. e rest of the case study is to apply the MOPSO algorithm based on the estimated fuel consumption model.

Setting of MOPSO Parameters.
e MOPSO parameters were configured as follows: the swarm size N is 200, and the maximum number of iterations is 100. Under this setting, the program took 47 seconds on average in 30 repeated runs. In 24 out of the 30 runs, the results were basically consistent, indicating the stability of the MOPSO algorithm. Figure 8 shows the convergence curves of the two objective functions. It can be seen that the MOPSO algorithm had converged to the optimal values of the two objectives at the 100 th iteration.

Analysis on the Current Situation: Why Detour?
Recall the phenomenon introduced in Section 4.1 that the ship "CMA CGM Chile" sailed through Route SC at a fuel price of 365.5 USD/ton and detoured to Route CGH at a fuel price of 190.6 USD/ton. By using our proposed algorithms, we are able to explain why the ship chose to detour. As shown in Figure 9, the optimal solutions of both Route SC (depicted in green) and Route CGH (depicted in red) follow the pattern that the total voyage cost decreases with the total voyage time rising. Since the decision-maker has to choose from the optimal solutions, it is clear that the two objectives, minimizing the total voyage cost and minimizing the total voyage time, conflict with each other. In order to make comparison among the optimal solutions in Route SC and Route CGH, we applied the principle of selecting the optimal solution from the Pareto frontier in Section 3.3. Figure 9(a) displays the total voyage cost and total voyage time of Route SC at the fuel price of 365.5 USD/ton. It can be seen that the total voyage cost of the Pareto optimal solution was 2,711,000 USD, and the total voyage time was 793.3      hours. As shown in Figure 9(a), the total voyage cost of Route CGH stood at 2,849,000 USD, and the total voyage time lasted 919.9 hours. Route CGH had a greater total voyage cost and total voyage time than Route SC. Hence, it is more cost-effective to choose Route SC at the fuel price of 365.5 USD/ton. After the outbreak of COVID-19, the global fuel price plunged deeply. Figure 9(b) displays the total voyage cost and total voyage time of Route SC at the fuel price of 190.6 USD/ton. e total voyage cost was 1,822,000 USD, much smaller than that in the scenario of 365.5 USD/ton. And the total voyage cost of Route CGH was 1,696,000 USD, down by 1,153,000 USD/ton from that at the fuel price of 365.5 USD/ton and 126,000 USD lower than that of Route SC. Although the total voyage time of Route CGH at the fuel price of 190.6 USD/ton was 140.2 hours (about 5.8 days) longer than that of Route SC, by balancing the different objectives, the ship chose the longer route at the fuel price of 190.6 USD/ton. Table 8 compares the voyage time and speed of each leg on two routes that were estimated by using the methodology proposed in this paper, and the actual values collected from AIS data. e relative error of voyage time and speed was limited to less than 9%, except for Leg 7 of Route CGH in the scenario of 190.6 USD/t. It also can be seen that the sailing speed fluctuated during Leg 1∼16 of the voyage. When sailing for a short distance on Leg 1∼4, the speed obtained by the model is basically consistent with the actual speed. Similarly, when sailing for a long distance on Leg 5∼16, the estimated values is also close to the actual values. In a word, the estimations and the actual values are basically the same. e results again prove that our methodology can precisely predict the actual voyage.

Scenario Analysis: When to Detour?
e fuel prices and charter rates are time-varying factors. Figure 10 shows the change in fuel price and charter rate from December 20 th , 2019, to July 3 rd , 2020, collected from Clarksons [43]. According to the actual distributions of fuel price and charter rate, four scenarios were designed ( Figure 11): high fuel price and high charter rate (HFHC), high fuel price and low charter rate (HFLC), low fuel price and high charter rate (LFHC), and low fuel price and low charter rate (LFLC). Although the HFLC scenario happens occasionally in reality, this scenario was simulated to provide more comprehensive results. e combinations of fuel price and charter rate in HFLC were generated randomly within the range of HFLC and represented by yellow rectangles in Figure 11. e actual combinations of fuel price and charter rate were represented by blue dots. en, the total voyage cost and the relationship between the four scenarios were analyzed in details. Table 9 provides the total voyage costs and the total voyage time of Route SC and Route CGH in Scenarios HFHC, HFLC, LFHC, and LFLC, respectively. In Scenarios HFHC, LFHC, and HFLC, the total voyage cost of Route SC was at least 120,700 USD, 83,700 USD, and 130,300 USD lower than that of Route CGH, as shown in the inputs that were marked by * in Table 9, and Route SC remains an absolute advantage on the total voyage time due to the sailing distance. erefore, the operator will naturally choose Route SC in the above three scenarios. By contrast, in Scenario LFLC, the total voyage cost of Route SC was higher than that of Route CGH for all fuel price and charter rate although the total voyage time of Route SC took less time, as shown in the inputs corresponding to LFLC in Table 9, marked by * . In this case, it should analyze the operator's preference to find out which route will be more competitive. And if let the operator select Route SC in Scenario LFLC, how does the authority of the Suez Canal need to adjust its canal toll? Actually, the canal toll is a significant factor influencing the route choice when fuel price and charter rate are fixed. On April 1 st , 2020, the authority of the Suez Canal once announced a 6% discount for European ships [45], but it is evident with many ships including the CMA ship that this discount is not sufficient to prevent the detour around the Cape of Good Hope. Here, taking the fuel price of 193.58 USD/ton and the charter rate of 20,500 USD/day in April 3 rd , 2020, as an example in the Scenario LFLC set, we attempted to test the impact of different discounts on the original canal toll (450,000 USD) on the choice between Route SC and Route CGH using the WS method in Section 3.4. e route choice decision-making is complicated by the fact that the adjustment of canal toll impacts both the total voyage cost gap Δ M 1 and the total voyage time gap Δ M 2 between the two routes. erefore, we used utility (U) to represent the satisfactory degree of the decision-maker. e values of utility can be calculated by the WS method, as mentioned in Section 3.4. e higher the value of U, the more attractive the Route SC. e results of Δ M 1 , Δ M 2 , and U are shown in Table 10. We also calculated the Suez Canal toll (P pass ) and the toll discount corresponding to U = 0, which can make sure the attractiveness of Route SC and Route CGH are the same, by linearly interpolating a value between the Suez Canal tolls with utilities very close to zero. In Table 10, the positive U that was closest to zero and the negative U that was closest to zero under any cost preference were marked by * , and the Ppass corresponding to U=0 was derived from these values.In order to avoid the ships from detouring to Route CGH, the Suez Canal Authority needs to keep U ≥ 0 when adjusting P pass and the toll discount.
From the results in Table 10, we can find two implications. First, if the operator does not pay so much attention to the voyage cost, even if the canal toll does not need to be discounted, that is, when it is still 450000 USD,  Route SC still has obvious competitiveness. For example, we can clearly observe that when the operator's cost preference is reduced to 0.3, that is, when the operator pays more attention to the loss of economic value caused by the delayed delivery of the cargo, the operator will undoubtedly choose Route SC. at is to say, if the arrival time of the ship at the destination becomes very important, that is, when the operator is not allowed to violate the delivery agreement, the ship will not detour Route CGH. However, April 2020 is in a relatively special time window. COVID-2019 triggered many chain reactions in the global shipping market. Factor in some operators had no time to pay attention to delivery of the cargo in transit, many ports were closed down, and the fluctuation of fuel price and charter rate was abnormal. Route CGH became a cost-saving choice for some operators at that time. e second implication is that if the authority of the Suez Canal is willing to discount the canal toll, it will certainly have an effect on the route selection, regardless of the operator's preference. Only when the operator pays more attention to the cost preference, the authority of the Suez Canal needs to have more discount for the canal toll if the operator wants to choose Route SC. For example, when the cost preference equals 1, the authority of the Suez Canal needs to give a discount of more than 30.13% (as marked by * in the bottom of Table 10) on the canal toll to make a difference in the route choice of the operator. As previously analyzed, the cost preference of 0.3 can no longer be discounted. So, for the cost preference of 0.4, the authority of the Suez Canal only needs to reduce the canal toll to 40830USD; that is, 9.27% discount (as marked by * in the bottom of Table 10) is enough to enable the ship to re-enter Route SC.
Finally, we do a computational experiment on the canal toll discounts which should be made by the authority of the Suez Canal for the operator with different preferences under all possible fuel prices and charter rates in the Scenario LFLC set. e results in Table 11 show that when the operator's cost preference is higher than 0.5, the canal toll discount should be at least 4.60% (as marked by * in Table 11), and the ship will return to Route SC. Of course, only when the authority of the Suez Canal needs to adjust a large discount to 56.32% (as marked by * in Table 11) on the canal toll, the ships will not detour Route CGH. On the other hand, when the operator's cost preference is less than 0.5, there is no need to reduce the canal toll in some cases, such as ω = 0.1 or 0.2. Even for the operators with time preference, if the authority of the Suez Canal expects all of them to choose Route SC, a 38.19% discount (as marked by * in Table 11) is still required.

Conclusions
We observed that, in a period of time after the outbreak of COVID-19, some ships in the Europe-Asia trades chose to detour the Cape of Good Hope to transport cargos, which caused us to explore the reasons behind it. In view of the different factors that need to be considered in the operations of a real container ship, including external factors like fuel price, charter rate, canal transit charge, and navigation environments, as well as the controllable factors like sailing speed and sailing route, a research framework including data preparation, training LASSO regression, MOPSO algorithm, and WS method is constructed. First, LASSO regression is used to build a ship fuel consumption estimation model, which can effectively select variables from multiple eigenvariables with strong multicollinearity. Next, a mathematical model was established to minimize total voyage cost and total time of voyage and solved by the MOPSO algorithm. After that, the WS method was introduced to deal with the route selection. Finally, a case study on a real container ship named "CMA CGM Chile" was carried out to verify the  e results show that the proposed method can be used as a decision support tool for route planning.
In the case study, the optimal decisions were obtained for different combinations of fuel price and charter rate. In most cases, the Suez Canal was found to be the more attractive option. Four scenarios (HFHC, HFLC, LFHC, and LFLC) were designed based on actual fuel prices and charter rates. e detour around the Cape of Good Hope is preferred only in the LFLC scenario. erefore, in the LFLC scenario, the authority of the Suez Canal needs to reduce the canal toll, which always has an effect on winning back ships. Our computational experiment shows that the authority of the Suez Canal has to offer a 4.60% discount for those operators who are sensitive to the cost. For those who have time preference, the canal toll should not be adjusted in some cases. However, if the authority of the Suez Canal wants to win back all the ships, a huge discount of 56.32% is required for the canal toll. e limitation of this paper lies in the neglect of accidents and piracy, which might also affect the route planning. To better reflect the actual situation, the future research will take the occurrence probability of accidents and piracy in the voyage into account. In addition, the emission cost may need to be considered if authority of Suez Canal or the Cape of Good Hope puts forward environmental policies.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.