Student Enrollment and Teacher Statistics Forecasting Based on Time-Series Analysis

Education competitiveness is a key feature of national competitiveness. It is crucial for nations to develop and enhance student and teacher potential to increase national competitiveness. The decreasing population of children has caused a series of social problems in many developed countries, directly affecting education and com.petitiveness in an international environment. In Taiwan, a low birthrate has had a large impact on schools at every level because of a substantial decrease in enrollment and a surplus of teachers. Therefore, close attention must be paid to these trends. In this study, combining a whale optimization algorithm (WOA) and support vector regression (WOASVR) was proposed to determine trends of student and teacher numbers in Taiwan for higher accuracy in time-series forecasting analysis. To select the most suitable support vector kernel parameters, WOA was applied. Data collected from the Ministry of Education datasets of student and teacher numbers between 1991 and 2018 were used to examine the proposed method. Analysis revealed that the numbers of students and teachers decreased annually except in private primary schools. A comparison of the forecasting results obtained from WOASVR and other common models indicated that WOASVR provided the lowest mean absolute percentage error (MAPE) and root mean square error (RMSE) for all analyzed datasets. Forecasting performed using the WOASVR method can provide accurate data for use in developing education policies and responses.


Introduction
For decades, the near-global decline in fertility has led to considerable socioeconomic changes. The low fertility rate observed in many countries is likely the result of economic, social, cultural, and institutional transformations [1]. Some theories linking broad social changes to fertility decline may be relevant to all countries. Common trends for fertility patterns are also present in many regions. Other theories discuss the situation unique to a particular country. Although the fertility transition has taken place globally, the rate of fertility decline, levels that have been hit, and current fertility rates differ by country. The decline in fertility rates in certain societies is likely to result from an interplay of global phenomena, regional policies, and local forces. Weakening economic and global competitiveness and decreasing birthrates will challenge the competitiveness of a nation by leaving it with a labor shortage.
Although demographic change is considered a constant force, educational institutions are pioneers in each generation's shifting composition. Demographers have predicted that schools will change significantly, which will have an impact on the general public as well. The number of students attending school in Taiwan has decreased, which inevitably has an effect on teacher training programs. Given the difference between the teachers' supply and demand, the cultivation of teachers is ultimately affected. The out-of-balance teaching market in Taiwan is a problem that demands attention from the government, educational administrators, and educators, all of whom are responsible for implementing new strategies to rebalance the teaching market.
Current education sustainability heavily depends on the strategic and budgetary planning of each institution, particularly for student enrollment and teacher recruitment. For instance, colleges and universities must achieve enrollment goals annually to achieve the institutional mission and maintain economic vitality. Teacher statistics also require considerable attention to understand the retirement wave and status of the current teaching market. With a view to proposing the most suitable strategies, administrators turn to enrollment professionals to forecast prospective numbers of students and teachers, built on their area of interest. Using this information will assist administrations in accurately distributing resources and constructing future decisions. The forecast presented in this paper can be adapted to project new enrollment or recruitment goals for any given term, regardless of institution type.
Time-series analysis can be used for the trend analysis of time-series data. Time-series data refer to data that are arranged according to a series of time periods or intervals. Time-series analysis involves testing linear or nonlinear relationships among dependent variables. Many linear and nonlinear approaches have been proposed for forecasting in various time-series studies [2,3]. Linear approaches include exponential smoothing (ETS) [4] and regression-based models, such as autoregressive integrated moving average (ARIMA) [5] and Trigonometric Seasonal Box-Cox Transformation with ARMA residuals Trend and Seasonal Components (TBATS) [6]. The ETS method, which was proposed in 1963 by Brown [4], is a data averaging approach that considers three factors: the error, trend, and season. The maximum likelihood estimation is used in ETS to optimize initial values and parameters, and the optimal ETS model is selected. Moreover, the weight of ETS data decays exponentially. The ARIMA model, which is a well-known timeseries prediction method, was proposed by Box and Jenkins [5]. In the ARIMA method, several fragments formed after a time series that has passed are used as input. Moreover, regression analysis is performed to establish a mathematical prediction model, which is often used for the prediction of short-term economic trends. TBATS, which was proposed by Livera in 2011 [6], is a novel method that integrates trigonometric seasonality, Box-Cox transformation, ARIMA error trends, and seasonal components. The TBATS model is an extension of ETS. The TBATS model can forecast whether seasonal data exist and can analyze the data. These methods, which are heavily dependent on linear assumptions, involve using historical datasets to forecast future flow through a univariate or multivariate mathematical function. However, these models are theoretically linear and can be hindered by their weak nonlinear fitting capabilities. The linear assumption makes it difficult for the aforementioned models to process complex nonlinear problems and obtain ideal prediction results.
The spatiotemporal pattern mining method is applied to track the sequence of frequently occurring events in spatiotemporal datasets. Many spatiotemporal pattern-based forecasting and detection methods have been proposed to the prediction accuracy of the sequence of frequently occurring events. For instance, Dubey et al. used a fast and accurate widearea backup protection scheme for transmission lines based on the synchronized phasor measurement [7]. Cui et al. integrated the spatiotemporal model of system measurement into a flexible Bayes classifier for network attack detection [8]. Sun et al. also used an optimized temporal-spatial scheduling strategy, in the presence of distributed generators, to schedule appropriate charging requirements of plug-in electric vehicles [9]. Furthermore, Cui et al. used the generalized graph Laplacian matrix to visualize the spatiotemporal relationship of the distributed layer phasor measurement unit data [10]. Reinforcement learning (RL) is a powerful technique in machine learning that helps generalization because it enables the design of model-free methods. In recent years, various studies have been conducted using RL. Oh and Wang proposed an RLbased energy storage systems operation strategy which was used to investigate the wind power generation forecast uncertainty [11]. Chen et al. proposed an offline predetermination-online-practice mode and embodied it as model-free control based on RL [12]. Another type of powerful machine learning technique is extreme learning (EL). The calculation is based on a single hidden layer feedforward neural network, which calculates the random weight between the input layer and the hidden layer. By using the EL method and error correction technique, Li et al. proposed a combined statistical method for wind power forecasting [13]. Nonlinear approaches such as radial basis function (RBF) neural networks, multilayer perceptron (MLP) neural networks [14,15], SVR [16], bagging predictors [17], and regression-based trees [18] have attracted considerable research interest. These approaches have demonstrated sufficient nonlinear fitting capabilities for forecasting demand. The SVM method, which was proposed by Valdimir and Vapnik in 1995 [2], maps all data points into a high-dimensional space and then generates a hyperplane to maximize the boundary between two classes and separate them. SVR was proposed by Vapnik et al. in 1997 [19]. Compared with SVM [3], SVR comprises loss functions, penalty factors, and other elements for improving the robustness of the model. In the SVR method, the total distance from each point to a hyperplane is calculated after mapping data points to a high-dimensional hyperplane. The hyperplane with the smallest total distance is the best solution. SVR has been used successfully to solve various problems in numerous fields, such as medicine [20], and has been proven to be a superior prediction model for time-series analysis [21] and regression analysis [22]. Research based on ANNs and SVR, which have promising nonlinear adaptability, has widened the application of nonlinear methods. Cang et al. proposed a nonlinear combination method that involves the use of MLP neural networks to map the nonlinear relationship between inputs and outputs [23]. Oliveira suggested that SVR can be used to estimate the software project effort. The findings of Oliveira indicate that the performance of SVR is superior to that of RBF neural networks [24]. However, the use of inappropriate parameter settings influences the 2 Computational Intelligence and Neuroscience implementation of ANN and SVR methods. Studies have indicated that the accuracy of the aforementioned methods strongly depends on the values of their hyperparameters. Therefore, thorough guidelines must be developed [25,26]. Hyperparameter optimization methods have attracted considerable research attention and have been applied in various areas. Many machine learning algorithms, such as the genetic algorithm (GA) [27], particle swarm optimization (PSO) [28], and differential evolution (DE) algorithm [29], have been proposed for optimizing SVM hyperparameters. Luo introduced a novel artificial intelligence approach for predicting the vertical load capacity of driven piles in cohesionless soils through SVR optimized by the GA [30]. Huang et al. combined the PSO algorithm with a backpropagation neural network to establish a demand estimation model [31]. Furthermore, Hasanipanah et al. proposed a new hybrid PSO-SVR model for predicting air overpressure caused by mine blasting [32]. Kuo and Li presented a three-stage method that integrates wavelet transforms, firefly algorithm-based K-means algorithms, and firefly algorithm-based SVR for forecasting Taiwanese exports [33]. Seyedpoor proposed a combination of SVR and the DE algorithm for identifying damage in moment frame connections [34]. Support vector regression (SVR) was proposed by Vapnik et al. in 1997 [19]. The SVR algorithm includes the insensitive loss and penalty factor functions; thus, it has higher robustness than the support vector machine (SVM) algorithm [2,3]. SVR has been proven to be a superior forecasting model for time series [21] and regression analysis [22]. It has three hyperparameters: the regularization parameter (C), bandwidth of the kernel function (σ), and tube size of the ε-insensitive loss function (ε). These hyperparameters considerably affect the accuracy of SVR forecasting. However, automatic adjustment of the aforementioned three hyperparameters in SVR remains a prominent challenge for improving the accuracy of SVR forecasting. The whale optimization algorithm (WOA) exhibits a superior performance to other well-known heuristic methods in solving global optimization problems [35] because it can strike a balance between exploitation and exploration during iterations [36]. The WOA can effectively avoid the problem of local optima and maintain rapid convergence. The WOA can be used with an optimization algorithm to avoid the selection of unsuitable hyperparameters, which can result in overfitting or underfitting [37]. Furthermore, the WOA has the advantages of global optimization capabilities, few control hyperparameters, and easy implementation. The WOA has been successfully applied in various optimization problems, such as photovoltaic cell parameter estimation [38], wind speed prediction [39], and energy-related carbon dioxide emission prediction [40]. In this study, we propose the WOASVR algorithm, which is a combination of the WOA and SVR algorithm, for the highaccuracy time-series forecasting analysis of student enrollment and teacher statistics in Taiwan. The WOA was used to obtain suitable hyperparameters for SVR. Experimental results indicated that the WOA outperformed the GRID and PSO algorithms in terms of reducing regression errors in SVR parameter estimation.

Related Works
Fertility rates are decreasing worldwide. The world's total fertility rate dropped from 5.0 children per woman in 1960 to 2.5 children per woman in 2014 [41]. In the early 1970s, about 43% of the world's population lived in high-fertility countries where women on average had five or more children over their reproductive years and about 18% lived in countries with fertility rates below the replacement level (i.e., 2.1 children per woman) [42]. At present, approximately 46% of the world's population live in countries with subreplacement fertility and only 8% live in high-fertility countries [42]. As a country enters into demographic transition, its fertility may decline. What was probably unexpected was the enormous population affected by the decline in this short period of time [43][44][45], which is as Caldwell [46] described "unpredicted and unprecedented." If the fertility rate continues to remain at such an ultralow level, high-income Asian economies will face serious challenges resulting from depopulation and rapid aging. The shrinking labor force and increasing economic burden of supporting elderly people will pose serious threats to their socioeconomic development and sustainability. At present, governments in Asian societies with ultralow fertility recognize the need to raise fertility, but "exactly what should be done remains elusive" [47,48]. Formulation of effective population and fertility policies entails a good understanding of the commonalities and uniqueness of the situation in East and Southeast Asia, in comparison with Western countries.
McDonald [49] has pointed out that low fertility rates in Asian societies are related to (1) rising economic risk and insecurity (particularly after the 1997 Asian financial crisis); (2) the difference in gender equality at home and at work; and (3) the lack of support from governments, employers, and society for family needs of young adults [49]. In Asian countries, marriage is often seen as a package of childrearing and bearing, caring for seniors, and other family obligations, which puts a much heavier burden on women than on men [45]. Furthermore, these societies place high societal expectations on children's education and career achievements, which places great pressure on parents, particularly mothers [50].
As the country with the lowest birthrate [51], the birthrate of the Taiwanese has become the primary problem of population change. Taking a long view, the low birthrate will not be a short-term problem for Taiwan, but a wave of shocks to the future. In education, primary, secondary, and tertiary education institutions are facing challenges that they have not seen in the past. The education problems faced by the younger generation include a decrease in the number of students enrolled and a surplus of teachers [52]. As a result of the low birthrate, primary schools have reported that the wave of decline has affected heavily [48], and with the instability of the retirement system resulting from previously announced adjusted pension policies, the older group of school teachers retired several years ago [53]. Therefore, the faculty in primary schools represents a younger population, and more teachers are available than are required; this has resulted in a wide range of "stray" teachers, and those with a Computational Intelligence and Neuroscience teaching license are yet unable to obtain a formal position due to a lack of vacancies [52]. Without new vacancies opening in the near future and with birthrates continuing to decline, the problem of surplus education graduates and excess of existing teachers demands immediate attention from authorities. In the aspect of the reduction of student enrollment, schools must face the phenomenon of reducing the number of classes year by year, which also implies that school funding cuts are likely to occur [54]. This is happening not only in urban schools; schools in rural areas are battling the crisis more intensively. Many more agricultural counties, such as Kaohsiung, Pingtung, and Tainan counties, have faced a tendency of combining multiple smaller schools as a type of policy adjustment [55]. With the survival of many schools at risk, the politics of education reform will without doubt become significantly more difficult for the Taiwan government.

SVR. SVM is a machine learning method based on statistical learning theory (Vapnik-Chervonenkis theory)
and structural risk minimization [56]. The concept aims to find the support vector in the data space to distinguish two different categories to construct a hyperplane in the highdimensional feature space. This hyperplane can maximize the boundary distance between the two categories and distinguish the two categories of data correctly to obtain high classification accuracy.
In 1997, the introduction of Vapnik's insensitive loss function ε [57] was extended to solve nonlinear regression estimation and time-series prediction. The basic idea is to give a set of data (x i , y i ), ..., (x n , y n ), where x i ∈ R d and y i ∈ R, x i is the input vector, y i is the target value, i � 1, 2, . . ., n, where n is the sample size of the training data, the x is transformed into a high-dimensional feature space F mapping through a nonlinear mapping ϕ(x), and linear regression is performed in the high-dimensional feature space. The SVR linear function is as follows: where w is a weight vector, which represents the flatness of f(x) in a high-dimensional space, and b is a deviation value; ϕ represents a high-dimensional feature space, which is a nonlinear mapping of the input space x. The coefficients of the parameters w and b can be estimated by minimizing the structural risk. The formula is as follows: where R reg (C) and R emp (C) represent the regression error and empirical error, respectively, which are calculated by the ε-insensitive loss function as equation (3). (1/2)w 2 is the penalty term; C is the penalty constant, which is used to control the degree of error penalty. The penalty term is traded off from the empirical error. To obtain w and b, a relaxation variable is added to equation (4). The formula is as follows: where ξ i and ξ * i are imported to measure all training data that fall outside the ε-insensitive loss interval. With Lagrange multipliers, satisfy α i × α * i � 0, α i ≥ 0, α * i ≥ 0, and include them into equation (1), as in the following equation: with the Lagrange multiplier brought into equation (4) to obtain the maximal dual equation. The formula is as follows: Computational Intelligence and Neuroscience where K(x i , x j ) is defined as the kernel function, and its value is the inner product of two vectors in the feature space ϕ(x i ) and ϕ(x j ). The kernel function can avoid complex calculations in high-dimensional spaces. In this study, a Gaussian radial basis kernel function (RBF) is used: where σ is the bandwidth of the RBF kernel function.
3.2. WOA. WOA was proposed by Mirjalili and Lewis [58]. It was inspired by whales' upward spiral bubble-net hunting behavior. In WOA, every humpback whale in the search space is the candidate solution of the optimization problem. The search whale is used to determine the global optimal solution. Given the initial random candidate solution, WOA updates the candidate solution until the end condition is met. A humpback whale randomly swims to search for prey, and spiral bubble-net predation establishes a mathematical model. The WOA has three different behavior patterns: encircling prey, bubble-net attack method, and search for prey. The details of the three behavior patterns are introduced subsequently.

Encircling Prey.
The humpback whales identify the prey position and surround the prey. It is assumed that the current position of the best individual whale is the position of target prey or closest to the target prey. Other whales in the population update their positions according to the current position of the prey. The updating function can be formulated by where t is the current iterations, is the current best solution position vector, and X �→ is the current solution position vector. The coefficient vectors A → and C → are as follows: where r → is a random vector between [0, 1], a → is linearly reduced from 2 to 0 during the iteration, and Max t is the maximum iterations.

Bubble-Net Attacking Method.
According to the foraging behavior of humpback whales using a bubble net, two kinds of behavior mechanisms exist: a shrinking encircling mechanism and a spiral updating position. The positions of contraction encirclement and spiral renewal are as follows: (1) Shrinking encircling mechanism: when a → decreases linearly from 2 to 0 during the iteration process, this behavior reduces A → in equation (10) by equation (12) to achieve contraction envelopment, in which A → is [−a, a] random value within the interval. Therefore, by setting the random value A → between [−1, 1], the position of the individual whale group appears at any position between the current position and the current position of the best solution.
(2) Spiral updating position: this stage calculates the distance between the individual whale group and its prey and then creates a spiral model between the individual whale group and its prey position to simulate the spiral swimming behavior of the humpback whale. The model can be formulated by where D ′ �→ is the distance between the current best position of the individual whale and its prey, b is a constant defining the spiral shape, and l is a random value between [−1, 1].
When humpback whales shrink around prey and move to feed along the spiral shape path at the same time, it is assumed that the probability of two behavior mechanisms selected in the process of updating the individual position of the whale group is 0.5. The position updating can be formulated by

Search for Prey.
When |A| ≥ 1, in the stage of searching for prey, individuals of the whale group randomly search for prey according to each other's position. A → takes a random value; when it is greater than 1 or less than −1, it forces the whale group to deviate from the prey's location to search for other more suitable prey, so that WOA has the best global search ability. The mathematical model is as follows: where X rand ���� �→ is a randomly selected position from the current whale group.

Selecting SVR Parameters
Using WOA. The SVR parameters are selected using WOA, as presented in Figure 1 and Algorithm 1. When setting up the SVR model, the parameters can influence the prediction effect. SVR includes three parameters: the C penalty constant, the ε-insensitive  Figure 1: WOASVR flowchart.
Initialize the whale population X i (i � 1, 2, 3, . . . , n) Initialize a, A, and C Calculate the fitness value of each whale X * � the best whale (1) t � 0 (2) while (t < Max_iter) (3) for each search whale (4) Update a, A, C, l, and p Update the position of the current whale by Encircling prey (8) else (9) if |A| ≥ 1 (10) Select a random whale (X rand ) Update the position of the current whale by search for prey (11) end (12) end (13) else (14) if (p ≥ 0.5) (15) Update the position of the current whale by bubble-net attacking method (16) end (17) end (18) end for (19) Check if any whale goes beyond the search space and amend it (20) Calculate the fitness value of each whale (21) Update X * if there is a better solution (22) t � t + 1 (23) end while (24) return X * ALGORITHM 1: Whale optimization algorithm. 6 Computational Intelligence and Neuroscience loss function, and the bandwidth of the σ kernel function. An improper approach could cause the overfitting or underfitting of parameter selections. In this study, the WOA algorithm was used to select the parameters of the SVR model. Figure 1 illustrates a flowchart of WOASVR. The process is as follows: Step 1: the WOA parameters are initialized, and then a whale is randomly generated in the search space. Each whale i is represented by x i � {C, ε, σ}.
Step 2: to evaluate fitness value, put the three parameters C, ε, and σ into the SVR model to predict the problem, and use k-fold cross-validation (CV) during the training phase to avoid overfitting and calculate the validation error value, divide the data randomly into k sets, then use one set as testing data and the remaining k − 1 sets as training data; repeat until each set has been used as test data. The final prediction result is compared with the actual result. The mean absolute percentage error (MAPE) is used as the adaptation function, and the k MAPEs are averaged to obtain the final MAPE cv . The value of k is set to 4, which is calculated as follows: where y i is the actual value, f i is the predicted value, and n is the sample size of the test data.
Step 3: use the fitness function of equation (18) to calculate the fitness value of the individual whale group. The best individual whale group with the best fitness value is saved as X * .
Step 4: if the current number of iterations (t) ≤ the maximum number of iterations (Max t ), update a, A, C, l, and p.
Step 5: when p < 0.5, A < 1 uses equation (9) to update the current position of the individual whale group. If A ≥ 1, randomly select the individual whale group position X rand from the current whale group and use equation (17) to update the current individual whale herd position.
Step 6: when p ≥ 0.5, use equation (13) to update the current position of the individual whale group.
Step 7: use the fitness function of equation (18) to calculate the fitness value of the individual whale group, and find and save the best individual whale group in the current group (X * ).
Step 8: determine whether the termination condition is met. If the condition is met, output the adaptive value position X * of the individual whale group; otherwise t � t + 1, repeat steps (4) to (7), and X * is the SVR model's optimal parameters.
Step 9: use the optimal parameters to train the SVR model.
Step 10: use the best trained SVR model to predict the result.

Performance Criteria.
To evaluate the forecast performance of the WOASVR model, two common statistical measures were used in this study, namely, the root mean square error (RMSE) and MAPE, for comparing the deviation of the actual and predicted values. The RMSE and MAPE metrics are expressed in (19) and (20), respectively.
where y i is the actual value, f i is the predicted value, and n is the sample size of the test data.

Datasets and Preprocessing. A hybrid model, combination of WOA and SVR (WOASVR), is presented to forecast student enrollment and teacher statistics in Taiwan.
To evaluate the proposed approach, we applied it to data on student enrollment and teacher statistics as a case study. From 1999 to 2018, data were collected in a Ministry of Education database, and demographics were categorized for public and private schools [59]. The training data used to train the algorithms consisted of the annual data for 1999-2012. The forecast accuracy was evaluated using the testing data, which consisted of the annual student enrollment and teacher statistics data for 2013-2018.

SVR Parameter
Settings. SVR has three hyperparameters: the regularization parameter (C), bandwidth of the kernel function (σ), and tube size of the ε-insensitive loss function (ε). These hyperparameters considerably affect the accuracy of SVR forecasting. The traditional SVR model uses a grid search method (GRIDSVR) to determine optimal hyperparameters. GRIDSVR increases the hyperparameters exponentially [60]. Therefore, the search parameters of GRIDSVR were set as follows: C � [2 0 , 2 1 , 2 2 , . . ., 2 22 ], σ � [2 −10 , 2 −9 , 2 −8 , . . ., 2 0 ], and ε � [2 −10 , 2 −9 , 2 −8 , . . ., 2 0 ]. In this study, two popular machine learning algorithms, namely, PSO and WOA, are proposed to optimize SVR hyperparameters. The population size in PSO was set as 50, the acceleration factors c 1 and c 2 were set as 2.0, and the maximum number of iterations was set as 100. The aforementioned hyperparameters were selected in accordance with the study of Bratton and Kennedy [61]. The population size in WOA was set as 20, and the maximum number of iterations was set at 100. The training results obtained from GRIDSVR, PSOSVR, and WOASVR methods with the selected hyperparameters for student and teacher forecasting are presented in Tables 1 and 2, respectively.

Comparison of the ETS, ARIMA, TBATS, GRIDSVR, PSOSVR, and WOASVR.
In this study, a WOASVR model which combined WOA with SVR algorithms was proposed.

Computational Intelligence and Neuroscience
The WOA method was used to optimize the SVR parameters and enhanced the performance of SVR model. The proposed model was applied to number forecasting of student enrollment and teachers to obtain minimal prediction error. To evaluate the performance of time-series forecast, the proposed model (WOASVR) was compared with an SVR model optimized by PSO or a grid search algorithm and the statistical models included ARIMA [5], ETS [4], and TBATS [6]. Mean absolute percentage error (MAPE) and root mean square error (RMSE) were used as a performance interpretation metric. In a practical time-series forecasting experience, a limited dataset is a frequent drawback for statistical models; it can be quite challenging to achieve the desired results with limited samples. The SVR approach with its strong generalization capability can effectively overcome technical challenges such as minimal datasets and nonlinear, high-dimensional, and local minimum values. Tables 3 and 4 present the average MAPE values of the forecasts obtained using ARIMA, ETS, SVR, PSOSVR, and WOASVR for both datasets (public school and private school). As shown in

Model Performance.
To achieve high efficiency, prediction accuracy, and stability, optimal hyperparameters must be determined for the SVM algorithm. However, the selection of appropriate SVR hyperparameters is a vital challenge. In this study, we proposed the WOASVR algorithm, which is a combination of WOA and SVR algorithms, for the high-accuracy time-series forecasting analysis of the student enrollment and teacher statistics in Taiwan. The algorithm WOA was used to determine suitable hyperparameters for SVR. The predictive power of the WOASVR approach was compared with those of five well-known algorithms: ARIMA, ETS, TBATS, GRIDSVR, and PSOSVR. The datasets of student and teacher numbers between 1991 and 2018 were used to compare the performance of the proposed algorithm with that of the aforementioned five  GRIDSVR, grid search support vector regression; PSOSVR, particle swarm optimization support vector regression; WOASVR, whale optimization algorithm support vector regression; C, penalty factor; ε, epsilon; σ, sigma. 8 Computational Intelligence and Neuroscience methods. The hyperparameters acquired using the WOASVR model were more accurate than those acquired using the GRIDSVR and PSOVR models; thus, the WOASVR algorithm was more effective on the optimization of SVR hyperparameters than the GRIDSVR and PSOSVR algorithms. The WOA algorithm can provide high local optima avoidance and convergence speed during the course of iteration. Moreover, the WOASVR algorithm achieved higher forecasting accuracy than the other methods. The results indicate that the WOASVR model is a superior method for forecasting student enrollment and teacher statistics.
Only two parameters were considered during the implementation of the WOASVR method: namely, the size ARIMA, autoregressive integrated moving average; ETS, exponential smoothing; TBATS, Trigonometric Seasonal Box-Cox Transformation with ARMA residuals Trend and Seasonal Components; GRIDSVR, grid search support vector regression; PSOSVR, particle swarm optimization support vector regression; WOASVR, whale optimization algorithm support vector regression; boldface, the best values in each row.               of the population in WOA and the maximum number of iterations. A large population size and number of iterations enhanced the searching ability for determining the trends of student and teacher numbers in Taiwan. However, the computational time also increased with the population size and number of iterations. In this study, the WOASVR method required a population size of only 20, whereas the PSOSVR method required a population size of 50. Using the common PSO parameter optimization approaches, the overall performance revealed that WOA achieved a better result and reduced operating costs in fewer iteration. Thus, the parameter-optimizing ability of WOASVR was superior to that of PSOSVR [58]. The high search capability of WOA was due to the population location update mechanism, which is presented in equation (17). Equation (17) requires population to move randomly in the initial steps of the iterative operation. However, in the remainder of the iterative operation, the high development and convergence derived from equation (15) are emphasized. In accordance with equation (15), the population repositions itself around the current best solution or moves in a spiral path to seek the new best solution. Because the aforementioned two stages were conducted separately and each stage only required approximately half number of the iterations for parameter optimization, WOASVR exhibited a better local optimal avoidance and convergence speed than the other methods [58].

Trend Evaluation.
In the early 2000s, new education institutions were still opening in Asia to cater to the increasing number of children, particularly the large numbers of babies born in the dragon years 1988 and 2000. Two decades later, many schools are closing. Ministry of education in various countries is left with reluctant decisions regarding shrinking school cohorts and is either closing or merging schools. The considerable decline of birthrates has evidently made an impact on student enrollment, with elementary schools first pushed to the edge. In this study, accurate forecasts from the proposed methodology offer findings that are capable of providing the government, administrators, and educators a picture of future school attendance trends and what these trends mean for the demand for education. Applicable strategies are also provided to accommodate the found trends.
Numerous trends are worth noting. First, as shown in Figures 2(a) and 2(b), the trends for public and private primary student enrollment were in opposite directions. Whereas public school student enrollment is declining, private schools are surprisingly gaining students. This finding can be explained by the emergence of the quantityquality trade-off theory developed by Becker and associates [62][63][64][65][66][67]. According to their model, the increasing marginal cost of quality (child output) in terms of quantity (number of children) leads to a trade-off between quantity and quality. Countries with a low fertility rate tend to be wealthier, with higher per capita incomes [67]. Smaller families could invest more in each child, thus improving education, health, and cognitive ability. Compulsory education provided by the Taiwanese state was meant to enable all citizens to receive basic education by providing education based on a standard academic curriculum. Private schools charge parents higher cost tuition than the norm claim to provide students additional assistance in academic training or opportunities to participate in additional extracurricular activities to cultivate additional "talents". Additional tuition is also often used to provide students better facilities and amenities than those provided to their public counterparts. Our results suggest that with the low fertility rate in Taiwan [51], current Taiwanese parents are willing to invest a relatively large amount of money in their children's education; therefore, they select private schools over public schools for their children's education. The trend of our forecast is consistent with the quantity-quality trade-off theory of Becker et al. [62][63][64][65][66][67]. Small Taiwanese families, which have strong financial capability, are willing to invest considerable money for their children's education, health, and cognitive skill cultivation.
The numbers of student enrollment and teachers also caught our attention. As shown in Figures 2(c) and 2(d), a quite considerable increase of student enrollment occurred in 2014. Students who were enrolled as freshmen that year were mostly born in 2000, the year of the dragon. There is typically a bump in the number of children born in the dragon year, considered in Chinese culture to be auspicious, which demands that school administrations and governments cater to this projected population. Lastly, another trend (presented in Figure 3) was a major drop in the number of teachers in 2015, possibly the result of retirement waves among public school teachers. In 2013, with the aim to prevent the existing pension system from going bankrupt, new pension policies were introduced by the Taiwanese government, which delayed the official retirement age. Around the same time, in hopes to foster future generations' key competency and the nation's economic competency, a new law named 12-year Curriculum for Basic Education was approved [68]. The law readjusted the Taiwan's education system to align with demographic changes and current problems the island was facing, and many changes were unfamiliar to teachers. Many teachers, under the pressure of facing foreign curriculums and the extended retirement age policy, decided to retire before the law was implemented, hence the considerable drop in the number of teachers. For the many challenges ahead, the following strategies are elaborated for future reference.

Further Strategic Implications.
It is inevitable that schools should be repurposed after a declining fertility trend. Schools bolster the local community and support residents' well-being, which contributes to a positive neighborhood [69]; therefore, to repurpose schools, developing community hubs on unused school space is essential [70]. A practical strategy is to place recruitment efforts in searching for new markets. Elder-friendly community models, as defined by the World Health Organization, are ones that promote active aging [71], which is a process that enhances life quality while aging. Engagement in meaningful activities, such as learning, can contribute to good physical and psychological health [72]. With the extra resources and facilities that were once used to accommodate the younger population's education needs, schools can provide the elder population an opportunity to learn as well.
A reform of the higher education system demands attention. Following universal expansion and reform, Taiwan's higher education system has received wide recognition and is considered reputable in Asia. Due to the "410 Demonstration for Education Reform" policy, the number of higher education institutions in Taiwan rapidly increased, from 130 in 1994 to 164 in 2007 [73]. In 2008, the percentage of high school graduates who entered university hit 95%, and it has remained this high since. This policy has then caused a failure of the system to be discriminative, and with the considerable birthrate decline, the future of Taiwan's higher education system is even less positive. Today, facilitating a university elimination mechanism for endangered schools is a necessity for Taiwan, by either shutting down those with poor performance or transforming them into other purposes. A well-thought scanning initiative will decrease wasteful investments and steer effort toward upgrading higher education quality, which is an additional implementation Taiwanese higher education currently needs [74]. Government and school administrators should rethink the concepts and directions of school management and adjust administration strategies and policies in response to the declining birthrate in Taiwan. To increase national competence, accurate forecasting of student enrollment and teacher statistics is critical for ensuring that human resources can be effectively developed in the future.

Conclusion
Inevitable demographic change is a force all nations must face today. With the phenomenon of a declining birthrate, government education departments, schools, and educators should rethink the concept and direction of school management and adjust administration strategies and policies in response to this social trend. To increase national competence, accurate forecasting of student enrollment and teacher statistics is critical to ensuring that human resources can be effectively developed. In this study, we proposed the WOASVR algorithm which was combined with WOA and SVR for the forecasting of student enrollment and teacher statistics. WOA was used to tune the suitable parameters for SVR. The datasets of student and teacher numbers between 1991 and 2018 were used to evaluate the performance of the proposed algorithm. The predictive power of the WOASVR approach was compared with well-known algorithms in five models: ARIMA, ETS, TBATS, GRIDSVR, and PSOSVR. The parameters acquired using the WOASVR model were more accurate than those acquired using GRIDSVR and PSOVR, which indicates that WOASVR is more effective at optimizing SVR parameters than GRIDSVR or PSOSVR. The results indicate WOA algorithm has the ability to provide high local optima avoidance and convergence speed during the course of iteration. Moreover, WOASVR achieved higher forecasting accuracy than the other methods, indicating that the WOASVR model is a superior method for forecasting student enrollment and teacher statistics.

Data Availability
The data supporting this study are openly available from the Ministry of Education, ROC, Taiwan, at https://english.moe. gov.tw/mp-1.html.