Cell Traffic Prediction Based on Convolutional Neural Network for Software-Defined Ultra-Dense Visible Light Communication Networks

,


Introduction
With the rapid development of communication networks and the continuous expansion of network scales, great challenges are confronting wireless communication systems due to the need for a large number of concurrent services with high connection density and heterogeneous service demands. is is especially the case in dense urban areas, where there are large-scale distributed wireless communication systems [1]. As a result, mobile communication traffic has witnessed explosive growth. Cisco predicts that the IP traffic will triple from 2017 to 2022 [2]. In view of this space, and spectrum changes, and the redundant deployment of equipment will lead to high maintenance and energy consumption costs. To this end, this paper proposes a new network framework based on SDN, UDN, and FVLC, which is called the software-defined ultra-dense FVLC (SDUD-FVLC) network. Specifically, the architecture uses the idea of SDN to decouple the control plane and data plane of the device [5][6][7], making the network architecture more reasonable and orderly. It is applied to the FVLC network to optimize its structure in an ultra-dense network.
Many researchers have noticed that the control scheme proposed by the control plane is of paramount importance for resource allocation, and accordingly, they have proposed different solutions [8,9]. However, these solutions are all planning for the state of the existing traffic and have a certain delay for dynamic traffic changes. erefore, to deal with the problem of delay, we believe that it is necessary to proactively evaluate the state of traffic changes and formulate a control plane based on the evaluation results. To this end, we propose a traffic prediction model based on convolutional neural networks. e use of convolutional neural networks can well capture the characteristics of dynamic traffic and trends and predict the network traffic for a period of time in the future. On this basis, the control plane is predictable and thereby well managed. To achieve this goal, this paper focuses on building a better traffic prediction model and proposes a 2D convolutional neural network (2D-CNN) model.
Specifically, the traffic predict for the cell is a time series analysis problem, and the research on the time series problem has become extensive. For example, Chen et al. [10] investigated the temporal and spatial characteristics of cellular network traffic, in which the researchers proposed a model of traffic changes over time and space on weekly cellular network traffic data.
ere are also some other studies that make accurate predictions based on the nonstationarity and seasonality of cellular networks [11]. e prediction of cellular traffic in these studies can be regarded as a time series analysis problem whose performance depends on its linear statistical models, such as autoregressive integrated moving average (ARIMA), α stability [12], and entropy theory [13]. Artificial intelligence and machine learning technologies are also introduced to help solve traffic prediction problems, like linear regression (LR) and support vector regression (SVR) methods [14,15]. However, with the development and popularity of 5G networks, the formation of ultra-dense network architecture, resulting in the mode of cellular network, is highly complex. It is clear that classical methods based on linear models and simple neural networks are unable to provide precise predicting results of network traffic in the 5G era [16]. In order to enable high-quality traffic prediction for a selected cell, some researchers have also used deep learning methods, like recursive neural computing [17], deep transfer learning [18], and deep meta-learning [19]. On the other hand, most of the aforementioned works can only handle grid-based traffic, which is of little help for cell traffic prediction of micro base stations in UDN.
Differently, the authors in [20] employ a convolutional neural network to predict cell traffic and use the traffic of a selected cell and 80 adjacent cells to form a 9 × 9 2D image. In this way, they can expand time to form a 3D dataset. At the same time, the authors divide the time correlation into two parts: daily cycle and hourly cycle, and then use a twochannel convolutional neural network for traffic prediction. Similarly, the authors in [21] leverage the same strategy to restructure the cell traffic data to be 3D data according to spatial and temporal characteristics. In order to get a better correlation of time series, data are sorted into three temporal components: hourly cycle, daily cycle, and weekly cycle. A three-channel convolutional neural network framework is the input to predict the cell traffic.
However, both of the above studies focus on the prediction of cellular traffic data of large base stations in the whole city. At the same time, the above two models have higher requirements for input data, requiring the selection of historical traffic data of 80 cells adjacent to the selected cell. e transmission and storage of historical data of 80 cells will undoubtedly render considerable consumption of network resources and memory for a micro base station. In UDN, the number of micro base stations is large, and their locations are densely distributed. erefore, the above two models are obviously inappropriate for the prediction of cell traffic of densely distributed micro base stations in the UDN environment.
Considering the above constraints and challenges, our proposed 2D-CNN model in this paper is suitable for applying in SDUD-FVLC networks. First, it does not require too much historical data from the data plane to precisely predict cell traffic. Second, the control plane can formulate a reasonable control plane based on the predicted cell state and traffic trend to reduce energy consumption and balance communication load. To summarize, the major contributions of this paper are as follows: (i) is paper proposes a new network framework to meet the requirements in the future networks.
(ii) rough traffic prediction, the future state of a cell can be known in advance, so that the control plane can be formulated proactively to avoid unnecessary delay. (iii) Using CNN for traffic prediction can better capture the temporal characteristics of cell traffic. (iv) Only a small amount of historical data from the cell is needed to precisely predict the traffic direction, which can save network resources and reduce the load on the control plane.
e specific content of the paper is as follows. Section 2 elaborates on the new network architecture of SDUD-FVLC. Section 3 discusses the source of the dataset used in the experiment and introduces 2D-CNN. In Section 4, the proposed 2D-CNN model is elaborated, and the related algorithm is given. Section 5 shows the comparison model and evaluation parameters. Furthermore, the experimental results of different models are compared and analysed. Finally, Section 6 concludes the paper.

Software-Defined Ultra-Dense Visible Light Communication Networks
In order to meet the challenges brought by UDN, a novel UDN architecture based on a software-defined network (SDN) is proposed in [22][23][24]. SDN is programmable, scalable, and flexible, which can effectively mitigate the problems of high energy consumption and large redundancy of UDN. Using the concept of SDN, the communication equipment is decoupled, the control plane and the data plane are separated, and the data plane is controlled through a unified control plane, thereby simplifying the management and control of the UDN network. However, the arrival of 5G networks and the 6G era, which is not far away from us, have put forward higher requirements and challenges for the transmission speed of traffic and the quality of network services. In order to improve the quality of network service, the VLC network architecture based on SDN has been proposed [25,26]. Nevertheless, most of the indoor VLC communication environment is not very complicated, SDN cannot play its maximum role and is limited by the communication range.
us, the existing software-defined VLC schemes cannot efficiently solve the problems in the ultra-dense network environment. By combining optical fiber and VLC to form the structure of FVLC, the communication range of VLC can be expanded, so that it can be applied to large and ultradense areas, improve network service quality, and simplify network management and control. erefore, in this paper, we look forward to the future network architecture. A novel network architecture based on UDN, SDN, and FVLC is proposed. In this architecture, we divide the ultra-dense network area into many small areas. In these small areas, we can apply VLC communication to them and connect the devices in each small area through an optical fiber to form the FVLC structure in the ultra-dense network, which can improve the quality of service and transmission rate of the network. But this will certainly form a complex network environment, and it is difficult to control and deploy the ultra-dense network as a whole. So, we use SDN to decouple the equipment in each small area and control and allocate the data plane of the small area in the super-dense area through a single control plane. In this way, the LED device in FVLC only needs to be responsible for data transmission, which greatly simplifies the function of LED and greatly reduces its production and use costs. Finally, a new network architecture is formed as shown in Figure 1.

Wireless Large Traffic Dataset.
e data used in this paper is obtained from the dataset provided by Telecom Italia [27], a large European telephone service provider, which mainly includes the communication records of telephone services, SMS services, and Internet activities (62 days, 500 million records) in Milan and Trentino. In the spatial dimension, the dataset divides the entire city into H × W grids, with each square in the grid being divided into a "unit"; each unit covers 0.05 square kilometers. e data used in this paper come from the dataset of Milan. In the dataset of Milan, H � W � 100 means that the city area is divided into 100 × 100 units. In the time dimension, the dataset recorded traffic from 00:00 on November 1, 2013, to 00:00 on January 1, 2014, at a 10minute interval.

Temporal and Spatial Characteristics.
In this paper, we focus on the time characteristics of cell traffic. As shown in Figure 2, we show the time distribution of five different services (phone call, phone out, SMS call, SMS out, and network traffic) in randomly selected units in Milan for a period of 336 hours over a two-week period. It can be seen that the time activities of the five services follow the characteristics of daily and weekly cycles. Obvious features like a significant reduction in data during the weekend can be observed, suggesting that there is almost no population in the area on weekends, e.g., factories, business districts, and campus areas. In addition, the number of activities for the five services is slightly different. e amount of Internet services is always greater than that of the other four services. e traffic dialed in by phone is almost the same as that dialed out by phone, while the number of dialed in by SMS is obviously excessive. Traffic prediction depends on these important temporal characteristics of mobile traffic, and convolutional neural networks can well extract these temporal characteristics.
In addition, the prediction of the traffic of the cell in the FVLC-UDN environment is closely related to the time series. In other words, the characteristics of the time series play a decisive role in the prediction of the traffic of the cell. In the experimental process, by comparing the prediction results of traffic prediction based on temporal and spatial characteristics and the prediction results of traffic prediction based only on time series features [20,21], it can be found that the temporal correlation is more important than spatial dimension correlation. e essence of spatial characteristics is the similarity of time series between adjacent cells. It is thereby not very helpful for traffic prediction but requires a large amount of external data from nonselected base stations. In terms of resource utilization, adding spatial features to traffic prediction indicates a large consumption of network resources. Taking [20] as an example, for the traffic prediction of a selected cell, the model requires several days of historical data from 80 cells adjacent to the target cell. e sorting and transmission of the input data consume tremendous resources.
In the era of 5G, UDN consuming a huge amount of network resources emerges, and therefore, more attention should be paid to the management and reasonable use of network resources. Consequently, in the model proposed in this paper, we discard the spatial characteristics of cell traffic and focus on its temporal characteristics only, which can also achieve excellent performance. In this way, the model in this paper requires less data and can reduce resource consumption for data transmission and processing.

2D Convolutional Neural Network.
In machine learning, the convolutional neural network (CNN) architecture has been perfected and successfully used in a variety of tasks, such as face recognition, scene marking, image classification, and so on [28]. With the development and maturity of the convolutional neural network framework, the data it can process have expanded from 1D data to 2D data and from 2D data to 3D data. e emergence of 3D-CNN architecture makes it easier for us to manipulate volumetric data such as video and extract the spatial and temporal characteristics of 3D data [29].
In the model proposed in this paper, we use the 2D-CNN architecture to use the time characteristics of the cellular network to predict the traffic of the cell in FVLC-UDNs. e structure of 2D-CNN is shown in Figure 3. In 2D-CNN, the convolution kernel slides in 2D. In Figure 3, f h and f w are the height and width of the convolution core, and i and j are the height and width of the 2D image. rough the sliding of the convolution kernel with a stride of 1, the entire time series data can be effectively covered. At the same time, when the convolution kernel slides to the edge of the data, zero padding is used to avoid ignoring the feature extraction of the edge value of the 2D image. e 2D-CNN model described above can extract the data characteristics of 2D images well, thus making the traffic prediction for cellular networks more accurate.

Time Series Modeling.
In this paper, to fully utilize the influence of the historical traffic data of the base station on the prediction accuracy, the daily period and the weekly period are taken as two features of the 2D image, and the traffic data in this specific period can be regarded as another feature. On this basis, the three features are expanded in an hourly cycle to form the data image as shown in Figure 4. When using a convolutional neural network for training, the model can extract the characteristics of the hourly cycle, daily cycle, and weekly cycle of traffic well, thus improving the accuracy of prediction.
As shown in Figure 4, the three features are continuously extended to form a 2D data image of L × K. In this paper, we select 12 hours of continuous time, using the daily cycle, weekly cycle, and traffic data as three characteristics, so L � 12 and K � 3, and we build a 2D data image except 12 × 3. We construct the original dataset into a continuous two-dimensional image in the above way, that is, we construct the original dataset into [X t− p , X t− (p− 1) , . . . , X t− 1 ] continuous dataset. e value in the upper left corner of the 2D image is the continuous time in hours; that is, the 2D image is arranged in continuous time. At this point, we have finished modeling the input data (training set and test set). e original dataset has a period of 62 days. We select the last 7 days of data as the test set, and all previous traffic is set to the training set and the validation set (allocated at a ratio of 7 : 2), thus completing all processing of the data. Figure 5 shows the overall framework of the 2D-CNN model proposed in this paper. It consists of several 2D convolution layers, max- Security and Communication Networks pooling layers, Flatten layers, and several fully connected layers. e 2D convolutional layer is directly connected to the 2D image built in Section 3.1. rough a series of connected layers, the final output of the prediction results is obtained. By integrating the historical data of the selected cell into the 2D-CNN model, the next hour's cell traffic is predicted accurately, so that the control plane can provide an efficient control scheme based on the cell status in the next hour, which can help attain the goal of energy saving, load balancing, and delay reducing.

Convolution Layer.
Using a 2D convolution layer, the 2D image dataset modeled in Section 3.1 can be extracted well. After the size of the convolution kernel is selected, the convolution kernel is slid on the surface of the 2D image, and when the edge of the image is encountered, the method of zero padding is used to avoid ignoring the data characteristics of the image edge. After the data framed by the convolution kernel are subjected to a convolution operation, they are connected to the next network layer as the data of the next network layer for use.

Max-Pooling
Layer. Due to a large amount of trained 2D image data, the max-pooling layer can effectively reduce the amount of calculation, memory usage, and the number of parameters (so as to reduce the risk of overfitting). After determining the size and stride of the receiving field in the pooling layer, only the maximum input value of each receiving field can enter the next layer network, while other inputs are discarded. Besides reducing the amount of computation, memory usage, and the number of parameters, the max-pooling layer also introduces a certain degree of invariance for small changes. At the same time, the max-pooling layer has some disadvantages. First of all, it is obviously destructive; then, in some applications, invariance is not desirable, and the pooling layer cannot be used for these applications. However, in this model, the application of the maximum pooling layer can well reduce the amount of calculation, memory usage, and the number of parameters, so that the model runs faster and can get the prediction of the selected base station more quickly and timely.

Flatten Layer.
e output data of the max-pooling layer are 2D data, while the input data of the fully connected layer are 1D data. rough Flatten layer, the input of the max-pooling layer can be "flattened"; that is, multidimensional input can be 1D to meet the input requirements of the fully connected layer.

Fully Connected Layer.
After the 2D input of the maxpooling layer is converted into 1D data through the Flatten layer, it is input to the dense layer composed of three fully connected layers. e input and hidden layer neurons use ReLU as the activation function, and it (ReLU) also acts as an activation function for the output neurons.  We can train our 2D-CNN model to predict traffic by minimizing the mean absolute error (MAE) between the estimated value y n and the actual value y n . e loss function can be defined as where y n is the actual value, y n is the predicted value, and N is the size of the dataset. e complete model is shown in Figure 5. It shows the connection mode of each network layer and the trend of fitting data in the model.

Model Algorithm.
We trained the 2D-CNN model through the process of Algorithm 1, which consists of two parts: training instance construction and model training.
e input of the 2D image is obtained from the integration of the original traffic dataset and then input into the model for training. After constructing the data and initializing the model as shown in Figure 5, a small number of training instances are randomly selected. In this model, we use the widely used optimization technology to train and randomly select a small batch to train. Afterwards, the model can be trained by the loss function defined in Section 4.2.4. In the process of training, in order to prevent overfitting or loss function wandering around the minimum, we use an early stopping method to stop training in time, so as to optimize the performance of the model.

Evaluation Index.
In this section, we will describe the quantitative criteria of evaluation indicators. For this task, we refer to the application and case of time series prediction. For this model, we will use root mean square error (RMSE) as the evaluation index. RMSE is defined as where y n is the real data of the datasets, y n is the predicted value of the model, and N is the number of datasets to be tested. e performance of the smaller RMSE model is better.
Mean absolute error (MAE) is the average value of the difference between two time series. e definition of MAE has been introduced in Section 4.2.4. We also use the loss value of the trained model as an index to evaluate the performance of these models. A smaller MAE means that the model has higher performance.
We also use the mean absolute percentage error (MAPE), which is defined as In this index, y n , y n , and N are the same as above, and smaller MAPE means higher performance.
At the same time, we use the direction accuracy (DirAcc) to compare the difference between the predicted direction and the actual number sequence direction, which is defined as follows: and d n � 1, y n − y n− 1 y n − y n− 1 ≥ 0, 0, others.

(5)
For this index, y n and y n− 1 represent the actual data of T and T− 1 time, respectively. Similarly, we can get y n and y n− 1 . e higher the value of the indicator, the better the prediction. From the definition of the indicator, we can see that DirAcc ≤ M (M is the size of the test data).
RMSE, MAE, MAPE, and DirAcc will be used to evaluate the performance of these models. In the next section, we will show some common and widely used time series prediction models. By comparing the evaluation indexes with the model proposed in this paper, we can verify the superior performance of the model in cellular network traffic prediction.

Comparison Model.
In order to get a more intuitive evaluation effect and better understand the performance of the model proposed in this paper, three models which have achieved good results in time series prediction and which have been applied to production and life are selected, which are HA, ARIMA, and SVG. e evaluation indexes of the above three models and the 2D convolutional neural network model proposed in this paper are calculated, respectively, namely, the four evaluation indexes shown in Section 5.1. rough the intuitive comparison of data, we can clearly see the excellent performance of the 2D model and the accuracy of prediction results.
HA model adopts the strategy of the historical average, which predicts the value of the next week by getting the average value of the last week. is is a relatively simple probability strategy to predict cellular network traffic through the historical average. It is suitable for the historical data comparison rule and unified sequence. For the cellular network traffic which is irregular and interfered by the external environment, HA model performs poorly, as can be seen from Table 1.
By combining the autoregressive (AR) model, moving average (MA) model, and differential method, we can get the ARIMA model, which is usually used for time series predication. e AR model extracts the relationship between the current value and the history and uses the historical data of the variable itself to predict. e MA model focuses on the accumulation of error terms in the autoregressive model and then integrates AR and MA into one model by the difference method. ARIMA is widely used in time series prediction, which is a mature prediction model.
In order to obtain a better prediction effect in the field of regression, we use the idea of support vector machine (SVM) to construct a model besides support vector regression (SVR). According to the distribution of training set, SVR divides different types of samples in the specified interval region and does not calculate the loss for the samples falling in the interval region but calculates the loss for the samples falling outside the interval region, so as to improve the prediction accuracy of the model. Table 1 shows the evaluation index values of the 2D convolution model and three comparison models (HA, ARIMA, and SVR). It can be seen intuitively that the evaluation index of the 2D convolutional neural network model proposed in this paper is better than that of other methods. Compared with the other three models, evaluation metrics (RMSE, MAE, MAPE, and DirAcc) for 2D-CNN have better results.

Performance Analysis of 2D Convolutional Neural Network Model.
During the experiment, we randomly select a certain area in Milan, and the traffic prediction results of four different services (phone in, phone out, SMS in, and SMS out) are shown in Figure 6. We can observe that the prediction results can accurately capture the dynamic trend of real traffic in 2D images. e traditional time series has a general performance in dealing with complex and nonstationary traffic data and has a higher RMSE and lower accuracy in all four services.
In addition, from the prediction results in Figure 6, the 2D model can also effectively predict sudden changes of traffic trend. By taking the daily period, weekly period, and traffic as the features of 2D data image and then extending the three features on the hourly period, the 2D convolutional neural network model proposed in this paper achieves better performance on all evaluation indexes of four kinds of services. Input: training set historical data: [X t− p , X t− (p− 1) , . . . , X t− 1 ] Output: trained 2D convolutional neural network model

Conclusions
Compared with traditional UDN, the newly proposed UDN architecture based on SDN and FVLC can better adapt to the high requirements of the future networks. In this paper, a 2D-CNN deep learning model is proposed to predict the traffic of cells in SDUD-FVLC architecture, so that the network state can be perceived from a global perspective and used to solve the delay of the current control scheme. Compared with the traditional time series prediction model, 2D-CNN can capture the dynamic performance of traffic well and requires less data than 3D models, which reduces the load of the control plane. Also, the model presents a better prediction effect and higher performance, which provides a feasible solution to solve the delay problem of the SDUD-FVLC network control plane.

Data Availability
e data used to support the findings of this study are included within the article. e data presented in this study are also available at https://doi.org/10.1038/sdata.2015.55.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
SZ was responsible for investigation and methodology. ZW and SD contributed to formal analysis. QC and YD were responsible for software. SZ, QC, and LY validated the study. SZ and YD were responsible for resources. SZ and LY conceptualized the study and wrote the original draft. SZ and YY were responsible for visualization and data curation. SD and ZK reviewed and edited the manuscript. LY supervised the study and acquired funding. ZW and LY were responsible for project administration. All authors have read and agreed to the published version of the manuscript.