A Decision Tree-Based Online Traffic Classification Method for QoS Routing in Data Center Networks

,


Introduction
Network trafc type identifcation is a key issue in network performance improvement, and Internet trafc classifcation (ITC) has become an important research direction in the network feld [1]. High-accuracy trafc classifcation plays an important role in intrusion detection [2], quality of service (QoS) guarantee [3], and trafc engineering [4]. In the data center, the data center network (DCN), which connects the computing resources and services, is growing in size and network throughput [5]. With the increase of network application types and service capabilities of data center network, how to ensure the QoS requirements of network fow has been a hot topic.
For unknown fows, network devices need to get their QoS requirements to guarantee the QoS of the fow. Because most of fows in the network do not use the type of service (ToS) feld in the IP header, network forwarding devices cannot quickly classify the type of fows to provide corresponding QoS guarantees. Real-time trafc classifcation enables network operators to respond to fows of diferent network applications quickly and improve the QoS guarantees efectively. Traditional QoS guarantee mechanisms in which custom switches or routers are used through predefned mechanisms such as IntServ [6] or DifServ [7], as well as routing policies, are not fexible enough. With the development of software-defned networking (SDN), the analysis of network fow's QoS requirements and routing strategies are moved to controller [8]. Although this improves the ability of QoS guarantee to a certain level, SDN switches need to upload the fow information to the controller for analysis and then install the corresponding fow entries to the switch to realize the reasonable routing of the fow. Tis not only increases the bandwidth consumption between the data plane and the control plane but also increases the processing delay of the packets.
As network programmability extends to the data plane, it becomes possible to perform logical operations within switches, such as in-network computing, which is well used with the support of programmable switches [9]. At the same time, because programmable switches (such as P4 switch) can maintain the state of network fow [10], the extraction of network fow features used for machine learning is possible. Based on these premises, this paper achieves the online classifcation of network trafc and reroutes the fow in time according to the QoS requirements of diferent types of fow. Te main contributions of this paper are as follows: (i) An online classifcation and identifcation algorithm of network fow is proposed based on decision tree. Te dataset is formed by extracting the related features of the frst n packets of each network fow from the original ofine network trafc traces. Using the dataset, the decision tree model is trained ofine, and then the model is used to realize the online real-time classifcation of network fow in programmable switch. (ii) We have optimized the feature values that can be extracted by the programmable switches. At the same time, in order to reduce the complexity of decision tree and improve the classifcation recall, pruning strategy is used and a decision tree optimization reconstruction method is proposed. (iii) We apply diferent application types of network fows with diferent QoS requirements as classifcation objectives to realize the reasonable routing of the fow by programmable switches. Te network administrator can set the QoS requirement for fows of diferent network application types and then guarantee the QoS of the identifed fow based on the network status information efectively.
Te rest of this paper is organized as follows. Section 2 introduces the related work of trafc classifcation and the motivation of the research on online trafc classifcation for QoS routing based on programmable switch. Te overall framework of OCQR is described in Section 3. Section 4 analyses the construction and optimization method of ofine decision tree model. Section 5 describes the online classifcation and QoS routing mechanism implemented in programmable switch. Section 6 evaluates the performance of OCQR by setting up an environment based on P4 software switches. Finally, in Section 7, we summarized this paper.

Related Work and Motivation
2.1. Trafc Classifcation. Te ITC method enables network operators to classify the network trafc and manage it according to the network state. In recent years, a variety of ITC technologies have been proposed, such as port-based classifcation, payload-based classifcation, and machine learning-based classifcation method [11].
Port-based trafc classifcation uses TCP/UDP port numbers to distinguish trafc types. For example, network trafc using well-known port number 80 is often considered WEB trafc. However, with the explosive development of network applications, a large number of network fows use random port numbers. At the same time, the application of new network technology (such as NAT protocol) makes the classifcation of port numbers become more and more inaccurate and eliminated [12,13].
Te payload-based classifcation method analyses the network fow load and constructs the state of session and application information from the content of each packet for application identifcation. However, due to the complexity of implementation, this kind of in-depth detection using packets has a high overhead to run on trafc identifcation devices, and it is difcult to analyse encrypted trafc.
With the rapid development of machine learning (ML) technology, ML has been widely studied and applied in the feld of trafc classifcation [14]. By extracting the features such as average packet length and packet time interval of network fows, a network trafc classifer is constructed by using classical algorithms such as support vector machine (SVM) [15], hidden Markov model [16,17], Naive Bayesian [18], decision tree [19], K-NN algorithm [20], and random forest [21]. Although machine learning has shown good performance in the classifcation of ofine network fow data, in the traditional distributed computing network environment, network data collection and algorithms are difcult to carry out in traditional switches.
As the SDN controller has global view and powerful computing ability, SDN-based ITC for QoS guarantee has attracted the attention of researchers. Sun et al. [22] proposed a network fow classifcation method using a variety of machine learning to obtain its QoS requirements by accurately classifying the network fow and then planned the path of the fow to ensure QoS. Zheng et al. [8] used deep neural network to classify trafc data according to QoS requirements and made routing decisions according to transmission delay and bandwidth demand. In addition, because diferent video fows have diferent requirements in real time, bandwidth, and packet loss rate, Tang et al. [23] proposed a fne-grained fow classifcation method according to the fractal characteristics of fows. Tis method shows superior performance in fnegrained classifcation of video trafc. Te ITC mechanism based on SDN can adopt complex machine learning models to improve the classifcation accuracy, so as to provide efective guarantee for the QoS of corresponding fows. However, in the process of fow feature extraction, the switch needs to upload packets to the controller, which increases the bandwidth occupation between the data plane and the control plane. At the same time, after fows are classifed by the model, the corresponding fow entries need to be installed in switch to make decisions of QoS routing for network fows, which will lead to an increase in packet processing delay.

Motivation and Challenges.
In order to reduce the occupation of network bandwidth by fow feature extraction and decrease packet processing delay, this paper considers using programmable data plane to carry out online classifcation of network trafc. Trough deploying the classifcation model directly in the programmable switch, the network fows will be classifed and identifed without afecting the line-rate packet forwarding, and the network fows can be rerouted quickly to meet the QoS requirements of fows.
Although the programmable switches have arithmetic logic unit (ALU) which makes them computationally capable, in order to ensure line-rate forwarding of packets, foating point, exponential, and division operations that require multiple clock cycles are not supported. Terefore, it is not efcient to directly use the existing complex machine learning model to the programmable switch.
Decision tree, a rule-based supervised machine learning method, is one of the most popular classifcation techniques in machine learning and data mining. A decision tree model is represented by a sequence of branching statements, which continuously judges from the root node of the tree to reach the leaf node of the expected classifcation label. It is simple, interpretable, and efcient, so it is widely used in the classifcation of network packets [24]. Because programmable switches have state maintenance capability and support data size comparison in the pipeline of packet processing, the decision tree algorithm is naturally considered for real-time classifcation of network fows in programmable switches.
To reduce complex operations that need to be used in the training process of the decision tree model, the way of ofine training and online deployment is used to classify the network fow in real time. After identifying the fow type, the programmable switch timely carries out the corresponding QoS routing for the fow according to the real-time link state of the network. However, at present, the computing and storage resources of programmable switches are limited [10]. Te following challenges need to be solved in the actual deployment.

Extraction of Flow Features.
Because complex operations cannot be performed in the packet processing pipeline in the programmable switch, it is necessary to consider whether the programmable switch can extract relevant features when generating the ofine training dataset. For example, the mean, standard deviation, and other feature values commonly used for classifcation cannot be used.

Efciency of Real-Time Classifcation.
Te online classifcation of network trafc fow undoubtedly increases the processing logic of programmable switch for the network fow and thus increases the forwarding delay of packets. Terefore, under the condition of ensuring certain classifcation accuracy, the complexity of the decision tree model needs to be optimized.

System Overview
Te construction of the trafc QoS routing system based on online decision tree classifcation (OCQR system) is mainly divided into three steps: decision tree model ofine training, programmable switch (P4 switch) confguration, and fow online classifcation and QoS routing. Te overall system framework is shown in Figure 1.

Decision Tree Model Ofine
Training. Te fow features extracted from the frst n packets of the fow in raw trafc trace fles are used as datasets to train the initial decision tree model. Since the initial decision tree model has high complexity and cannot run efectively in programmable switches with limited computing and memory resources, it is necessary to optimize it to obtain a suitable decision tree model for online classifcation. Terefore, in model training, we use the pruning strategy to reduce the size of decision tree and design the leaf node modifcation strategy to improve the recall rate of application type classifcation.

P4 Switch Confguration.
Based on the original forwarding rules of the network, network administrator embeds the optimized decision tree model into P4 code logic and confgures the switch that supports the online fow classifcation for QoS routing with the compiled P4 code.

Flow Online Classifcation and QoS
Routing. In a programmable switch that supports fow classifcation, information such as the size and arrival time of the frst n packets of a fow is recorded, and related features are extracted for classifcation. Te forwarding rules for the frst n packets use the default path, whereas for subsequent classifed packets, the switch reroutes them to a path with QoS guarantee for forwarding. Te mapping relationship of "QoS requirementport" can be determined by obtaining the link parameters of the data center network through a method such as in-band network telemetry [25].

Flow Defnition.
Network fow is usually defned by fve tuples, including IP source address, IP destination address, protocol type, source port, and destination port. At the same time, network fow has the diference between unidirectional fow and bidirectional fow. Te unidirectional fow is strictly distinguished according to the source and destination address defned by fve tuples. Te bidirectional fow refers to the bidirectional packet sequence of the same connection with the same fve tuples without distinguishing the strict source destination IP address or source destination port [26]. To better classify network fows, this paper regards bidirectional fows with the same fve tuples as the same fow to make full use of the features of bidirectional fows.

Evaluation Metrics.
For the evaluation of classifcation results, we use the confusion matrix commonly used in ML for analysis [27]. In the confusion matrix, four values as shown in Table 1 are used. For the fow of application type i, true positive (TP) represents the fow of the actual application type i, and the prediction result is also i. False negative (FN) represents the fow of the actual application type i, and the prediction result is j. False positive (FP) represents the fow of the actual application type j, but the predicted result is i. True negative (TN) represents the fow of actual application to j, and the prediction result is also the fow of j.

Flow Feature Selection.
To quickly classify the type of network fow for timely QoS routing, we extract the relevant features of the frst n packets of the network fow. Common features used in trafc classifcation include time series, header, statistical features, and so on. In order to achieve the classifcation of fow types with diferent network applications, combined with the status maintenance capabilities of programmable switches, we selected some time series, statistical features, and bidirectional fow features as shown in Table 2. Tese features have achieved good results in the previous trafc classifcation based on machine learning [28]. In addition, many studies have shown that the related features of the frst 20 packets of the fow have been proved to have good classifcation accuracy [29]. It is mainly noted that the larger n is, the later the switch completes fow classifcation and the worse the guaranteed efect of fow QoS is. Te standard formula for calculating standard deviation of packet length is shown in formula (6), where len i represents the length of the i-th packet, and mean len represents the average length of the frst n packets.
Because the programmable switch does not support the operations such as foating point and division, we use square, multiplication, and other operations on both sides of formula (6) and then convert it to the following equation: At this point, the right side of formula (7) contains only the addition, subtraction, and multiplication operations that can be performed in a programmable switch. Te standard deviation of the packet time interval is calculated in the same way.

Proposition 1.
Te classifcation result of the decision tree will not change when applying square, multiplication (multiply by a positive number), addition, or subtraction operation on a continuous positive feature.
Proof. Assume that the feature A has the following n values a 1 , a 2 , ..., a n , and 0 < a 1 < a 2 < · · · < a n . Now, f(a) operation is performed on the values of feature A. When f(a) � k × a 2 + m, (k > 0), there is 0 < f(a 1 ) < f(a 2 ) < · · · < f(a n ) because 0 < a 1 < a 2 < · · · < a n and f(a) monotonically increases in the [0, +∞] interval. Suppose that the decision tree T takes the feature A as a splitting node, that is, a value a ′    of the feature A is used as the splitting point, and the dataset D is divided into two parts: D 1 (a ≤ a ′ ) and D 2 (a > a ′ ). At this time, after the f(a) operation of all the values of the feature A, taking the f(a ′ ) as the splitting point, the dataset D can still be divided into ). Terefore, the classifcation efect of decision tree T is not afected.

Classifcation
Objectives. Traditional network trafc classifcation mostly takes specifc applications as the classifcation objectives, such as works [30,31]. However, too many classifcation objectives will require more complex decision tree model to classify efectively, which makes the efectiveness of online trafc classifcation worse. In computer networks, QoS requirement can be determined by bandwidth, end-to-end delay, packet loss, and jitter [32]. At the same time, the network fow of the same application type has similar QoS requirements. According to the existing QoS framework protocol (IETF, NGN and ITU-T, etc.), the QoS requirements of diferent fow types are obtained through analysis [33], as shown in Table 3. We use these fve application types as classifcation objectives to analyse network trafc.

Dataset Generation.
Since the relevant statistical information of the frst n packets of the fow is used as the classifcation feature to achieve the real-time classifcation, some existing trafc classifcation datasets such as Moore [34] use the relevant information of the whole fow as the feature. At the same time, some datasets, such as Mirage [28], do not have do not have packet traces of original trafc, so it is impossible to conduct a valid online classifcation efect evaluation. To this end, considering comprehensively, we use the original trafc traces (pcap fles) of some applications of ISCX's VPN-nonVPN dataset [35,36] and Tor-nonTor dataset [37,38] for experimental analysis. Te applications corresponding to each application type are shown in Table 4. By extracting the relevant information of the frst n packets of each type of original network fow, calculate the features corresponding to Section 4.2.1 to form the experimental dataset.

Algorithm Selection.
During the decision tree model training, the use of diferent splitting feature selection algorithms will result in diferent decision trees, such as ID3 algorithms, C4.5 algorithms, and CART algorithms, as shown in Table 5.
(1) ID3 Algorithm. Te selection of splitting features is determined by information gain. Te feature with the largest information gain is selected as the splitting node and greedily traverses the selection from top to bottom to form the decision tree space. Information gain indicates the diference of data entropy before and after using the feature A to classify data, that is, In formula (8), H(D) denotes the information entropy of dataset D before feature A is selected for classifcation, and H(D|A) denotes the information entropy of dataset D after feature A is selected for classifcation. When the information gain of feature A is greater than that of all other features, feature A is selected as the splitting node.
(2) C4.5 Algorithm. Te selection of splitting features is determined by information gain rate. Te information gain ratio Gain R (D, A) represents the ratio of the information gain Gain(D, A) of the dataset D on feature A to the entropy H A (D) of the dataset D about the value of feature A, that is, where In formula (10), n is the number of values of feature A. C4.5 frst selects the features with higher information gain than the average value from all candidate splitting features and then selects the feature with the highest gain rate as the node of the classifcation.
(3) CART Algorithm. Te selection of splitting features is determined by Gini coefcient, and the optimal binary splitting point of the feature is determined at the same time. Gini coefcient is calculated as follows: where K represents the category number of a feature column and p k represents the probability of the k-th category. For dataset D, when a value of feature A is used to divide the dataset into D 1 and D 2 , under the condition of feature A, the Gini coefcient is Te feature with the largest gain of Gini coefcient will be selected as the node attribute of the decision tree.
ID3 algorithm cannot deal with continuous features and is not applicable to the fow classifcation having continuous features. Te decision tree model generated by C4.5 algorithm is a multi-tree, that is, a parent node can have multiple child nodes, and its operation efciency is much lower than that of the binary tree model. In addition, although ID3 algorithm and C4.5 algorithm can mine as much information as possible in the learning of the training sample set, the branches and scale of the decision tree model they generated are relatively large, while the dichotomy adopted by CART algorithm can simplify the scale of the decision tree. Terefore, considering the online fow classifcation in programmable switches with limited computing and storage resources, the CART algorithm was selected as the decision tree model algorithm of OCQR.
To verify the classifcation efect of the frst n packetrelated features in the decision tree model based on CART algorithm, 75% of the datasets are used as the training set to train the decision tree model, and the remaining 25% of the datasets are used as the validation set to evaluate the trained decision tree model. Te results are shown in Table 6. In Table 6, the tree scale is the number of all nodes in the tree, the maximum depth is the maximum length from root to leaf of the tree, and the average depth is the average length from root node to all leaf nodes of the tree.
As can be seen from Table 6, except for the poor classifcation efect of multimedia stream, the classifcation precision, recall, and F 1 score of the other four application types have exceeded 88%, and the classifcation accuracy reached 96%. On the scale of the tree, with the same accuracy, the tree size is the smallest when n = 15, indicating that the diference between the application fow types is most signifcant when n = 15. At the same time, from the classifcation efect of Rtvideo and Text, nearly 100% of the classifcation index values indicate that there is a certain possibility that the decision tree model is overftted in classifcation.

Model Pruning.
As a method of reducing overftting for decision trees, pruning strategy not only improves the classifcation efect but also reduces the complexity of decision trees by actively removing some branches [39]. Te pruning operation consists of prepruning and postpruning [40]. Prepruning restricts the size of a decision tree by giving the conditions under which it grows when building a decision tree model. Postpruning improves the classifcation accuracy of decision trees by pruning some subtrees when the decision tree is completely built.
Due to the limited computing and storage resources of programmable switch and to meet the requirements of linerate forwarding, the processing logic of network fow should be as simple as possible. To reduce the impact of decision  Multi-branch tree Information gain C4. 5 Multi-branch tree Information gain ratio CART Binary tree Gini  6 Security and Communication Networks tree algorithm running in programmable switches on packet processing delay, it is necessary to reduce the size of decision tree as much as possible.
(1) Prepruning. Since the model trained by the decision tree using CART algorithm is a binary tree, the tree depth is the maximum comparison number for classifying a sample. When a programmable switch implements classifcation, it can directly use the if-else statements to judge at each splitting node and then get the classifcation type at leaf node. Terefore, by using the prepruning strategy to limit the depth of the tree, it can efectively limit the maximum time consumption for running the decision tree in the programmable switch. When training the decision tree model for OCQR, the parameters of prepruning are as follows: the maximum tree depth is 10, the minimum number of samples of each leaf node is 3, and the minimum number of samples of each split node is 4.
(2) Postpruning. Using postpruning to remove some subtrees can not only alleviate the overftting phenomenon but also reduce the complexity and the average depth of the tree. For the programmable switch that executes the decision tree model, reducing the average depth of the tree can efectively reduce the average time consumption and computing overhead of fow classifcation.
Cost complexity pruning (CCP) [41] is used as postpruning algorithm for OCQR to further prune the decision tree model after using the prepruning strategy. By calculating the cost complexity pruning path of the decision tree, the decision tree model after each pruning in the path can be obtained. Trough analysis, it is found that when the decision tree size is greater than 70, the accuracy of the classifcation model can be more than 90%. Terefore, when using the postpruning strategy, we choose a decision tree model with size as small as possible but not less than 70.
We still use 75% of the dataset as the training set for the decision tree model training by using the prepruning strategy and then optimize the model by using the postpruning strategy. Te remaining 25% of the dataset is used as validation set to validate the model after pruning, and the results are shown in Table 7.
Comparing the classifcation results in Table 6, it can be seen that the decision tree size is more than one thousand nodes before pruning and less than 80 nodes after pruning. Although the decision tree model has some underftting after a lot of pruning, the decline of each evaluation metrics of its classifcation is relatively small.

Leaf Node Modifcation.
In the QoS routing process of network fows, if the fow of a network application type is misclassifed into other network application types, it may lead to a serious impact. For example, if Mstreaming-type fows, with high network bandwidth demand, are misclassifed as Voice-type fows, it will be rerouted to the link with low latency but insufcient bandwidth guarantees. Tis will not only make the fow fail to obtain QoS guarantee but also afect the performance of the link and even the whole network after rerouting. It is especially important to improve the recall of classifcation so that the fow of a certain application type can be classifed as accurately as possible. However, on the one hand, there are packet loss, delay jitter, and packet fragmentation in the network, resulting in the deviation between the extracted features and the actual features of the fow; on the other hand, as shown in Table 7, the decision tree model obtained by ofine training has certain underftting due to the infuence of the training dataset and pruning.
To this end, we optimized and reconstructed the trained decision tree model. As shown in Figure 2, by evaluating the Gini coefcient for leaf node classifcation (see formula (11)), when the Gini coefcient value exceeds 0.5, its type is considered as a general fow. Such fows are processed by default paths to ensure basic QoS requirements and to stabilize network performance. Assume that the number of the fow of application type i that can be classifed as general fow is Nor (Normal) in the modifed decision tree model. At this time, the formula for classifcation recall rate of application type i is changed to We improved three decision tree models in Table 7 and evaluated the modifed decision tree model with the same proportion of validation sets. Figure 3 shows the trafc classifcation recall of fve diferent network application types under diferent n conditions with the model before pruning, after pruning, and after modifcation, respectively. As can be seen from Figure 3, after only pruning the decision tree model, the trafc classifcation recall of diferent network application types has decreased to a certain extent because the scale of the decision tree has been reduced to about 1/20 of the original, making the classifcation efect worse. However, after modifcation of the pruned decision tree model, the trafc classifcation recall of diferent network application types has been greatly improved. Tis ensures that even if some fows are not classifed accurately, the network administrator can still use the default forwarding path and guarantee the basic QoS of this part. Tus, the decision tree models are all modifed in subsequent experiments.

Online Flow Classification Implementation
Te trained and modifed decision tree model is embedded into the code logic of programmable switch to realize the goal of online classifcation and reasonable QoS routing of network fow. Te packet processing logic of fow is shown in Figure 4. Te frst n packets of the network fow are indexed by the smallest of the forward and reverse fve-tuple hash value (min_hash), and the packet size, arrival time, forward and reverse packet size, and other information are recorded, respectively. After extracting the relevant information of n packets, the feature values used in decision tree classifcation are calculated, and then the fow is classifed according to the decision tree model. After identifying the application type of the fow, min_hash is used as an index to record the type in the switch, so that the subsequent packets of the fow can be directly recognized according to the min_hash value and then rerouted to the path that can meet the QoS requirements (see Algorithm 1).

Experimental Environment Settings.
In order to evaluate the efect of OCQR, we use Mininet network simulation tool [42] and P4 software programmable switch based on bmv2 [43] to build a simulation environment on Dell P580 workstation (Intel Core i9-10900x processor, 32 GB memory, and Ubuntu 18.04 OS). Mininet and bmv2 are supported by P4 language consortium, which can reproduce the real functions of P4 environment, so they are widely used in P4 verifcation and evaluation [44].
Considering the performance limitations of software switch and the QoS requirements of diferent application fows shown in Table 3, we construct our experimental topology environment by extracting all shortest paths of a pair of edge switches from 6-ary fat-tree topology [45] which is commonly used in the data center network, as shown in Figure 5.
In Figure 5, the parameter confguration of all six shortest paths from S 11 to S 12 is shown in Table 8. Te default forwarding rule of the switch is to select the shortest path for forwarding according to the destination address. For example, the default forwarding path from host h 1 to host h 4 is S 11 -S 7 -S 1 -S 9 -S 12 .
Te edge switches S 11 and S 12 realize classifcation, marking, and rerouting of fows, and the intermediate switch reroutes the fow according to the diferentiated services code point (DSCP) marked in the TOS feld of the packet IP header by edge switches. Take the packet of Texttype fow as an example. For the fow sent from h 1 to h 4 , if the switch S 11 recognizes the packet that belongs to Texttype fow, frst modify the TOS feld (DSCP = 4) and then forward it to port 2. After receiving the packet and recognizing it belongs to Text-type fow based on the DSCP value, switch S 8 directly forwards it from port 3, as shown in the blue path in Figure 5.

Flow Generation.
To efectively evaluate the online classifcation efect of the decision tree model, we consider using tcpreplay [46] network tool to replay the original packets of diferent types of network trafc fows from the raw trafc traces used to extract features during the model ofine training.
Diferent fows in the original trafc trace have different IP address pairs of source and destination. If the IP address-based routing rules are used, the forwarding of all fows cannot be satisfed. At the same time, the extraction of classifcation features contains the bidirectional fow information. If the original trafc trace is used for bidirectional replay directly, the link delay and jitter will result in a large diference between the bidirectional fow features collected at the edge switch (S 11 , S 12 ) and the features extracted directly from the network fow packet. To this end, we process the network and original trafc traces as follows:  Figure 5: Experimental topology.

Security and Communication Networks
(1) Te shortest path algorithm is used to calculate the port p j in the shortest path of the switch S i to the destination host, and the default port forwarding matching rule is the MAC address (dst_mac k ) of the destination host h k , which is called "dst_mac k : p j ." (2) Te path performance parameter setting depends on the link between switches S 1 -S 6 and switches S 9 -S 10 . At this time, after the original trafc traces are replayed by the host h 1 , the packet received at port 3 of switch S 11 can obtain the same time series and statistical features as those in the source pcap fles.

Efect of Pruning.
To evaluate the efect of pruning on the network, the processing time of packets in the switch is analysed. Using the decision tree model constructed at n � 10, we calculate the average processing time of the frst 20 packets of 1000 network fows of diferent application types (200 fows for each type) at switch S 11 , as shown in Figure 6. As can be seen from Figure 6, in the case of OCQR with pruning, the processing delay of the 10 th packet of the fow is about 300 μm less than OCQR without pruning. Tis is because when the 10 th packet arrives at the switch, the switch needs to execute diferent number of if-else statements with diferent decision tree depths to get the type of fow. Te average depth of the decision tree model after pruning is smaller than that without pruning, which reduces the amount of computation that the switch needs to perform and thus reduces the processing delay. In addition, when the 10 th packet arrives, the switch needs not only to extract the packet information but also to calculate the relevant feature values and identify the application type through the decision tree model. Because the computational resource occupation of the decision tree model only contains a part of it, the delay diference between packet processing with or without pruning is relatively small. At the same time, due to the poor performance of P4 software switch, the packet processing delay is still about 1.4 ms even without the OCQR method.
Furthermore, in Figure 6, when using OCQR, the processing delay of the frst 10 packets is higher than that of the subsequent arrival packets, and the processing delay of the 10 th packet is nearly three times higher. Tis is because when the frst 10 packets arrive at the switch, the switch needs to maintain the relevant information, and additional operations such as feature extraction and decision tree classifcation are required in the processing of the 10 th packet. After classifying the fow, subsequent packets of the fow only need to confrm which application type they belong to based on the hash value of the fve-tuple, so the processing delay is signifcantly shorter.

Online Classifcation QoS Routing.
In order to evaluate the QoS routing efect of OCQR, we choose the fnal decision tree model obtained by pruning the features of the frst 10 packets for experiments and realize the classifcation and rerouting of the network trafc in the programmable switch. Te experiment replays some of the original trafc fows of diferent application types and monitors the actual classifcation and QoS routing result of each application type of network trafc by countering the fow numbers arrived at switches S 2 -S 6 . Te results are shown in Figure 8. As can be seen from Figure 8, the classifcation and QoS routing efect of Rtvideo and Text fows are better, and the classifcation and QoS routing efect of Mstreaming are the worst. At the same time, by comparing the results of QoS scheduling with each application type in Figure 8 and the recall rate of the decision tree model in Figure 3, it can be found that when n = 10, the classifcation efect of the model in the ofine dataset is close to the classifcation efect of QoS routing for diferent application types after the OCQR is actually deployed in the programmable switch. It can be said that OCQR can be normally deployed in the data center network based on the programmable switch and run efectively, and the efect of QoS routing scheduling of diferent application fows is consistent with expectations.

Comparison with SDN-Based Methods.
When the SDN architecture is used to implement the online classifcation of network fow, complex classifcation models can be run in the SDN controller. However, the SDN-based method needs to upload the packets of the network fow to the controller, which not only increases the bandwidth occupation between the data plane and the control plane but also increases the processing delay of the packets. For this reason, we experimentally evaluated the bandwidth utilization between data plane and control plane of SDN-based classifcation methods and OCQR methods, as shown in Figure 9. From Figure 9, the SDN-based method takes up more bandwidth utilization as the n value increases, while the OCQR takes up 0. Tis is because when SDN-based methods are used, the larger the n value is, the more the packets need to be uploaded to the controller, while OCQR methods do not take up bandwidth because they extract features and classify fow directly within the P4 switch. Due to the poor performance of the P4 software switch used in the experiment, it is impossible to forward packets with line-rate, and the delay between the data plane and the control plane cannot be quantifed clearly. Terefore, the packet processing delay with the SDN-based method and OCQR method is not evaluated in our experiment.

Conclusion
In this paper, we propose an online trafc classifcation for QoS routing method, OCQR, which runs in data center networks based on programmable switch. With the ofinetrained decision tree model, OCQR classifes fows using the features extracted from the frst n packets of diferent application fows and reroutes the fows according to the corresponding QoS requirements. Firstly, we transform some classifcation features to ensure that the programmable switch can efectively extract the relevant features required by the decision tree model and reduce the impact on packet processing delay and optimize the decision tree model using the pruning strategy and leaf node modifcation strategy. Secondly, to achieve the QoS guarantee of fows, we reroute fows according to the corresponding QoS requirements and the performance of each link in data center network. Finally, OCQR is validated experimentally based on the simulation environment built with P4 software switch. Te decision tree can only classify the existing trafc types in the training dataset, which means that the fow of general application types that are not adequately trained cannot be efectively processed. Terefore, to improve the recall of trafc classifcation in our future work, it is necessary to fnd a more appropriate classifcation method to identify fow type with a certain confdence (such as Gaussian Bayesian classifcation).

Data Availability
Te data used to support the fndings of this study are available from the corresponding author upon request.

Conflicts of Interest
Te authors declare that there are no conficts of interest regarding the publication of this paper. Security and Communication Networks 13