Integrated Packet Classification to Support Multiple Security Policies for Robust and Low Delay V2X Services

One of the key applications in the 5G system is Vehicle-to-Everything (V2X). Ultra-low delay communication is essential for the safety of users and pedestrians in V2X. However, as sophisticated and various cyberattacks are increasing, it becomes hard to satisfy low delay constraints. To protect networks from such attacks, even single network security equipment provides multiple security functions, resulting in the inevitable additive delay in packet processing. In this paper, we suggest a new packet classification paradigm to resolve this issue. .e proposed algorithm integrates multiple policy rule-sets into a single rule-set and classifies incoming packets using the integrated rule-set. .us, it has a unique feature providing high classification performance regardless of the number of security policies. .rough extensive performance evaluations, we confirm that the performance improvement is also increased with the total rule-set number increasing without the significant overhead of memory cost. We expect that it will mitigate the delay issue of existing network equipment for upcoming services such as V2X.


Introduction
Vehicle-to-Everything (V2X) service is one of the most promising applications in the 5G system.It frequently exchanges information between drivers, pedestrians, vehicles, and transportation infrasystems [1][2][3][4][5][6], and the information should be delivered with low delay and high reliability for the safety of involved persons.
Modern cyberattacks have become more sophisticated and diverse, and as a result, security functions installed in modern security equipment also become more complex and various.For protecting networks from various cyberattacks, single multifunction network equipment has been introduced [7].For example, unified threat management (UTM) supports multiple rule-sets using multiple policy tables as shown in Figure 1.Such integrated network equipment has advantages in security but disadvantages in strict delay requirements of V2X.Each security policy is implemented by complicate packet classification that searches a matching rule with the highest priority by comparing each field of every rule with the incoming packet header.As the integrated equipment should independently perform packet classification for each policy rule-set, the classification cost increases as the number of rule-sets increases [8][9][10][11][12][13][14].Multiple classifications are a bottleneck of network performance, especially in terms of the delay [15][16][17][18][19][20].erefore, the high performance and scalable packet classification is essential for supporting V2X.
In this paper, we propose a new packet classification algorithm that has a distinct feature against other competitors.Although most existing classification algorithms suffer from deteriorated performance as the total rule-set number increases, the proposed algorithm can achieve high classification performance regardless of the total rule-set number.It can effectively support reliable and low delay V2X services.Figure 2 shows the overall architecture of the software-defined networking (SDN) for V2X.Packet classification is a basic function of OpenFlow controller and SDN switch.
To increase the performance of packet classification, high-end SDN switches adopt the expensive hardware-based solution.However, the OpenFlow controller usually adopts software-based packet classification since hardware solution cannot achieve high flexibility to support various security requirements from customers.
is algorithm targets OpenFlow controllers and software-based SDN switches to reduce the burden of packet classification.It can play a very important role in them to provide high performance and high security, simultaneously.e remainder of this paper is organized as follows: Section 2 brie y presents related work, and the motivation of this research is explained in Section 3. In Section 4, the proposed algorithm is described in detail.e performance evaluation results are compared with those of competitors in Section 5. Finally, Section 6 concludes.

Related Work
Although many factors can be used to evaluate the performance of packet classi cation algorithms, packet classi cation speed and memory requirements are most important factors.However, most algorithms cannot support high classi cation speed with low memory requirement.
Packet classi cation is classi ed into hardware-and software-based approaches [21][22][23][24][25][26][27].Hardware-based packet classi cation can achieve very high classi cation speed that is impossible for the software-based one.Most modern network equipment adopts hardware packet accelerators to provide 100 Gbps performance with multiple rule-sets.However, the hardware should be redesigned to satisfy the various requirements of users such as adding a new eld in the rule structure.Moreover, hardware-based solutions usually adopt expensive memory called ternary-content addressable memory (T-CAM) for classi cation.Since supported rule-set size is determined by the size of T-CAM, it costs very high to support large rule-sets.e strongest advantage in the software-based approach is exibility.If the structure of eld should be changed, it can be easily supported by modifying software.Another merit of the software-based approach is cost.If larger rule-set is needed, the user can increase the rule-set capacity of the network equipment by just adding much cheaper dynamic random-access memory (DRAM) compared to T-CAM.
Well-known algorithms belonging to the software-based approach are exhaustive search, cross-producting-based classi cation, tuple space search, and decision tree-based algorithms [21].Now, we will brie y describe each softwarebased algorithm.
Exhaustive search linearly compares each rule with keys from highest to lowest priority to nd the matching rule.Due to the searching procedure, the packet classi cation performance is degraded as the rule-set size increases.However, it requires smallest memory among all packet classi cations and supports very fast update.Most of all, it can be easily implementable.As a result, it is suitable for a system with a small rule-set.
Cross-producting-based classi cation independently performs searching for each eld, and it merges intermediate results [28][29][30][31][32][33]. is procedure is repeated until the nal matching rule is found.It is one of fastest classi cation algorithms but it requires a huge amount of memory and time to build a classi cation table.Since it cannot support incremental update, it needs to rebuild entire table whenever a rule-set is updated.Although it has such critical weaknesses, it can support the classi cation performance almost similar to that of the hardware-based approach.erefore, a lot of research is still going on to improve the weaknesses.
Tuple space search probes each sub-rule-set called tuple space to nd the matching rule [34][35][36][37].A tuple is de ned by combination of each pre x length for ve tuples, and the set of tuples are called tuple space.Since each rule of a rule-set belongs to only one of the tuples, tuple space has a good scalability in terms of a rule-set size.Although it achieves a moderate classi cation performance, it supports fast update, i.e., inserting or deleting a rule.erefore, it has been adopted in Open vSwitch [38].However, the classi cation performance is decreased proportional to the number of tuples, thus requiring further research to improve the performance.
If it reaches, it searches the matching rule with the highest priority among all rules stored in the leaf node.e overall classi cation performance is known to have log complexity in terms of the rule-set size.
Decision tree-based algorithm provides a comparable classi cation performance with that of cross-productingbased classi cation algorithm but requires much smaller memory size.us, the decision tree-based algorithm is one of the most actively researched algorithms at present.When a decision tree-based algorithm partitions a rule-set into multiple sub-rule-sets, partitioning criteria is controlled by two factors: space factor, the maximum ratio of the sum of all rules belong to all sub-rule-sets to the original rule-set size, and binth, the maximum allowed rule size in the leaf node.Hence, the classi cation performance and the table size can be adjusted according to the requirements of applications.
Large space factor increases partitioning number but decreases the height of a decision tree, resulting in fast classi cation performance.However, the total number of duplicate rules is increased, and therefore, generating a large decision tree.On the other hand, large binth reduces partitioning number, so the tree size is decreased but the searching cost in the leaf node increases, and thus, providing low classi cation performance.
Well-known algorithms belonging to the decision treebased approach are HiCuts and HyperCuts [39,40].Although they provide high classi cation performance, they still su er from a large decision tree due to signi cant rule duplications.Recently, E Cuts was introduced to decreasing rule duplications [41].E Cuts is based on HyperCuts but groups rules by elds with wildcard and generates a separate tree for each group.is approach signi cantly reduces rule duplications, so the total tree size is greatly decreased.However, separate tree deteriorates the classi cation performance.As a mitigation, trees with similar wildcard characteristics can be merged to increase classi cation performance while the overall tree 2 Mobile Information Systems size is almost the same.EffiCuts is known to support fast updating [48,49].We will describe the operation of EffiCuts in detail.At first, EffiCuts splits the total rule-set into some predefined categories according to how many wildcard field each rule contains, where wildcard field is a field on which the rule has a large matching range, typically at least 50% of the total range of the field.For 5-tuple rule-set, we have four cases as follows: (i) Category 1: four wildcard field rules (ii) Category 2: three wildcard field rules (iii) Category 3: two wildcard field rules (iv) Category 4: one or zero wildcard field rules For example, assuming that matching ranges of a rule for source IP, destination IP, source port, destination port, and protocol are ANY, ANY, 0 to 32768, 80, 0 to 128, it has four wildcard fields except for destination port, and therefore belonging to Category 1.Since each category contains similar rules only, EffiCuts builds a decision tree for subrule-set belonging to the same category and reduces replicated rules during building a decision tree.Although Effi-Cuts generates multiple decision trees, the total tree size is very small compared to the original HyperCuts.However, the number of decision tree affects the total classification performance.To reduce the number of trees, EffiCuts merges similar categories. is tree merging process increases the total tree size but it can still avoid excessive replication of rules.By doing so, EffiCuts achieves high classification performance and low memory requirement, simultaneously.
Table 1 summarizes each feature of packet classification algorithms.

Motivation
As shown in Table 1, the software-based approach consumes much memory to achieve high classification performance.However, high complexity of memory requirement results in low scalability in terms of rule-set size.Although decision tree-based algorithms have a high complexity of memory requirement, i.e., O(N D ), latest decision tree algorithms show very low memory requirements, where N and D denotes the total dimension number and the rule-set size, respectively.
To verify the memory requirement, we performed the following experiment.We synthesized multiple firewall rulesets whose size is from 20K to 100K using ClassBench [50].
en, we built the total decision tree and calculated the ratio of the tree size to the rule-set size for each rule-set, where space factor and binth were configured to the best values.Figure 3 shows the experimental results obtained from EffiCuts.EffiCuts shows almost the same ratio regardless of the rule-set size, which means EffiCuts achieves almost O(N) for memory requirement.us, it can decrease the decision tree size by 100 times for 100,000 rules compared to HiCuts or HyperCuts [41].
Figure 4 shows the ratio of the average memory access number and the rule-set size on the same configuration.

Mobile Information Systems
We synthesized the packet data using each rule-set and searched the decision tree to nd every packet in the data.We counted the total number of memory accesses during searching process and obtained the ratio of the total number and the total packet number.As the rule-set size increases, the number of memory access for E Cuts also increases in Figure 4.However, the ratio of the access number to the ruleset size decreases as shown in Figure 4. From Figures 3 and 4, we can nally nd two characteristics as follows: , where M(R) is the average memory access number for ruleset R.
, where S(R) is the size of decision tree for rule-set R.
For example, we can see that 4 and S(T 20K ) + S(T 40K ) S(T 20K + T 40K ) ∼ S(T 60K ) from Figure 3, respectively, where T n denotes the testing rule-set with a size of n used in the experiment and where K means 1,000.
Until now, existing research studies for packet classication focus on classi cation with a single rule-set.However, network systems with multiple rule-sets become popular, and fast classi cation algorithm oriented on a single rule-set has limitation to achieve high performance for multiple rulesets.ereby, it is required to consider multiple rule-sets for designing high performance classi cation algorithms.Hence, Characteristics 1 and 2 suggest a new guideline for developing packet classi cation algorithms.According to Characteristic 1, if a system has multiple rule-sets, it is advantageous to integrate them into one rule-set to construct a decision tree for improving classi cation speed.Characteristic 2 also implies that the size of the decision tree for integrated rule-sets is not larger than the sum of sizes for each decision tree for all rule-sets.
We nally conclude that packet classi cation algorithm based on integrated rule-sets has many advantages and suggest a new classi cation algorithm utilizing the features of integrated rule-sets.

Proposed Algorithm
e proposed algorithm performs packet classi cation using an integrated rule-set that combines all rule-sets in a system.At rst, we brie y show the features of the proposed algorithm, and then, describe the algorithm in detail.For simple explanation, we assume that the rule consists of ve tuples but it can be easily extended to more eld cases.

Minimized Classi cation Cost.
e proposed algorithm can complete total packet classi cation for all rule-sets with one search.erefore, it can minimize the increased overhead due to the repetitive classi cation.Since it can maintain the high packet classi cation performance regardless of the number of rule-sets, it is very important feature of the proposed algorithm.

Early Packet Drop.
Integrated rule-set has not only advantage to decrease classifying overhead but also to remove unnecessary classifying.For example, Figure 5 shows existing and proposed packet classi cations.Assume that an incoming packet is allowed by rule-sets 0 to k − 1, but it is rejected by rule-set k.In this case, packet classi cations for rule-sets 0 to k − 1 are eventually unnecessary since the packet cannot be forwarded due to rule-set k.However, packet classi cation for each rule-set is performed in sequence, so it cannot avoid

Building Decision Tree.
e proposed algorithm builds a decision tree using E Cuts after merging each rule-set into a large rule-set.However, it needs unique procedure called "fast rule skipping" and "early drop marking" in each leaf node for improving the searching performance.

Fast Rule Skipping.
e proposed algorithm requires an additional table called "rule-set starting index table" to store all indexes of the rst rule in each rule-set.If we reach a leaf node during traversing the tree, we should nd matching rules for each rule-set.Original E Cuts linearly searches matching rules, so it will take a long time.To increase searching performance, we need to skip unvisited rules in rule-set k and go to the next rule-set k + 1 when we nd matching rule in the rule-set k.It is called "rule skipping."For example, if we nd a matching rule r 2 for ACL rule-set in the leaf node as shown in Figure 6, we do not need to check rules r 3 and r 5 anymore.In this case, we can nd the index number for rewall rule-set, i.e., 4, and skip r 3 and r 5 .us, we can directly start searching the matching rule for rewall rule-set.

Early Packet Drop
Marking.Assume that we build a node of a decision tree.Each node corresponds to disjoint hypercube searching space.Let us de ne some notations for describing "early packet drop marking": (i) Δ v : the searching space for node v (ii) k: the total rule-set number (iii) n v p : the total number of rules for rule-set p belong to node v (iv) r v p [i]: ith rule of rule-set p belonging to node v when the rules are sorted in the order of decreasing priority (v) S(•): a set of all matching keys with given rule (vi) as a set of all keys matching with action "drop" from rst to ith rules of the rule-set p belonging to node v when the rules are sorted in the order of decreasing priority.en, it is recursively de ned as where D v p (0) ∅.

Mobile Information Systems
Assume that an incoming packet is, respectively, matched with r v p [i] and r v q [j] for rule-sets i and j, where the actions of r v p [i] and r v q [j] are "allow" and "drop."In this case, the packet should be dropped by the rule-set j.If all packets matching with r v p [i] are always matched with rules in other rule-set with action "drop," it will be very helpful to know that the packet will be dropped for increasing classification performance.is idea can be generalized as follows. If , any packet matched with r v p [i] in node v is dropped, where p ≠ q. us, while building a node v, the proposed algorithm finds any rule , where p < q, and mark r v p [i] with "early packet drop."If a packet matches with a rule that has a mark "early packet drop" during searching, the searching procedure is finished and the packet is dropped.is "early packet drop marking" significantly increases the performance.

Proposed Packet Classification Performance Analysis.
e proposed algorithm merges multiple rule-sets into an integrated one and constructs a decision tree.Now, we will show numerical analysis results for our algorithm.Assume that rules are homogeneous, and the decision tree is perfectly balanced B-tree for easy analysis.Let us define some notations as follows: (i) k: the total rule-set number.(ii) B: binth, the maximum allowed rule size in the leaf node.(iii) c: the child number of each node.For easy analysis, we assume that c is fixed.(iv) s: space factor.e maximum ratio of the sum of all rules belongs to all sub-rule-sets to the original ruleset size.(v) N: the total rule number for each rule-set size.We also assume that N is fixed.

Total Packet Classification Cost Analysis. Assume that
EffiCuts has N rules in a root node.If it has c child nodes and the space factor is s, the first level child node has at most sN/c rules.In a similar way, we can calculate the rule number in the leaf node as follows: and it should be equal to or less than B, where the height of the decision tree is h EffiCuts .From (2), we can find the height as follows: For the proposed algorithm, we can similarly obtain the height as us, the total packet classification cost for EffiCuts is approximated as follows: Now, we calculate the difference between two costs: en, we can conclude that the proposed algorithm can always provide higher classification performance than EffiCuts.

Total Decision Tree Size
Analysis.Since we assume that the decision tree is a perfectly balanced B-tree, EffiCuts requires at most 2 h EffiCuts − 1 nodes for one rule-set, so the total number of nodes is k(2 h EffiCuts − 1).Similarly, the proposed algorithm requires 2 h proposed − 1.If we calculate the difference between two node sizes, Since 2 log (c/s) (N/B) (k − k log (c/s) 2 ) − k + 1 < 0, if 1 < (c/s) < 2, the proposed algorithm creates larger tree than EffiCuts, where 1 < (c/s) < 2. However, we found that c ≫ s for most nodes in a decision tree.It means that the proposed algorithm builds a tree which size is not significantly large compared to EffiCuts.

Performance Evaluation
We compared the performance of the proposed algorithm with EffiCuts.Since EffiCuts is almost an unique decision tree-based packet classification algorithm to support fast classification and large rule-set size simultaneously, we choose it as a competitor.We measured average and worst case classification memory access numbers, and decision tree size using the optimal bucket size and space factor for each evaluation.Since the average classification memory access number defines the overall performance of the network equipment, it is the most important metric.e worst case classification memory access number represents the maximum queuing delay required to guarantee in-order packet forwarding.Last, the total decision tree size is also a critical factor to represent the scalability in terms of ruleset size.Considering modern network traffic increases 6 Mobile Information Systems exponentially and rule-set becomes larger and more complicated to support various services, we choose these three metrics for performance evaluation.
For evaluating performance of the proposed algorithm, multiple rule-sets are needed.us, three rule-set types such as FW, ACL, and IPC were generated using Classbench [50].Each rule consists of ve tuples, and the rule-set size was set to 20K to 100K increasing by 20K where K means 1,000.
us, the integrated rule-set size was to 60K to 300K.For each evaluation, binth and space factor were set to the optimal values, i.e., 30 and 2, respectively.
Figure 7 shows the average classi cation performance in terms of the average number of memory accesses according to the size of the total integrated rule-set.e proposed algorithm achieves about 2.5 times lower memory access number regardless of the rule-set size compared to E Cuts.It is almost similar to the memory access number of each rule-set.It con rms that integrated rule-set has many bene ts to increase the classi cation performance.
Figure 8 shows results for the worst case packet classication performance.e proposed algorithm decreases the memory access number by 2.2 times regardless of the total rule-set size compared to competitor.Although the improvement is slightly smaller than that for average memory access number, it also con rms that the proposed algorithm is very e ective to increase the packet classi cation performance for the worst case.
e worst case performance actually a ects the packet processing delay since most network equipment should guarantee that the packets are processed in sequence, keeping that the orders of incoming and outgoing packets are the same.As the worst classi cation performance is improved, it can e ciently provide in-order packet forwarding while minimizing packet queuing delay.
Figure 9 shows the comparison results between proposed and E Cuts for decision tree size.As mentioned earlier as Characteristic 2, the proposed algorithm generates a decision tree whose size about 20% is larger than that of E Cuts for 300K rules.erefore, the proposed algorithm does not su er from signi cantly increased tree size caused by rule-set integration.
Although we used three rule-sets for most performance evaluations, it is also important to investigate the performance as the rule-set number increases for evaluating scalability in terms of the rule-set number.Figure 10 shows the ratio of the results of E Cuts to those of the proposed algorithm for the memory access and the decision tree sizes as the rule-set size increases from 1 to 10.
As shown in Figure 10, the decision tree size of the proposed algorithm is almost the same to that of E Cuts  Mobile Information Systems regardless of rule-set size but the memory access size is decreased fast compared to E Cuts.For 10 rule-sets, the proposed algorithm achieves 3 times higher classi cation performance while the decision tree size is just increased by 10%.
us, we can see that our proposed algorithm can provide high classi cation performance without any cost of decision tree size.

Conclusions
In this paper, we proposed a new packet classi cation algorithm to achieve high packet classi cation performance without signi cant increasing of memory requirement.It can be adopted in modern high performance network equipment that use various classi cation rule-sets such as routing, switching, QoS, and other rule-sets.Existing network equipment with multiple rule-sets independently perform classi cation for each rule-set, thus resulting in deteriorated performance as the rule-set number increases.Our algorithm combines each rule-set and achieves high performance that cannot be provided by existing algorithms.We expect that it will help to enable robust and low delay V2X services in modern networks.

Data Availability
e source code data used to support the ndings of this study are currently under embargo while the research ndings are commercialized.Requests for data, 12 months after publication of this article, will be considered by the corresponding author.

8
Mobile Information Systems

Figure 1 :
Figure 1: Multiple packet classi cation using multiple rule-sets in UTM equipment.

Figure 3 : 3 )Figure 4 :
Figure3: e ratio of the total decision tree size for E Cuts to the total number of rules as the number of rules increases.

Figure 5 :
Figure 5: Packet classi cations of existing and proposed algorithm.(a) Previous packet.(b) Proposed packet.

Figure 7 :Figure 8 :
Figure 7: e comparison results of E Cuts and the proposed algorithm for the average number of memory accesses as total ruleset size increases, where three rule-sets are used.

Figure 9 :
Figure 9: e comparison results of E Cuts and the proposed algorithm for the decision tree size as total rule-set size increases, where three rule-sets are used.
Average memory access numberThe ratio of results for Efficuts to those for the proposed algorithm

Figure 10 :
Figure 10: e ratio of results for E Cuts to those for the proposed algorithm for decision tree and average memory access number as the number of rule-set increases, where each rule-set size is xed to 20K.

Table 1 :
Comparison features of software-based packet classication algorithms according to algorithm type.
the unnecessary classi cations for rule-sets 0 to k − 1 for existing classi cation.For the proposed classi cation algorithm, all rule-sets are integrated into one larger rule-set, making almost the same e ect as searching multiple rule-sets, simultaneously.ereby, the problem of existing classi cation is mitigated in the proposed one.