Distributed Controller Placement in Software-Defined Networks with Consistency and Interoperability Problems

and discussed.


Introduction
Te support for a network of everything and many ondemand applications and services ofered by diferent multitenant cloud computing service providers had pushed the Internet to evolve into a large and complex infrastructure.A prior study reveals that the number of active Internet devices demanding these services will rise from 26.66 billion in 2019 to 41 billion in 2027 [1].Tis growth is expected to exponentially increase to 125 billion by 2030 as an average of 127 devices is connected to the Internet daily [2].Te signs of this have become more visible during the COVID-19 pandemic.Tese devices can generate heterogeneous trafc in cyberspace to the tune of 800 ZBs.Most trafc emanates from applications with conficting quality of service (QoS) requirements.Communication is no longer exclusive to client-server but involves machine-to-machine (M2M).Managing these devices generating various trafcs with diferent QoS demands is quite challenging in today's traditional networks.
As such, network design and management have become much more difcult.Te integration of network control logic and data forwarding entity is considered the major drawback of traditional IP network architecture [3,4].Te prohibition on customization in its proprietary glued network devices makes network management, policy design and innovation time-consuming, infexible, and prone to error [1,5].Tese limitations drive the need for a new architecture that can cope with current demand dynamics.Software-defned networking (SDN) [6] is considered a viable option to the traditional architecture because it breaks the distributed vertical integration of network devices to eliminate all their dependencies.Tis way, the SDN's main revolution to stimulate this possibility is the detachment of the intelligence aspect of networking from layer three devices and handing it over to a programmable controller.Terefore, the role of these devices is limited to packet forwarding based on the controller's instructions [6].Te controller's global view and access to network statistics enable it to formulate policies to manage the devices fexibly.In addition, the control plane creates, distributes, and enforces the network policies upon request by switches whenever a new fow with no corresponding rule entry in the switch fow table arrives [7].Te architecture of SDN was initially designed to use a single centralized CP to respond to all the events that emanate from the data plane [8].Although a single controller is sufcient to manage small-scale networks, it introduces overhead, scalability, and reliability concerns due to high failure tendencies in large-scale networks.As such, multiple controllers have shown better performance on large-scale networks.Terefore, it is imperative to use multiple controllers in a distributed architecture (dmCP) to address these issues.However, deploying the multiple controllers also opens contemporary design and performance challenges at the CP.Tese issues are associated with interoperability, consistency, and the controller placement problem (CPP) [9].
Several eforts were made to ofer solutions to these outlined issues [9][10][11][12].Each has a peculiar application area, parameters, and performance metrics.Te method(s) and algorithm(s) proposed to have diferent degrees of strength and weakness.Te state-of-the-art solutions ofered are numerous and vary distinctively.Tis work undertakes a review of these solutions to provide the research community with an insight into these activities although some attempts were made in the past to give this review as reported in [4,[13][14][15][16][17][18][19][20][21].However, most of these works focused their studies strictly on CPP alone.Other essential design challenges are not considered.For example, the authors in [4] studied CPP solutions focused on the controller's capacity and trafc conditions.Even at that, they did not consider application environment and cost factors.
Te authors in [13] focused on SDN with multicontrollers from a design logic perspective.Te paper reviewed the proposed scalability, reliability, and load balance solutions.However, it did not consider issues associated with interoperability between the controllers.Jalili et al. [15] reviewed the control plane deployment mode.Te authors compared in-band and out-band solutions.However, optimization objectives, application environment, and controller type were overlooked.Te work in [16] focused on CPP design principles and architectures.But like [15], their discussion coverage did not touch on optimization objectives and the deployment environment.Tis aspect is partially covered in [17][18][19] as the works categorized the CPP solutions in terms of performance metrics such as latency, reliability, cost, and a combination of many of these in a MOO scenario.However, a controller type and other SDN application areas, such as WSN or IoT, were not considered.Te authors restrict their scope to CPP in DCN and WAN.However, wireless scenarios were partially touched in [19].
Similarly, some techniques have applied deep learning and machine learning techniques to integrate SDN with IoT; however, these AI techniques are outside the scope of the present document.On the other hand, Isong et al. [20] considered optimization objectives such as latency, QoS, and resilience.Additionally, they discuss various schemes for solving the CPP and their limitations.Tey categorized them into optimal and heuristic-based suboptimal solutions.Furthermore, they underlined CPP application areas in next-generation networks such as VANETs, IoT, and telecom.Lastly, the authors of [14,21] conducted an in-depth analysis of CPP solution strategies applied to optimize the CPP performance metrics such as latency and reliability.However, none of the papers [4,[13][14][15][16][17][18][19][20][21] considered interoperability and consistency issues among either homogeneous or heterogeneous controllers deployed in a dCP.However, a survey of controllers for scalability, consistency, reliability, and security is provided in [22].However, it did not consider CPP and interoperability.In other words, none of the papers consider reporting how the multiple controllers placed in the dCP interoperate or synchronize their domain information via the EWi to arrive at a consistent network state for efective service provisioning.Tese issues and CPP are fundamental in designing a control plane with multiple controllers in SDN.
So, unlike all these other articles, this study looks at how homogeneous and heterogeneous controllers have been used to solve CP design problems in csCP and dmCP.In dmCP, the discussion covers issues on C-C interoperability, information synchronization problems for getting a consistent global network view among the controllers, and CPP.Table 1 provides a comparison summary to bring out these diferences.Te interoperability and consistency discussion focuses on the EWi and consistency properties, respectively, while the one on CPP solutions is focused on design objectives, load balance application environment, and security.Te key contributions of this research are summarised as follows: (i) A highlight of CP challenges with historical context to trace the source of the problem that evolved into CPP (ii) Provide the most commonly used CPP mathematical modelling and problem formulation approaches (iii) Indicate a diferent solution approach to pursue through DP than the dmCP option (iv) Provide a critical review of eforts to address dmCP issues related to interoperability, consistency, and CPP solutions proposed in both csCP and dmCP, along with design objectives, the application environment, and security (v) Finally, the study identifes and discusses potential future research directions As shown in Figure 1, the rest of the paper is organized as follows: Section 2 briefy overviews SDN and discusses its CP design options and performance issues.Sections 3 and 4 discuss the challenges of interoperability and consistency in dmCP and review the approaches proposed to address them, Journal of Electrical and Computer Engineering respectively.Section 5 presents a review of the state-of-theart solutions to CPP.Finally, the conclusion and future research direction are provided in Section 6.

Overview of SDN
SDN comprises three (3) planes with two interfaces to manage the communication between them, as shown in Figure 2: the application plane (AP), the control plane (CP), and the data plane (DP).Te communication between AP and CP is managed through the northbound (NB) interface.Te APs are a set of network applications such as network virtualization, frewalls, intrusion detection systems (IDSs), routing, QoS, and mobility management.Tese applications are translated into high-level networking policies and exposed to CP. Te CP is considered the network's brain that manipulates the network forwarding entities based on the design of network policies.It is responsible for routing  [13] √ X X Te multicontrol plane from a design logic perspective [14] √ X X [15] In-band and out-band solutions [16] Design principles and architecture [17] √ X X Performance metrics, such as latency, reliability, cost, and MOO [18] √ X X Performance metrics, such as latency, reliability, cost, and MOO [19] √ X X Classify CPP based on optimization/performance objectives and wireless environment [20] √ X X Focused on the solution algorithms or approaches used to optimize the well-known CPP performance objectives [14] √ X X Performance metrics [21] √ X X Taxonomy of CPP optimization [22] X √ Scalability, consistency, reliability, and security [ [1] and Open vSwitch Database (OVSDB) [6] are among the early southbound management protocol used in SDN.However, OpenFlow is considered the most popular southbound standard.Over the last decade, SDN has attracted interest from both academia and industry in terms of research and deployment.Many institutions widely apply it in data centre networks (DCN) and wide area networks (WAN).It is also observed to be infuencing the design of IoT [24] applications such as VANET, WBAN [25], and next-generation technologies such as 5 G [26].

SDN Control Plane Design Options and Performance
Issues.Te CP of SDN can be designed with a single controller (csCP) or multiple controllers (dmCP).While csCP can meet performance requirements in small networks, it struggles to scale to large and dynamic network scenarios due to high control message processing overhead.It also exhibits reliability concerns because of the single failure point (SPOF) because failure tendencies are higher when the network is large.Figure 3 provides a visual representation of performance issues in each design option.Te CP uses link layer discovery protocol (LLDP) to discover the DP devices.
After the discovery, it is responsible for ensuring real-time fne-grained maintenance of its state.For that, it monitors the topology at regular intervals to collect network statistics for the operation of network applications at AP.Some of the activities it monitors are trafc arrival patterns, trafc types, or topology changes due to events such as failures, which have increased lately [27].Tis enables it to recalculate new instructions and install them on the switch's fow table at DP upon the occurrence of any of these events.It centrally performs these functions reactively or proactively.

Centralize Single Control Plane: Performance Issues.
As the name implies, in csCP, a lone controller is confgured to control the entire network.Te single controller centrally manages all the devices at the DP (see Figure 4).Although it can satisfy performance obligations if the network is of average size, it easily sufers performance degradation if the network begins to grow or span over a wide geographical area.Table 2 provides the summarised features of some of these controllers.

Scalability and Overhead.
One of the performance issues sufered by a CP designed with a csCP is its inability to handle a network with DP devices extending to a geographically large area or rapidly generating a high number of events.Te network's diameter infuences the fow setup time in SDN; the time tends to be higher when the switch is further away from the controller.Furthermore, a network can grow so that numerous fow setup requests from many switches become a source of performance bottlenecks.Tis is because the number of fow setup requests is directly proportional to the number of switches, implying that the overall cost to confgure a fow route for n-switches concerning network load is about 94 + 144 N. Te authors in [28] report that a large network can have switches that can generate up to 10 million fow requests per second.Tis is beyond the capacity of a single controller, as some controllers can only accommodate 6000 fow requests per second [13].
On the other hand, a DCN might have a dynamic environment where high-volume network events are generated rapidly within a short period.In such a situation, the csCP sufers from communication and processing overhead that can prolong response time, causing a serious delay that can hurt some time-constrained applications.Tis is likely to happen because the controller may not have adequate CPU, memory, and bandwidth capacity to process and respond to this many DP events [6].

Reliability: Single Point of Failure.
A csCP is vulnerable to a single point of failure (SPOF) [29] because if the single controller is down, all the switches under it will not have another controller to fall back on.Tis is crucial, as it will afect service availability and security if the network is compromised under any attack, like a DoS attack by fooding the network with numerous fake fow setup requests.Furthermore, the singularity of the CP makes it more vulnerable to attacks such as spectrum sensing data falsifcation (SSDF), also known as the Byzantine attack.And defence against such an attack on an SDN controller is tough.If successful, the adversaries will acquire full control of all network devices and behave arbitrarily to disrupt the network.Te only known defence against such a threat is a 3f + 1 switch-tocontroller mapping.
Tere are two approaches to mitigating these challenges.One of them is a DP modifcation approach in which some levels of decision-making are relaxed and allowed to be taken by the forwarding devices.Te other is to redesign the CP with multiple controllers.Using the former approach, DevoFlow [30] proposed decreasing the intercommunication rate between the CP and DP by implementing a wildcard rules mechanism at switches.Te mechanism empowers switches to be able to make some local routing decisions involving matching mice fows.Tis frees up the controller to focus solely on elephant fows.Tis way, signifcant overhead is reduced while in DIFANE [31], a distributed DP framework for handling all data packets is proposed.In this architecture, if trafc fows that do not resemble a precached rule arrive at an ingress switch, the ingress switch is instructed to re-encapsulate the fows in packets and redirect them to a designated switch with route forwarding determination authority.Tis is done based on rules for partitioning information.Te authority switch 4 Journal of Electrical and Computer Engineering must deal with all packets in the data plane and update the ingress switch to cache the decided rules locally.Te approach employs a link-state routing technique that detects changes in network topology without the controller's involvement to curtail its trafc overhead.However, both techniques violate the SDN basic principle of freeing DP elements from any function but packet forwarding.Moreover, they may require a complex modifcation or replacement of the SBi API, such as OpenFlow compliance DP elements, which may increase the design cost.Alternatively, a technique in [32] uses a partitioning algorithm to proactively generate wildcard rules and install them in the switches to handle mice fow free of the controller.Te technique performs the "action" of rewriting the server's IP address for packet forwarding to the egress port.On the contrary, the approach in [33] proposed a technique that modifed the packet handling process such that only the packet that arrives frst is sent to the CP, while later packets are handled locally.Tis is achieved via the blackhole mechanism.In a similar approach, the authors of [34] proposed a technique that reduces CP overhead via packetin fltering.It screens packets with duplicate information and drops them.However, the techniques incur high packet loss due to the architecture of the black hole mechanism and fltering techniques, respectively, while the authors of [35] proposed a scheme that reduces the number of control messages between CP and DP using an out-of-band controller to avoid hybrid architecture.
Conversely, Isyaku [36] proposed a technique that reduced CP overhead using fow timeouts and eviction mechanisms.Te authors consider trafc characteristics to select fows to evict from the fow table whose reinstallation would cause the least overhead on the CP.Similarly, to improve csCP scalability, ethane [37] and NOX [38] enhance enterprise networks by allowing administrators to defne policies such that mismatched requests pass through the controller to have centralized control.However, both approaches sufer from SPOF and support a relatively small network.Other approaches [39,40] adopt parallelism-based optimization using a multicore system and multithreading to reduce fow setup latency in CP.On the other hand, the authors of [41,42] embraced routing scheme based-optimization and entry aggregation with early match using the hidden Markov model to scale and reduce the number of  Journal of Electrical and Computer Engineering events processed by the CP.Te approaches aim to optimize this process in terms of fowable.

Control Plane with Multiple Controllers.
A CP with multiple controllers is designed to solve the csCP limitations.
To achieve that, the csCP is modifed to deploy multiple controllers in a distributed architecture to manage the network.Figure 5 depicts an example of this design option.
Te dmCP uses a load-sharing mechanism to allocate the DP switches among diferent controller instances as appropriate.
Te controllers communicate via east-to-west interface (EWi) to synchronise their information for global network knowledge.In addition, the interface provides a channel to coordinate activities such as data transmission, leader selection, failover, and load balancing among controllers.
Architecturally, the mCP with multiple controllers is designed either as a logically centralized CP or a logically distributed CP.
Te multiple controllers work together to perform the functions of a single controller in a logically centralised CP.Tis is accomplished by constantly synchronising their network state and policies to provide a consistent network view.However, intensive state synchronisation among controllers to maintain a logically central CP can result in signifcant bandwidth consumption and high latency in large networks whereas in logically distributed CP, each controller only has a view of the domain for which it is responsible; it makes decisions for its local domain alone and only dispenses the information that is needed to other controllers.Tis is as opposed to logically centralising designs where each controller must have a global view of the entire network to take decisions.Consensus algorithms are applied in most distributed control plane designs to achieve eventual correctness and consistency [43].Te consistency level achieved by these designs may be strong or weak.Strong consistency requires that the state of each controller instance be replicated and transmitted to all controllers through consensus.Tis implies that an appropriate and consistent network state is only achieved through consensus.Te procedure introduces overhead and delays, limiting responsiveness, and potentially resulting in suboptimal performance.While the eventual consistency model omits consensus and assures at least one delivery invariant, the approach only integrates information as it becomes easily available and reconciles updates when each domain knows them.Tis supports faster reaction with the ability to handle higher update rates, but at the cost of a temporarily inconsistent network view.Tus, it may cause inappropriate network behaviours.
ONIX [44] is an example of these CP architectures with distributed controller instances deployed on one or more physical machines.Te control architecture of ONIX maintains a global network view within a network information base (NIB) data structure with two unique update and distribution mechanisms.ONIS guarantees consistency of network state using distributed locking and Paxos consensus algorithms.It also incorporates replication and transactional database modes to ensure that consistency attend is reliable.In addition, it contains a distributed hash table (DHT) mode that provides an extensive API to verify the consistency.If you need a solution with high availability and your network experiences frequent events, Onix is a great option.However, despite all these consistency checks, it lacks confdentiality and integrity mechanisms to ensure secure state exchange among the controller instances.
Nevertheless, ONOS [45] and ODL [46] employed stringent access control techniques and security services to prevent repudiation and elevation of privilege risks if security is crucial to you.ONOS [45], distributed control architecture, operates diferent instances of foodlight controllers on multiple servers, with each server responsible for a subset of OpenFlow switches.Te controllers broadcast network events using a publish/subscribe method, and intercontroller communication is handled via various routes.ONOS leverages Titan's transactional semantics on top of Cassandra's consistent data store to ensure the consistency and integrity of the network state.In addition, the secure mode (SM) of ONOS provides protected access and granular control over internal data structures and libraries.
OpenDaylight (ODL) [46] is another controller with distributed architecture designed through clustering multiple controllers.ODL can keep a centralized, logical network view using the Akka framework and the RAFT consensus algorithm.Te consensus algorithm is incorporated to enable the clustered ODLs to achieve network consistency.Te algorithm randomly selects one of the cluster members to serve as the leader and then transmits all the most recent data changes to that leader for update processing.It is an open-source controller that can accommodate various specialized security modules such as secure network bootstrapping infrastructure (SNBI), AAA service, and Defense4All.As a direct result, ODL can preserve the integrity of the data, as such; it is recognized as one of the most secure dCPs.Tese features facilitate SDN integration with conventional network architecture.
Similarly, like ODL [46], DISCO [47] is another horizontally distributed CP architecture.But unlike ODL, it has limited security mechanisms because of an inherent Floodlight controller's vulnerability.DISCO employs an advanced message queuing protocol (AMQP) to design an expandable dCP suitable for heterogeneous WASDN that addresses concurrent control strategy inconsistencies in  6 Journal of Electrical and Computer Engineering multicontroller architecture.A single Floodlight instance of DISCO is assigned to an autonomous domain.Te AMQP helps it via an EWi to transmit information to other controllers' instances using publish-and-subscribe mode.However, even though DISCO is suitable for a large network under diferent administrative control like the Internet, scalability and SPOF concerns still exist because of its one controller instance per network domain strategy.Katta et al. [48] proposed Ravana as an alternative to DISCO for fault tolerance at both DP and CP.Ravana is a tripod-phase replication procedure to preserve consistency of (1) DP switches run time, (2) control interface, and (3) controller instance runtime in a logically centralized CP architecture in a master-slave design.Instead, of merely keeping the controller state consistent in one phase, it considers the data plane by incorporating a mechanism to guarantee switches' state consistency.Ravana demands strong consistency guarantees when processing the failure events to preserve one exact semantic.Because the method is based on the Ryu controller platform, it is susceptible to the same security faws as the Ryu system itself [49].Another alternate faulttolerant dCP with an additional transparency feature in controller instance capacity is proposed in [50] as IRIS-HiSA.Te architecture of IRIS-HiSA comprises an assembled bunch of controller instances organized in a physically distributed manner, with each instance having access to network state information of the global topology.Controller instances are activated by its session management module when there is a failure or overload incident, and switches are assigned to it as per its residual capacity.All the controller instances shared their domain information in a publishsubscribe procedure with consistency in all the controller's network knowledge being pursued using the Hazelcast consensus algorithm.Instead of using traditional topological partitioning for diferential QoS provisioning, Hydra [51] is an alternativedistributed CP architecture that divides a computer network according to the functionality and role of network control applications.So that network applications are confgured on diferent distinct controllers.Hydra uses the Paxos consensus algorithm for fault tolerance and consistency.However, due to the functional slicing, communication between various applications across other partitions may encounter high latency.Conversely, Elasticon [52] is built with a load measurement and an adaptation module to select load adjustment via switch migration across controller instances in the event of topology changes, making it ideal for adaptive load balancing in dCP.

Distributed Control Plane Challenges and Design Issues.
Architecturally, the CP with multiple controllers is designed either as a logically centralized or a logically distributed CP.However, as summarised in Table 3, both designs face challenges such as consistency, interoperability, and the controller placement problem (CPP) (see Figure 6).Two of these challenges are about ensuring that the behaviour of the multiple controllers matches what is ofered by a single controller.Indeed, this requires coordination to ensure consistency and interoperability among the controllers.However, ensuring this is a big challenge in SDN [43].
Meanwhile, consensus algorithms such as Paxos and RAFT are being relied upon by many dCP to achieve a consistent network state.However, the consensus algorithms relied upon are observed to be theoretically unsuitable and practically inefective because they inhibit availability and incur extra latency, primarily when controllers are distributed across a WAN [43].Furthermore, interoperability among the multiple controllers became even more difcult due to the controllers' heterogeneity.Te other problems CP faces with multiple controllers is identifying the number of required controllers and their placement position in the network topology.Tis problem is called controller placement problem (CPP) [9].

Controllers Consistency Problem.
Controller consistency in mCP refers to its ability to always have a stable, upto-date global knowledge of all network states and policies.Te update always aims to preserve one or more of the following aspects: i.e., (1) network state like connectivity and capacity, and (2) policy.Inconsistency in CP connectivity can cause trafc blackholes (i.e., dead-end paths), isolation or forwarding loops problems [54] (i.e., a situation where trafc keeps going back and forth without proceeding to destinations).Te latter can deplete switch bufers to the extent of impairing availability in the network.
In contrast, inconsistency in capacity can cause transient congestion and latency problems during updates [55].And lastly, consistency in policy ensures the requirements desired by the operator, like path selection such as in Isyaku et al. [56], ACL, and frewall are adhered to.Inconsistency in policy updates might have security implications and QoS violations.To avoid all these, the controllers must constantly share their state, policies, and version info.However, achieving a consistent global knowledge of an entire network by all controllers is one of the most challenging tasks in dmCP, unlike in csCP instances where global networks are easily acquired during network policy updates.Te consistency in mCP covers three aspects of SDN devices operations: (1) controllers' uniform state consistency, (2) consistency in switches' fow tables rules, and (3) controllers' version update consistency [22].Suppose the rules' update operation happens at the time of the version update.In that case, there might be an ongoing transfer of some packets of some fows, so at that time, they may be forwarded by a mix of old and new rules, leading to inconsistency in packet forwarding decisions because status updates might arrive late, which will cause jitter.Likewise, controller overhead may appear due to the high frequency of synchronization attempts.In contrast, state desynchronization at intervals between two syncs could bring connectivity consistency problems like forwarding loops and black holes [57].

Controllers Interoperability Problem.
Similarly, it is essential to note that information synchronization for global view among the controllers is only made possible by an efective EWi.However, the lack of unifed EWi makes interoperability between diferent controllers a prominent Journal of Electrical and Computer Engineering problem in SDN.Hence, it is compounded even further when dealing with heterogeneous controllers.Te motivation for designing CP with heterogeneous controllers can be seen from a security perspective to avoid the common mode fault of homogeneous controllers [10].Tus, providing an abstraction that can support the integration and interoperation between CP with heterogeneous controllers from diferent vendors as did to DP elements by OpenFlow is a big challenge.Moreover, the variations in data models among other controllers hampered this collaboration [22].

Controllers Placement Problem.
Another popular issue confronting CP with multiple controllers is a controller placement problem (CPP) [9].To design a CP for any given network topology, the CPP is formulated to fnd answers to questions like how many controllers are needed for the network.What are their optimum location in the topology?
And how can they be mapped with the DP devices to satisfy the QoS requirements of the network?Te issues had received substantial research attention as knowing the number of controllers to use and where to put them is a prerequisite to meeting QoS metrics and fault tolerance.For instance, knowing answers to these questions is necessary to design efcient dmCP for SDWAN and DCN where latency and reliability are some of the most important performance requirements.

Consistency Problem in Distributed
Control Plane  asynchronously.However, there will be delays before all the switches afect the update because of the asynchronism.Tus, until the last switch afects the update, fows might be controlled by a combination of old and new rules.In this circumstance, diferent forms of invariant infringement, such as a forwarding loop, frewall bypass, isolation, or black hole, might be experienced in the network.Tis is called the "consistent update problem."Te challenge, therefore, is to update the DP devices in a manner where no single invariant is infringed.Te solution to the problem has been approached in three ways.Te frst method meticulously arranges the switch update sequence to guarantee that no invariant violations occur [58].Te second method involves a 2-phase commit, i.e., marking (tagging) any incoming packets at the ingress switch with a unique identifer before processing them according to whether they belong to a new or existing fow entry [11,59,60].Te third strategy has switches almost simultaneously switch over to the new fow entries, which is made possible by using switches with clocks that perfectly sync with one another [61].In dmCP consistency, an issue arises because even though the multiple controllers are partitioned from each other, the DP devices they control remain connected [62].
In this situation, as explained in 2.3.1, the controllers synchronise their domain info with one another to update their states, versions, and rules with the help of consensus algorithms.Te update always aimed to preserve one or more consistency properties, i.e., (1) connectivity, (2) capacity, or (3) policy.For instance, inconsistency in CP connectivity can cause trafc blackholes (i.e., dead-end paths) or forwarding loops problems (trafc keeps going back and forth without proceeding to destinations).Te latter can deplete switch bufers to impair availability and connectivity.
Another instance is that inconsistency in capacity can cause transient congestion and latency problems during network state or policy updates.And lastly, in addition to connectivity and capacity, inconsistency in policy updates might have security implications or QoS violation.For example, some networks might have a policy that will enforce some fows to traverse through a frewall or some fows to be routed via certain subpath because of their QoS requirements.Terefore, such kind of policies must necessarily be updated consistently throughout the network.
Meanwhile, consensus algorithms such as Paxos and RAFT are being relied upon by many dCP to achieve a consistent network state.However, the algorithms are observed to be theoretically unsuitable and practically ineffective because they inhibit availability and incur extra latency, especially when controllers are distributed across a WAN [43].For instance, Paxos [7] involves a four-delay state updating method: prepare-request, promise, acceptrequest, and accept-response.In large-scale SDN, fow requests can reach up to 11 million per second [63].Every request made may require a state change and consensus run; this can hinder quick network reconfgurations and generate a bottleneck on CP and DP.
Similarly, this might not be suitable for some use cases, like in 5 G technology, where you have a connection setup requirement of <15-30 ms for low latency applications [63].Although RAFT has optimised Paxos [9], the core notion remains.Besides, they are algorithmically complex and hard to implement as they are aficted with errors.Additionally, RAFT [43] is susceptible to Byzantine failures [64].
Furthermore, each aspect's consistency can be either weak or strong.Strong consistency requires that all controllers' instances states can only be replicated and propagated through mutual consensus.After any state update, the leader facilitates the confict-free distribution of state updates to all.In contrast, the eventual (weak) consistency model omits consensus and guarantees at least one delivery invariant.Te selection of the consistency model utilized by the replication process impacts the incurred synchronization overhead in load, response times, availability, and the processing order of commits.
Tis study classifes the proposed solutions to preserve consistency among controllers in dmCP according to their targeted objectives.We considered two consistency properties that reside inside the controllers to do the classifcation: i.e., (1) the network state update to preserve "connectivity" and "capacity" properties and (2) the "policy" update, which is a concern with network operation procedure such as securities check and service diferentiation.

Network State (Connectivity and Capacity).
In response to network-changing events, controllers in dCP update their knowledge to preserve and prevent (1) connection disruption and (2) network capacity violations.An inconsistent connection can lead to issues such as black holes, isolation, and forwarding loops problems that may deplete switch bufers and limit network availability while congestion and update delay can be caused by inconsistent controllers' information on devices' residual capacity [47].Terefore, the following section reviews the research eforts to update networks to maintain these consistency properties.

Controllers' Consistency for Connectivity Preservation.
Mahajan and Wattenhofer [58] proposed update schedules based on combinatorial dependencies that do not require any packet tagging [60].Tis will allow some updated connections to become available soon.Te authors also provide an initial algorithm that, given the current state, swiftly updates routes in a loop-free way, with the controller greedily trying to update as many nodes as possible.However, the greedy operation of the algorithm might lock it up in local optimal.In another approach, Nguyen et al. [65] designed EZ-Segway, as a distributed method for updating network state consistently and swiftly while avoiding anomalies such as loops, blackholes, and congestion.In EZ-Segway, the controllers precompute any information necessary by the switches before the update.Te data are sent to the switches to implement the change using a combination of direct message passing and partial knowledge.Tis removes communication and calculation bottlenecks at the CP.As such, it enhances the performance of update Journal of Electrical and Computer Engineering processes.However, using partial data to afect the update may not guarantee loop and blackhole freedom in the system.
In the rule replacement approach, Forester et al. [66] suggest that a straightforward way to ensure blackhole freedom is for a mechanism to install new or default matching rules with higher priority and delete old ones to avoid blackhole problems.However, implementing this might induce a forwarding loop problem.Besides, the techniques are also restrained by TCAM limitation constraints.In another work, Canini et al. [67] proposed a solution based on principles of self-stabilization within a bounded communication delay.Known as "Renaissance," the method manages multiple controllers' connection and communication outages without compromising performance.Tis solution is an improvement over their previously proposed FixTag [68].A wait-free technique that tackles rules updates consistency problem via a transactional interface.Zhou et al. [69] propose a consistency layer to actively and passively snapshot the cross-domain control states to reduce the complexities of service realizations to ensure a consistent link state for max throughput and min latency.Te technique adopts both reactive and passive snapshots of a cross-domain layer in WAN to control a consistent state of the network controllers.
In a diferent approach, Mizrahi and Moses [70] propose a switches' clock synchronising technique to update the network in real time.Te method preserves loop freedom and communication loss with perfect clock synchronisation and switches execution behaviour.Similarly, in [71], Mizrahi and Moses modify precision time protocol (PTP) to achieve microsecond update accuracy in SDN.Tis is because the normal network time protocol (NTP) lacks appropriate synchronisation behaviour for SDN.Tis leads to an increase in the number of messages necessary for time synchronisation across the whole network.However, when two unsplittable fows need to be swapped in the network with no alternative paths available, the synchronized updates are considered optimal, and the new fow paths can minimize the induced congestion.Furthermore, despite these benefts, clock synchronisation approaches do not prevent random fuctuations in command execution time on switches.Tis prompted the development of prediction-based scheduling techniques [72].

Controllers' Consistency for Capacity Preservation.
Panda et al. [62] investigate the extent to which CAP theorem trade-ofs apply to SDN with dCP.Tey examine network consistency properties that require tenant isolation and middlebox traversal for some trafc and prove that they cannot be all enforced without losing availability.Te authors posit that linearizability is typically unneeded for ensuring efective enforcement of most network consistency properties since the monitored policies typically have simple correctness criteria.For this reason, in [43], Panda et al. designed a simple coordination layer SCL that avoids consensus algorithms like Paxos or Raft to achieve consistency in dCP.SCL broadcasts all CP communication to avoid the need for bootstrapping on the controllers.Te approach has simplicity and eventual correctness, with a higher response time advantage.However, it might give the DP ovS conficting instructions because of the response time.Hence, an implementation may consume higher bandwidth and replacement of the topology discovery module of the CP with the log provided in the controller proxy-SCL.Also inspired by [62], Sakic et al. [63] and Bannour et al. [73] develop adaptive, eventual consistent and self-adaptive multilevel consistency models to solve blockage possibilities in a highly consistent dCP, respectively.Tese models are intended to facilitate developer implementation of numerous application-specifc consistency models.In [63], Sakic et al. integrate eventual consistency models with a novel cost-based approach, where rigorous synchronization is used for crucial activities involving many network resources.At the same time, less critical changes are intermittently transmitted across cluster nodes.However, these techniques will sufer trafc separation overhead.Another study in [74] proposed a fast and generic system that imposes customizable network consistency during updates and information synchronization.Te authors design a customizable consistency generator (CCG) to act as a shim layer between CP and DP, intercepting and scheduling real-time updates issued by the controllers.Te authors use Mininet to emulate a fat-tree network with the shortest path routing and a loadbalancing application in a NOX.However, CCG might require architecture modifcation and incur customization overhead.Luo et al. [75] also argue that during the update process, in-transit packets might misuse wrong versions of rules, and "hot" links could be burdened due to the unplanned update order.Even though earlier proposals like the 2-phase commit and CCG have provided generic and customizable solutions to address the problem of misusing rules, yet no fexible approach exists to avoid transient congestion on hot links with varied user requirements such as update deadlines, transient throughput, or loss.Motivated by this, they thus proposed a customizable update planner (CUP), to seek a solution to the problem.But just like CCG, CUP too might incur customization overhead Aslan and Matrawy published in [76] an adaptation technique that chooses feasible values for the consistency level indicators that satisfy a specifc application indicator.Te authors use K-means online clustering to determine an appropriate mapping between consistency level and application indicator.In a similar approach, Zhang et al. [77] also propose the current network state adaptive synchronization strategy of controller information in dCP.Te authors formulate an optimization problem concerning overhead and availability constraints.Te controller's roles are classifed as leader, acceptor, and learner.However, the technique did not consider fair load balancing among the controllers, which if considered there is a possibility of reducing the synchronization overhead further down.
Te 2-phase commits [60] process used to ensure perpacket consistency can also be used to ensure per-fow consistency to avoid congestion.While this approach reduces congestion, it is insufcient for full bandwidth guarantees.Terefore, as demonstrated by Mizrahi [78].However, these techniques will sufer from fow reassembling overhead at the destination.In a diferent approach, Jin et al. [81] consider diferent update timings of network switches to avoid congestion.Tey construct a dependency graph of various updates and send them out in a greedy fashion once the requisite conditions are met.Flows are then slowed down when the greedy way of going through the dependency graph leads to a standstill.In a similar approach, Zheng et al. [82,83] examine the use of consistent timed updates to reduce congestion in the context of fow migration and latency by employing switch clock synchronization.Te authors presented chronus, a mechanism that allows for scheduling individual node updates in SDNs at precise periods.It is discovered that the strategy reduces transitory congestion and conserves fow table space by reducing rule size by 60 per cent.However, this reduction might be achieved at the cost of some policy inconsistency.
Amiri et al. [84] also explore node-ordering as an alternative to a 2-phase commit.With the fow version numbers removed from the packet header ("marking"), they ofer a technique that uniquely identifes fows based on their source and destination.By omitting the 2-phase commit procedures, the process minimizes complexity and overhead.However, every node has an old and new forwarding rule for every fow.Te difculty is determining the proper sequence to apply these updates without causing congestion or forwarding loops.
Another option proposed by Botelho et al. [85,86] is the use of distributed data storage such that applications on controllers could scarcely be aware of any inconsistency.Tis way, latency problems would not even arise.However, the high memory requirement for the data store might not allow it to work well with TCAM of openfow switches.In contrast, Levin et al. show in [55] that load-balancers and other distributed network functions can bypass weak (eventual) consistency while still providing adequate performance for production networks.Tey made their observation by experimenting with two-state distribution trade-ofs between staleness and optimality and between app logic complexity and robustness to control state.Tey suggest similar investigation for other control applications such as routing and security.Tus, Guo et al. [57] build on Levin et al.'s work by reducing synchronization overhead using a load variance (LVS) technique.Unlike the periodic synchronization (PS) technique, LVS incurs less overhead because it gets activated only when the load exceeds a particular threshold.However, you cannot rule confict in state update distribution because of weak consistency.

Controllers' Consistency for Policy Preservation.
Te 2phase commit approach proposed in [11,60] by Reitblatt et al. symbolizes the frst foundation work for most rule updates' techniques for consistency preservation in SDN.It broadened the scope of consistency properties beyond just network/forwarding to include policy ones.Te authors emphasise the per-packet consistency (PPC) criterion, which requires packets to be forwarded only on their old or new paths during an update (never on both).Te concept revolves around labelling packets at the ingress switch so that either only all old or all new rules can be applied consistently all over the network but not both.Tis way, the problem of forwarding loops and inconsistent policy on packets is avoided.Fayazbakhsh et al.'s fawtags use the same principles [87] as Qazi et al. [88] to ensure networkwide policy compliance with middlebox usage in SDN.However, the technique is known to be memory hungry, as it requires additional free memory slots.Tis is its major drawback, considering how precious and expensive memory is to an OpenFlow switch.Meanwhile, Katta et al. in [59] designed a variant of the original technique to addresses intending to address the limitation using incremental 2-phase commit.Te input update is proposed to be broken down into more minor updates that can be executed sequentially without putting much pressure on TCAM.
Another technique is that software transactional networking (STN) was introduced by Canini et al. [89] to resolve concurrency concerns that occur from the concurrent execution of control applications.Every policy update is either fully implemented or does not afect any packet because the STN relies on all-or-nothing principles.Consequently, this can lead to transitional delay.And to specify actual policy composition principles, the authors deemed it necessary to expand pyretic.In another efort, Canini et al. proposed a consistent policy composition (CPC) technique [68] for concurrent network policy updates.Te method uses a replicated state machine, which ofers a transactional interface to address the issue of conficting policy updates.Te methods consistently match desired network behaviour from the operator's perspective.However, this is only possible if switch ports allow atomic read,modify,and write permission.Undoubtedly, this prerequisite is the weak security spot of the technique.Schif et al. [90] propose a synchronization framework for policy updates based on atomic transactions, implemented in-band, on the DP devices.Te technique achieves consistency in the events of controller failure via publish/subscribe model.In the event of any network-changing events, the controller will immediately publicize the events to its coresources to update their local information using standard OpenFlow protocol [91].But the technique has not been integrated with any SDN controller.
Using a carefully calculated rule replacement sequence, Vissicchio et al. [92] investigate how network policies can be enacted on the OpenFlow switches based on per-packet consistency (PPC).Tey use a greedy algorithm GPIA that updates the fow table of switches in polynomial time to discover a sequence of rule replacements per switch that does not violate PPC.Te memory overhead of this method is zero and can be implemented in a heterogeneous network because of tag-free packets as in a 2-phase commit.Te authors, nevertheless, hybridize both the 2-phase commit and rule replacement approaches to propose FLIP in [93].As expected, the 2-phase commit in FLIP tainted its smooth usage in hybrid SDN, and its complexity is not polynomial.But FLIP is powerful as it can handle per higher number of update incidence.In [94], Mcclurg et al. ofer another method for preserving arbitrary network policies using rule replacement order.Te paper models the update-consistency properties as linear temporal logical formulas, which can be used to generate updates that continue to uphold the original properties automatically.

Summarised Insights.
A distributed control plane updates the states of its controllers to maintain a consistent network view whenever network-changing events, such as the arrival of a new fow, link, or node failure, occur.Te update is always aimed at preserving one or more of the following aspects of the network state: (1) connectivity, (2) capacity, and (3) network policy.Te consequences of violating either of them cause black holes, forwarding loops, or congestion problems.Diferent consistency strategies are put forward according to diferent situations.Table 4 summarises these strategies and the consistency property they optimised.Tese consistency properties are interdependent, i.e., every property must be preserved.Yet, none of the existing platforms attempts to consider all three properties at once; some only meet one or two.
Furthermore, based on these strategies, preserving connectivity to ensure loop freedom is better understood of the three because even straightforward greedy techniques fare relatively well, as demonstrated in [66].We also suppose the problem can be minimised through adaptive and hybrid idle-hard timeout allocation and fow eviction mechanisms similar to [36] or with the approach as in [58].However, the greedy approach might not be suitable to optimize the makespan metric because of the controller to switch interaction rounds might be higher.So getting a logarithmic time algorithm to optimize the makespan might require some deep learning techniques as applied by Poularakis et al. in [95][96][97] using some AI techniques like deep learning.On the other hand, capacity consistency for congestion freedom is stronger than connectivity consistency for loop freedom [66].
Te 2-phase commits [60], dependency graph [81], and rule replacement are the most widely used method to address congestion.Tough [80] posits that splittable fow can be solved in polynomial time, the authors of [84] show that omitting it can reduce complexity and overhead.Regarding network policy consistency, the two major techniques are packet tagging to avoid rule mismatch using a 2-phase commit and complete rule replacement.Each approach has its merit and demerit; while the former comes at the cost of switch memory consumption for algorithm simplicity, the latter comes at the expense of high algorithm complexity with the advantage of dealing with the problem heads on.So, considering the TCAM limitation of OvS, the former might not be a good approach [1].
Lastly, another insight is that most of the consistency methods used by these works do not provide any security, synchronization rate or when to synchronise.Tus, none of the techniques factored the network application into play to assess the impact of synchronization rate on their performance.For instance, it has been shown that load balancers can navigate weak consistency and still deliver good performance.

Controller Interoperability/ Heterogeneity Problem
4.1.Problem Description.Interoperability across distributed controllers in an SDN architecture requires an EWi, just like the NBi and SBi, for communicating with the AP and DP.In contrast to the broad adoption of standardized SBi with initiatives like OpenFlow, there is no standard EW communication interface between the controller in dCP [22].Even at that, the gap has not received the requisite attention from the research community to provide the needed interoperability.But lately, there have been some fruitful study attempts [60,[93][94][95][96][97][98][99][100][101][102][103][104][105].And the existing solutions show variation across a range of performance metrics.Some approaches are mainly concerned with achieving state synchronization in networks where the dCP comprises homogeneous controllers and can't coordinate the coexistence of heterogeneous controllers from diferent platforms [93,94].In other words, they have only solved the interoperability issue for a vendor's unique SDN where all controllers are of the same type.Because their EWi is private and the data model is diferent, they do not interoperate to share the same network.However, this interoperability is crucial for the survival of the modern-day network for the foreseeable future as SDN increasingly proves its advantages.Fortunately, some EWi for dCP SDN networks that use heterogeneous controllers have recently been proposed [60,[95][96][97][98][99][100][101][102][103][104][105].However, studies on SDN's east-west interface are still preliminary, and there are no established industry standards yet.For instance, a Ryu [49] could only process 6 K requests per second, a NOX [38] could process about 30 K requests per second, a Floodlight [98] could process about 250 K requests per second, and a Maestro [40] could process about 300 K requests per second.Terefore, controllers are chosen to be included in a CP for efective resource utilization and cost optimization based on their capacity relative to the DP devices they coordinate.In addition, modern networks are extremely complex, demanding the usage of specialist services such as advanced VPN, deep packet inspection, frewalling, and intrusion detection.As this list grows, the necessity for techniques to implement new network regulations increases.While many controller systems may support several services, this is not the case for all of them.It is also quite unlikely that a single controller vendor ofers the performance gold standard for all services.Consequently, network operators may be forced to choose between not providing a service and making an expensive, disruptive, or unworkable migration to a diferent controller platform.Another motivation is security to avoid homogeneous controller common-mode faults.Treats to network security could occur on SDN composed of multiple homogeneous controllers.For it is homogeneous, it indicates that their designs have the same underlying functioning mechanism, meaning that any potential vulnerabilities in one controller would be refected in all the other identical controllers.Consequently, if attackers could exploit one of these vulnerabilities, they would be able to execute malicious attacks, such as a message leaks or DDoS, on the remaining controllers to bring the whole network down.Tis might have devastating consequences for the entire network.Tis phenomenon is called the "homogeneous controller common-mode fault" (HC-CMF) [10].Consider a scenario of an SDN for which all the multiple controllers are homogeneous (e.g., Floodlight), each of which manages the corresponding SDN subnet.By exploiting CVE-2014-2304, a known faw in an OpenFlow protocol for the Floodlight 0.9 version, an attacker can easily crash the entire system after realizing it is using Floodlight controllers in all the other subnets.So, using the same vulnerability, the attackers can easily eliminate any number of backup foodlight controllers located on the control plane.However, the network might be insulated from this threat if the controllers deployed are heterogeneous because in practice, diferent controllers are susceptible to diferent vulnerabilities.Tis is partly because the heterogeneous controllers are independent of each other and have no dependencies during runtime.Te same vulnerability hardly occurs among controllers written in diferent programming languages.For instance, NOX [38], Ryu [49], and Floodlight [98] controllers are programmed in C++, Python, and Java, respectively.Tese make the trigger mechanisms of their vulnerabilities diferent.Tus, heterogeneous dCP is considered one of the best defences against the "homogeneous controller common-mode fault" (HC-CMF).Because it is almost impossible for an adversary to make heterogeneous CP exhibit abnormal behaviour by exploiting the vulnerability of only one of the controllers' vulnerabilities, this is considered one of the motivations for designing dCP with heterogeneous controllers.

EWi for Homogeneous Control Plane.
In 2012, SDNi [12] was introduced by Huawei as a multidomain SDNS message exchange protocol.Te connection between controllers is a mechanism for controllers to synchronise their data.ODL employs SDNi to promote cross-domain communication among its numerous controllers.Benamrane et al. [99] proposed a communication interface for distributed control plane (CIDC) to synchronise the controllers' information via the EWi.Te technique makes provision for network managers to customize the controller's function by selecting from the following three communication modes: (1) notifcation on (events or mode), ( 2) service (e.g., load balancing, security etc.), and (3) full (both mode and event).Te efectiveness of CIDC is tested in simulated network confgurations that include the frewall and load balancer services.But CIDC is only used for interoperability between dCP that use Floodlights and ODLs.Adedokun and Adekale [100] assert that they improved the original CIDC system to create mCIDC.Nonetheless, mCIDC's architecture lacks a clear indication of where the modifcation section should be located.

EWi for Heterogeneous Control Plane.
One of the earliest eforts to facilitate sharing network views between diferent domains in a multidomain network was initiated by Lin et al. [101,102].Te work proposed a high-performance mechanism known as east-west bridge (EWBridge) with support for controllers from diferent platforms enabled by JSON.Te method specifed which pieces of information should be shared and how.In addition, EWBridge provided a method for protecting network privacy by hiding the underlying hardware behind a virtual network.EWBridge has been utilized in CERNET, Internet2, and CSTN (CSTNET).Another efort came a year later by Dixit et al., in FlowBricks [103].FlowBricks is also a framework for composing heterogeneous controllers on the same CP.Te framework integrates the services implemented on the heterogeneous controllers into the same network trafc.Tis is motivated by the fact that no single controller has or can provide the best-in-class implementation of all desired network services such as VPN and frewall.FlowBricks is implemented as a module in Floodlight.Yu et al. [104] also introduced a Zebra architecture for consideration.Zebra is divided into the HCM module for managing heterogeneous controllers and the DRM module for managing domain relationships.Zebra had recorded a remarkable improvement in request completion time and scalability-related metrics like throughput and latency.Te proposal's authors were optimistic that it would stimulate the interest of relevant stakeholders, leading to further, fruitful research.
Meanwhile, Qi and Li in [105] point out that dealing with heterogeneous controllers in a large network is very challenging to coordinate due to ununifed APIs.Tus, we propose a controller management system that generates a global network view with unifed APIs for both AP and DP in a way that shields the heterogeneity of the controllers.Te method comprises four (4) modules: heterogeneous controller management (HCM), domain relationships management (DRM), a database, and front-end modules.Tey used a fat-tree-based DCN topology split into three domains: deployed Ryu, Floodlight, and Pox controllers, respectively.
Another recent efort is Yu's et al. [106] proposal of WECAN.WECAN is designed to act as an EWi for controllers in SDN dCP.It consists of three parts: (1) a controller  [107] propose DSF, an adaptive framework for the EWi design for heterogeneous dmCP to synchronise topologies using a standardized communication protocol.DSF has been tried out with both the ONOS and Floodlight controllers.However, although the authors claim that DSF can run on diferent platforms, the current version of DSF only works on Java-based control platforms.
Te authors of MNOS [108] and Mcad-SA [109] explore a heterogeneous SDN control plane that can deal with security challenges.Tey developed a dCP to counteract hijacking and modifcation attempts.Particularly, a mimic cyberspace defence (CMD) is included as the core concept underpinning MNOS.By incorporating CMD into SDN controllers' design, they could produce an N-variant controller framework with dynamic, heterogeneous, and redundant properties.Te CMD protects the controllers against any backdoor, or modifcation attacks, but not against attacks such as DDoS or OpenFlow known vulnerabilities.And MNOS is only limited to the variety of controllers used.Recently, Yi et al. [10] were inspired by these works to formulate a secure aware heterogeneous CP (SQHCP) to optimize delay, resource utilization, and failure rate in SDN with homogeneous controllers due to common-mode fault (CMF).Te techniques involve two steps: step one deals with determining the number of heterogeneous controllers using a knapsack approach based on dynamic planning to ensure tight control plane security.
Step two deals with network partitioning using the K-means clustering algorithm.Te inclusion of heterogeneity in [10] is security motivated.Because of the CMF of CP homogeneity, attackers familiar with one controller's vulnerabilities can bring down the entire network.Tey got rid of the threat by using heterogeneous controllers such as NOX, Ryu, Floodlight, and ONOS in the network.
Similarly, Hoang et al. [110] have recently proposed (SINA), an EWi, to guarantee the interoperability of a decentralised and heterogeneous SDN system.A unique consistency algorithm for an adaptive quorum-based replication mechanism is also provided.Te former shows that SINA's consistency strategy, active replication, based on broadcasting, is valid.Te latter evaluates SINA's Q-learningbased quorum-based replication strategy.SINA achieves better reading and writing latency and overhead than other wellknown interfaces.Despite its excellent consistency, active replication overuses system resources (bandwidth, processing capacity, etc.), which is one of its downsides.In similar work, Moeyersons et al. [111] proposed "Domino" a pluggable framework for managing heterogeneous SDN.Domino incorporates a microservice architecture allowing users to integrate multiple SDN controllers.

Summarised Insight. All the techniques cited in Table5
ofer an EWi solutions of communication in dCP.Among all the proposed techniques, only [10,108,109] addressed security issues related to hijacking, fow rule modifcation attacks and a CMF.Furthermore, none of them provides protections against such as DDoS or any known OpenFlow protocol vulnerabilities.SDNi is an IETF initiative and is already in use in production networks.However, as there is no generally accepted specifcation for the east-west interface, the growth of SDN is stymied in extremely large-scale network settings.

Problem Description.
Managing a large-scale network with a single controller was inefcient due to its lack of scalability and reliability; hence, CPP has emerged as a fundamental research subject in SDN.Te idea was frst conceived by Heller et al. [9].For any given network, the problem is determining how many controllers must be employed in the network and (ii) where they should be positioned on the network so that the impact of average and maximum latency between controller and switch is reduced.Te problem has been modelled over the years to consider multiple factors, including reliability, load balancing, cost, and security.As a result, we highlight the CPP's mathematical modelling and problem formulation, solution approaches, and design objectives in Sections 5.2-5.4.

Mathematical Modelling and Problem Formulation
Approaches.CPP is an NP-hard problem similar to a wellknown facility location problem (FLP) [129].With FLP, the controllers are treated as facilities, while switches are as demand points.However, other mathematical techniques such as Knapsack, Vertex Cover (VC), and Set Cover Problem [130] are also reducible to CPP [131].Furthermore, techniques such as feld matching problem [132] are used in [133,134] and dominating set problem [135] in [136].
To formulate CPP, the network being considered is modelled as graph G � (V, E, S) with V, S & E representing the controllers, switches, and links, respectively.Suppose you have n � |V|∀ v ∈ V as the number of controllers and k � |S| ∀ s ∈ S as the number of switches, the solution of the CPP is to fnd the number n and the mapping of v ∈ V ⟶ S in the network.Every CPP solution is aimed to improve network efciency as measured by QoS KPIs such as latency, throughput, overhead, loss, response time.Accordingly, the goal of any given CPP method is to optimize one or a combination of these metrics subject to several constraints.Table 6 summarises most of the widely used mathematical symbols in CPP formulation.

Solution Approaches to Controller Placement Problem.
After the problem is formulated, many diferent solutionseeking approaches can be applied to solve the problem.As shown in Figure 7, for optimal solution, optimization approaches using linear programming (LP), integer linear programming (ILP), binary integer programming (BIP),  [137].But for suboptimal solutions, heuristics, and meta-heuristics algorithms such as simulated annealing (SA) and evolutionary algorithm (EA) such as particle swam optimization (PSO) and its variant NCPSO, genetic algorithm (GA) and its variant NSGA-ii, KnEA, Firefy, Manta Foraging Ray algorithms are usually used on expanded data set.Similarly, random, greedy methods, game theory such as Nash, nonzero sum and machine learning techniques are also applied [17].

Controller Placement Problem Design Objectives.
Figure 8 illustrates the performance metrics for the CPP.
One of these metrics is controller-to-switch latency.Latency has several causes such as propagation, packet transmission, switch processing, controller processing, and controller queuing latency [138].Various CPP methods have attempted to minimize the efects of one or more of these delays.Reliability is another important metric that has been studied extensively.In an SDN, failure might originate from either the DP or the CP [139].In the former, failure comes from the forwarding devices or links; in the latter, failure arises due to software or hardware.Existing works that proposed resilient dmCP designed their techniques to guarantee control path redundancy via multiple control message paths.Others elect to shorten the length of the control path.In contrast, others implement numerous controllers, as shown in Figure 9. Tese proposals are reviewed in Section 5.4.1 and summarised in Table 7. Next is load balance, measured by synchronization, statistics collecting, and fow setup overhead.Controllers' maximum and average loads are also measured.CPP with load-balancing eforts is summarised in Section 5.4.2 and Table 8.Te capacity and model of the controllers used impact CPP design.Tis study classifed the CPP as capacitated, uncapacitated, homogeneous, or heterogeneous [58].Section 5.4.3 provides the review of CPP when deployment and application environments are considered.
In addition to DCN and WAN, SDN is widely used in emerging technologies such as WSN, VANET, and IoTs.Section (5.4.4) reviews these proposals, and Table 9 compares them.Similarly, numerous CPP solutions have incorporated a cost component.According to [13], capital expenditures (CAPEX) and operating expenses (OPEX) are the two main ways in which cost is viewed in CPP.CAPEX refers to the funding allotment for hardware components.It is usually related to their sheer quantity and robust features such as processing power and number of ports while OPX addresses energy costs.Te end-to-end transmission line for this exchange consists of one or more switches at the DP, a controller, and a link.A loss of packets in the exchange may occur due to congestion, a faulty component (controller, switch, or link), or a security incident.In either case, any loss in the control packets will devastate the network's behaviour.Several research eforts are made to mitigate this problem by proposing a resilient dmCP [140][141][142][143].As shown in Figure 9, some of these eforts proposed resiliency in dmCP by guaranteeing redundancy in the control path via multiple paths.Similarly, there have been some eforts to shorten the control path by using fewer nodes along the path in order to reduce probable failure points while others seek resillience in the event of failure, by connecting switches to multiple controllers [14].Table 7 also provides a summary comparison of these techniques reviewed.
Hu et al. [140] initiated a novel reliability metric to formulate a fault-tolerant CPP (FTCPP) as ILP.Te objective function aims to maximize the expected percentage of valid control paths, i.e., the logical links between DP to controllers.Te authors proposed three (3) variant techniques: random placement, brute force, and l-w-greedy algorithm to solve the problem.Tey used networks from the ITZ repository to validate the methods.However, the proposed technique is not benchmarked against any other previous study, perhaps because it is one of the pioneer's works on CPP with reliability.
For this reason, its performance cannot be substantiated at the time, but it can be done now that diferent approaches are available.Guo and Bhattacharya [142] adopt a partition approach to achieve the triplet of reliability, scalability, and security in SDN.By considering the interdependence networks cascading failure, they proposed a network partition technique for controller placement.Simulation results show that the expected network failure tendency is inversely proportional to the average path length of the network.However, the technique has not been evaluated on real topology rather, and they used a synthetic method using the igraph library to generate three diferent networks with a ring, binary tree, and Erdos-Renyi random network topology for the evaluation.Te approach in [143] used MIP to design a capacity-aware CP technique.Te authors use two strategies to prevent network outages with smooth failover plans in the event of one.In the frst strategy, an assumption that a node is joined to the controller via two separate control paths was made, while in the second strategy, they assume that a node is linked to multiple controller replicas via two separate paths.Te efect of diferent topology sizes and the number of controllers on the average path length is investigated using networks from the SNDlib database.
Te work in [144] proposed two optimization models to improve reliability in CP.Tey formulate a CPP under a comprehensive network states (CPCNS) in the frst model.While in the second model, they develop the problem for a single link failure" (CPSLF).Ten, we proposed an optimal CP algorithm and greedy algorithm to solve the two problems.Teir motivation for adopting this approach was to solve the 70% network failure experienced by a single link.
Another study [145] used a greedy algorithm to design a technique for fnding multiple control paths for exchanging control messages in SDN.Tey used a clustering-based global optimization for fnding the shortest path among them.For average reliability and minimum computational complexity, a reliability factor is defned.Te authors in [146] adopt the strategy of network partitioning to design a distributed and reliable CP.Depending on the subnet size, a formula for calculating the reliability of each network domain is proposed.Similarly, based on each subnet load, controllers are distributed accordingly.Hence, the authors considered the packet loss rate, and the node degree for the assignment.Tey also designated a coordinator to detect any nonactive node, so that an appropriate controller can be relocated to take charge of the failed subnet.Te coordinator considers the calculated reliability of the failed subnet and its distance relative to the new controller before the reallocation.However, all these techniques may have succeeded in minimizing the length of the control path to minimize the   possible failure points, but the shortest path selected may not be the best path in terms of other QoS metrics like bandwidth.Although control messages may not have a high bandwidth demand, if there are many requests, the shortest path with limited bandwidth may be overloaded, causing a high response time and subsequent failure.
In contrast to minimizing the control path length to ensure CP reliability, other techniques assign switches to multiple controllers for redundancy [112,[147][148][149][150][151][152][153][154].For instance, Tanha et al. in [147] proposed a technique that maps each switch on DP to one primary and multiple levels of backup controllers.Sridharan et al. in [148] designed an algorithm for mapping switches to multiple controllers in a distributed controller architecture.Te algorithms distribute fow setup requests among the multiple controllers to minimize the controller's response time and satisfy the elastic constraints.Te outcome reveals that increasing the network budget increases the network's resilience level.Te work in [149] switches is assigned to a primary and backup controller(s) to prevent controller failures.Killi and Rao [150] propose a technique for controller placement that minimises worst-case latency in the event of controller failures.Te authors assumed that switches have a failure-foresight ability.Tis means that the switches are aware of the current state of the controllers.Te authors employ LP to mathematically formulate a reliable and capacitated aware CPP that can withstand the failure of up to (k − 1) controllers.Te technique is evaluated using networks from ITZ.However, the authors did not consider the average latency between the switch to controller and controller to controller.Terefore, in [151], Killi and Rao proposed another mathematical model that investigates the worst-case latency in the event of single-link failure.Tey formulate a reliability-aware CPP as an ILP to determine the expected percentage of control path loss.Te authors develop a greedy algorithm to solve the reliability problem.Te aim is to fnd answers to the questions: How many of them will be enough to maximize reliability if controllers are carefully placed?However, the greedy approach cannot guarantee an optimal solution to the problem.Tus, a diferent problem variant is formulated in [152] as capacitated next controller placement (CNCP) strategy.CNCP's failure-foresight assumption of switches made in [150] is relaxed.Tey also assign multiple controllers to each switch to ensure redundancy in the event of controller failure.Te problem is expressed as a MILP, and a simulated annealing algorithm is designed to solve it heuristically.Te technique is also tested on ITZ networks.Perrot and Reynaud [112] propose another resilient controller placement strategy in which switches are assigned to 1+ controllers for redundancy against possible controller failures.Te problem is formulated as an ILP with the authors assuming that the probability of the controller's failure is the same.Reference [153] proposed a delay-guaranteed, capacityaware CP for SD-WAN.Te authors consider a controller-tocontroller node failure to design a resilient and capacitated aware CP (RCCP) using a master-slave (M/S) model that can restore network operation in the event of failure.Tey deployed the technique on networks at ITZ for validation.However, the technique is lacking in inclusiveness and fexibility concerning adaptability to delay-sensitive applications and their service level agreements (SLAs).Conversely, the work in [155] proposed a method for dealing with CP failure recovery using a backup controller selection algorithm that minimizes average failure recovery time free of load oscillation caused by switch migration.To design the proposed technique, they consider QoS requirements as well as controller activation costs.Te authors assume that fow requests have the same processing time.However, this assumption cannot be true because of fow variability and diferent controller capacities.In another technique, Hu et al. [154] proposed a solution, arguing that the inappropriate selection of slave controllers to handle the additional load of switches whose master controller is down can lead to what they described as "controller chain failure."To this end, they designed an adaptive slave controller assignment (ASCA) technique to avoid it.ASCA has three modules, namely, the selecting slave module, the assignment slave module, and the adaptive adjustment module.Te problem is formulated as an ILP with the objective to minimize load variance diference (LVD).Tis is done such that a lower value of LVD can be selected to get better fault tolerance and load balancing performance.Tey used KnEA to design heuristics to solve the problem.And the solution has been shown to be efective in avoiding the chain failure of controllers.However, the migration of the afected switches to the slave controller is not done with respect to the type of fow coming out of these switches, and this can have QoS violation consequences.It lacks a clear explanation of how the load distribution afects the queuing delay.It also did not show the diferences in controllers' utilization due to load diferences.Lastly, Bannour et al. [156] proposed two-stage novel context-based techniques that cover load imbalance and latency metrics to optimize response time in the event of failure.Tey created an information-gathering mechanism in stage I to collect and transmit topology information to the placement algorithm in stage II.Te information is used by the cluster leader election scheme to select a leader.Te followers send their cluster's neighbourhood latency status to their respective leaders to be used for synchronization and global topology building.Among the leaders, one is designated as the "hyper leader," responsible for building global logical topology and running the Dijkstra algorithm.In stage II, a partition around medoids (PAM-B, aka the "k-Medoid method") is used to partition switches into K clusters of controllers, whose quality is gauged based on the average distinction of all nodes to their nearest medoid.Ten, NSGAII is used to optimize the solution.
One drawback of these methods is maintenance costs.Deployment costs include the purchase price and ongoing operating costs of network devices such as controllers and switches.Tis consists of the cost of purchasing and installing the controllers into the network and the cost of connecting the controller to the switch.In addition, energy costs are an operational expense.
Terefore, in a diferent approach to deploying multiple controllers, some techniques choose to deploy various control path instead.Te multiple paths guarantee at least two disjoint paths connecting DP and CP, which protect the control path against single link and node failures by switching to an alternate path.One example of this approach is proposed by Muller et al. [141] where a proactive capacity-aware CPP solution called survivor enhances CP resilience to failure and recovery.Survivors use the path diversity approach to ensure the redundancy of transmission paths between the CP and DP.Tis way, an auxiliary connection between the two planes is usually assumed to be available.Te redundancy reduces connection loss by 66%.Survivor also adds a mechanism that periodically checks the controller's load concerning its capacity to avoid overload.
Te work in [157] proposes a novel, reliable controller deployment mechanism using a K-critical technique.Tey aim to construct a robust control layer that considers network characteristics such as interference while selecting appropriate controllers.Te paper earlier proves how and why choosing only the shortest control path is an inefcient way of enhancing control layer load and robustness.Similarly, Zhong et al. [131] defned two metrics for checking control path reliability by looking at how many switches may lose connection with their controllers when a single-link failure occurs.Ten, we formulate a problem that can fnd a controller's neighbourhood minimum coverage area in the network.Furthermore, it keeps a list of backup links if a link fails due to unforeseen circumstances.Te goal is to increase dependability and simultaneously reduce the number of controllers required.In addition, the study proposes a heuristic based on particle swarm optimization that begins with all switches as controllers and then generates a nearly optimal solution that is practicable.However, both [131,141,157] do not account for the switch-to-controller delay, intercontroller latency, or controller load (s).
(1) Summarised Insight.All the techniques that minimize the length of the control path may have succeeded in reducing the possible failure points, but the shortest path selected may not be the best path in terms of other QoS metrics like bandwidth.Although control messages may not have a high bandwidth demand, if there are many requests, the shortest path with limited bandwidth may be overloaded, causing a high response time and subsequent failure.Terefore, it is important to minimize the number of potential failures while optimizing the QoS parameters for optimal performance.As a result, taking into account other QoS metrics in relation to the total number of fow requests is necessary in order to reduce the number of potential failure points.As for the multiple controllers or multiple control path approaches, one drawback is the signifcant initial investment and ongoing maintenance costs.Deployment costs include the purchase price and ongoing operating costs of network devices such as controllers and switches.Tis includes the cost of purchasing and installing the controllers into the network and connecting the controller to the switch.Energy costs are operational expenses.

5.4.2.
Load Balance Aware CPP.Te task of fow request handling at CP had made it a source of performance bottleneck [159].As a result, various works proposed CPP solutions to balance load across the CP.Tere are two approaches to achieving this objective.A controller clustering approach (CCA) and switch migration approach Journal of Electrical and Computer Engineering (SMA) are discussed in Sections 5.4.2.1 and 5.4.2.1, respectively.Table 8 shows the summary comparison of each technique proposed under this category.
(1) Controller Clustering Approach (CCA).Te controller clustering approach (CCA) can be considered a proactive approach to dCP load balancing in SDN.In CCA, one controller is designated as super while others as subordinate.Te super coordinates the functions of the subordinate and balances the load across them centrally.
BalanceFlow [113] is a typical CP load balancing using CCA.It is based on a hierarchical deployment of controllers with one of them adopting a super role, which reallocates fow setup requests to others in the event of trafc changes.It uses the multicontroller feature of OpenFlow 1.2 to regulate the process.Te controllers maintain and synchronise their load information periodically via an EWi.Te method has the advantage of fexible tuning of fow requests by each controller without introducing extra latencies.However, it introduces additional overhead on the CP.An approach [114] formulated CPP as ILP to regulate controllers' number and location and their assignment to switches in dynamic trafc conditions.Te objective function of the ILP seeks to optimize the weighted sum of statistics collection, synchronization, fow setup, and reassignment costs.Te authors design a scheme comprising three monitoring, reassignment, and provisioning modules.Tey proposed two algorithms using a GA and SA based on the knapsack problem and meta-heuristic approaches.Although the simulated annealing method takes more time, it was proven more efective than the greedy strategy.In the paper [115], the authors propose an innovative framework known as MDCP with an objective to minimize overhead.Te authors formulate the problem as measurement-aware CPP, which considers synchronization and communication costs.MDCP is designed to be application-agnostic, cost-efective, and lightweight.To avoid computational complexity, a discrete approximation algorithm and a connectivity ranking algorithm are developed to obtain the desired placements.Experiments were carried out on 240 network topologies to validate the technique.Te results reveal a 40% reduction in CP overhead.Furthermore, Selvi et al. in [116] propose a cooperative load balancing scheme for hierarchical controller deployment (COLBAS) similar to [113].COLBAS is a low-cost greedy algorithm in which controllers release their load regularly and coordinate with one another to achieve LB.However, this COLBAS strategy is only centred on implementing LB for distributed controllers without any security considerations for the distributed controller architecture.In addition, the algorithms incorporated into these designs may not be precise enough to collect the controllers' load.Periodic collection of controllers' loads may also result in resource waste.In static approaches [117], we attempted to minimize the burden on switches using the concept of stress centrality to specify the weight of each node based on the number of edge-disjoint paths.Tis way, the authors proposed a controller placement algorithm to alleviate the burden.Te algorithms run in polynomial time to compute the appropriate placement position of the controller.Network topologies from ITZ are used for validation on the simulator running FloydWashall algorithm.However, one major limitation of the suggested technique is that it can only be used for intracluster controller placement or single-controller networks.Similar to BalanceFlow and COLBAS, Sufev and Haddad [118] proposed a cluster vector (CV) approach to achieve load balancing in CP.Te authors simplify the load balancing operation by defning a self-label CV, which contains addresses of controllers in the same cluster.It breaks the dependency of slave controllers on the super as in Balan-ceFlow [113] by designing high-level operations and lowlevel operations in the controllers.Te CV is built into every controller so that a regular controller can discover the address of another regular controller in an inherently reliable way.In this method, standard controllers can poll other standard controllers to gather load data.
(2) Switch Migration Approach (SMA).Te switch migration approach is a reactive way to restore load balance across controllers in dCP as it only comes into play when the load imbalance occurs.SMA posits that whenever a controller is down, or the load of one controller exceeds its capacity, all or a portion of the load of that controller will be transferred to another controller.Tis way, the controller's memory and CPU availability and resilience can be enhanced to amplify its swift response to any request switch might make [13].
Elastic distributed controller (ElastiCon) proposed by Dixit et al. [52] is the pioneer load balancing approach that uses SMA.ElastiCon is a dmCP designed to grow dynamically or shrink concerning trafc changes.It has three modules: load measurement, load adaptation decision, and action modules.Te LMM keeps track of the load of each controller.It collects and sends the load information to the LADM which then determines load allocation among controllers.Based on the load adaptation module, the AM carries out a load balance sequence, like identifying the immigrant switch.Te target controller then migrates the switch to achieve the required load balance.
However, the proposed scheme could not provide clear information on the controller location that can reduce CP latency and the key challenge of resilient architecture.Also, the authors of [119] explore SM with network utility optimization for scalable CP within the constraints of limited resources.Te authors formulate the problem as NUM and develop a distributed hopping algorithm (DHA) to address it.However, the scheme sufers from high migration costs and frequent load shifting.For this reason, a trade-of between migration costs and load balance rate is considered in [120] to design an SM decision-making (SMDM) scheme.Tey formulated the SM procedure as a bin-packing problem and applied a greedy algorithm to fnd an optimal solution.
Similarly, in Wang et al. [121], an SMA based on load informing strategy (LILB) was developed.Each controller actively and periodically synchronises its load information with the other controllers to detect load imbalance.Te algorithm then traverses load measurement, load informing, decision, and SM modules to restore load balance at CP with maximum throughput and minimum load oscillation.Nevertheless, the algorithm may be susceptible to CMF [10] because it does not accommodate heterogeneous controllers.Tus, the authors in [122] combined an SMA with security features to protect important controllers against DDoS and eavesdropping via load relief by adjusting load diferences among the controllers.Tis is a fexible SMA designed using a 3D-EMD algorithm.Nonetheless, the specifc CMF [10] vulnerabilities are still unaddressed in the proposed strategies as it doesn't support heterogeneous controllers in the system.Also in [123], Tarai and Shailendra tackled challenges of poor resource utilization and wastage owing to load imbalance and security in deploying IoT devices for smart cities using SMA.As such, they formulate the problem as LP to obtain the initial device placement.Subsequently, they develop an SMA to minimize migration costs in the event of load imbalance.Te limitation of this scheme is that the immigrant switches and destination controller selection did not adhere to trafc engineering principles.In contrast, the solution proposed by Sahoo et al. [124] uses a multicriteria decision-making procedure called "the technique for order preference by similarity to an ideal solution" (TOPSIS) to facilitate the selection of the target underutilised controller that immigrant switches will be reassigned in their proposed framework for load balancing in SDN control plane while a zero-sum game theory is used in [125] to help choose an immigrant switch(es) from an overloaded controller(s) and a recipient controller for the migration.To do this, the recipient controllers take the role of players in the game, while the immigrant switches serve as commodities.Numerical results reveal that the technique can relieve controllers from heavy loads beyond their capacities.However, despite its apparent speed, game theory is probably not well suited for usage in a wide-area SDN.Similarly, solving the SMP requires a lengthy time to reach the optimal result (which may not be acceptable under a dynamic trafc distribution) or generates heuristic solutions that are inadequate in migration performance.In the event of multiple controllers overload, instead of relieving their load one-by-one via independent SM execution, the authors of [126] proposed an SMCLBRT technique that executes the operation of all the afected controllers at a time to minimize time.To improve the selection of outmigration controllers, they consider response time delay in addition to the current load.However, because of the strategy's emphasis on switch migration by several controllers simultaneously, it is resource-intensive and might lead to congestion.And in a closely similar idea to [126], Mahjoubi et al. [127], in their proposed LBFT technique, grouped all the immigrant switches after their identifcation and migrated them all at once to restore load balance.Te scheme is efective in terms of failover recovery time and packet loss but at the expense of RTT.However, like the other proposals, trafc fow classifcation is not considered in terms of its uniqueness.
One common shortcoming of these reviewed CPP solutions with load balancing mechanisms using a switch migration approach is their failure to accommodate heterogeneous controllers in their proposed design.In other words, the solutions only support homogeneous controllers.However, as explained in the section, this might have security vulnerabilities such as "homogeneous controller common-mode fault."(HC-CMF) [10].Any potential vulnerabilities in one controller would be refected in all identical controllers.If attackers could exploit one of these vulnerabilities, they would be able to bring the whole network down.Tis might have devastating consequences for the entire network.However, the network might be insulated from this threat if the controllers deployed are heterogeneous.
Apart from these works, many other methods like [128,[160][161][162][163] have presented various load balancing solutions at dCP employing a switch migration approach without necessarily resolving a CPP to get the initial CP to DP mapping.
(3) Summarised Insight.Based on this review, we may deduce some possible defciencies in CCA: memory, CPU, and bandwidth limitation may reduce the performance of the centralized "super controller."First, the super controller collects load information periodically and routinely exchanges a large number of messages with other controllers, resulting in a decrease in system performance.Second, there is the possibility of SPOF.Tus, if the super controller fails, the entire technique for load balancing fails.Tis compromises the availability of distributed controllers.Tird, each load balancing operation necessitates two network transmissions: one for collecting load and one for giving commands.In such a scenario, the aggregated load data may be outdated, and the command may lag behind the actual load status.Similarly, none of the methods considers that network entities would have diferent processing needs based on characteristics such as fow table size, queue size, and fows request variability.
Furthermore, the inability to accommodate heterogeneous controllers in their proposed designs is one crucial shortcoming with security threat implications shared by all these CPP solutions with load balancing mechanisms either using the CCA or switch migration approach.Tis means that the solutions can only be used with homogeneous controllers.However, security faws and vulnerabilities like the "homogeneous controller common-mode fault" (HC-CMF) are possible, as detailed in Section 4.1.1.HC-CMF posits that any security faws in one controller would be present in every other controller of the same model.A successful exploit of even a single vulnerability might compromise the entire network.Tis could have farreaching efects on the network as a whole.However, if the installed controllers are heterogeneous, the network may be protected from this threat.Moreover, the designs presented in these publications achieve load balancing across controllers by switching the controller's function over the switch.However, they do not consider the possibility that each controller may have a unique routing strategy; moving the switch directly could disrupt ongoing operations.Finally, fow classifcation is ignored when switches are reallocated, which can compromise the quality of service for some trafc types.

Deployment and Application Environment Aware CPP.
As summarised and compared in Table 9, several approaches have been proposed to address the CPP while considering deployment or application environment.
For e.g., the works in [164][165][166] proposed a framework for fow processing-aware CP that considered data fow processing and control applications.Te framework aims to support SDN architecture deployment in DenseNets.A fexible fow processing-aware controller placement framework (FlexFCPF) places and fexibly reassigns controller devices to manage the future wireless network efciently.Tey formulate the problem as a mixed integer quadratic constrained program (MIQCP) for which they design a heuristic using a greedy approach.Te study in [185] advocates the integration of the SDN concept in managing the emerging 5 G technology because in 5 G, you will be confronted with huge carrier aggregation and dynamic bandwidth provisioning.Tus, optical integration with the wireless at both the front haul and back haul is eminent.Terefore, elastic optical networking is necessary at the core to optimize resource allocations and utilization centrally.Similarly, massive MIMO and digital beamforming mechanisms require a lot of computing power in the 5 G technology.In this vein, SDN will come handy in providing agile and fexible distributed management of 5 G from a centralized controller.As such, [167] joined the vision of integrating the novel 5 G technology with SDN.For this to happen, it will involve the separation of control and user DP functions of the evolved packet core (EPC) of the 5 G technology.Tis will give birth to serving packet gateway controller S/PGW-C and packet data gateway-user S/PGW-U.Te authors focused on the placement problem of the serving gateway controller (SGW-C) in a 5 G network.As such, they made a trade-of between minimizing the SGW relocation rate and trafc load balance among the underlying SGW-C virtual network functions (VNFs) in the problem formulation.Tey formulate the problem as an ILP optimization model and apply game theory using Nash bargaining game and the threat point techniques to obtain a fair solution aka (Pareto optimal).Another work [174] proposed a hybrid hierarchical architecture of multiple SDN controllers to manage 5 G networks.Te architecture is designed as a federating unit of multiple subnetwork controllers, with each focusing on a single subsection of the network but centrally coordinated by a hierarchically superior controller.Te architecture is made up of a global controller module, area controller module, and user equipment (UE) with publisher/subscriber, routing, and topology modules.Tey integrate a data distribution service (DDS) on the publisher and subscriber module on the ODL controller at each control level.Tey experiment with three (3) diferent use case scenarios to check the functionality of the architecture performance.Similarly, the authors of [183]  Other technique in [168] introduced two approaches for CPP formulation and assignment to switches in a wireless and wired SDN environments.In the frst approach, they investigate the controller's average response time when the connection between the controllers and switches is wired, while in the second approach, they consider per-link response time constraint using chance-constrained stochastic programming (CCSP) when the transmission links between the controllers and the switches are wireless.
Other approaches [168] study a CPP in cellular networks, factoring the uncertainties of user mobility to propose a C 3 P 2 and CPPA, respectively.Is a static and dynamic joint stochastic CP and evolved node B (eNB) controller assignment method to minimize the number of controllers required to manage the eNB in the cellular network concerning response time, probability, request rate, and user mobility.Other works [170,171] simulate a 6-Queue system with Bernoulli arrival processes of diferent rates to investigate the optimality of their controller placement techniques in wireless networks concerning throughput under delayed channel state information (CSI).Tey model the problem in static and dynamic controller positions to track how the delay in CSI afects the network's throughput.With this, they were able to characterize the variability in the throughput in diferent regions to allow them to defne network policies that best stabilize the system for all trafc dynamics.
In wire sensor network (WSN) environment, an approach in [172] applied SDN principles for energy-efcient resource allocation.First, they formulate an LP optimization problem to minimize the energy consumption of sensor nodes concerning quality-of-service constraints.Afterwards, they propose a software-defned centralized adaptive bandwidth and power allocation scheme (CABPA).Numerical analysis suggests a positive result of the scheme.While in a VANET environment, the techniques proposed in [173] use SDN for load balancing.First, they investigated an optimal rebating strategy to balance latency and cost in VDVNs.Tey formulated a mathematical model of the problem.Ten, they proposed a two-stage game (IGA) to optimize the rebating strategy to balance the latency and the cost based on metaheuristics genetic algorithm to solve it.
Trough simulation, the number of packets transmitted through cellular lines positively correlates with the rebate ratio and the other parameters.Another approach [175] also demonstrates how SDN can be deployed in the management of IoT and WBAN applications.Te authors emulate SDN functionalities on Mininet with a personal digital assistant PDA acting as the OpenFlow switches under the central control of the programmable SDN controller.Tey measure deployment complexity and network overhead.Te author concluded that the approach is simple, reliable, and costefective.As such, the authors of [176,177] accept the recommendation of [175] to propose an architecture to support the application-specifc requirements of WBAN, named SDWBAN.Unlike classical SDN, a novel HUBsFlow is designed to replace OpenFlow as the SBi protocol in the architecture.In another efort, the authors of [180,181] proposed an SDWBAN framework that allows centralized administrative controls of incoming data trafc to give fexibility for trafc diferentiation of sensitive data with deadline constraints from normal data that require only the best efort in WBAN applications.In a similar health-related application, the authors of [25] applied SDN to optimize the routing of medical emergency packets in the WBAN application.Using mesh topology, the SDN controller is placed on defning the best forwarding node while considering propagation delay and intrabandwidth.
Recently, the authors of [182] conducted a proof-ofconcept experiment for using a multicontroller for heterogeneous wireless networks.Terefore, to that end, the authors of [184] proposed a novel technique to tackle the WCPP in a heterogeneous wireless network environment of Wi-Fi and 4GLTE-U.Te method aims to improve the throughput, link failure rate, and transparency of the SBi in cases where hybrid providers coexist to choose from as linklayer technologies.Te problem of determining the placement of LTE-U and Wi-Fi-based controllers is modelled as an optimization problem, and two heuristic algorithms are proposed to fnd its solutions.

Security Aware CPP.
Driven by the future of the Internet, the solution in [186] proposed a decentralised SDN framework that supports both the physical and logical distribution of CP.D-SDN incorporated a security requirement of identity based cryptography (IBC) which requires a trusted third party (TTP) for secret key generation to the defnition of the hierarchy of controllers.Te feature is compatible with Internet organizational and administrative structures.It supports administrative decentralization and autonomy to enhance the integrated security feature.For proof of concept, the authors experiment with two use cases of network capacity sharing and public safety service.Te mechanism in [187] presents a secure and reliable design of SDN using a cloudbased multiple CP.It is dynamic and uses isolated instance mapping of controller resources in a cloud using a Byzantine mechanism.But the architecture aims to minimize the number of controllers required to be mapped to satisfy the security requirement of each switch.Tey model the problem as controller assignment in fault-tolerant SDN (CAFTS).Tis is done concerning the controller capacity and control of message latency.Te Byzantine mechanism ofers architectural security features.Leveraging on the visualization of controllers' replicas via Byzantine fault-tolerance protocol, the authors proposed a cost-efective requirement frst assignment algorithm to solve the CAFTS.It signifcantly reduces CAPEX of the CP as revealed by the experiments.Zhou et al.,in [122], combined the SM technique with security features to protect signifcant controllers against DDoS by reliving overloaded controllers' load and eavesdropping by adjusting the load diferences among the controllers.Tey consider two types of security breach tactics of adversaries.Te frst is a reconnaissance attack in which the hacker tracks controller trafc, and the second is a saturation attack through IP spoofng.To mitigate these attacks, the authors proposed a fexible SM model designed using a 3D-EMD algorithm.But the scheme incurred high computation time and lacked a fow classifcation module to help give diferential treatment to fow with QoS requirements.Te approach in [123] tackled the CPP with security in a heterogeneous WAN.Tey addressed issues of delay and resource wastage due to load imbalance, fault tolerance, and insecurity in deploying IoTdevices for the smart city via SM.Tey formulate the problem as LP to get the initial controller placement.Later, we designed an optimization algorithm to minimize migration costs in the event of load imbalance when the network evolved.A consensus protocol is integrated to address malicious security issues.Te limitation of this scheme is that the immigrant switches and destination controller selection did not adhere to trafc engineering principles.Tey did not classify trafc fow according to their characteristics concerning types, variability, QoS requirement etc. Defending against a spectrum sensing data falsifcation (SSDF), Byzantine attack on an SDN controller is incredibly difcult.If successful, the adversaries will acquire full control of all network devices and behave arbitrarily to disrupt the network.Protection against such a threat requires a 3f + 1 mapping of a switch to a controller, which has the consequence of overload.In [149,188], the authors propose a novel primary-backup controller mapping and remapping approach in which a switch is mapped to only f + 1 primary and f failover controllers in the event of a simultaneous Byzantine attack.Tey formulate two separate minimization problems of primary and backup controller mapping (PBCM) and remapping (PBCR) as ILP, respectively.Tey then design two heuristics MINCON and MINRUS to solve the problem for large-size networks.Te performance study shows that the optimal mapping requires up to 50% fewer controllers compared to an existing scheme and the heuristics perform within 8% of the optimum, see Table 10 for a summarised comparison of these techniques.

Open Issues and Future Study
Te research presents several EWI communication, consistency, and CPP solutions for the various use cases in SDN.Despite this, there is a pressing need for more inquiry into the issues, as many concerns remain unresolved.Tis section pointed out some of the problems left unsolved and ofered suggestions for new lines of inquiry.

Resources Utilization Related Issues in dmCP Controller
Placement.Adaptive and fair resource allocation-based controller placement is desirable to cope with today's network's dynamic nature and trafc variability.Many of the existing CP approaches with load balancing bias used network partition and controller clustering.Terefore, they are proactive and static in their resource mapping with no provision to adjust to any possible trafc changes.Te few works in corporate trafc dynamics use SM or controller reallocation (CR) mechanism to restore load distribution fairness.Tey track controllers' overload using a threshold value.Terefore, the approaches can be described as reactive, as they only occur when load imbalance occurs.Te SM trigger in these solutions is a speculative fxed threshold parameter that lacks any experimental reference [126].Depending on the threshold size, it may lead to premature or delayed detection, thus leading to network instability.Also, the reactive nature of the approach can cause a delay in the load balance restoration process.Furthermore, there is a migration cost to consider.For example, load oscillation problems may surface if an inappropriate switch or controller is selected.In addition, controller chain failure might occur if switches are not properly mapped to the appropriate controller.Terefore, it will be interesting to explore TE prediction principles to avoid the highlighted issues while addressing fair load distribution in a dynamic environment.

CAP Teory Issues in dmCP.
It is difcult to collectively achieve the three aspects of consistency, availability, and partition tolerance, i.e., the CAP theorem, in the dmCP of SDN.Designers of SDN with partition requirements must deal with performance trade-ofs in choosing a consistency level in their designs.You will be confronted with the choice between having weak (eventual) consistency for high availability or strong consistency at the expense of availability.A weak consistency can assure you of the availability of network resources, but it will lead to state staleness in the network, causing abnormal application behaviour.However, with strong consistency and correct adherence to all network policies, you will pay a steep price for network unavailability.Most current CP works in large-scale networks that sacrifce one for the other.Terefore, it will be interesting to consider adopting a hybrid approach by merging these conficting levels of consistency to strike a balance and look for the optimum trade-of between consistency and availability.

Heterogeneity of Controllers in dmCP Controller
Placement.Another design challenge in SDN dmCP is in the placement and interoperability of heterogeneous controllers from diferent vendors.Here, you will be faced with both CPP and knowledge-sharing problems, where you will deal with controllers' instances' consistency and compatibility.Overcoming these challenges might require a generic and all-inclusive standardization of the SBi, NBi, and EWBi.Te motivation for these can be seen from many perspectives.First, a homogeneous CP poses a possible security vulnerability from a security perspective, considering the controllers possess a common-mode fault, aka a common vulnerability point.If adversaries are familiar with the vulnerabilities of one controller, they can easily bring down the entire network under the common vulnerability of the controllers.Second, interoperability between diferent controller platforms and traditional IP networks can signifcantly encourage and simplify the universal adoption of SDN commercially.So far, very few studies have looked in this direction.As a result, conducting additional research in this area will be a worthwhile contribution.
6.4.Security Aware Controller Placement.Given that the control of a network in SDN is centralized via the controller, all transactions that involve the CP ought to be considered critical, as any disruption owing to a successful attack can be catastrophic to the business.Hence, the CP is susceptible to several security threats like the man-in-the-middle attack (MITM), DoS and DDoS attacks, saturation attacks, control packet snifng and tempering, CP isolation, and IP spoofng.Tis can be attributed to vulnerabilities such as weak authentication, incomplete encryption, and information disclosure.Terefore, it will be a signifcant contribution to designing a CPP scheme that incorporates security measures such as role-based authentication, DDoS blocking application (DBA), secure socket layer (SSL), locator ID separation protocol (LISP), the strong message authentication code (MAC) algorithm, or content-oriented networking architecture (CONA) in their solution.Unfortunately, to the best of our knowledge, we have not come across any solution with these features.

CP with Mobility Tolerance in Wireless Environment.
Te architecture of the current mobile networks sufers from the same complexities as the traditional network.Fortunately, the concept of "software-defned mobile networking" (SDMN) is expected to be instrumental in modifying the architecture of the current LTE, the emerging 5 G, and the IoTframework for applications such as WBAN and VANET, respectively.However, despite the SDMN architecture's potential in resource and mobility management, it cannot be fully utilized until the fundamental issue of CP design that it has raised is resolved.In addition to determining the number and placement of controllers in the network, CPP in SDMN must determine the best controller position compatible with a dynamically changing topology.Likewise, it must work in harmony with "on the fy" ubiquitous and heterogeneous networks and handof support to diferent radio access networks.Second, in SDMN, controllers will heavily rely on statistics that may include user metadata received from APs, BST, and even UEs for mobility and location tracking.Tese have security and privacy concerns, as the information can be exploited for mischievous purposes.In addition, these factors will refect on aspects such as problem formulation and solution methodologies.Tus, the privacy issues and spatiotemporal changes in system parameters required in the algorithms will pose additional intrinsic challenges that further complicate the problem model.Terefore, research in this direction is still open to contributions.

Conclusion
SDN architecture is structured in a way that makes the controller the most important component for its smooth and efcient operations, as every decision is made by it.For this reason, the design and operation of the CP are confronted with some challenges that need signifcant attention to facilitate the adoption of SDN.Tese challenges might be specifc to an application scenario.Several solutions have been proposed over the years to address these challenges.Tis paper critically reviews the profound issues associated with interoperability, consistency, and CPP in control plane design with multiple controllers.Te discussion touches on a wide range of issues, covering the origin of the problem, alternate solutions found at the DP, its evolution, and the state-of-the-art solutions proposed to address them.Te objective is to provide an updated evolution of the problem concerning the solutions proposed to guide future research directions.To accomplish this, we begin with a brief overview of SDN fundamentals and then narrow it down to the subject matter, where we discuss the source of the problem.Te proposed solutions were then reviewed based on their application environment, controller type, and optimization objectives.Te fndings of the critical review reveal that a substantial number of solutions were proposed with diferent degrees of strengths and weaknesses concerning their optimization objectives.Tus, based on that, certain future research directions were pointed out and briefy discussed.

Figure 3 :Figure 4 :
Figure 3: Taxonomy of SDN control plane performance issues.
also harvest the potential of SDN to enhance the management of 5 G technology.Tey integrate the logically centralized-physically distributed LC-PD SDN CP architecture in the management of a 5 G network.Tey conduct experiments on Mininet to demonstrate how LC-PD architecture can optimize the overall output of 5 G network quality of services (QoS).Tey provide proof of concept experimental results and recommendations to adopt SDN LC-PD CP architecture in 5 G technology.

Table 1 :
Comparison of related papers.
Journal of Electrical and Computer Engineering computation, network monitoring, balancing the load on the network, enforcing security policies, and many more.Te DP is released from all control functions and focuses on forwarding trafc based on the decision made by the control plane.Te CP manipulates the network's behaviour by installing fow rules in the DP fow table.It is a logical data structure that stores the fow entries for each corresponding trafc fow.Te communication between the CP and DP is managed through the southbound interface.Protocol Oblivious Forwarding (POF) Figure 1: Organization of the review.Control Plane Controller SBi {e.g.OpenFLow, OpenState, Force, POF, OpFLES etc} NBi {e.g.REST, OSG i } Data Plane (DP)

Table 2 :
Comparison of controllers.

Table 3 :
Distributed control plane summary table.
and 10 Journal of Electrical and Computer Engineering Moses [70], strategies that go beyond simple fow switching are needed to ensure that the update does not impact the network's capacity constraints.Terefore, Hong et al. in [78] established the standard model for capacitated updates as part of their SWAN.It ofers an LP formulation for splittable fows that, if satisfed, results in a migration plan with x updates.It provides the foundation for Liu et al.'s zUpdate [79], a technique for updating DCN with zero losses.Te authors also demonstrate that consistent migration is doable with ⌈1/s⌉ − 1 updates if all fow links have free capacity slacks.However, for smooth migration, it is best to eliminate some of the network's background trafc or temporarily limit its throughput if it contains noncritical trafc.Brandt et al. [80] additionally ofered a method to distribute fows over the network.Tey provided an algorithm that attempts to produce slack capacity on all links, and they demonstrated that splittable fow migration can always be solved in polynomial time.Te core concept is to repeatedly divide fows into new pathways until sufcient capacity is available to use the algorithm of

Table 4 :
Comparison of related works on dmCP consistency.Each controller has unique performance characteristics, even though services can be simply transferred between them.Some controllers may have quick response times while others are better suited for large-scale applications.It may be advantageous to execute numerous services on multiple controllers to align controller performance characteristics with service requirements.Services that passively sample packets in the network, for example, may need to increase to keep up with the volume of network trafc, whereas services that reactively insert fow table entries to route fows require low response times.

Table 5 :
Comparison of controller interoperability.

Table 6 :
Most commonly used CPP mathematical symbols.
i is associated with a controller c i y ij � 1 ifv i mappedc j exist 0 otherwise  Binary variable to indicate if a switch s i is associated with a controller c j 16 Journal of Electrical and Computer Engineering mixed integer linear programming (MILP), or quadratic programming (QP) are used.Brute force methods can also use when dealing with small data set on optimizer such as CPLEX

Table 7 :
Comparison of resilience aware controller placement.
Minimizing control links, multiple controller: MC, multiple control path: MCP.

Table 8 :
Comparison of controller placement with load balance awareness.

Table 9 :
Comparison of deployment environment aware controller placement.

Table 10 :
Comparison of related works on dmCP security.