A Secure Network Coding Based on Broadcast Encryption in SDN

By allowing intermediate nodes to encode the received packets before sending them out, network coding improves the capacity and robustness of multicast applications. But it is vulnerable to the pollution attacks. Some signature schemes were proposed to thwart such attacks, but most of them need to be homomorphic that the keys cannot be generated and managed easily. In this paper, we propose a novel fast and secure switch network coding multicast (SSNC) on the software defined networks (SDN). In our scheme, the complicated securemulticast management was separated from the fast data transmission based on the SDN.Multiplemulticasts will be aggregated to onemulticast group according to the requirements of services and the network status.Then, the controller will route aggregated multicast group with network coding; only the trusted switch will be allowed to join the network coding by using broadcast encryption.The proposed scheme can use the traditional cryptography without homomorphy, which greatly reduces the complexity of the computation and improves the efficiency of transmission.


Introduction
The inflexible transport mode underlining today's network restricts the development of the networks.Its capability and structure are not well suited to the requirement of all sorts of emerging transmission services.The gap between service requirements and basic network capabilities becomes bigger.Increasing efforts have been devoted to finding more reconfigurable network, such as the software defined network (SDN) [1,2] (e.g., OpenFlow [3,4]) and the Flexible Architecture of Reconfigurable Infrastructure (FARI [5]).
Multicast [6] is an efficient way of disseminating information to large groups of users and plays a significant role in network services.Unfortunately, the traditional IP multicast protocols are very complex, because the router needs to not only transmit the data but also forward the control messages to get the group membership and the global network information [7].In the new network paradigm (e.g., SDN), through separating data plane and control plane, the logically centralizing controller can control the behavior of the entire network and the router only to forwarding data.To design a scalable and secure approach for multicast is significant in the future network.
Network coding [8] is a novel transmission mechanism that allows intermediate nodes to encode the received packets, improve the capacity of multicast applications, and enhance network robustness and throughput relative to the traditional "store-and-forward" [9].However, if there are some malicious nodes in the networks, and they forward fake vectors or invalid combinations of received vectors, the polluted vectors will be quickly spread to the other nodes and even the whole network.Only part of the multiple packets obtained by the receivers are the uncorrupt vectors.Recently, to provide a secure random linear network coding [10], most schemes need to use the homomorphic cryptographic technology to sign or hash the original data.The development of SDN brings a new method in providing the secure services.
In this paper, we propose a scheme called secure switch network coding (SSNC).Multiple multicast groups will be aggregated to one multicast group according to the requirements of services and the network status.Then, the controller will route this multicast group with network coding.Those 2 Mathematical Problems in Engineering switches in the same path will get the same attribute.When the data packet enters the switch in the SDN and meets the transaction's resource requirements, the switch can decrypt the received data packets and then combine the packets according to the encoding matrix.So the SSNC advances in the following aspects: (a) the controller is responsible for managing the multicast group that it can prevent attackers sending data and illegal users receiving the data; (b) the controller will authenticate and authorize the switch by only allowing trusted switches meet with the services requirements to join the path of the multicast tree; (c) the switches only forward the data according to the flow entry; and (d) we meet this challenge by using broadcast encryption and AES without homomorphism.

Related Work
In a typical linear network coding scenario [11], the sender first breaks the message into sequence vectors in an ndimensional linear space F   .The sender transmits these message vectors to its neighboring nodes.The intermediate nodes will randomly and linearly combine the arriving vectors according to the local encoding matrix and then forward the new vectors to their adjacent nodes.Receivers can recover the original messages from the sufficient number of arriving packets by the encoding matrix.
When malicious nodes forward fake vectors or invalid combinations of received vectors, the receivers will have no way to decode vectors [12], and network resources are wasted.So it is crucial to prevent pollution attacks in practical applications of network coding.From Figure 1, pollution attacks cannot be mitigated by standard signatures or MACs.Because the receivers do not have the original message vectors, they cannot verify the signature of the original vectors.And there is no use to sign the entire message before transmission.
To defend against pollution attack in NC, an increasing number of researchers have proposed several novel hashing or signature schemes.Zhao et al. [13] introduced a signature scheme that breaks a file into a number of blocks viewed as vectors spanning a subspace .Each node verifies the integrity of a received vector  by checking the membership of  in  based on the signature on .The intermediate nodes only need to verify the signature without doing anything else.But in this scheme, the signature is too long, the public key is only used for a single file, and the sender needs to know the entire file before generating the authentication information.Therefore, Boneh et al. [14] proposed homomorphic signature schemes with better performance that both public key size and per-packer overhead are constant.This scheme signs individual vector instead of the entire subspace, but it suffers from highly expensive computational overhead due to the operations of bilinear pairing.
Yu et al. [15] proposed a probabilistic key predistribution and message authentication codes to defend against pollution attacks.The sources need to add multiple MACs to the data before forwarding them.Besides multiple nodes can verify the different parts of the message via shared secret keys.Agrawal et al. [16,17] designed a homomorphic MAC system which allows checking the integrity of network coded data.But this system is collusion resistant up to a predetermined collusion bound c and is vulnerable to the tag pollution attacks.Based on the idea of [18], Li et al. [19] proposed a RIPPLE, a symmetric key based in-network solution for network coding authentication, which is on the basis of the homomorphic Mac and TESLA [20].In RIPPLE, for calculating the MAC labels, the sources must know the longest path from each node to the source.Wu et al. [21] designed a key predistribution-based tag encoding scheme KEPTE.The source will generate multiple tags for each packet.Besides, the intermediate nodes will generate a new tag according to the received tags and then verify the correctness of the packet.But there must be a key distribution center in this scheme.
In the SDN architecture, the complicated secure multicast management could be separated from the fast data transmission [22,23].We need to verify that the switch has some authenticated attributes.So the session keys should only be bold by the switch with those attributes.The broadcast encryption can be used to distribute the session key to those switches.When a switch receives a broadcast message, it can verify that the message really comes from the controller.Because the session key of the packet and the attributes of switch are authenticated and issued by the PKG, thus the only secure switch can combine received packets according to the encoding matrix and then encrypt and forward the new packet by the flow entry.So the packet can be forwarded in the secure switches in the SDN.

Broadcast Encryption
The main construction of broadcast encryption for  users is as follows: Bro.Setup(): In this broadcast encryption scheme, a trustee is needed to generate some public information used in the following construction.The trustee chooses proper bilinear group G of prime order .
Choose random values:  ← G;  ← Z  , and set where a random  ∈ Z  .The public key is The private key for user  ∈ {1, . . ., } is set as . Output the public key PK and the  private keys  1 , . . .,   .
Then, output A private key is only one group element in G, and the ciphertext Hdr is only two group elements.The ciphertexts and private keys are of constant size, but the public key grows linearly in the number of users.

SSNC Authentication Method
We mainly use the network coding to implement the multicast application.In the SDN, a legal sender can generate and send out original packets without being aware of the encoding/decoding affairs, whereas a legal recipient can receive these packets from the network.For multicast transmission, the controller is responsible for deciding whether to use the network coding.When multiple multicast trees are aggregated to use the network coding for transmission, the controller is then responsible for routing the paths, updating the flow table of the switches, and calculating the local encoding matrix for each switch involved in the transmission.Finally, the controller generates the flow-table entries of the network coding and sends these to each switch via the interface.
Figure 2 shows that the control layer mainly consists of the control server.In the controller, the general module is responsible for basic operation and management, such as knowing the global network topology, generating and updating the flow entry, and monitoring the state awareness.To create a strategy, the general module also exchanges basic parameters with the other modules.The state-aware module monitors and analyzes the network state to promptly discover the abnormal switch nodes and current traffic distribution.The network-topology module obtains the whole network structure to provide a global view.The flow-entry module generates and updates the flow-table entries of the switches.The multicast module manages the group members, the routing of multicast trees, the decision of whether to use network coding for multicast, the selection of networkcoding path, and the generation of a coding matrix based on the parameter information provided by the general module.The security module mainly generates and manages the certificates and completes the authorization and authentication of the member switches by cooperating with the other modules.
The data layer is composed of switches and users, which are only involved in data transmission.In the switch, the encoding module is responsible for encoding the data according to the local encoding matrix assigned by the controller.The decoding module only runs when the switch is connected to the receiver.It decodes according to the decoding matrix and then sends the data to the receiver.The security module stores private keys and certificates and simultaneously verifies the credibility of the data as well as its own credibility by connecting with the controller.
The network coding needs to establish the forward paths between all the senders and receivers.Besides, for traditional networks, it is difficult to know all the nodes on the path.So most researches sign the source data or the spanning vectors space and ensure the security through the source authentication in each node.If the switches on the path are all real and trusted during the whole data transmission, other switches can be effectively prevented from replaying and tampering attacks.Authentication, authorization, and integrity checks must be provided in network coding transmission.Only authenticated and authorized switches are used.

System Setup.
The trusted controller runs the Bro.Setup() algorithm in SDN.Input the number of switches  in the SDN.Then every switch gets the system setup outputs, public PK, and its private key   .

Switch Session Keys Distribution.
Multiple multicast groups will be aggregated to one group according to the quality of service level and the network status.Then the controller routes this group by network coding, to get the subset  ⊆ {1, . . ., }, and the switch   =  belongs to this path.Input the public key PK and a subset ; the Bro.Encrypt(, PK) algorithm outputs a pair (Hdr, ), where Hdr is the broadcast ciphertext, and the key  ∈ K is a symmetric multicast session key.Because a malicious node can fake a pair (Hdr, ), the controller chooses a collisionresistant hash function H, and sign the hash value of Hdr:  = enc(ℎ(Hdr), Private.rsa) by the RSA algorithm.So the message (, Hdr, ) is broadcast to the set S:   After the message passes the verification, input a subset  ⊆ {1, . . ., }, the switch id  ∈ {1, . . ., }, and the private key   for the switch , a header Hdr, and the public key PK, if  ∈ .Then, the Bro.Derypt(, ,   , Hdr, PK) algorithm outputs the multicast session key  ∈ K.The key  can then be used to encrypt/decrypt the multicast data.To improve the performance of the multicast transmission, the encryption function can be embedded in hardware.
In SSNC, after the secure multicast forwarding tree is established, each switch, sender, and receiver in the multicast paths will get the session key .For the sender, the original data is represented by a vector  = (V 1 , V 2 , . . ., V  ).The data vector should be encrypted by the key K, denoted as {V  }.The different data {V  } is sent to the different next switch : The switches located in the same path as the sender have the same requirement of the quality of service and security level.Only when they have the same session key K will the switches decrypt all receiving packets by using the session key  and then linearly combine the received data vectors to a new data vector  based on the predistributed encoding matrix   = { 1 ,  2 , . . .,   }  .After that, the switch encrypts and forwards the vector  to the next switches: For the receiver located in the same path as the sender, once receiving a packet, it will decrypt data using the session key  and encode the packets by network coding to get the original data.

Rekeying.
When the switch is broken down or cannot meet the quality of service level or the security requirements, this switch should be removed from the subset .The controller may need to route new path for the aggregated multicast to replace the problem.To enhance the security of the multicast, the controller also needs to rekey per certain period.Therefore, the controller is required to generate new keys for the subset .When new keys are generated, these keys will be broadcasted to all switches.But only the switch belonging to the subset  can decrypt the new keys.

Maintenance of Multicast Path.
When the multicast groups are determined, the controller is responsible for aggregating multicast groups and deciding the usage of network coding.And if it decides to use the network coding for transmission, it will route this group and provide secure and reliable switchers to construct the transmission path of multicast.It is very important to establish the trust relationship between the switch and the controller [24].The controller can keep white lists for trusted and authenticated devices and build the multicast paths from this list.The list is dynamically changed based on anomaly/failure detection algorithms [25].If the trustworthiness of the switch is questioned or has abnormal behavior, the switch will be reported by other switches or controllers.So this switch will be automatically quarantined by other switches and controllers and thus removed from the list.

Performance and Security Analysis
5.1.Security Analysis.The set of the whole network is denoted by N, the number of all switches is , the set of the switches in one secure multicast path is denoted by S (S ⊂ N), the number of switches is , the number of the multicasts in the network is , and the attacker is denoted by A.

Eavesdropping
Attack.An eavesdropping attacker can wiretap one or more links in the network.It also refers to the information not leaked to the unauthorized users.In the network coding, the original data will be divided into many data vectors before being sent out.And the data vectors will be combined into new data vectors.The original data can be decoded only if A got enough data vectors and the corresponding coding matrix.To make the eavesdropper not get the correct data, traditional network coding needs encrypting coefficients or partial data.But the encryption scheme usually needs to be homomorphic.In the SSNC, the coefficients of coding are predistributed to the trusted switch by the controller instead of being transited with the data.And the multicast packets are transmitted with the ciphertext encrypted by AES.Each of the trusted switches should be authenticated by the controller, but A may fake the controller to send the session .Thereby, in the SSNC, we provide the source authentication when broadcasting the session  by the RSA.

Pollution Attack.
This attack is usually stated by unauthorized nodes which inject polluted packets into the information flow.An attacker A fakes a message {V * } * .Only if the switch of S has the fake session key  * will it receive the data V * .But A has to crack secret keys of the AES.Without cracking the AES, A has to fake the trusted switch to join the secure multicast group or fake the controller to distribute the session key  * to the switches of C. We can use the RSA technology to resolve the second problem in Section 4.2.To resolve the first problem, the switch should be authenticated and authorized by controller at the first time, when it connects to the controller.And with the system running, the controller will authenticate and authorize them at regular intervals.
The security of the whole system depends on that of the RSA, AES, and broadcast encryption used in our scheme.From the above analyses, it can be concluded that our system is capable of resisting pollution attack and the eavesdrop attack.

Performance Analysis.
The communication overhead of the network coding refers to the bandwidth cost for distributing the authentication information to each switch.Without the data vectors, the overhead mainly considers two aspects, the keys and signatures.The computational overhead mainly refers to encoding/decoding the vector and verifying the signatures.The storage overhead refers to store keys, certificates, and other safety parameters.
In the traditional secure network coding, homomorphic hash, signature, and MAC are frequently used methods to defend against the pollution attack.However, they are always expensive in computation.The advantages of the network coding are only reflected in the networks consistent with faithful switches [26].So in the SSNC, to take the advantages of the SDN, the control layer completes the complex and expensive computations.The whole secure network coding is divided into two stages.The first stage is to set up the system.The controller will select the trusted switch meeting the requirement of QoS for the set of S from the network N. By using the broadcast encryption, only the secure switch can get the session .The controller also needs route the multicast and generates and distributes the keys.In the second stage, the data layer only needs to encode and forward the data by the session K.
Each intermediate switch only needs to store its session keys and the controller public key.While the controller provides the source authentication service, the switch stores the controller public key, whose size is denoted as |  |.The session key is shared by one group for one key.We denote the size of session key as |  |.For the switch, the total storage overhead is |  | + |  | * , where  is the number of the switches belonging to the secure group.For the controller, it stores a pair of keys of RSA and all the sessions keys, so that the total is |  | *  + |  | * .
In the SSNC scheme, the encoding matrix and the session keys will be distributed to each switch before the stage of transferring the data.So the packets do not need to carry any more information.We denote the size of the data vector as |V|.
The performance of the SSNC will depend on the setup stage, as in the forwarding stage, we can encrypt the message by the hardware.So it is very important for routing the multicast by the network coding.When the group changes very frequently, the SSNC will pay more cost in the setup stage, and the performance maybe became poor.In the worst case, it should update the keys to the whole network.To reduce the change rate of the multicast paths, we aggregate the same class multicasts to one aggregated path.
To demonstrate the effectiveness of SSNC, we compared the following three schemes (Table 1).
In the scheme presented by Ho et al. in [11], the source executes signature algorithm, while the intermediate nodes execute combinations and verification algorithms.From their experiment, it can be found that the time overhead of signature algorithm is much longer than the process of combination and verification.The intermediate nodes need to use Random Generator algorithm and Pseudo Random Function algorithm to combine the received data.The time overhead of the two algorithms is denoted as  PRG and  PRF .In the SSNC, the node will firstly decrypt the vectors and then combines the vectors.But in the scheme in [11], the switch firstly combines the vectors and then verifies the vectors.In this case, it is hard to tell which vector is forged.Besides, it needs to generate  keys for one multicast.Each switch distributes different keys.So it requires a complicated key management mechanism.The size of key is denoted as | HMAC |.Each node needs to store two sets of keys to ensure that it can act as both of the data source and receiver node.In contrast with scheme in [11], SSNC key management is more simple and effective.In the phase of data transmission, besides the original vector, it also needs to attach tags , the extension of the vector data, and the data source id.The size of the extended vector and tag are, respectively, denoted as || and |tag|, which is set in advance.The id of data source is a constant and denoted as .This scheme needs to distribute keys before transmission.
In the scheme in [7], the source will find an orthogonal vector and signatures.The intermediate node will verify the signatures of which the security is based on the Diffie-Hellman problem.This computation overhead is denoted as |  |.The complexity of this algorithm is so high.This scheme uses the public key encryption scheme.The source preserves a private key and distributes the public key.Although the number of needed keys is not so much, the size of the keys is related to the size of files.We denote the size of public and secret key as | space |.The size of public key is 6( + ℎ)/ℎ times that of the file, where the file with size  is divided into  blocks and ℎ is the size of each block.During the transmission, the main communication overhead is public key and signature vector.The vector is signed by standard algorithm, whose size is denoted as || DSA .
In [8], the scheme combines linear subspace with homomorphic signature and uses constant size of public key and signature.It uses constant size of public key and signature.The internal nodes verify the signature according to public key, id, and the dimensionality of signed vector.The complexity of the calculation is similar to scheme in [7] but storage overhead is much smaller.The size of the keys is constant and denoted as | NCS |.Besides, the size of signature in this scheme is constant, which usually takes up  log 2  bits or more, denoted as || NCS .
Comparing with three classical solutions, the key generation is more simple and the key management is more effective in our scheme.

Network Experiment Performance.
We evaluated the performance of our scheme through an experimental network.We tested SSNC on Mininet, one of the popular network simulation platforms.Each switch is connected with a receiver.Besides, one controller is connected with all switches.The controller will run the algorithm of network coding [27] according to the topology, broadcast the session keys, and update the flow entry of the switch.In the traditional secure network coding, the controller only needs to broadcast the keys.Therefore in the scheme we presented, the number of control messages increases with the number of switches in the network coding path.We emulated 5 aggregated multicast sessions.The number of the control messages is shown in Figure 3.
Before sending the multicast data packets, the switch decryption time will be dominated by the || − 2 group operations, which need to compute ∏ ∈, ̸ =  +1−+ .This takes plenty of time.But when the switch forwards the data packets, it only needs to run the AES algorithm.The hardware will accelerate the completion of the AES algorithm.In this experiment, we used Dell desktop computers with 2.93 GHz Intel core i3 530 CPU and 4 GB main memory.The sender will send the messages to three receivers in different network topology.We timed the following operations: (1) Broadcasting the session keys: the controller calculates the forward tree and the session ; the switch decrypts the broadcast message to get the key.(2) Forwarding the data packets: switch decrypts and encrypts the packets using the AES.The results of multicast are shown in Table 2.We can see that the first packet of the multicast session will take more time to arrive to the receiver, but the total time of the session will take less time.

Conclusions
In this paper, a novel fast and secure switch network coding multicast on the software defined networks is proposed.It separates the secure management from the transmission.During the management stage, the controller routes the aggregated multicast and broadcasts the session keys.Only the secure switches meeting the requirement of quality of service can join in the multicast forwarding tree.Our scheme ensures that the switch verifies the data from other trusted switches, which greatly improves the ability to prevent the pollution and eavesdropping attacks.During data transmission, the node only focuses on how to forward the data to keep the high performance of the network coding.In order to provide more flexible and efficient multicast services, further efforts are needed to improve the system dynamics, especially on how to deal with the frequent changes of members.

( 5 ) 4 . 3 .
Multicast Data Forwarding.When the switch gets the broadcast message (, Hdr, ), it is necessary to verify the identity of the controller.So the switch decrypts the  to get the value ℎ(Hdr) = dec(, Public.rsa) and computes and checks the ℎ(Hdr) by the function H as the input Hdr.

Figure 3 :
Figure 3: The number of the control messages.

Table 1 :
The comparison of computation, storage, and communication overhead.

Table 2 :
Time of the two stages.