A Survey of Key Technologies for Constructing Network Covert Channel

In order to protect user privacy or guarantee free access to the Internet, the network covert channel has become a hot research topic. It refers to an information channel in which the messages are covertly transmitted under the network environment. In recent years, many new construction schemes of network covert channels are proposed. But at the same time, network covert channel has also received the attention of censors, leading to many attacks. The network covert channel refers to an information channel in which the messages are covertly transmitted under the network environment. Many users exploit the network covert channel to protect privacy or guarantee free access to the Internet. Previous construction schemes of the network covert channel are based on information steganography, which can be divided into CTCs and CSCs. In recent years, there are some covert channels constructed by changing the transmission network architecture. On the other side, some research work promises that the characteristics of emerging network may better fit the construction of the network covert channel. In addition, the covert channel can also be constructed by changing the transmission network architecture. The proxy and anonymity communication technology implement this construction scheme. In this paper, we divide the key technologies for constructing network covert channels into two aspects: communication content level (based on information steganography) and transmission network level (based on proxy and anonymity communication technology). We give an comprehensively summary about covert channels at each level. We also introduce work for the three new types of network covert channels (covert channels based on streaming media, covert channels based on blockchain, and covert channels based on IPv6). In addition, we present the attacks against the network covert channel, including elimination, limitation, and detection. Finally, the challenge and future research trend in this field are discussed.


Introduction
With the rapid development of information technology, Internet has penetrated into every aspect of people's lives. However, when people enjoy the convenience brought by the network, there have been many issues of information leakage and user privacy breaches [1]. For example, there have emerged malicious attacks which aimed at stealing confidential government data, such as GhostNet [2], ShadowNet [3], and Axiom [4]. On the other hand, repressive governments have deployed increasingly sophisticated technology to block the disfavored Internet content [5]. So, many users cannot access Internet freely. e network covert channel can covertly transmit secret messages. It can hide covert traffic in a large amount of overt communication traffic. Many researches show that the use of network covert channel can protect user privacy and guarantee users' right to free access to Internet [6][7][8]. e secure transmission of secret messages in the communication process refers to two aspects: one is the communication content security [9] and the other is the communication connection security [6,10,11]. Network covert channel can effectively improve the security of these two aspects.
In terms of communication content security, encryption technology is widely used to protect the communication content of both sides, such as SSL (secure sockets layer), representatives: proxy technology [19,20] and anonymous communication technology [11]. e proxy can be divided into two categories: end-to-end proxy (such as HTTP proxy [21]) and end to middle proxy (such as Telex [22]). In addition, anonymous communication technology can also conduct a new covert transmission path. ere are many mature anonymous communication systems such as Tor [23], I2P [24], and Loopix [11].
On the other side, some research work promises that the characteristics of emerging networks may better fit the construction of the network covert channel. With the development of emerging networks, many network covert channels in the new network environment (streaming media network, blockchain network, and IPv6) have been proposed.
e covert channels based on streaming media network hide secret messages in audio and video traffic and use popular streaming media applications as the carrier.
ere are three typical covert channels: Facet [25], Cov-ertCast [26], and DeltaShaper [27]. e blockchain network has the characteristics of participant-anonymity, flooding propagation, and tampering resistance [28]. e covert channels based on blockchain network can utilize participant-anonymity and flooding propagation to increase the concealment of communicating parties. e tampering resistance can also be used to guarantee the robustness of covert channel. In this context, the models of covert channels based on blockchain network are proposed [10,28] and three covert channels (Zombiecoin [10], Botchain [29], and Chainchannels [30]) have been actually deployed in blockchain network. e IPv6 network is also a compelling platform for constructing covert channels. e IPv6 header and its extensions have many reserved fields or other fields which can embed information, thus leading to many possible covert channels [31].
However, because the network covert channel is a good method to cope with repressive government, it has also received the attention of censors [32]. Compared with ordinary eavesdroppers, the national-level censors have a global traffic view and have a stronger ability to analyze traffic. More and more attacks against the covert channel have appeared, which has an impact on channel concealment, robustness, and transmission efficiency [33][34][35].
Although there are many studies on covert channels, there is no comprehensive survey for the construction technologies they use and corresponding attacks. In addition, there is also less research on the covert channels in the new network environment. Compared with the already published studies, the main contributions of this paper are as follows: (1) Previous studies only considered the network covert channel based on information steganography, but not the covert channel based on the changing network architecture. According to different principles of covert channel construction technologies, we divide covert channels into two levels: communication content and transmission network, which can comprehensively include existing covert channels. And, we conduct a comprehensive analysis on the covert channels under each construction technology.
(2) e characteristics of the new network create many convenient conditions for the construction of network covert channels. However, they are not considered in other reviews. We present the covert channels in the new network environments including streaming media, blockchain, and IPv6, which makes up for deficiencies in existing work. It would highly facilitate for the researchers to understand the research status and provide research ideas for the subsequent design of covert channels in those new network environments. (3) We emphasize the challenging problems facing the construction of covert channels: the IP blocking or other blocking technology reduces the channel availability; the use of ML and DL technology makes the covert channel easier to expose. We discuss how to improve the ability to resist those problems, such as using adversarial examples, constructing reversible network, covert channel.
In order to improve the readability, we list the abbreviations used in our article in Table 1. e rest of the paper is organized as follows: Section 2 gives the research background of network covert channels. In Section 3, we present the network covert channel construction technology at communication content level and transmission network level. In Section 4, we provide the covert channels in the new network environment. In Section 5, we present network covert channel metrics including concealment, robustness, and throughput. In Section 6, we show the attacks against network covert channels. en, we discuss the challenges and suggest future research directions in Section 7. Lastly, Section 8 presents conclusion.

Network Covert Channel
Definition. e covert channel was originally proposed by Lampson [36]. Its purpose is to transmit secret messages to the recipient in an unconventional manner without being noticed by the observer. e classic communication scenario of the covert channel is the prisoner problem [37]: Alice and Bob are held in two rooms of a prison. ey want to escape and they need to transmit the escape plan to each other, but the watcher Wendy monitors them. erefore, Alice and Bob need to complete the information exchange without alerting Wendy.
With the development of Internet, the network covert channel has emerged. It is a kind of channel that transmits covert messages in violation of communication restriction rules in network environment [7]. e goal of network covert channels is not only to ensure that communication content is not discovered but also to protect the identities of both parties. e prisoner model of the network covert channel is shown in Figure 1.

Adversary Scenario.
In order to improve the availability of covert channel, we must describe the adversaries appropriately. Common attack methods used by adversaries are described in detail in Section 6. ere are four attributes which can describe different types of adversaries: (i) According to the attack mode, the adversaries are divided into passive and active: the passive adversary observes the communication traffic and analyzes it; the active adversary not only observes and analyzes it but also can generate, modify, delete, or delay traffic. (ii) According to the location, the adversaries are divided into external and internal: external adversaries are located outside the path of covert information transmission and internal attackers are located on the path or control some middleware on the path. (iii) According to the adversaries' resource, the adversaries are divided into global and partial: a global adversary can observe the total channel, but a partial adversary can only attack a part of the channel. (iv) According to the variability of attackers' resource, the adversaries are divided into invariable and adaptive: an invariable adversary cannot change resources he occupied after the attack begins. But, adaptive adversaries may constantly change the occupied resources during the attack.
Because of the differences in security targets and features of each covert channel, the adversary each channel is assuming is different. An adversary is always assumed to have  [22], an end-to-middle proxy, the adversary is assumed to be a person who controls the infrastructure of the network within its jurisdiction and can observe, alter, block, or inject network traffic. So, he is not only a passive adversary but also a partial and active adversary. In Tor [23], a low-latency anonymous communication system, if an adversary operates some onion nodes and modifies the data that flow through them, he is an internal and active adversary. In the meantime, if he observes all of the onion nodes, he is also a global and passive adversary.

Network Covert Channel Classification.
Most of the existing researches divide the network covert channels into CTCs and CSCs according to the different message modulation methods [38,39]: (i) Network covert storage channels: CSCs include the secret messages into storage objects at the sender and then reading them at the receiver. Reserved bits or unused bits of the protocol are mainly used to transmit information. (ii) Network covert timing channels: CTCs include the secret messages into the timing behavior at the sender and then extract the covert messages at the receiver. Normally, the delays in network packets are used to deliver covert messages.

Typical Techniques for Constructing Network Covert Channels
Most covert channels tend to encode covert information into a storage field or time behavior and then transmit it in accordance with common network transmission processes. It is essentially a form of information hiding technology (also called information steganography). e classification method mentioned in Section 2 is also based on this technology. e information steganography is a hidden transmission technology at the level of communication content, which does not involve the transmission network level. At present, there are many covert channels based on changing the structure of the transmission network. ey covertly transmit information by designing new network transmission paths. So, the classification mentioned in Section 2 can no longer include all network covert channel construction schemes. We glean the covert channel researches and make a close reading of the study of techniques they use. In order to make a comprehensive summary, we divide the construction technology as follows: (i) Covert channel construction technology at the communication content level (ii) Covert channel construction technology at the transmission network level Correspondingly, covert channels can also be divided into communication content level and transmission network level. In Section 3.1, we will introduce the two types of technologies and the typical covert channels under each construction technology in detail.

Construction Technology at the Communication Content
Level. As we stated above, the construction technology at the communication content level is based on information steganography, which includes CTC construction technology and CSC construction technology. e covert channels using this construction technology transmit secret messages through information steganography. It can be divided into CTC construction technology and CSC construction technology. [17,[39][40][41]. Wendzel et al. [39] use pattern language markup language (PLML) to classify covert channels into 11 different patterns. We refer to the classification method proposed by Wendzel et al. [39] and divide CTCs construction technology into four categories: interarrival time (C1), rate modulation (C2), PDU order modulation (C3), and PDU retransmission (C4).

CTCs. A large body of literatures deals with the study
(1) Interarrival Time (C1). e CTC construction technology based on interarrival time (C1) is the most common. It transmits messages by altering timing intervals between network PDUs. Most studies on CTCs are based on this.
Cabuk et al. proposed On-Off CTC [42], which is a time window-based method. e sender and receiver share a time  window T w . When transmitting information, the total transmission time is divided into equal and disjoint time intervals T i according to the T w . In a T i , if the sender transmitted a packet, it represents the bit "1;" if the sender remained silent during T i , it represents the bit "0." Shah et al. [43] proposed the Keyboard Jitterbug. In Jitterbug CTC, the sender and receiver share a value w. e sender sends a packet with extra delay to the server when the user types the keyboard. If the delay is an integer multiple of w, it represents the bit "1;" if the delay is an integer multiple of (w/2), it represents the bit "0." In addition, the Jitterbug CTCs do not require the sender and receiver to keep time in sync, just that their respective clocks are accurate.
With the development of steganography technology, the more concealed statistical-based CTCs have appeared. Brodley and Spafford [44] proposed the TRCTC (time-replay CTC). It sorts the overt network packet intervals and records them in the set S 0 and S 1 , respectively. When transmitting bit "1," the sender randomly selects a packet interval from set S 1 and replays it; when transmitting bit "0," the sender randomly selects a packet interval from set S 0 and replays it.
However, the packet interval sent from set S 0 may be transferred to set S 1 due to the network jitter, causing a high bit error. erefore, TRCTC must ignore the value of delay on part of the boundary to ensure that the receiver can correctly receive the secret messages.
L-N CTC (L-bits to n-packets scheme) proposed in [45] introduces a new data embedding method, which not only improves the channel capacity but also reduces the bit error rate. It can embed an L-bits secret message into the delays of N-consecutive packets. To represent different combinations of L-bits, the packet interval of L-N CTC will be evaluated around the normal network delay d or exponential times of d.
(2) Rate Modulation (C2). e covert channel sender alters the data rate of a traffic flow from itself or a third party to the covert channel receiver. For example, Li et al. [46] analyzed a covert channel in the real switch. e sender exhausts the performance of a switch to affect the throughput of a connection from a third party to a receiver over time.

(3) Package Arrival Time Modulation (C3).
is type of covert channel encodes hidden information by modifying the arrival time of multiple packets. For example, Tahir et al. [47] presented Sneak-Peek, a high speed covert channels in data center networks. In Sneak-Peek, the packets sent by the sender change some special packet sequences in the shared resource queue, and the receiver decodes secret messages from queuing delays of the special packets.
(4) PDU Retransmission (C4). In order to encode secret messages, the covert channel retransmits previously sent or received PDUs such as DNS requests, selected IEEE 802.11 packets, and selected TCP segments. Mazurczyk et al. [48] presented RSTEG (retransmission steganography). e main innovation of RSTEG is to not acknowledge a successfully received packet in order to force the sender to retransmit. e retransmitted packet will carry the secret messages.
(1) Size Modulation (C5). is type of covert channel encodes the covert information by changing the size of some special data such as the length of overall data packets and the length of a header element. e study in [49] discussed the technique of implementing CSCs by altering the size of TCP databursts. e TCP databurst is the number of TCP segments sent by a host before waiting for a TCP ACK packet.
is type of covert channel encodes covert information by changing the order of PDU elements or header fields in the packet. e IPv6 extension header fields, header fields in the HTTP protocol, and options in the DHCP protocol are frequently used to encode covert messages [39].
(3) Random Value Modification (C7). If header elements in the network protocol contain random values, then these fields can be used to represent hidden information. e study in [17] proposed that the case and the least significant bit (LSB) of the values in some header fields can be used to encode secret messages.

(4) Redundant Field (C8).
is kind of covert channel encoded hidden data into a reserved or unused header/PDU element. Rowland [50] proposed embedding covert channels in different unused areas in the IPv4 header and in the TCP header.

(5) Multimedia Data Modification (C9)
. is channel uses multimedia data (such as text, image, and video) as the carrier for secret transmission. For example, the information hiding is realized in [51] through embedding secret information in the characteristic data area of digital video by a steganography scheme based on chaotic mapping.
Wendzel et al. [39] categorize network covert channels at communication content level regarding three aspects (semantic, syntax, and noise). e semantic means whether the pattern modifies header elements in a way that leads to a different interpretation of the changed PDU. e syntax means whether the PDU structure is modified, and the noise means whether the channel is affected by noise. All CTCs do not change the structure of a PDU. But, they are greatly influenced by network delay or jitters, so they are all noisy and the robustness is not very well [38]. For CSCs, the fields used are not modified when transferring and hence there is few channel noise. But on the other hand, these channels are easy to be detected by the outside observer.
Besides the three aspects, CTCs have another characteristic, which is whether it is based on statistics [7]. For example, in many CTCs based on C1, in order to make the extra delay similar to the time series characteristics of the overt traffic, the delay will be carefully selected to ensure that it can not only fit the delay distribution of the overt traffic but also transmit secret messages.

Security and Communication Networks
So, we summarize the surveyed papers of CTCs and CSCs in Table 2 from the following aspects: category, semantic, syntax, noise, and statistics. In addition, we also give the description of each covert channel.

Construction Technology at the Transmission
Network Level 3.2.1. Proxy Technology. Normally, users go directly to Internet sites to get network information, but nowadays, users often get information through the proxies. As a channel service in the Internet environment, the proxy service has multiple functions such as improving access performance, resource access control, and security protection and protecting user identity information. In addition, proxy services have the features of hidden, dynamic, and diverse. More importantly, proxies can help users break through content filtering restrictions and access the websites blocked by censors. e proxy can be divided into two categories according to the traffic transmission path: E2E proxy (end-to-end proxy) and E2M proxy (end-to-middle proxy). Figure 2 shows the designs of E2E proxy (using HTTP proxy as one example) and E2M proxy (using Telex scheme as one example).
(1) End-to-End Proxy. e E2E proxy connects clients and servers directly. To get content from the server, the client sends a request to the proxy, and the proxy gets the content from the server and returns it to the client. e most common E2E proxy is the HTTP proxy [21], which acts like a web server and correctly accepts request and returns response. In order to extend the scope of the application protocol, the socks proxy is proposed. It simply passes data packets and does not care what application protocol they are. With the development of tunneling and cryptography, VPN proxy has emerged. It uses the technology of tunneling, encryption, decryption, and identity authentication, which means it is more secure. Existing mature VPN proxies include PPTP (Pointto-Point Tunneling Protocol) and L2TP (Layer-Two Tunneling Protocol), which are located in the second layer of the TCP/IP protocol, VTP (VLAN Trunking Protocol), and IPSec (IP Security), which are located in the third layer of the TCP/IP protocol.
However, E2E proxies are exposed to adversaries, and their activities are easily spotted. e global and active adversaries are able to block many of E2E proxies by discovering and banning the IP addresses of the servers on which they rely [20,22]. To overcome this problem, researchers propose the E2M proxy.
(2) End-to-Middle Proxy. e traditional E2E proxy relays data to a specified server. Different from that, the E2M proxy is located in the path to a server and it can redirect the connection to an alternative destination. e E2M proxy needs a router at the friendly ISP to host it so that it can control the connection to an unblocked decoy server. en, E2M proxy determines whether to block the connection and redirect it to a censored server by recognizing a steganographic tag. From the perspective of the censor, the E2M proxy user appears to be in contact only with the decoy server. e censor cannot block E2M proxy without blocking all connections that pass through participating ISPs, which is a large, primarily legitimate category of Internet traffic. So, E2M proxy will provide increased resistance to IP blocking.
ere are four existing publications on end-to-middle proxy: Telex [22], Decoy Routing [63], Cirripede [64], and Tapdance [20]. e designs for the four systems are largely similar, although there are differences in some aspects, such as the embedded steganographic tag, blocking strategy, and deployment requirement. e comparison of the four E2M proxies is shown in Table 3.

Anonymity Communication Technology.
e anonymous communication system is designed to access content blocked by censored anonymously. It uses technologies such as anonymous domain generation, traffic obfuscation, and broadcast/multicast to covertly forward messages. It can prevent attackers from acquiring communication relationships or the identity of the senders and receivers. at is, in the anonymous communication system, not only the channel is covert but also the identity of both parties in the communication.
e concept of mix-net proposed by Chaum [65] in 1981 is considered the origin of anonymity communication system. e core idea of mix-net is to encrypt and obfuscate messages based on mix nodes. Inspired by the mix-net, many anonymous communication systems are proposed, such as Crowds [66], P5 (Peer-to-Peer Personal Privacy Protocol) [67], Tor (the Second-Generation Onion Routing) [23], and I2P (Invisible Internet Project) [24]. Tor is currently the most active and most popular anonymous communication system. It has about 8 million daily users [68]. e core idea of Tor is onion routing [69]: firstly, select three suitable relay nodes and establish links with these relay nodes hop by hop; secondly, the client encrypts the secret messages 3 times; then, each relay node decrypts them in order. No node can know whether its previous node in the chain is the sender or the relay node. Likewise, no node can know whether its next node in the chain is the receiver or the relay node, so Tor can protect user privacy well.
According to different transmission delays, anonymous communication systems can be divided into high-latency systems and low-latency systems. High-latency systems are based on mix-net. High-latency systems are used for applications that can tolerate delays, such as anonymous e-mail services. However, most applications require timeliness, such as web browsing and live chat. erefore, low-latency systems have a wider range of applications and attract more attention. Most of the research related to anonymous communication is also about low-latency systems. According to the network structure, the low-latency system can be divided into P2P anonymous network and non-P2P anonymous network. According to whether the routing path is determined, P2P anonymous network can be further divided into structured and unstructured network models. In a structured anonymous network, it is determined which nodes If the connections were not tagged, Telex proxy would proceed to the notblocked website as normal.
(3) Telex proxy determines whether there is a tag If the connection has the tag, Telex proxy deciphers the tag and diverts the connection to blocked website.
(1) (2) Figure 2: E2E and E2M concept (example users). Alice is connecting to the HTTP proxy, and Bob is connecting to Telex proxy.
Security and Communication Networks data stream will pass through, while in an unstructured anonymous network, the path of the data stream is unknown.
Besides the mix-net, anonymous communication systems also use many techniques to increase the system's covertness.
e common techniques include anonymous domain generation, broadcast/multicast, probabilistic traffic routing mechanism, and traffic obfuscation. Some newer technologies are also used, such as zero knowledge proofs in [8] and verifiable shuffle technique in [70]. But because they are specific to a certain system, we will not go into details. e anonymous domain generation is a mechanism similar to DGA (domain generation algorithm) used by malicious software. It uses a random private key, calculates the corresponding public key, and uses the public key as part of the domain. So, it can guarantee the anonymous communication system's anonymity and security. e core idea of broadcast/multicast technology is each node broadcasts a message to other nodes in the system in each cycle of the system operation. DC-Nets (Dining Cryptographers) [71] is a typical example of using this technology. It is an anonymous communication system based purely on broadcast/multicast. An improved example is P5 [67], which divides the nodes into multiple broadcast groups to improve the scalability of the system. e probabilistic traffic routing mechanism is often used to build unstructured anonymous networks.
is mechanism is implemented through DHT-based (distributed hash table) routing protocols and random walk protocols. In systems that use probabilistic traffic routing mechanism, nodes can decide whether to forward traffic to the next node based on a certain probability. Because it is difficult for each node on the path to determine whether its predecessor is the original sender of the message or just an intermediate forwarding node, it effectively guarantees the anonymity of the original sender. Crowds [66] is a typical example of using this mechanism. Other examples are Torsk [72] and BitBlender [73].
Traffic obfuscation can erase or randomize the statistical characteristics of covert traffic, so that the load of covert traffic looks like a uniform random bitstream or a "benign" protocol [74]. Its goal is to make it difficult for adversaries to distinguish between the obfuscated traffic and overt traffic. Current traffic obfuscation techniques are as follows: (i) Randomization: randomization refers to the use of encryption, random padding, and other methods to randomize the characteristics of covert traffic. For example, the Tor project [75] has developed a variety of randomization mechanisms, including Obfsproxy3 [76], Obfsproxy4 [77], Dust [78], and ScrambleSuit [79]. (ii) Protocol mimicry: the main idea of protocol mimicry is imitating or masquerading as popular whitelisted protocols which are rarely suspected by adversaries. For example, SkypeMorch [80] is a transport layer plugin that integrates traffic between Tor clients and Tor bridges into Skype traffic. Another example is StegoTorus [81]. e core idea is to segment Tor traffic and simulate other overt traffic such HTTP. (iii) Join dummy traffic: to provide stronger anonymity, some systems generate additionally dummy traffic. is technology is used in many existing system, such as P5 [67] and Loopix [11]. (iv) Tunneling: the tunneling technology is one extreme of the mimicry logic, which simply encapsulated data into an (usually encrypted) overlay protocol. A famous example is meek [82] deployed with Tor.
Meek core idea is to use different domain names in different places, one in the SNI for DNS requests (set to noncensored URL) and the other in the HTTP host field (set to censored URL). Meek uses cloud platform as relay node and redirects Tor traffic to Meek server.
We summarize the main technologies used in each anonymous communication system and application scenarios in Table 4.  In this context, many covert channels based on streaming media have been proposed. ey use audio and video traffic allowed by the observer as overlay traffic and build covert channels. e carrier is a popular encrypted streaming application such as Skype. is technique can help users watch the censored video, and it can work without requiring changes to the carrier application.
Facet [25] is a covert communication system for transmitting censored video, and it relies on the assumption that the observer is unwilling to indiscriminately block all or most sessions of the cover protocol (Skype). To the outside observer, the Facet client is just having a Skype session. Facet consists of clients, Facet servers, and emulators. e procedure of a Facet connection is as follows: (1) e Facet client and Facet server establish initial connections (2) A Facet client sends a uniform resource locator (URL) of the censored video to the Facet server (3) Facet server downloads video from blocked video sites such as YouTube, Vine, or Vimeo (4) e emulator simulates the video content as a Skype session and resends it to the client at a lower resolution (5) e Facet client ends the connection and the Facet server destructs the emulators and ends the session Different from Facet, CovertCast [26] supports that multiple clients receive data transmitted in a specific live stream in the real time. And, CovertCast is scalable, with the server workload independent of the number of clients receiving content. To the observer, CovertCast traffic is similar to the traffic that users watch someone broadcasting on a given live-streaming platform such as YouTube. CovertCast consists of users, CovertCast clients, and CovertCast servers. e communication process of CovertCast is as follows: (1) CovertCast server crawls a censored website and modulates its content into images.
(2) CovertCast server broadcasts images by livestreaming video services and begins to download the next website. (4) e user's web browser sends a request through the user's proxy.
(5) CovertCast client creates a response with the corresponding website.
In Facet or CovertCast, the format of covert messages is restricted to video (in Facet) or Web content (in CovertCast). In DeltaShaper [27], the covert TCP/IP packets are encoded and embedded into the video stream transmitted by the video channel of a popular videoconferencing application such as Skype between the communication endpoints, which means that DeltaShaper allows for tunneling arbitrary TCP/IP traffic. To the observer, the client and server are just engaged in a Skype session. DeltaShaper consists of client endpoint and server endpoint. e same procedure is applied at both endpoints of a Skype call, thus DeltaShaper supports bidirectional communication. e procedure is as follows: (1) e sender modulates covert data into images and encodes them in a video stream which is fed to Skype (2) Skype transmits this video to the receiver's Skype instance (3) e receiver captures the stream from the Skype video buffer (4) e receiver's decoder extracts the payload from the stream We give an overview about the three covert channels based on streaming media with respect to the covert message transmitted platform, whether it is bidirectional and whether it supports multiple clients in Table 5.

Building Covert Channels Using Blockchain Technology.
With the development of covert communication countermeasure technology, the traditional network covert channels based on TCP/IP architecture have the risk that the channel is regulated, the traffic is easy to be tracked, and the identity information of the communicators is easy to be recognized [85], which makes it difficult to meet the security requirements of data covert transmission. Blockchain is one of the representatives of the new generation of information technology. It has a large number of active users, a large number of transaction data packets, and many ways to embed the secret data. For example, there are many fields in the blockchain ledger structure that can store data. e blockchain network adopts transaction transmission mechanism based on flood forwarding, which ensures that information can be effectively, quickly, and reliably transmitted to all nodes in the network. A decentralized flooding mechanism can also protect the recipient of a covert transmission of data by avoiding stealing communication privacy by monitoring a single server or a single communication link.
Li et al. [28] proposed a model of covert timing channel in the blockchain network and uses a formal method to model and proves the anti-interference and tamper-resistance. Secondly, they constructed a scenario of the covert channel in the blockchain network based on the time interval of business operations. ey also present covert channel evaluation vectors for blockchain networks containing detection resistance, robustness, and transmission efficiency.
Brenner et al. [10] proposed a model of covert storage channel in the blockchain. ey explored the possibility of applying blockchain technology to the transmission of C&C (command and control) instructions in a field of blockchain protocol and described the prototypes of Zombiecoin [10], which are based on Bitcoin.
Besides the researches on the model of blockchain-based covert channel, there are three systems that have been actually deployed: Zombiecoin [86], Botchain [29], and Chaninchannels [30]. e designs of these three systems are very similar. In the three systems, communication participants are expected to covertly transmit messages through blockchain. Firstly, they apply to be the client nodes of the blockchain network and negotiate labels in advance so that the receiver can identify the transactions containing covert messages from thousands of transactions. To ensure security, both sides also need to negotiate the encoding, encryption algorithm, and the way of message embedding. en, the sender encodes, encrypts, and embeds the messages into certain transactions according to the negotiated algorithm and sends them to the server nodes of blockchain. After the flooding propagation mechanism of blockchain network, the receiver identifies the special transactions through negotiated labels and extracts the covert messages.
Ali et al. [86] proposed Zombiecoin 2.0, which validated the claims in [10] and deployed successfully over the blockchain network. In this system, the covert messages are directly inserted in the output script function OP_RE-TURN (available under https://en.bitcoin.it/wiki/ OP_RETURN), which is a field of particular blockchain implementation and originally used to carry additional transaction information. e sender and the receiver negotiate a pair of prenegotiated public-private keys as the label to identify the transactions that contain covert messages.
e receiver identifies these transactions by scanning the ScriptSig (the unlocking script in Bitcoin to verify whether a transaction is passed) which contains the sender's public key and the digital signature (computed over the transaction) using the private key. e receiver verifies the signature and decodes the messages.
Chainchannels [30] realized a new way of embedding covert messages in blockchain with key leakage and did some cryptographic proofs. It uses a subliminal channel in digital signatures to insert secret messages totally. Many blockchain-based virtual currencies use the ECDSA (elliptic curve digital signature algorithm,) and the subliminal channel can substitute the nonce used in ECDSA with the secret messages. e cryptographic characteristics of ECDSA ensure that no errors will occur during the process of extracting the secret message, thus increasing the system. In terms of the label to identify the special transactions, Chainchannel uses a pair of prenegotiated public-private keys as the label, which is the same as Zombiecoin 2.0.
Botchain [29], proposed by Cybaze, is a fully functional botnet which is based on Bitcoin protocol. is system also utilizes OP_RETURN to embedding secret messages, which is similar to Zombiecoin 2.0 [10]. In terms of the label to identify the special transactions, Botchain [29] uses prenegotiated virtual currency wallet addresses as the label, which is different from the Zombiecoin 2.0 and Chainchannels.
We compare the three systems in term of five aspects: secret message embedding method, used field to embed secret messages, label embedding method, used field to embed labels, and suitable platform in Table 6.

IPv6
Covert Channels. IPv6 (Internet Protocol version 6, also called the NextGeneration Internet Protocol or IPng) is a next-generation IP protocol designed by the Internet Engineering Task Force (IETF) to replace IPv4. e IPv6 header field is reduced by four (header length, identification, flags, fragment offset, and header checksum), and the options are replaced with extended headers.
However, the grammar rules of the IPv6 are not perfect, which makes the construction of IPv6 covert channels very easy. In [31], several possible covert channels have been analyzed in the IPv6 header and its extensions. In [87], Yang think that IPv6 packets are a good carrier for information hiding and propose potential covert channels. Ullrich et al. [88] also discuss the use of IPv6 covert channels.
We conclude that there are four types of IPv6 covert channels. e first type is based on reserved fields of the IPv6 extended headers. e fields that can be used are as follows: (i) e router alert options in the hop-by-hop option header (2 bytes/packet) (ii) e reserved field in the routing extension header when routing type is 0 (4 bytes/packet) (iii) e reserved field in the fragment extension header (10 bits totally/packet) (iv) e reserved field in the authentication extension header (2 bytes/packet) (v) e binding update option in the destination options header (4 bits/packet) e second type is based on the order. Covert channels of this type hide information by ordering several special parameters. is method does not insert extra characters, so the risk of channel exposure is relatively low.
(i) Encoding covert messages based on the difference between the order of extension headers and the suggested order of RFC2460 (available under https:// tools.ietf.org/html/rfc2460) (8 bits/packet) (ii) Encoding covert messages based on the difference between the order of N addresses in 0 routing type header and preshared order (N bits/packet) e third type is based on some random values. e reason for this type of channel is that some fields are incompletely defined or the design of the inspection mechanism is not strict. For example, when the reassembly process is performed, the destination host only inspects the next header value of the first fragment and ignores the next header values of fragments that differ. is causes that the sender can set a false next header value to transmit secret messages: (i) Set one or more false router addresses in the routing extension header when routing type is 0 (up to 2048 bytes/packet) (ii) Set false values in the field of traffic class, flow label, hop limit, and source address in the IPv6 header (8 bits, 20 bits, 1 bit, and 16 bytes/packet, respectively) (iii) Set a false padding value in the extension headers including hop-by-hop option extension header, destination option extension header, and ESP extension header (up to 256 bytes/packet) (iv) Set a false next header in the IPv6 header (varies) and fragment extension header (at least 8 bits/ fragment) e channels proposed above are all covert storage channels. e three types of channels can be classified in turn into C8, C6, and C7 introduced in Section 3. e fourth type is based on tunneled traffic. Tunneling technology is not only used in anonymous communication systems but also used to

Summary.
e characteristics of the new network bring natural convenience to the construction of network covert channels. For the covert channel based on streaming media, the real-time and interactive features improve the transmission efficiency of secret messages. In addition, the widespread popularity of streaming media applications has also made channels more difficult to expose. For the covert channel based on blockchain, the participant-anonymity can protect user identity, and flooding propagation makes it impossible for observers to determine the true recipient of the secret messages and to know who is receiving the messages. In addition, the tampering resistance strongly guarantees the robustness. For the IPv6 covert channel, the extended headers and grammar rules open up multiple possible construction methods.
On the other hand, due to the complexity of the new network environment, there are few effective attacks against the three new type covert channels (the attacks include elimination, limitation, and detection, which are introduced in Section 6). erefore, compared with the traditional network covert channels, they are more secure.

Network Covert Channel Metrics
e evaluation metrics of the network covert channel include 3 aspects: concealment, robustness, and transmission efficiency. Concealment refers to the ability to be undetected by adversaries. In this regard, we propose for the first time to divide the metrics of concealment into message concealment and identity concealment. Robustness refers to the ability of the network covert channel to accurately transmit data. And, transmission efficiency refers to the maximum rate at which the channel can transmit data without error.

Concealment.
A successful network covert channel requires high concealment. It includes not only the concealment of the communication content but also the concealment of the identity of the communicating parties. e adversaries can expose the covert channel through these two aspects. So, we list the concealment metrics as follows.

Concealment of Communicating Parties.
e concealment of communicating parties consists of anonymity and unobservability. Anonymity means that users can communicate without disclosing their identity. Unobservability refers to the indistinguishable state of traffic with covert messages in the overt traffic set.
ere are many methods proposed to measure the anonymity [92][93][94], which can be categorized into three classes: measurement based on continuous interval, measurement based on the size of anonymous set, and measurement based on entropy.
(1) Measurement based on continuous interval: Reiter and Rubin [92] described the degree of anonymity, which is widely adopted. e degree of anonymity is defined as a continuous interval which ranges from absolute privacy to provably exposed. e six key points are as follows: (i) Absolute privacy: the adversary cannot even perceive the presence of communication (ii) Beyond suspicion: though the attacker can see evidence of a sent message, the sender appears no more likely to be the originator of that message than any other potential sender in the system (iii) Probable innocence: the observer thinks that each sender appears no more likely to be the originator than to not be the originator (iv) Possible innocence: the originator of the message is likely to be someone else (v) Exposed: the originator of the message is unlikely to be someone else (vi) Provably exposed: the adversary can prove the identity of sender or receiver to others (2) Measurement based on the size of anonymity set: Berthold et al. [93] proposed the degree of anonymity can be defined by the size of the group. For example, the anonymity may be measured as follows: where N is the number of possible senders. (3) Measurement based on entropy: Diaz et al. [94] issued that because an observer may have some background knowledge, objects in an anonymous set may have different probabilities. So, they proposed the measurement based on entropy. e entropy of sender set is defined as equation (2), where S � s 1 , s 2 , . . . , s N is the anonymity set and p i is the probability that the possible object s i is the real sender. e p i is concluded by the observer's background knowledge: So, A − H(X) represents the information obtained by an observer who has known some background knowledge. e degree of anonymity d a is defined as (2) Unobservability. Anonymity can measure the security of the identity information of the communication subject, while unobservability is a measure of the user's behavioral security. Adversaries with traffic analysis capabilities can use protocol fingerprint characteristics to determine the user behavior, such as whether the senders or receivers are sending or receiving messages. Tan et al. [18] proposed to use the relative entropy D between the communication behavior of the covert communication system and the overt network behavior to measure the unobservability. e degree of unobservability d u is defined as equation (4). e D m is the max. relative entropy: When the probability distribution of the network behavior seen by the attacker is completely consistent with that of the normal network behavior, the relative entropy between them can be minimized, that is, the unobservable degree d u is minimized.

Concealment of Communication Content.
Since constructing a covert channel based on time information modulation will cause some time characteristics of the channel to change, most of the research on the metrics of message concealment is aimed at the CTCs. Due to the technical specificity of other types of covert channel, there is no universal method to measure the concealment of these channels [7]. So, we list the following methods used to measure the message concealment of CTCs.
(1) Kolmogorov-Smirnov Test. KS test (Kolmogorov-Smirnov test) points an upper bound between the cumulative probability of experience and the cumulative probability of the target distribution at each data point [95]. Archibald and Ghosal [96] leverage it as a method to assess the concealment of covert channel. How to calculate the test score is shown as where F(x) and G(x) are empirical cumulative probability distributions of the IPDs of the overt traffic and covert traffic, respectively.
(2) Kullback-Leibler Divergence Test. e KL divergence is a measure of relative entropy between two target distributions [97], which means the KL divergence is used to show the distance between two random variables. It is leveraged as a metric for detecting CovertCast [98] and CTCs in [96,99]. e KL divergence from P to G is denoted as e p(x) and g(x) are two probability distributions of the covert traffic sample and the overt traffic sample, respectively.
(3) Standard Deviation Test. Wu et al. [100] use the dispersion of standard deviation to assess concealment. It can measure the variation of a stream throughout the transmission process. is method starts by separating traffic into nonoverlapping windows of size w.
en, the standard deviation is computed for each window, which is shown in equation (7). Finally, as presented in equation (8), it calculates the standard deviation of the pairwise difference as the metric of concealment C t : (4) Entropy-Based Test. Because covert channels will cause the change of ER (entropy rate), it is used to detect covert channels in [28,85,101,102]. e ER describes the uncertainty of a random variable sequence, and the sequence length m approaches infinity. It is defined as Porta et al. [103] proposed to use the entropy rate of finite samples as the estimated value of ER by calculating CCE (corrected conditional entropy), which is shown in equations (10) and (11). e estimated value of ER is the minimum value of CCE when m changes: (i) e traffic containing covert messages may be affected by network noise or adversary noise, resulting in errors in the messages received by the receiver. So, the first definition [28] is the error probability obtained by comparing the number of error bits S error to the total number of bits transmitted S all . e BER is defined as (ii) e other aspect to define BER is from the perspective of message recovery. Because the encoding method used by the covert channel may lose some of the original information, there will be errors in the process of decoding the final message after the receiver obtains the covert message. Houmansadr and Borisov [104] define the BER as the error probability obtained by comparing the original message to the final message after decoding. e definition is shown in equation (13), where e(x, y) � 1 for x ≠ y and e(x, y) � 0 for x � y: where k is the decoded message length; m(i) is the ith bit of the original message; m ′ (i) the i-th bit of the message obtained after transmission.

Transmission Efficiency.
e transmission efficiency of the network covert channel is evaluated using the amount of information contained in a unit symbol or the amount of information transmitted in a unit time.
ere are three definitions about the transmission rate, which are enumerated as follows: (i) Wu et al. [100] define the transmission efficiency as the maximum possible error-free information transmission rate. Equation (14) shows the definition. e N(t) represents the amount of information transmitted by N-ary coding in time t: (ii) Houmansadr and Borisov [104] proposed to take each covert data packet as a unit. ey define the transmission efficiency as the number of bits of covert messages transmitted by each data packet. e definition is shown as follows: where K is the number of bits of covert messages sent using N + 1 packets.
(iii) Li et al. [28] discussed the transmission efficiency in blockchain network environment. It is defined as the amount of information transmitted per unit time. e following equation shows the transmission efficiency C: where v is the information carrier transmission rate; a is the number of modulation symbols that each covert information carrier can carry; and p i is the probability of occurrence of the i-th encoded character in the encoding table. (iv) Wang et al. [105] use channel capacity as a method to measure the transmission efficiency of CTCs. It is defined as where bit is the amount of information carried by each packet interval, and ipd is the average network packet interval.

Attacks against Network Covert Channels
Attacks against network covert channels can be divided into three categories according to [35]: (i) Elimination: removing covert channels or making the covert channel completely unusable (ii) Limitation: reducing the transmission efficiency of covert channels (iii) Detection: discovering the existence of covert channels or the identity of both parties in the communication Eliminating attacks is the most difficult to perform. On the one hand, attackers need the ability to monitor and modify traffic, such as national censors. On the other hand, the specificity of certain channel structures makes it impossible to eliminate them fundamentally. For the limitation attack, while it works, it may also interfere with normal network communication. So, most researches on attacks have focused on detection. In the remainder of this section, we will give attacks against different network covert channels. can greatly affect the transmission efficiency of CTCs. In addition, detection attacks against CTCs have been studied in recent years and can be divided into two categories. One is to use statistical methods to detect the shape, regularity, and randomness of traffic. e other is to use ML or DL technology like the detection attacks against CSCs, except that the fingerprints used are time-dependent.

Attacks against Network Level Channels.
Due to the particularity of the network level covert channel construction methods, no research has shown that they can be effectively limited. In the following, we will introduce the attacks on the network level channels from the aspects of elimination and detection.

Proxy.
Circumventing Internet censorship is a mechanism commonly used by censors. It contains IP blocking, URL blocking, DNS hijacking, keyword filtering, network protocol blacklist/whitelist, etc. Among them, the IP blocking and URL blocking can make proxy server unusable and eliminate it. In addition, there are three methods for detection attacks against proxy: (i) e attacker first extracts the characteristics of the packet and generates regular expressions based on this. And then, the attacker inspects the content of the traffic based on regular expressions. (ii) e attacker injects the traffic watermark into traffic and observes if the traffic from the target host contains the watermark. (iii) e attacker uses ML or DL technology to find the traffic produced by proxy. e key to this approach is to determine the traffic fingerprint of the corresponding proxy service.

Anonymous Communication System. Circumventing
Internet censorship is also used to eliminate this channel. For example, censors block the IPs of some known entry nodes and bridge nodes in Tor [33,80]. Likewise, the research [34] measured I2P censorship at a global scale and found that censors can hinder access to I2P using several blocking techniques, such as URL blocking and DNS hijacking. Detection attacks on anonymous communication systems include two aspects. One is to detect covert communication traffic to discover network nodes. For example, He et al. [106] propose Tor traffic could be identified using TLS fingerprint (cipher suite and digital certificate) or message length distribution characteristics. Wang et al. [85] perform detection attacks against obfuscation tools which are configured in Tor. e other is to detect the association between nodes, in order to discover the connection between sender and receiver, which destroys the anonymity of the channel.
is attack includes the following ways: (i) Predecessor attack: the adversary has control some nodes and collects relevant information. When he knows the node is on the senders' path, the precursor node of this node is more likely to be a sender. is attack requires many controlled nodes to work together, so the predecessor attack is also called the collusion attack. (ii) Sybil attack: Sybil attack means that malicious attackers control some nodes by imitating the identity of nodes in the system. ese malicious nodes leak system information to the attacker, and the attacker can infer the routing forwarding and data redundancy strategies of the system, so as to launch a precursor attack. (iii) Replay attack: replay attack means that an attacker records the message to be tracked first and then sends it back. e attacker tracks the message by observing the output of the mix node until the recipient is found. In deterministic encryption schemes, resisting replay attacks is a difficult problem. Mix nodes must remember the message they have processed in order to prevent attackers from discovering recipient. (iv) Message tagging attack: tagging attack is initiated by an internal attacker who controls the first and last node. e attackers mark messages at the first node. In this way, the attacker can identify the message in the last node according to the tag, thus linking the sender and receiver. (v) N-1 attack: the N-1 attack is also known as flooding attack. e attacker's goal is to track the path of a target message. He isolates any message other than a target message, and a certain number of forged messages are sent at the same time. us, when all messages flow out of mix, the only message that is not forged is the target message that the attacker wants to track. (vi) Flow correlation attack: the adversary observes the traffic that one particular mix is receiving at a special port and then finds the corresponding traffic at output ports. is attack can be performed by noting the timing of the packet between the traffic at the input and the output port or with the help of ML and DL to correlate traffic.

Summary.
Due to the existence of a large number of network covert channel construction techniques, each attack method is generally only effective for specific channels. Even the elimination or limitation attacks against some channels have not been studied. Now, research focuses on detection attacks. Most detection attacks are done by collecting traffic fingerprints and training ML or DL models. e success of this attack lies in selecting the appropriate traffic fingerprint and model. We summarise the attacks against covert channels in Table 7.

Challenges and Future Directions
After investigating the attacks against covert channels in Section 6, we see that there are two challenges that still remain: (i) e IP blocking or other blocking technology has made most transmission network level channels unavailable (ii) e use of ML and DL technology makes the covert channel easier to expose So, the future directions for covert channel are to improve the ability to resist attacks. In addition, we discuss several specific research methods in the following.

Measuring Internet Censorship.
In order to avoid being eliminated by censors, the covert channel can be designed specifically from the perspective of analyzing the censorship technology used by the censors. Measurement research on censorship is a prerequisite for this approach.

Reversible Network Covert
Channel. Many of the CSCs and CTCs proposed alter some traffic features permanently when embedding data. So, they are easy to be detected by ML or DL. erefore, we can use RDHT (reversible data hiding techniques) to construct the channel which is able to revert the covert traffic to its original form.

New Network Environments.
Under some new network environments such as IOT (the Internet of ings), SDN (software defined network), and ICS (industrial control systems), there is no mature covert communication system. With the continuous innovation of new network technologies, we can use the characteristics of the new network to construct more concealed channels.

Conclusion
As a key technology in the field of network security, covert channels have always been an effective way to protect user privacy. With the development of information steganography and Internet, the network covert channel construction technology continues to be innovated. But at the same time, because the network covert channel can copy with repressive governments, censors are also starting to pay close attention to covert channels. e elimination attack, limitation attack, and detection attack against covert channels have a significant impact on security of covert channels. Although not all attacks can be successful because each channel has different characteristics, there are always one or more attacks that can effectively affect certain performance of covert channel. So, many network covert channels cannot simultaneously satisfy high availability, strong robustness, and high transmission efficiency. e characteristics of the new network can be used to increase the three aspects to some extent.
In this article, we have presented a comprehensive literature review, focusing on the techniques for constructing network covert channels, covert channel metrics, and attacks against network covert channels. e covert channels in the new network environment (streaming media, blockchain, and IPv6) have been introduced and compared. We identify challenges to explore the future direction of improvement and propose possible research methods. We believe this review will contribute to the development of this research area.