Effective Capacity Maximization in beyond 5G Vehicular Networks: A Hybrid Deep Transfer Learning Method

How to improve delay-sensitive traffic throughput is an open issue in vehicular communication networks, where a great number of vehicle to infrastructure (V2I) and vehicle to vehicle (V2V) links coexist. To address this issue, this paper proposes to employ a hybrid deep transfer learning scheme to allocate radio resources. Specifically, the traffic throughput maximization problem is first formulated by considering interchannel interference and statistical delay guarantee. The effective capacity theory is then applied to develop a power allocation scheme on each channel reused by a V2I and a V2V link. Thereafter, a deep transfer learning scheme is proposed to obtain the optimal channel assignment for each V2I and V2V link. Simulation results validate that the proposed scheme provides a close performance guarantee compared to a globally optimal scheme. Besides, the proposed scheme can guarantee lower delay violation probability than the schemes aiming to maximize the channel capacity.


Introduction
The rapid evolution of mobile communication technologies invites all human beings to the era of the Internet of Everything, where unprecedented changes will take place in all walks of life and have a profound impact on every single aspect of our daily interactions [1,2]. Vehicular communications, widely regarded as a promising technology to enable intelligent transportation, autonomous driving, and even every potential application related to smart vehicles in beyond 5G networks have attracted extensive attention from both academia and industry [3]. Typically, the link types of vehicular communications include vehicle to infrastructure (V2I), vehicle to vehicle (V2V), and vehicle to everything (V2X) [4]. Worth noting is that different communication link types usually have to provide certain quality of service (QoS) guarantees [5]. For instance, an autonomous driving vehicle is expected to transmit its rough position information to the infrastructure to help the base station (BS) perceive the whole vehicular network. Besides, such vehicles are constantly exchanging various types of their instantaneous infor-mation with adjacent vehicles to ensure transportation safety. Apparently, these two types of information should be transmitted with low delay, where the exchange of the instantaneous information between adjacent vehicles using V2V links is inherently more delay-sensitive than V2I communications. Moreover, since the spectrum resources are quite limited in existing cellular systems, how to effectively provide differentiated QoS guarantees for different traffic is a critical issue for vehicular communication networks [6].
Traffic throughput maximization usually serves as an objective to improve the overall spectrum efficiency of a given communication network [7]. However, as a vehicular network is generally required to provide different QoS guarantees for different traffic, traditional resource allocation schemes that aim to maximize the channel capacity may be no longer applicable. Considering the low delay constraint, actual resource optimization problems are usually more complex [8,9]. Additionally, spectrum sharing is another potential solution to improve the spectrum efficiency of a communication system. In a vehicular network, V2I and V2V communications can reuse the same channel to transmit data, which dramatically increases the number of access links for vehicular communications. Nevertheless, the introduced interference between the V2V and V2I links further complicates the performance analysis of vehicular networks. In summary, a fundamental challenge for vehicular communication networks is to design an efficient resource allocation scheme to maximize the network traffic throughput under diverse QoS requirements and various interference constraints.
To address this challenge, this paper focuses on a vehicular network where V2I and V2V links share the limited spectrum resource. Specifically, an interference model and a statistical delay model are both established, based on which a traffic throughput maximization problem is formulated aiming to acquire the optimal power allocation and spectrum sharing scheme. Subsequently, the optimization problem is decomposed into a power allocation subproblem for each pair of cellular user (CUE) and V2V user (VUE) and a spectrum sharing subproblem for the whole vehicular network. Firstly, the power allocation subproblem is solved analytically based on the effective capacity theory, with both the statistical information of small-scale channel fading and the instantaneous information of large-scale channel fading taken into account. Secondly, a supervised deep learning algorithm is proposed to solve the spectrum sharing problem. Moreover, to overcome the mismatch problem caused by the varying distribution of hidden network information and states, we propose a deep transfer learning algorithm to adapt fast to new scenarios and to achieve optimization under certain QoS requirements for vehicular networks. Simulation results validate the accuracy of our proposed learning schemes, and the performance analyses show that traditional channel capacity maximization schemes may incur a high delay violation probability for delay-sensitive traffic, which is systematically alleviated by our proposed learning schemes.
The contributions of this paper are summarized as follows: (i) An analytical model is established to jointly consider the statistical delay guarantee and interchannel interference. Compared to the traditional model based on the average delay, our proposed analytical model better fits the context of beyond 5G networks, where the delay performance is commonly measured in a probabilistic dimension (ii) A power allocation scheme is proposed to maximize the throughput of delay-sensitive traffic for a given CUE-VUE pair. The highlight of our proposed scheme lies in that its computation complexity only relates to the number of power levels, which enables its application on a real-world vehicular transmitter. In addition, the power allocation scheme can guarantee the traffic delay requirement for both CUE and VUE while other conventional schemes only guarantee either one of the vehicular links (iii) A deep learning-based spectrum sharing scheme is proposed to quickly obtain the optimal channel reuse strategy. Based on the offline deep learning algorithm, a deep transfer learning algorithm is further developed to deal with the mismatch problem commonly encountered in new scenarios, where the hidden information is dynamic and only few training samples can be obtained The remainder of this paper is organized as follows. In Section 2, related work is introduced and discussed. In Section 3, the network model, interference model, and delay model are presented. Section 4 proposes the traffic maximization scheme, and Section 5 compares and discusses the simulation results. Finally, the paper is concluded in Section 6.

Related Work
In the literature, existing studies on throughput maximization for vehicular communications can be briefly summarized as follows.
In [10], a low complexity data routing policy was designed to maximize the data throughput from vehicles to roadside units. In [11], a data transmission and scheduling scheme was proposed to maximize the traffic throughput and reduce the resource contention for nonadjacent V2V communications. In [12], an information spread problem in vehicular networks with V2I and V2V links was formulated and solved, where the channel capacity of V2I links was maximized based on the Doppler effect. In [13], a coalition game model was introduced to optimize resource allocation and maximize the throughput of individual V2V links under a minimum V2I throughput requirement. In [14], a novel power allocation and spectrum sharing algorithm was proposed to optimize the throughput of V2I links while guaranteeing the minimum throughput requirement of V2V links. The abovementioned works [10][11][12][13][14] have designed novel resource optimization schemes for different scenarios. However, in these studies, the throughput performance was simply characterized by the Shannon channel capacity, and the transmission delay was not taken into account.
Since the transmission delay is a critical metric in vehicular communications, a significant number of researchers have paid close attention to the improvements of the delay performance. In the literature, the transmission delay was usually analyzed in the average and used as an indicator to calculate the tradeoff with other performance metrics based on the Lyapunov theory [15][16][17][18]. In [15], the TV white space bands were used to supplement the bandwidth for the computation offloading of vehicular terminals. The computation offloading and bandwidth allocation decisions were jointly optimized to balance the task delay and the cost of the TV white space bands. In [16], the extreme value theory and Lyapunov theory were employed to analyze the tail distribution of the age of information in a given vehicular network. A power control scheme was proposed to guarantee the mean delay requirement. In [17], a vehicle-centric approach was designed to optimize the node association and resource reallocation, by taking the additional latency caused by the overhead into account. In [18], the long-term time-averaged total 2 Wireless Communications and Mobile Computing system capacity was maximized while satisfying the strict ultrareliable and low-latency requirements of vehicle communications. Generally, the mean delay is leveraged to characterize delay-tolerant traffic. However, there are many types of delay-sensitive traffic in vehicular networks where the positions of vehicles, wireless channel states, and traffic arrival intervals are all highly dynamic. Hence, the statistical delay guarantee is more useful for practical vehicular networks. In [19], the capacity of V2I links was maximized under a given delay and delay violation probability requirement, where the closed-form power allocation solution was derived for each V2I and V2V reuse link. However, [19] only considered the delay requirement of V2V links. To the best of our knowledge, how to provide the delay guarantee for both V2I and V2V links at the same time is still an open problem. In addition, deep learning-based techniques are becoming more and more popular in wireless communications [20]. In [21], the authors integrated a convolutional neural network and a long short-term memory network to predict the channel state information. In [22], the authors constructed a feature learning framework for IoT applications to effectively classify data and detect anomaly events, using RBF-BP hybrid neural network. In [23], the deep learning assisted optimization methods for resource allocation in vehicular communications were introduced and compared. In [24], a multiagent reinforcement learning framework was proposed for the spectrum sharing in vehicular networks with V2I and V2V links. It is evident that deep learning is confirmed to be an effective tool for optimization in wireless communications. Hence, in this paper, we propose a hybrid deep transfer learning scheme to address the aforementioned problem in vehicular communications.

System Model
We consider a multivehicle single-cell network as depicted in Figure 1, where there are M vehicles as CUEs and NðN ≤ MÞ pairs of proximate vehicles as VUEs. The CUEs transmit information to the BS through V2I com-munications with orthogonal channels, while the VUEs employ V2V communications to send and receive data through sharing the spectrum resource with CUEs. The total bandwidth of the considered network is B tot . We assume that each CUE can only occupy one channel at a time, and a channel can only be allocated to one CUE. Hence, the channel bandwidth allocated to a CUE can be denoted as B = B tot /M. In order to avoid the strong interference between V2I and V2V links, each VUE can only reuse one channel, and each channel can only be shared with one VUE. For notational expedience, we use M = f1, 2, ::, m, ⋯, Mg and N = f1, 2, ::, n, ⋯, Ng to denote the sets of CUEs and VUEs, respectively. In addition, all the CUEs and VUEs are equipped with a single antenna.

Communication Model.
We denote the channel power gain from the mth CUE to the BS by and h C m characterize the large-scale and small-scale fading components, respectively. The large-scale fading parameter can be further modeled as the path loss constant, ω C m is the random log-normal shadowing parameter, l represents the distance between the CUE and the BS, and α is the power decay exponent. Similarly, we use , and g m,n = φ m,n h m,n to represent the channel power gain of the nth VUE, the interference power gain from the nth VUE to the BS, and the interference power gain from the mth CUE to the nth VUE, respectively. In addition, due to the high mobility of vehicles and the varying delay requirement of different data traffic, it is impractical for the BS to always obtain the instantaneous small-scale fading information. However, the statistical information is easily accessible by the BS from the feedback of vehicles within hundreds of time slots. Hence, in this paper, we assume all the CUEs and VUEs undergo the small-scale Rayleigh fading. In other words, small-scale fading parameters h C m , h V n , h n,B , and h m,n follow the independent and exponential distribution with unit mean in each time slot.

Wireless Communications and Mobile Computing
As the channel (interference) power gain is time-varying for both V2I and V2V links. The considered network should make a decision on the power management and spectrum sharing for all CUEs and VUEs when the statistical channel information changes. The transmission power of the mth CUE and the nth VUE is denoted by p C m and p V n , respectively. Besides, binary indicator τ m,n is employed to characterize the channel reused by the mth CUE and the nth VUE, where τ m,n = 1 means the mth CUE and the nth VUE share the same channel. As a result, the signal to interference plus noise ratio (SINR) of the mth CUE and the nth VUE holds as , ð1Þ where N 0 denotes the power spectral density of background noise. According to the Shannon's Theorem, the channel capacity of the mth CUE and the nth VUE in each time slot can be obtained as In order to characterize the delay performance more intuitively, we model the delay metric according to the philosophy behind 5G ultrareliable low latency communications (uRLLC). Specifically, statistical delay characteristics are analyzed in this paper, as shown in For the mth CUE, its statistical delay performance means the traffic delay exceeding threshold d C m should be controlled with probability ε C m , which also holds for the nth VUE. In a V2X network, a vehicle sustaining a higher traffic arrival rate under a specific delay requirement means that this vehicle is able to update its information to other vehicles or the infra-structure more timely. Let λ C m (m ∈ M) and λ V n (n ∈ N ) denote the maximum arrival rate (i.e., traffic throughput) sustained by the mth CUE and the n VUE under the delay requirement, respectively. We model the traffic throughput under the delay requirement as follows In this paper, we aim to maximize the traffic throughput for the considered network under diverse delay requirements through optimizing the power and spectrum allocation for each CUE and VUE. The optimization problem regarding to resource allocation can be formulated as In P1, C1 and C2 represent the delay constraints for CUEs and VUEs, respectively. C3 and C4 constrain the transmission power range of CUEs and VUEs, respectively. C5 and C6 are the spectrum sharing constraints ensuring that the channel of each CUE is reused by at most one VUE and each VUE reuses the channel of at most one CUE. Considering C5 and C6, P1 is a mixed integer nonlinear programming (MINP) that cannot be solved by traditional convex optimization approaches. Moreover, due to the lack of tractable expressions to characterize C1 and C2, it is more challenging to solve P1, compared with other MINP problems, especially when the traffic throughput is modeled using the channel capacity without considering delay requirements. Therefore, we propose a hybrid deep transfer learning method to achieve the optimal resource allocation.

Joint Power Allocation and CUE-VUE Association Optimization
As the interference only exists in a channel that is reused by a CUE and a VUE, P1 can be decomposed into a power allocation subproblem P2 for a given CUE-VUE pair and CUE-VUE association subproblem P3 for a given P3 max Pr where β C m and β V n denote the effective capacity and θ C m and θ C m are the nonnegative QoS exponential parameters that can be further optimized. For a stable vehicular network, the effective capacity of CUE and VUE can be calculated as [25].
Combining (11) and (13), we have Pr From (15), the delay violation probability of CUE can be improved by increasing θ C m . However, according to (13), the effective capacity of CUE decreases with θ C m , which implies that a low λ C m is guaranteed. Similar results can be derived for VUE. Hence, P2 can be transformed into the following feasible problem To solve P4, the following theorem is derived.

Theorem 1.
If fp C * m , p V * n , θ C * m , θ V * n g denotes the optimal solution for P4, the following equations must hold Proof. Firstly, we assume that for the optimal solution fp C * m , According to (13), the effective capacity of CUE, i.e., β C m ðθ C m Þ is a continuously decreasing function in θ C m while the delay violation probability E½e −θ C m R C m ðp C m ,p V n Þd C m is also a continuously decreasing function in θ C m . As a result, there always exists e θ C m = σθ C * m < θ C * m ðσ ⟶ 1 − Þ meeting the delay constraint as Also, Hence, power allocation scheme fp C * m , p V * n , e θ C m , θ V * n g guarantees a higher traffic throughput than fp C * m , p V * n , θ C * m , θ V * n g under the delay and power constraints, which is a contradiction. On the other hand, we assume that for the optimal solution fp C * m , p V * n , θ C * m , θ V * n g, the following equation holds

Wireless Communications and Mobile Computing
And a similar contradiction to the assumption can be observed. As a result, the optimal power allocation must meet C1 and C2 equally.
According to (18), the effective capacities of CUE and VUE can be further simplified as Therefore, P4 can be further transformed to Note that θ C m and θ V n can be directly obtained according to C1 and C2 of (31) when transmission power fp C m , p V n g is determined. And another theorem is proposed to further optimize the power allocation.

Theorem 2.
The optimal solution to P5 always satisfies either p C * m = p C max or p V * n = p V max .
Proof. Firstly, we denote the optimal power allocation by fp C * m , p V * n g and assume the following two expressions hold at the same time.
Additionally, let p C * m <p C m = ξp C * m < p C max and p V * n <p V n = ξp V * n < p V max , where ξ ⟶ 1 + . According to (1), the following expressions hold According to (3) is decreasing with R C m , and θ V n is decreasing with R V n , and therefore Apparently, power allocation scheme fp C m ,p V n g guarantees a higher traffic throughput than fp C * m , p V * n g, under the delay and power constraints, which is also a contradiction to the assumption. Furthermore, for power allocation fp C m < p C max , p V n < p V max g, we can improve the traffic throughput under C1-C6 through increasing p C m and p V n with an equal proportion until one of them reaches the corresponding maximum. Consequently, Theorem 2 is proved.
From Theorem 2, P5 is decomposed into two onedimension optimization problems, i.e., optimizing p V n to maximize Γ m,n ðp C max , p V n Þ and optimizing p C m to maximize Γ m,n ðp C m , p V max Þ. Thereafter, through comparing the corresponding optimal Γ * m,n ðp C max , p V n Þ with Γ * m,n ðp C m , p V max Þ, we can choose the greater one as the optimal power allocation solution for P5. Note that in the procedure of solving the abovementioned two one-dimension optimization problems, the optimal θ C m and θ V n can be derived by using the bisection method on C1 and C2 in P5. Algorithm 1 summarizes how to ascertain the optimal power allocation for a given CUE-VUE pair. What should be highlighted is that, given power accuracy Δp, if C1 and C2 in P5 can be solved analytically, the computation complexity of Algorithm 3 is Oðp C max + p V max / ΔpÞ. Otherwise, according to the bisection method, the computation complexity is Oðlog 2 ðθ max Þp C max + p V max /ΔpÞ, where θ max denotes the maximum value of the QoS exponent. Typically, such a maximum value is small, and thus, the bisection method can converge rapidly.

Deep
Learning-Based CUE-VUE Matching. After obtaining the optimal power allocation and the maximum traffic throughput under the delay constraints of each CUE-VUE pair, we further propose a supervised deep learning approach to solve P3 in (10). In order to get the training label, we apply Hungarian algorithm to optimize the CUE-VUE matching, and the deep learning model should be properly trained to guarantee high accuracy. 6 Wireless Communications and Mobile Computing As the number of CUEs may be greater than that of VUEs, i.e., M > N, in this case, there are ðM − NÞ channels that are not reused by any VUE. In order to maximize the traffic throughput of those ðM − NÞ CUEs under their delay requirements, we introduce a set of virtual VUEs, denoted by N ′ and defined as Also, for a given CUE-VUE pair, if the VUE is a virtual VUE, we fix its transmission power as 0, and then, (31) can be transformed into a simple power optimization problem for a single CUE. It is easy to verify that the effective capacity of the CUE is an increasing function of transmission power p C m . Hence, the optimal power allocation for this CUE-VUE pair can be obtained as fp C max , 0g, and the maximum traffic throughput can be calculated by where θ C m can be obtained by solving Therefor, P3 can be transformed into Note that P6 is a bipartite matching problem [26] and can be effortlessly solved by the classical Hungarian algorithm. Specifically, the Hungarian algorithm is a sequential and combinatorial optimization algorithm first proposed to solve assignment problems [27]. The computation complexity of the Hungarian algorithm calculating the optimal CUE-VUE pair fτ * m,n g is OðM 3 Þ, which is prohibitively high when the number of CUEs is large. Hence, we develop a supervised deep learning approach to solve P6 after obtaining the offline labels from the Hungarian algorithm.
Firstly, we generate channel realizations with random positions of CUEs and VUEs. In each channel, 10 6 smallscale fading realizations are generated to solve C1 and C2 in P5. And then, we calculate the maximum traffic throughput Γ * m,n sustained for each CUE-VUE pair and form a throughput matric as fΓ * m,n g. The total number of the training samples K train is 5 × 10 4 . In each sample, the traffic throughput for a CUE-VUE pair varies in a large value range and may be very different from those in other samples, which takes a long time to obtain the optimal training parameters. As a result, we normalize Γ * m,n for each training sample as in For each training sample, we can easily deduce corresponding label fτ * m,n g. Note that there are M "1" and MðM − 1Þ "0" elements in each fτ * m,n g. This implies that each label is quite sparse and a latent poor training performance. Hence, we focus on the position of "1" elements for each m ∈ M and use a fixed number from 1 to M to represent it. For example, when "1" is spotted at the 5th column of the considered row, that row can then be characterized by 5. Therefore, the sparsity of the labels can be avoided. 1. Initialize statistical information of small-scale fading, locations of CUEs and VUEs, p C m = 0, p V n = 0, p C * m = 0, p V * n = 0, optimal traffic throughput Γ * m,n = 0, power accuracy Δp; Algorithm 1: Optimal power allocation algorithm. In the model training stage, we construct a fully connected neural network (FNN) with K FNN = 5 layers. In each layer, there are some neurons to be optimized and one activation function to introduce the nonlinear characteristics, as depicted in Figure 2. In detail, there are 1000 neurons in each middle (hidden) layer, and the ReLU activation function is employed. Besides, the input and output vectors of the kth layer are denoted by x k and y k , and we have

P6 max
where W k and b k are the weight and bias vectors in the kth layer. In order to predict the CUE-VUE pair matching from 1 to M, the output layer has M outputs, i.e., The training parameters of the FNN are initialized with Gaussian variables with zero mean and unit variance. In each epoch, a batch of training samples is randomly chosen from all training samples for parameter training. The loss function is defined as where y i,j denotes the jth element of the label in the ith sample andŷ i,j denotes the training result. The training parameters can be optimized by the Adam algorithm to minimize the loss function [28]. The deep learning-based CUE-VUE pairing can be summarized by Algorithm 2.

Deep Transfer Learning for New Scenarios.
As the FNN is trained offline, it works well only for a V2V network with the identical data distribution. However, in practical V2V networks, the traffic arrivals, channel fading, and positions of vehicles are highly dynamic and nonstationary. Since we only train the FNN with the maximum traffic throughput, hidden parameters such as the positions of vehicles and large scale channel fading are missing but have significant impacts on both the throughput matric and the optimal CUE-VUE matching. If the distribution of these parameters changes, the throughput matric will be affected, where the offline FNN will no longer perform well.
To address the mismatch problem, one potential approach is to retrain the FNN for each new scenario. However, it is hard to acquire enough training samples from a new scenario, and the traffic of both CUEs and VUEs is delay-sensitive. Hence, we resort to deep transfer learning to overcome the mismatch, especially when new training samples cannot be effectively obtained within a short time. The transfer learning framework is depicted in Figure 3. Specifically, we choose offline trained model fW, bg as the initial parameter setting. In addition, fine-tuning is employed to adjust the  3. Obtain the maximum traffic throughput fΓ * m,n g sustained for each CUE-VUE pair from Algorithm 1; 4. Calculate the optimal matching scheme fτ * m,n g as training labels; 5. Deal with the training samples fΓ * m,n g according to (32); 6. Deal with the training labels fτ * m,n g to reduce the sparsity; 7. Train the FNN parameters with data samples until the loss function converges; 8. Output the optimal model.

Simulation Results
In this section, simulation results are presented and discussed. Network parameters and scenarios involved are set as follows, unless otherwise stated. Similar with [14], we set up a simulation scenario with a 6-lane freeway (3 lanes in each direction) passing through a single cell, where the BS is located at the center of the roadside. The lane width is set to 4 m. The vehicles are randomly dropped according to a Poisson point process with density 2.5 s × v, where v (km/h) denotes the velocity of the vehicle. Then, we randomly choose CUEs and VUEs. Note that the CUE-VUE pair always includes two adjacent vehicles. In the simulation, the carrier frequency is set to 2 GHz, and the cell radius is set to 500 m. For the BS, the antenna height is set to 25 m, the antenna gain is set to 8 dBi, the receiver noise is set to 5 dB, and the distance to the freeway is set to 35 m. For each vehicle, the antenna height is set to 1.5 m, the antenna gain is set to 3 dBi, the receiver noise is set to 9 dB, and the velocity is set to 60 km/h. The numbers of CUEs and VUEs are both set as M = N = 5. The total bandwidth of the considered vehicular network B tot is set to 10 MHz, and thus, the bandwidth for each CUE is B = 2 MHz. The delay requirement and the maximum tolerable violation probability for each CUE are set to d C m = 1 ms and ε C m = 10 −3 . The delay requirement and the maximum tolerable violation probability for each VUE are set to d D n = 1 ms and ε D n = 10 −5 . Also, the maximum transmission power of each CUE and VUE is set to p C max = p D max = 20 dBm. We simulate 2000 channel realizations and output the average result. Figure 4 depicts the maximum traffic throughput sustained by a CUE-VUE pair under different vehicle velocities.
It is shown that the throughput of both CUE and VUE decreases as the velocity increases. This is because the V2V distance increases with the vehicle velocity. As a result, the data transmission capability of VUEs degrades seriously due to the path loss. Hence, the VUE has to increase transmitting power to guarantee the low delay requirement, which introduces higher interchannel interference to the CUE. Though the communication distance from the CUE to the BS changes slightly, the high interference from the VUE still affects the traffic throughput of the CUE.  9 Wireless Communications and Mobile Computing assignment problem with offline data samples. It is observed that both training loss and testing loss converge within around 110 epochs. Note that we need to predict M integers whose values are from 1 to M, and consequently, the loss is low enough to guarantee the optimal channel assignment. In addition, the training accuracy is also presented in Figure 6, which verifies that our proposed FNN model is effective in solving the channel assignment problem with the testing accuracy above 90%. Hopefully, the model accuracy can be further improved if more training samples are fed into the FNN. Figure 7 depicts the traffic throughput supported by different channel assignment schemes. Specifically, the global optimal scheme calculated by the Hungarian algorithm has the complexity of OðM 3 Þ. Our proposed FNN is a deep learning-based channel assignment scheme, where the channel capacity of the optimal scheme is leveraged to find the optimal channel assignment that maximizes the network throughput under given delay constraints. Obviously, the throughput prediction by our proposed FNN is close to that calculated by the Hungarian algorithm. Hence, we can use the trained model to predict traffic throughput rapidly while only incurring slight resource overheads. In addition, the    10 Wireless Communications and Mobile Computing channel capacity provided by FNN is always higher than the corresponding traffic throughput while satisfying the delay constraint. In short, if the channel assignment aims for the channel capacity maximization, it will lead to a severe overestimation of the traffic throughput. Figure 8 depicts the performance improvement by the deep transfer learning on the FNN training of new scenarios, where the distribution of hidden parameters varies. In the FNN training, the vehicle velocity is set to v = 100 km/h, whereas in the transfer learning process, we initialize the parameter as v = 60 km/h. It is verified that with the knowledge transfer, our proposed FNN model converges much faster than those with a random initialization. Hence, the proposed transfer learning scheme can rapidly retrain the channel assignment model for new sceneries and guarantee sufficiently high accuracy.

Conclusion
In this paper, a joint power allocation and spectrum sharing scheme was proposed to maximize the delay-sensitive traffic throughput for vehicular communications. Specifically, the interchannel interference model and traffic delay model were established, respectively, to derive the optimal power allocation for each CUE-VUE pair. Thereafter, a FNN was designed to deal with the channel assignment problem and speed up the allocation decision. Furthermore, a deep transfer learning scheme was proposed to leverage the offline knowledge to learn new scenarios where hidden parameters were unstable and training samples were insufficient. The effectiveness of the hybrid deep transfer learning scheme was also validated by extensive simulations. The results and analyses revealed that using the channel capacity to characterize the traffic throughput would incur a severe performance overestimation and degrade the traffic delay performance.

Data Availability
The simulation data used to support the findings of this study have not been made available because of the funding constraint.

Conflicts of Interest
The authors declare that they have no conflicts of interest.