Study and Analysis of Multiconnectivity for Ultrareliable and Low-Latency Features in Networks and V2X Communications

Department of Telecommunication Networks and Data Transmission, The Bonch-Bruevich Saint-Petersburg State University of Telecommunications (SPbSUT), Russia School of Data Science and Technology, Heilongjiang University, Harbin 150080, China School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China Department of Computer Science, Faculty of Computers and Information, Menoufia University, Shibin el Kom 32511, Egypt LaSTI Laboratory, ENSAK, Sultan Moulay Slimane University, Beni Mellal, Morocco Department of Mathematics and Computer Science, Faculty of Science, Menoufia University, Al Minufiyah 32511, Egypt

Furthermore, as shown in Figure 1, the main parameters of the 4G and 5G networks are compared and realized that set of improvements in 5G key metrics which can normally reflect the achievable data rates, density of subscriber devices, energy efficiency, and delivery times [10,11]. However, URLLC implementation requires a delay to be decreased from 10 ms to 1 ms. This is achieved by the technical features involved in building a network at the level of access [12][13][14][15], in particular by changing the format of transmitted frames or by changing the structure of the network at other levels [3] primarily because of the "approaching" point of service provided by the user (cloud and edge organization) [16][17][18][19][20].
Recently, significant efforts have been made to address the challenges related to ensuring the delay requirements through efficient interaction with base stations [21][22][23]. More specifically, the division of resources (slicing) is utilized in the implementation of services with different requirements for the parameters of functioning to provide various services through virtual network organization [24,25].
The organization of multiple (parallel) connections and duplication of transmitted data are considered one of the dominant solutions to address this problem and fulfill the requirements for ultralow latency and ultrahigh reliability [3,21,26]. This approach ensures that the probability of packet loss is reduced to the target level due to the maintenance of redundancy. Moreover, the concept of reliability is interpreted as the probability of successful data delivery in the overwhelming majority of works devoted to URLLC, in which the requirements for this probability depend on the applied problems that are also significantly higher than modern ones (e.g., the probability of successful delivery is 0.99999 in [22] while it reaches to 0.999999999 in [3]).
Indeed, the probability of data delivery is the ultimate goal of the functioning of the network. However, this probability is influenced not only by probabilistic processes associated with the transmission and reception of messages (i.e., equipment functioning processes) but also by the probabilistic processes associated with equipment failures (i.e., failure of network elements) in connection with which its normal functioning is disrupted.
Subsequently, it should be noted that the concept of reliability is associated with this process in domestic literature. However, the process of technical failures (equipment failures) is virtually ignored in the aforementioned literature. In addition, the overwhelming majority of publication cite research results for equipment that is fully trustworthy and tolerant of failures. For example, in [27], the hardware reliability is considered with packet losses, in which the requirements are higher than existing ones (e.g., the requirement for the probability of a good state in the aforementioned work is defined as 0.9999).
Consequently, based on this interpretation and the generally accepted approach in the domestic literature, URLLC should be given as connections to ultralow losses and delays, as defined in Russian. However, this will be a narrower definition because, in the absence of additional reservations regarding hardware reliability, the term reliability includes the concept of losses.
More recently, a new multiconnectivity scheme is proposed in [28] for 5G networks, in which virtual network functions, local servers within MEC, and communication link failures are formulated as an integer problem whose object of optimally deploying network slices with the minimum cost. Then, an efficient algorithm based on a genetic algorithm is proposed to solve this problem and derive the close-optimal solution, whereas, in [29], an end-to-end latency approach has been reduced for downlink communication using cooperation among multiple access points. In addition, the proactive radio resource allocation is formulated as an optimization problem for openloop uplink. Finally, Lyapunov optimization-based technique is developed to solve this problem in an efficient manner.
It is also worth noting that the only reason to lose data (i.e., a hardware or program failure or random events during    Wireless Communications and Mobile Computing the transmission process) is very difficult to be obtained in a modern network (interference leading to errors, resource shortage or hardware failure). In other words, the reason is complex, and the more possible reasons the model takes into account, the more accurately it describes the real network. It is, therefore, logical to use the term URLLC in a general interpretation and add more factors to the model that take into account the equipment fault tolerance.
In the above-mentioned work [27], the concept of hardware reliability is stated, and even numerical requirements for the probability of failure are provided. Therefore, motivated by such considerations, in this paper, we proposed a novel approach based on multiple connections (multiconnectivity) for improving the system reliability, in which the hardware reliability of network elements is considered. This work proved that the implementation of URLLC using multiconnectivity in 5G networks allows to increase the reliability of data delivery but at the same time increases the load on the network. An efficient model is proposed that makes it possible to estimate the optimal number of parallel connections at which the maximum connection reliability is achieved. The distribution function of the message delivery time in the URLLC network is investigated. A simulation model was built, which confirmed the proposed analytical models. The main contributions reported in this study can be summarized as follows: (i) The multiple connections obtained by increasing the number of delivery routes and "propagating" data are considered, which provides increased reliability both by reducing the likelihood of data loss due to congestion and by reducing the likelihood of losses due to technical malfunctions of the route elements (ii) The duplication (multiplication) of data, namely, the impact on the probability of losses, the amount of delay, and traffic intensity in the network, is analyzed (iii) The data delivery routes using a queuing system model with a combined service discipline is simulated (iv) The optimal number of routes is defined, in which the minimum probability of losses is achieved, and this solution is obtained for both "equal" and "unequal" routes (v) A simulation model was built, which confirmed the proposed analytical models

Model and Problem Statement
Heterogeneity is the key feature of 5G networks, which allows the implementation of URLLC and allows several ðkÞ parallel connections (routes) to be organized between customer's device and the point of service delivery using different technology networks (seen in Figure 2). In the general case, these connections can be implemented in different ways and passed through various network elements; therefore, we will call them routes below.
Furthermore, the use of multiple routes allows the parallel transmission of one data packet through different network elements. First, it enables the reliability of delivery to be increased; as in this case, the probability of packet loss is less than the smallest value for each route used.
where p i denotes the data loss's probability on the ith route. Secondly, this makes it possible to reduce the delivery time of the package, due to the fact that only the first of the arriving packages will be used at the point of service provision (i.e., delivery time will be equal to the minimum time for all routes used).
where t 1 , t 2 , ⋯, t k are the delivery times for each of the k routes. In this case, under the assumption that the delivery time is independent and if its distribution for all routes is the same and has a distribution function FðtÞ, the delivery time distribution function for k routes operating in accordance with (2) will be defined as From (3), it is easy to see that if this is an exponential distribution, then it is also an exponential distribution, the mathematical expectation of which differs by a factor of k. This qualitative example shows that the expected delivery time decreases in proportion to the number of routes.
It is evident from expression (1) that the probability of data delivery loss also decreases with an increase in the number of parallel routes k.
However, when considering the routing system in a single communication network, it is also evident that the transmission of the same packet along k routes increases the traffic intensity in this network by a factor of k.
Thus, with an even distribution of traffic along routes, if the system serves the traffic of n users, m of which uses URLLC and uses multiple connections (multiconnectivity), the resulting traffic intensity can be defined as If the share of URLLC users is η = m/n, then the traffic intensity of each of the routes can be expressed aŝ where k is the number of parallel channels, η is the share of clients using URLLC, and a i is the traffic intensity in the ith route without using URLLC. Each of the routes is a sequence of several network sections (channels) and, in the general case, can be described as a multiphase queuing system. In this case, we will assume that the most significant contribution to the probability of 3 Wireless Communications and Mobile Computing losses and the amount of delay is made only by one of the route sections; the influence of the remaining sections will be considered negligible. This simplification is probably acceptable if the subscriber access section is considered the most "complex." Considering the accepted assumption, we describe the channel model as a single-phase queuing system of the G/G/1/R type. As noted above, data loss can also occur for technical reasons in case of failure of route elements. Consider a model that takes into account both of these factors. A formalized model of k routes between a plurality of users of a UE and an SP is shown in Figure 3.
In this model, each of the routes is a sequence of three elements, two of which (Q and t) describe the queuing system G/G/1/R, and element A takes into account the finite reliability of the route elements.
The probability of packet loss due to the finite size of the buffer in each of the routes will be described by an approximate expression [18]: where ρ is the load intensity and R is the buffer size. The value of S in expression (6) depends on the properties of the traffic flow and the service (transmission) time of the data packet: where C a is the coefficient of variation of the time intervals between the moments of arrival of traffic packets and C b is the coefficient of variation of the service (transmission) time of the data packet. For example, for the Poissonian flow model, S = 1.
We will also assume that the organization of the URLLC does not affect the size of the data packets in any way. Tak-ing into account this assumption, from expression (5), a similar expression will follow for the intensity of the load in each of the routes: Then, taking into account (6) and (8), we can estimate the probability of losses in each of the k routes: Taking into account (1) and (9), we can obtain an expression for the probability of losses in a system with k routes. Figure 4 shows the dependence of the probability of packet loss on the number of routes used for parallel packet transmission. In this model, it is assumed that routes are chosen randomly, i.e., traffic is equally likely (evenly) distributed between routes.

Wireless Communications and Mobile Computing
As shown in the above graph, the probability of losses due to a limited buffer size has a nonlinear dependence on the number of routes for sending a packet. This dependence is obvious, since an increase in the number of delivery routes k leads to an increase in traffic (as can be seen from (5)). With an increase in k to a certain value, a decrease in losses in accordance with the parallel model (1) dominates; however, further growth leads to a dominance of an increase in losses in the channel in accordance with the model (9).
Thus, there is a number of routes for which system losses are minimal. For the given example, the optimal values of k were 3, 4, and 5 for traffic with different properties: S = 1; 0:8 ; and 0:7, respectively, i.e., this value is different for streams with different properties.
It should be noted that the degree of influence of an increase in the number of routes on the probability of losses sharply decreases with an increase in k. Figure 5 shows the dependence of the coefficient of change in the probability of losses on the number of delivery routes.

Wireless Communications and Mobile Computing
As can be seen from the above graph, the greatest decrease in the probability of losses occurs when the second route is switched on (3 ⋯ 10 times). Moreover, the addition of the third route changes the probability of loss by less than half.
As noted above, equipment reliability also affects the likelihood of data loss. We will characterize the reliability of the data delivery route by its availability factor, i.e., the likelihood of a good condition. In turn, the probability of the healthy state of the route is determined by the probabilities of the healthy state of each of its elements.
The probability of packet loss due to technical failure can be defined as: where p d is the probability that data is transmitted through the network element and φ is the probability of route failure. p d is numerically equal to the traffic intensity.
Taking into account (8) and (11), we can determine the probability of packet loss in the ith route due to a technical malfunction as where φ i is the probability of failure of the ith route. Taking into account (10) and (12), we can calculate the probability of losing a data packet, which depends both on the number of traffic delivery routes and on the probability of a healthy state of each of the routes: where p ðUÞ i is the probability of data loss due to failure of route elements. Figure 6 shows an example of the dependence of the probability of losses (5) on the number of routes, provided that all the probabilities of route failure are equal p ðUÞ i = 0:05 ; i = 1 ⋯ 8. The choice of a relatively high probability of failure is due to the fact that the elements of the route can be not only the equipment of the telecom operator, which has relatively high-reliability rates, but also the devices of users in the case of organizing D2D connections; the performance of which can be limited by a simple discharge of the battery or a software failure.
From the given example, it can be seen that the probability of data delivery losses has a dependence on the number of routes, similar to (10) with a difference in numerical values, which is due to taking into account the reliability of the route.
The optimal number of routes is determined by the load intensity (8) and flow properties (7). For the given example with S = 1 ; 0:8; and 0:7, the optimal value is achieved at b ρ i = 0:56 ; 0:58 ; 0:80, respectively. As shown from the obtained results and the given examples, the optimal number of routes depends on the intensity and properties of the packet flow and is determined from the model (13). However, the solution to this problem is not expressed in the final form. Therefore, numerical optimization methods are necessarily required to solve this problem.

Experimental Environment and Software
To obtain numerical results, we used the Mathcad software for solving analytical modeling problems and the AnyLogic simulation system for solving multiconnectivity simulation problems. AnyLogic is a general purpose simulation system. It includes a system of discrete event modeling and allows you to create models of systems based on models of queuing systems. This system allows you to flexibly choose the degree of detail of the simulated processes, simulate random processes with different properties, and accompany the modeling process by collecting the necessary statistics. The system has a sufficiently high performance, which makes it possible to collect the required amount of data in a reasonable time and obtain sufficiently accurate estimates of the simulated processes. One of the undoubted advantages of this system is a very simple and intuitive user interface that allows you to implement the necessary functionality by combining a fairly rich library and the necessary additions in the Java Language, entered by the user.
The results shown in Figures 4-6 were obtained in Mathcad by analytical modeling. Figure 7 shows a simulation model built in AnyLogic. The model contains a source of user (users) traffic that is implemented on UE elements, RT1, ⋯, RT5, each of the routes is modeled by a queuing system consisting of a queue q1, ⋯, q5, and a delay element transfer1, ⋯, transfer5, simulating the process of transmitting a data packet. Both the observed traffic and the background traffic served by this direction, which are produced by the elements BT1, ⋯, BT 5, arrive at the inputs of the channels. The introduction of The example simulation shown in Figure 7 demonstrates the difference in delivery times. The upper histogram (blue bars) shows the probability density of the minimum delay for the route group. The bottom bar graph (red bars) shows the average delivery time for all routes. From the given example, it can be seen that the minimum delivery time for a group of routes is less than the average value by a factor of the number of routes (in this case, 5 times).
In the given example, the share of user traffic was 7% of the total traffic in each of the routes. In this simulation example, the parameters of the routes were identical.
The results obtained confirm the expression for the distribution of the minimum delivery time (3), as well as the above conclusions for the exponential distribution of the delivery time.
When the distribution of delivery time changes (if it differs from exponential), the results change. For example, with an increase in the probability of losses in each of the routes, the distribution of delivery time becomes different from exponential and the ratio between the minimum delivery time and its average value changes.
It should be noted that with an increase in the dependence between flows in different routes, the effect of decreasing the delay decreases. The effect of reducing the delivery time when using a group of routes is the higher and the higher the independence of flows, i.e., when the share of traf-fic is not large in relation to the total traffic of the routes, as well as when the routes themselves differ from each other.
The given model was also used to optimize the choice of the number of routes, which will be considered below.

Method for Choosing the Optimal Number of Routes
The problem of minimizing losses for equal values of the probability of failure p ðUÞ i and equiprobable distribution of traffic over k routes can be written as The analytical solution of the problem (14) is complicated by the fact that it is not expressed in the final form from (13). However, in practical problems, the number of possible routes k m ax is not too large, and then problem (14) is an integer optimization problem. In many cases, with real values of k m ax, it can be easily solved by a simple search of k.
For the general case and to improve the efficiency of searching for an extremum, it is proposed to use an integer binary search algorithm [19].
The proposed algorithm in Figure 8 consists of cyclically dividing the entire set of possible values of k and discarding the half in which there cannot be an extremum until the search range is narrowed down to a single value at the next iteration. The idea of the method is similar to the dichotomy method [19], with the difference that the search for a solution is performed on the set of integer values of the argument.
If the routes are not equal, for example, the probabilities of data loss due to the failure of route elements or the load are not equal to each other, then the choice problem arises. To solve this problem, it is proposed to use the dynamic programming method. In this case, the solution to the problem is not the number of routes k, but the set of route numbers. Theoretically, such a solution can be obtained for each data packet sent, but in practice, it will require too much computation. Therefore, in this case, it is proposed to consider separate traffic flows a i , i = 1, ⋯, n, where n is the number of flows produced by devices, and to find a set of routes for each of these flows R i .
In this case, the objective function looks like this: where R i = fr i , jg is the set of routes for serving the ith flow, r i , j is the jth route serving the ith flow, what is this symbol is the set of available routes, m i is the number of routes serving the ith flow, n is the number of flows, and k is the total number of routes and is the load of the gth route.
The proposed algorithm is shown in Figure 9. The given algorithm consists of executing two cycles: external-according to the number of flows; internal-according to the number of available routes. In the inner loop, a plurality of routes is selected to serve the flow. The choice of a route is made by its inclusion in the set of selected routes for a given flow, if its choice leads to the largest decrease in the value of the objective function. The objective function is calculated, taking into account the traffic of the flow in question.

Discussion
The results obtained show, firstly, that the use of multiconnectivity can increase the likelihood of data delivery and, secondly, reduce the delay in their delivery. The graphs shown in Figures 4-6 show that the nature of the dependence is preserved when changing the properties of the flow. All dependencies for different values of S have the same character. The probability of losses has a pronounced minimum (Figures 4 and 6), which indicates the existence of an optimal number of parallel routes, exceeding which, on the contrary, reduces the probability of delivery due to an increase in the generated load.
Thus, the effectiveness of the use of multiconnectivity is ensured only under certain conditions, which should be selected based on the requirements for the probability of delivery and the magnitude of the load on the communication directions. Such a choice requires performing a certain number of operations to analyze the state and make a decision, which is very problematic given that the time to complete this work should be significantly less than the allowable delivery time. We propose to use traffic analysis and forecasting methods for this purpose, as well as to use preprepared sets of routes. This will reduce the time spent on this work. The solution to this problem is a further direction of research in this area.   Wireless Communications and Mobile Computing

Conclusions
In this paper, we conducted an analysis that showed that multiple connections, obtained by increasing the number of delivery routes and "propagation" of data, provide increased reliability both by reducing the likelihood of data loss due to congestion and by reducing the likelihood of data loss. For technical malfunctions of route elements, we have also shown that multiple connections can reduce the average delivery time of data, and this approach is most effective when streams are independent, i.e., when the share of "useful" traffic is relatively small in the total traffic of the route. It was also shown that duplication (multiplication) of data and their transmission along several routes, along with a decrease in the probability of losses and the amount of delay, leads to an increase in traffic in the network, which in turn can lead to the opposite effect, i.e., increase the likelihood of losses due to traffic growth. We proposed a model of data delivery routes, described using a queuing system model with a combined service discipline, which allows us to estimate the dependence of the loss probability on the number of routes chosen for data delivery. As a result of the analysis, it was shown that with "equal" routes in terms of load (with an equally probable traffic distribution) and the probability of failure, the optimal number of routes could be found, at which the minimum probability of losses is achieved. To solve this problem, it is proposed to use the integer binary search method. It was also shown that with "unequal" routes, the problem of traffic distribution arises.
To solve this problem, it is proposed to consider traffic flows generated by users and select delivery routes for each of these flows. To solve this problem, we proposed to use a dynamic programming method, and the result is a set of delivery routes for each of the flows.
In ongoing and future work, the privacy and security requirements related to the communication channel will be discussed, in which blockchain technology will be utilized.

Data Availability
The datasets generated during and analysed during the current study are available from the corresponding author upon reasonable request.