High-Performance Server-Based Live Streaming Transmission Optimization for Sports Events in Smart Cities

Smart cities allow cities to run more efficiently and have been approved by a lot of cities. During the process of building smart cities, a large amount of data is generated. Particularly, live sports events have been regarded as the inalienable part of smart cities. However, with the improvement in the quality of life, people tend to obtain better watching experience in terms of sports events. For such purpose, this paper proposes the live streaming transmission optimization method based on high-performance server, called HPTO, including two main modules, that is, high-performance server optimization and transmission optimization. Specifically, for the server optimization, this paper devises a distributed storage strategy to avoid producing the internal disk fragments and improve the writing efficiency of sports videos. For the transmission optimization, this paper devises a deeplearning-based video compression strategy to save the storage space of server and accelerate the transmission of sports videos. In addition, this paper makes simulation experiments based on PyCharm. (e experimental results show that HPTO has higher storage efficiency, smaller transmission time, and lower packet loss rate than benchmarks, which indicates that the proposed two aspects of optimization strategies (server optimization and transmission optimization) are efficient.


Introduction
Smart cities [1][2][3] usually apply modern network infrastructures and information techniques to improve the citizens' living quality and make cities run more efficiently. Recently, there have been some cities committed to building smart cities in the world, including Chicago, Milton Keynes, Busan, and Shanghai. With the continuous improvement of life standard, citizens like to watch the live sports with the satisfactory quality of experience. However, these behaviors will generate a large amount of data, which is expected to reach ZB-level. In fact, the live streaming transmission of sports events usually shows the concurrency feature [4,5], which has very high requirements on storage system and transmission path. In order to guarantee the sustained and steady work of smart cities during the process of watching sports events, two aspects of optimizations have been approved, that is, server optimization and transmission optimization.
For the server optimization, it refers to improve the storage system. e current storage system usually relies on file system management or raw disk design [6,7]. e filemanagement-based storage system will produce lots of internal disk fragments as the magnetic head moves frequently. Besides, the file system needs to maintain index information and attribute information, which handles the redundant information and goes against the storage of sports video data. Different from the file-management-based storage system, the raw-disk-based storage system directly performs reading and writing operations by the application programs, which greatly improves the efficiency of I/O. However, the live streaming of sports events has the concurrency feature, which causes the data storage spaces to be relatively scattered and thus produces the internal disk fragments. Especially when the malfunction happens, the storage system exits the large failure probability; that is to say, the raw-disk-based storage system has no high reliability. Different from the above, in this paper, the storage system of server is optimized based on the distributed storage method.
For the transmission optimization, it is consisted of two kinds of methods. One is the selection of transmission path, that is, routing [8]. However, smart cities usually depend on the backbone network, and the transmission path is usually assigned by service provider in advance. us, in terms of live streaming transmission of sports events, the optimization of transmission path has no obvious improvement effect on the transmission performance. e other one is the content compression; that is, the sports video is compressed into that with a smaller size, but all attributes are not changed. Video compression has two advantages. On one hand, the sports video can be transmitted in an effective way. On the other hand, the storage space of high-performance server can be saved to reduce its pressure so that the high-performance server works smoothly. e principle of video compression is to compile the original video through encoder [9,10], where the encoder has two functions, that is, (i) reading the signals of video and (ii) discerning and counting the residual signals. In this paper, the deep learning is used to complete the compression of sports video.
In smart cities, this paper proposes high-performance-serverbased live streaming transmission optimization method, named HPTO. HPTO's main contributions are summarized as the three following aspects: A distributed storage strategy is devised to avoid the generation of internal disk fragments and improve the writing efficiency, that is, server optimization A deep learning method is used to compress the sports video to save the storage space and accelerate the transmission e rich experiments are made, including neural network verification, video compression verification, and live streaming optimization verification e rest of the paper is organized as follows. Section 2 introduces the server optimization. Section 3 presents the transmission optimization. Section 4 reports the experiment results. is paper is concluded in Section 5.

Storage Structure of Sports Video.
To avoid producing the internal disk fragments resulting from the concurrent and stochastic writing of live streaming, this paper handles the sports video by designing high-speed buffer structure and disk logic storage structure. In addition, this paper also uses the buffer mapping policy, which belongs to the bcache-based hybrid storage technology to complete the connection between high-speed buffer and disk logic storage. To be specific, solid state disk (SSD) [11] does the special high-speed caching operation for the written live streaming, where the group of pictures (GOP) is used as the basic unit to allocate the fixed buffer segment for each channel's live streaming. e whole storage structure of sports video is shown in Figure 1. e high-speed buffer structure consists of superblock, buffer bitmap, and buffer segments. Among them, the superblock is a superfield that reports some parameters' information including creation time, buffer size, buffer number, and allocation condition. In particular, the file that superblock corresponds to is assigned as "0xEF53" completed when the formatting order is started. e buffer bitmap field is used to describe the service condition regarding the subsequent buffer segments. e remaining fields are buffer segments regarded as the basic unit to allocate and recycle the live streaming of sports video. In this paper, the size of buffer segment is set as 16 MB. In particular, when the remaining space cannot allocate a complete buffer segment, it is reserved. Besides, when the last GOP of live streaming is completely written, the corresponding buffer space is recycled. e disk logic storage structure consists of superblock, data block bitmap, primary index, secondary index, and buffer segments. In particular, the file that superblock corresponds to is assigned as "0xEF53" completed when the formatting order is started. e primary index covers several parameters' information including ID of live streaming, starting and ending time, bit rate type, and GOP. e secondary index is specially used to cover the detailed information of GOP. e buffer mapping strategy supports that multiple hard disk drives (HDDs) use the same SSD as the caching disk. To be specific, the "echo" statement is used to attach the "cset.uuid" of caching disk from high-speed buffer structure to disk logic storage structure; at the same time, the writing order is marked as "writeback."

Storage Management.
e raw disk usually refers to such special character-driven device without the formatting operation and it cannot be managed by Unix/Linux's file system [12]; thus the space management is very inflexible, and the corresponding on-demand enlargement requirement is very difficult to be satisfied. Given this, this paper leverages the logical volume (LV) to complete the enlargement of high-performance server. e whole storage management of high-performance server is shown in Figure 2.
To be specific, when the unallocated space of LV group (LVG) can satisfy the enlargement requirement, the required enlargement size is divided. en, the "rawdevice" installed and used upon it is bound to "/dev/raw/raw[ * ]." On the contrary, when the enlargement requirement cannot be satisfied, the additional physical disk is added and then such installation and binding operations can be started.
If the enlargement requirement cannot be satisfied by the above-mentioned enlargement method, the current file system will laterally increase the external high-performance server to reach the purpose of enlargement. In particular, the increased high performance submits its condition to the state manager through the heartbeat protocol.

Writing Method of Sports Video.
In this paper, the writing method of sports video is the single thread. At first, the live streams of sports video are arranged according to the time of request storage. en, they are scheduled concurrently via these buffers, where the single thread coding way is used to handle these concurrent live streams. e single thread coding way does the consecutive storage via the hugepage, which effectively avoids the time consumption due to waiting for addressing of multithreaded concurrency; therefore, the transmission efficiency is improved, and the storage time is decreased. In particular, a buffer is only used to store one channel's live streaming, which guarantees the consecutiveness of the physical storage space.

Transmission Optimization Based on Video Compression
3.1. Methodology. In smart cities, higher video compression proportion is needed as the transmission of live streaming is subjected to the limited transmission rate and storage space. In particular, the live streaming of sports video involves lots of images; that is to say, video compression is also called image compression. e current video image compression mainly compresses the intraframe data, that is, storing the previous video image information with less data [13]. is paper proposes a video segment compression method to realize the transmission optimization, where two key frames are extracted and used to store one video segment. In addition, on the video decoder side, the frame interpolation method is used to recover the previous video segment data. In particular, three-dimensional convolutional neural network (3DCNN) (a deep learning method [14,15]) is used to make the classification for these sports video segments by analyzing temporal information and spatial information of video frame sequences, in which there are three kinds of video segments, that is, radical change, gradual change, and ordinary change. In fact, 3DCNN has attracted attention for video information processing, since it introduces the time dimension innovatively on the basis of spatial dimensions to capture the contextual information between the different frames in the sports video.
In order to guarantee the efficient and accurate classification, a large enough surface dataset is very necessary.  Otherwise, it is considerably difficult or even impossible to train a high-precision 3DCNN. Given this, this paper presents a large enough dataset with 249621 sports video segments, including 135202 radical change segments, 103146 gradual change segments, and 11263 ordinary change segments. Meanwhile, each sports video segment consists of 16 image sequences, and the overlapping number of frames between two sports video segments is set as 8 in order to prevent the leak detection phenomenon from happening.

3DCNN-Based Video
Compression. e used 3DCNN has five convolution layers, three pooling layers, and three full connection layers. Among them, each convolution layer includes a rectified linear unit (ReLU) as the activation function with the local response normalization (LRN) operation. e first two full connection layers include 2048 neurons and the last full connection layers includes three neurons, where each neuron corresponds to one sports video segment. e detailed information of 3DCNN in this paper is shown in Table 1.
Furthermore, the ordinary change video segment's head frame and tail frame are used to express the whole sports video segment, and the whole length cannot exceed 32 frames.

Experiment Method.
In smart cities, the proposed HPTO is implemented based on Intel (R) Core (TM) i5-8500 CPU @3.00 GHz, RAM 8.00 GB, running on the Ubun-tul6.02 64-bit operation system. e programming language is Python, running on PyCharm. e verification of HPTO includes three aspects. At first, the 3DCNN-based deep learning method at the transmission optimization part is verified. In particular, two benchmarks [16,17] regarding CNN are used for the comparisons with evaluating recall ratio, precision ratio, and four F values (F1/F2/F3/F4). en, the video compression method based on 3DCNN is verified, in which one benchmark [18] regarding video compression is used for the comparison with bit rate evaluation. Finally, the whole live streaming transmission optimization scheme including high-performance server optimization and transmission optimization is verified. Meanwhile, two benchmarks [19,20] regarding live streaming optimization are used for the comparisons with evaluating storage efficiency, transmission time, and packet loss rate. In total, the five above-mentioned benchmarks are denoted by KumarB, LiuB, RaghaB, HeB, and LiB, respectively, and they are introduced as follows: Kumar et al. [16] did a comparative study on CNN for any real-time image classification and object recognition, where CNN had that much of ability to create optimized video image classifications and object recognitions Liu et al. [17] proposed a memristor-based 3DCNN to recognize and classify the behaviors of human in the video with 6 main actions Raghavendra et al. [18] devised different image compression techniques without any data loss He et al. [19] proposed an uncoded multiuser video streaming system by exploiting the diversities of video contents and channel conditions of multiple users Li et al. [20] presented a joint optimization method for conversational HD video service, taking into account the linkage between video coding and transmission Furthermore, the seven above-mentioned performance evaluation metrics are introduced as follows: e recall ratio rec is defined as where TP indicates true positives and FN indicates false negatives. e precision ratio pre is defined as where FP indicates false positives. F value is defined as When α � 1, F1 value is obtained. e bit rate is defined as the transmitted number of bits per second (kbps). e storage efficiency is defined as the utilization rate of high-performance server's storage space.
e transmission time is defined as the time difference between the timepoint when the first video segment of live streaming is sent from the high-performance server side and that when the last video segment of live streaming arrives at the decoder side. e packet loss rate is defined as the ratio of the lost number of video segments to the total number of video segments.
Mobile Information Systems

3DCNN Verification.
e experiment results on recall ratio based on six time simulations are shown in Table 2. We can observe that the proposed HPTO has the best recall ratio, followed by LiuB and KumarB. In particular, the recall ratio of HPTO can reach about 99%, increasing about 2% and 4.5% compared to LiuB and KumarB, respectively. HPTO and LiuB have higher recall ratio than KumarB, which results from the fact that they use 3DCNN structure to recognize and classify these live streams of sports video. Here, we emphasize that 3DCNN introduces the time dimension innovatively on the basis of spatial dimensions to capture the contextual frame information in the sports video and it has better performance than those traditional CNN structures. For HPTO and LiuB, the former presents the deep training based on Table 1, while the latter has no further improvement on 3DCNN structure. erefore, HPTO has higher recall ratio compared to LiuB. e experiment results on precision ratio based on six time simulations are shown in Table 3. We also find that the proposed HPTO has the highest precision ratio, followed by LiuB and KumarB. Similar reasons are found from the above statements.
Based on Tables 2 and 3, the average experiment results on F values including F1, F2, F3, and F4 are shown in Table 4. We can observe that, with the increase of α, the corresponding F value becomes smaller and smaller. As a matter of fact, the evaluation based on F1 has the highest reference value. In particular, the larger F1 value means that the corresponding strategy has better performance. As can be seen from Table 4, the proposed HPTO has the largest F1 value, which means that HPTO has the best classification effect on the live streaming of sports video.

Video Compression Verification.
is section considers six kinds of sports videos (NBA, CBA, German Bundesliga, Serie A, World Cup, and AOTC) and two kinds of encoding structures (H.264/AVC and HEVC). e average experiment results on bit rate are shown in Table 5. e improvement degrees on bit rate are shown in Table 6. We can observe that the proposed HPTO has an obvious advantage in terms of increasing the bit rate. In particular, the improvement rate of bit rate compared to the benchmark can reach 65.52% (Serie A) based on H.264/AVC encoding structure and 85.29% (World Cup) based on HEVC encoding structure, respectively. is further indicates that the proposed video compression optimization scheme is efficient.

Live Streaming Optimization Verification.
e experiment results on storage efficiency based on six time simulations are shown in Table 7. We can observe that HPTO has the highest storage efficiency, followed by HeB and LiB. In particular, the storage efficiency of HPTO can reach about 86%, but those of HeB and LiB can only reach about 79% and 72%, respectively. Different from two benchmarks, HPTO makes two aspects of optimization on live streaming, that is, server optimization and transmission optimization. To be specific, the server optimization improves the storage structure, storage management method, and writing method. In addition, 3DCNN is also employed to optimize the video compression at the transmission optimization part. In fact, 3DCNN structure and video compression scheme have presented the efficient experiment results, which can be found in Section 4.2 and Section 4.3.
e experiment results on transmission time based on six-time simulations are shown in Table 8. We can find that the proposed HPTO has the smallest transmission time, followed by LiB and HeB. In particular, HPTO has more than twice as transmission time as HeB; this is because HPTO increases the storage time and saves more storage space to accelerate the transmission of live streaming. Different from HeB, LiB presents a joint optimization method for conversational HD video service by considering the linkage between video coding and transmission; thus it has smaller transmission time than HeB. e experiment results on packet loss rate based on six-time simulations are shown in Table 9. We can find that the proposed HPTO has the lowest packet loss rate, followed by HeB and LiB, which means that HPTO will present the best watching experience in terms of the sports video. In particular, the packet loss rate of HPTO almost reaches 0%, which indicates that server optimization and transmission

Conclusions
In smart cities, the live streaming optimization of sports video is very important because it has a direct influence on the watching quality. In this paper, a live streaming transmission optimization method based on server optimization and transmission optimization is proposed. Meanwhile, the server optimization includes storage structure optimization, storage management optimization, and writing optimization method. e transmission optimization mainly depends on the video compression based on 3DCNN structure. e experiments include 3DCNN verification, video compression verification, and live streaming optimization verification, with evaluation of seven metrics, that is, recall ratio, precision ratio, four F values, bit rate, storage efficiency, transmission time, and packet loss rate. All experiment results show that the proposed HPTO has better performance to optimize the live streaming of sports video.
In the future, we will test more metrics and more applications. In addition, we also plan to make a real demo for the proposed live streaming transmission optimization mechanism by connecting some high-performance servers.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.