A Dynamic Multitarget Detection Algorithm in front of Vehicle Based on Embedded System and Internet of Things

-ere are few studies for the classification detection and dynamic multitarget detection of the targets in front of vehicles. In order to solve this problem, a dynamic multitarget detection algorithm is proposed. First, a dynamic multitarget detection with displacement at any time is suggested; secondly, a multitarget detection algorithm based on improved You Only Look Once version 3 (YOLOv3) is proposed for the detection of multitarget high probability risk events in front of the vehicle. -e YOLOv3 algorithm model is a lightweight backbone network that uses embedded real-time detection technologies. In this paper, we use a lightweight Mobilenetv2 to replace Darknet-53 for feature extraction. Moreover, an optimizer is used for multiobjective feature extraction, group normalization, and multiobjective feature extraction. -e results show that in comparison with the original YOLOv3 algorithm, the detection leakage rate of the improved YOLOv3 multitarget detection algorithm is less than 5%, and the amount of model parameters in this paper is reduced by 95% as compared to the traditional data and CPU intertime is reduced to 78%.


Introduction
e introduction of smart Internet of ings technology within the traffic management system is essential in light of the rapid expansion of urban modernization [1]. It is inseparable from the support of developing innovations to effectively adapt to the complex and changing traffic management needs. Sophisticated computer system, Internet of ings innovation, telecommunication, automatic control technology, and other technologies have been used to create an intelligent traffic management system. Its features include real-time, accurate, and efficient monitoring. To achieve efficient traffic control, the objective of a smart transportation system is to create an evolutionary combination of three parts of traffic involvement: driver, vehicle, and road [2].
Target detections major function and task are to accurately locate the relevant target object information in an image using location and recognition. e main purpose of the Internet of things is to provide network capabilities to the equipment and supplies used in our daily life to form information networking and interconnections and maintain the coverage of an information network service [3]. Vehicle signal recognition and vehicle information image acquisition in the Internet of things mode mainly focus on dynamic multitarget acquisition, which is the mainstream improvement direction of multitarget detection technology algorithm at present [4]. Traditional target detection technologies, such as hog feature technology combined with support vector machine (SVM), have some problems: first, the region selection strategy of sliding window is lacking pertinence, high time complexity, and window redundancy [5]; the second problem is that the combination of manual design features and object detection has no good robustness [6]. Multitarget detection can achieve a technical breakthrough and improve the intelligent level of automobile driving faster by using an algorithm with international advanced experience, which is based on the advantages of embedded systems and the deep learning convolution neural network of Internet of things [7]. For using a unified algorithm architecture of pedestrian and vehicle joint recognition technology, it is necessary to adopt highly significant regional or redundancy strategies to select specific targets [8].
rough the multitarget online selflearning tracking and multitarget tracking method, the multitarget recognition in front of the vehicle can be realized, and the technology can automatically upgrade the recognition algorithm to make the system track the target more stably [9]. Some scholars have studied a basic network architecture of single-shot detector (SSD) network and visual geometry group (VGG)- 16. e assumption is replaced by residual network Resnet-26, which improves the detection algorithm and the real-time accuracy of traffic detection [10]. Some scholars applied the full convolution network technology to the three-dimensional scanning data to combine the threedimensional point dimension with the two-dimensional grid as the feature extraction method by forming different 2D endto-end full convolution networks through the candidate regions and detect the vehicle target and frame and achieved good results. Scholars of vehicle technology proposed to design a feature convolution kernel library composed of multiple forms and color Gabor [11] to train and screen the optimal feature extraction convolution kernel group by replacing the low-level convolution kernel group of the original network, so as to improve the detection accuracy [12]. In the upgrading iteration from single image detection to multitarget image detection, the adaptive threshold strategy is added to reduce the missed alarm rate and false alarm rate and realize the target detection in complex traffic scenes [13]. In addition, there are some innovative researches including traffic hazard's warning sign recognition technology, which turns target detection in complex traffic environment into a method of detection together with signs, forming an effective network combination method. e above studies solve and put forward relevant methods and specific improvement directions from different angles and technical levels. e main research value of the dynamic multitarget detection algorithm in front of the vehicle in the embedded system combined with the Internet of things lies in the following: a lot of research focuses on how to improve the algorithm, and the detection algorithm of multiple targets is upgraded. e improvement of multitarget detection technology must insist that the multitarget detection algorithm can meet the specific requirements and operation value. Combined with the traffic development trend, single category target detection can no longer meet the needs of the traffic scene in front of the vehicle. e complexity of all target detection has become complex and diverse, and the detection accuracy still needs to be improved. In the next few years, road traffic will be more complex, and it is necessary to find approaches of how to effectively and accurately identify the process of the development using the Internet of things. us, the Internet will become more critical when it comes to identifying multiple targets, such as a variety of different vehicles, mixed pedestrians and objects, electric motorcycles, and two-wheeled vehicles which often appear and become dangerous targets, so it is necessary to upgrade and iterate the multiobjective algorithm of vehicle automatic driving technology detection.
By considering the statement of the above problems and according to the analysis of this paper, the target needs to be inspected in front of the vehicle in a complex traffic environment and divided into vehicle dynamic target and vehicle static target. e dynamic target is the displacement in front of the vehicle at any time. e main body applicable to the road includes four-wheel vehicles and two-wheel vehicles. Cars, trucks, and buses are in category of four-wheel vehicles while bicycles, motorcycles, and people are placed in the category of two-wheel vehicles. ere are many potential safety problems for pedestrians and cyclists. e static target is about the target that the front of the vehicle will not be displaced. e road auxiliary reference is the traffic signal. e single-stage detection algorithm YOLOv3 algorithm is a basic algorithm aiming at the problem that the model is large and is not suitable for embedded devices; the interfree problem is improved during CPU detection. e rest of this paper is organized as follows. In Section 2, the basic principle of the proposed algorithm and its optimization is presented. In Section 3, the experimental results are discussed. Finally, this paper is concluded in Section 4.

Algorithm Principle and Optimization
e multitarget detection algorithm based on deep learning can be divided into regions with convolutional neural networks (r-CNN) series in terms of detection mode. Its multitarget detection method can adopt the two-stage algorithm via the single-stage detection algorithm represented by YOLO series. e two-stage detection algorithm concept is as follows: first, employ algorithm for location information [14]; secondly, produce classification for category information. According to the proposed model, the focus is on real-time detection [15]. e single-stage detection method introduces a novel concept: the dynamic multitarget image in front of the vehicle is transformed into network output [16] by returning the category of bounding box position in the output layer [17] and transforming multitarget detection problem to regression problem by improving the detection speed [18][19][20].

YOLOv3
Algorithm. YOLO is a series of algorithms. YOLOv3 is the third version of YOLO algorithm. e proposed research work has a network structure that describes the YOLOv3 algorithm's principle; the network structure diagram is shown in Figure 1.
YOLOv3 architecture does not use classic backbone network structures such as VGG-16 and Resnet-50. e ImageNet classification is used as the backbone network for object detection. ImageNet is recognized as authoritative datasets for evaluating the capabilities of deep convolutional neural networks. Many new networks are developed to improve the performance of existing networks. e 2 Scientific Programming proposed model produces a backbone network as the combination of Darknet-53 feature extraction method. e structure of network is shown in Figure 2. e structure of basic YOLOv3 has neither pool layer nor the full connection layer. e front propagation process is size transformation and is realized by changing the step size of convolution layer. e image edge is reduced to half size and the area is reduced to 1/4 of its original size. Using five samples, the characteristic image is 1/32 of the original image. e YOLOv3 algorithm adopts the fixed pattern noise (FPN) idea [21,22] in the construction by using multiscale targets of different sizes for detection, and as an output, three characteristics of different scales, 13 × 13, 26 × 26, and 52 × 52, are produced, which makes the detection effect of YOLOv3 more significant than that of Yolo algorithm.
By comparing the YOLOv3 algorithm with the fast r-CNN two-order detection algorithm, the former has obvious advantage in terms of detection speed, but there are some shortcomings: the trained model is large, not suitable for embedded devices, and the influence time in the detection under the CPU is high. In order to solve this problem, the following methods are adopted for the optimization of YOLOv3 algorithm. Model optimization is carried out through the aspects of backbone network optimization and model pruning optimization.
is paper considers two aspects of optimization, backbone network optimization adjustment and normalized optimization adjustment, as shown in Figure 3.

Lightweight Improved Model.
MobileNetv2 is utilized as the backbone network to replace Darknet-53 for feature extraction in the lightweight model's design. MobileNet has been optimized into Mobilenetv2 [23][24][25]. e model is useful for resolving the compatibility issue between mobile terminals and embedded devices. MobileNet builds depth neural networks using depth separable convolution, based on the streamlined design. In Figure 3, the first layer is a convolution layer, followed by depth convolution and pointby-point convolution layers. e used lightweight model architecture is the segregated deep neural network. (1) e number of inserted channels is M and the number of output channels is N. e corresponding calculation quantity is expressed as e modified standard convolution is K�(D K , D K , M, N). en, it is divided by depth convolution and point-by-point convolution. Specifically, depth convolution is responsible for filtering, and its size is (D K , D K , 1, M) and its output characteristic is (D G , D G , M). e point-by-point convolution is responsible for the conversion channel, and its size is (1, 1, M, N) and its output is (D G , D G , N). e convolution formula of depth convolution is In (3), K is the depth convolution and convolution kernel is (D K , D K , 1, M). Among them, m th is the application of convolution kernel in F and m th is the channel, G and m th outside. e calculated equation of depth convolution and pointby-point convolution is From the perspective of comprehensive calculation, the reduction of comprehensive calculation is more obvious.
rough the derivation of the above formula, it can be found that the amount of parameters can be significantly reduced by using depth separation convolution. us, MobileNet is improved in aspect of linear bottleneck and inverted residual block, as shown in Figure 4.  Scientific Programming extraction, the point-to-point convolution feature approach is used with deep convolution [26]. Because calculating the characteristics of DW convolution defines its ability to adjust the number of channels, flowing from the top layer to the output, the improvement primarily involves PW convolution added to DW convolution as illustrated in Figure 5. e number of channels flowing to the upper layer is small. DW can only extract features in low-dimensional space, which is not good. erefore, PW activation function can be added before each DW, generally linear bottleneck, because the activation function can effectively increase nonlinearity in high-dimensional space and destroy features in low-dimensional space. e function of the second PW is dimension reduction.

Normalization Adjustment.
In training of depth convolution neural network, the problems of network deepening, difficult training, and slow convergence arise [27]. BN and GN are shown in Figure 6. e gradient will disappear when the parameters reach to low level of neural network. Based on BN, the optimization algorithm adds the group normalization method to normalize the same group of the same feature map, while the group is divided into channel dimension. erefore, the normalization operation is independent from the size of the batch and avoids its influence.

Experimental Analysis and Visualization
e experimental platform is divided into two portions, the hardware platform and the software platform. NVIDIA GeForce GTX 1080ti (double) GPU; Intel Core i7-7700 CPU; 32 GB disc Tensorflow1.13.1 GPU deep learning framework; PyCharm Community IDE; Linux-Ubuntu 16.04; and 10 types of targets in the bdd100k data set are analyzed, Python script files are generated, and the seven types of target image data and label files studied in this paper are extracted, all with the goal of studying the dynamic multitarget in front of the vehicle.

Experiment Data Set.
e experimental datasets are shown in Table 1. e statistics of data set adopts instance object, BDD100 K, and team-test. Datasets in the experiment are car, bus, truck, bike, motor, rider, and person.

Visual Analysis.
Visual analysis is carried out using YOLOv3 algorithm and the optimized YOLOv3 algorithm, respectively. e loss change value of models is recorded, as shown in Figure 7. After a comparison analysis, it can be shown that using different algorithms introduced in the training process, the optimized algorithms curve change is steeper, the decrease in the curve descent process is slower, and the mitigation algorithm's effectiveness is lowered. Missed detection rate and false detection rate of the target detection test set in the field of unmanned technology environment perception are considered incredibly essential evaluation indicators in the target detection model, which directly increases the model's trustworthiness.

Visualization of Target Detection.
e visualization of target detection effect is shown in Figures 8 and 9, the algorithm YOLOv3 MobileNetV2 obtains YOLOv3, and the model visualizes the dynamic multitarget detection in front of the test object vehicle.
In the real scene detection effect, I Xian multitarget detection can be carried out, including all detection objects such as cars, trucks, bicycles, pedestrians, and motorcycles. e effect of multitarget detection in front of vehicles has been significantly improved, which is proved in experiments. e algorithm YOLOv3 MobileNetV2 obtains YOLOv3. e model has significantly improved the visualization of dynamic multitarget detection in front of the test object vehicle.

Conclusion
A dynamic multitarget detection algorithm in front of vehicle based on improved YOLOv3 is proposed in this paper. e lightweight network MobileNetv2 is used as a replacement of Darknet-53 to form the backbone network for feature extraction; the normalization and optimizer are adjusted to accelerate the convergence of the network. e optimization algorithm trained the two models and from the     Scientific Programming perspectives of loss visualization, missed detection rate and false detection rate, model size, and effect time, the fundamental algorithms are compared and analyzed. e experiments show that the optimization model constructed in this paper improved by 0.5% in the map, the parameter quantity is reduced to about 89% compared with the YOLOv3 basic model, and the influence time under the CPU is reduced to about 70%. e visualization of test set and actual scene detection qualitatively verifies that the proposed algorithm has a good detection effect for dynamic multitarget detection in front of the vehicles. From the current research in the field of intelligent networked vehicle environment perception, the image level research is using camera as a sensor, including target detection, target tracking, and semantic or instance segmentation. ere is still room for improvement in accuracy and speed, the common practice in the industry is sensor fusion, which matches and fuses the cloud information obtained by radar with the image information obtained by the camera to achieve higher perception accuracy. e significance of this paper is that it can optimize the speed of camera sensor target detection, saved computing time, and space for the subsequent fusion. In this paper, the algorithm is only for those embedded systems that are based on the environmental detection and has not been transplanted to the other embedded platforms, and the robustness of the model has not been specifically analyzed. e follow-up research will focus on this aspect.

Data Availability
e datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.