Intelligent Perception and Positioning Technology of Internet of Things by K-Nearest Neighbor Matching Algorithm

To study the intelligent sensing and positioning technology of the Internet of Things (IoT) combined with the K-nearest neighbor algorithm, the K-nearest neighbor matching algorithm and optimization algorithm are introduced using the indoor Wi-Fi positioning technology. The study proposes weighting K-nearest neighbor (WKNN) by weighted Euclidean distance, adaptive weighted Euclidean distance K-nearest neighbor Wi-Fi localization algorithm, and optimal K-value Wi-Fi fingerprint localization algorithm. The experimental error is verified. The experimental results show that the lowest error of continuous acquisition of 3 s signal values in experimental environment A is 1.8815m, which is 10.13% lower than the error of only acquiring 1 s for the same K-value. The lowest error of environment B scheme two can reach 1.8862, which is 7.06% lower than the error of the same K-value. The optimal K-value Wi-Fi fingerprint positioning algorithm by distance constraint has better positioning accuracy than other KNN positioning algorithms, and the positioning fluctuation is smaller. The average positioning error of the optimal K in environment A is 1.2987m, which is 0.2797m less than the average of the traditional positioning algorithm. In environment B, the average positioning error of the optimal K is 1.5353m, which is 0.3253m less than the average of the traditional positioning algorithm. Therefore, the optimal K-value Wi-Fi positioning algorithm proposed has better performance.


Introduction
With the rapid development of Internet information technology, the Internet of Things (IoT) transforms the physical world into a digital world through sensing devices. IoT realizes the connection between things and things and people and things [1]. Location-Based Service (LBS) [2] is the basis for IoT to perceive the network direction. It is divided into indoor and outdoor positioning. Outdoor positioning is usually achieved through satellite positioning and base station positioning. At present, Global Positioning System (GPS) [3] is the most widely used and the most mature technology. Base station positioning is applied to mobile phone users. Mobile phone base station positioning is used to obtain the location information of mobile terminal users through the telecommunication mobile operator network. Indoor positioning includes Wi-Fi positioning, Radio Frequency Identification (RFID) positioning [4], and visual positioning of image acquisition by sensors. Currently, Wi-Fi positioning is a relatively mature technology with many applications. Wi-Fi has been popularized. So, the experiment does not need to lay special equipment for positioning. This research is aimed at using Wi-Fi fingerprint data more effectively to locate objects. Cui et al. [5] proposed an improved adaptive genetic algorithm to optimize the BP (Back Propagation) neural network. In this method, genetic algorithm selection, crossover, and mutation operations are used to optimize the weights and deviations of the BP neural network. On the one hand, the algorithm improves the selection operator in the adaptive genetic algorithm by maintaining the optimal strategy. That is, the population of each generation is classified from high to low according to adaptability. Then, the highest 20% of the population will be passed on directly to the next generation, while the worst 20% will be eliminated. The remaining 80% of the population will be selected by roulette wheel selection according to the probability of each person's selection to ensure that the population remains unchanged. On the other hand, the crossover and mutation probability equations in the adaptive genetic algorithm are improved. Han [6] proposed a new millimeter-wave positioning system by millimeter wave. The system can meet the needs of IoT applications. From the system level, the positioning model has been established. With the widespread deployment of access point (AP) and the global popularity of smartphones, Wi-Fi-based indoor positioning has attracted great attention from scholars. In indoor environments, locating and tracking objects play an important role in IoT applications and services. However, due to the high instability of the signal strength received by the AP, it is a challenging problem to use Wi-Fi positioning technology to achieve high accuracy. Zhang et al. [7] proposed an AP selection algorithm by multiobjective optimization to improve indoor Wi-Fi positioning accuracy. The adaptive AP selection algorithm can be easily applied to various real scenarios. The performance of the new method is better than that of the classic algorithm. The learning algorithm is used to obtain the optimal solution of the adaptive AP selection algorithm. Research on IoT positioning technology is extensive. There are little researches on the combination of the K proximity algorithm, distance-constrained optimal K optimization algorithm, and indoor positioning technology. Research can provide new inspiration for indoor positioning technology and effectively improve the accuracy of indoor positioning.
The method of experimental research is used. By studying the principle of the K neighboring algorithm, the K neighboring matching algorithm is optimized. The innovation is that the K-proximity matching algorithm is used in indoor Wi-Fi positioning, and an optimal K-value optimization algorithm by the weighted Euclidean distance constraint is proposed. The algorithm improves the accuracy of indoor positioning, reduces positioning errors, and verifies the superior performance of IoT positioning technology in indoor positioning. The overall frame structure model diagram is shown in Figure 1:

Materials and Methods
2.1. The Architecture and Core Technology of the IoT. The prominent feature of the IoT is acquiring various information of the physical world through various sensing methods and then combining the Internet, wired network, and wireless mobile communications for information transmission and interaction [8]. The IoT uses intelligent computing technology to analyze and process information, thereby enhancing people's perception of the material world and achieving intelligent decision-making and control.
2.1.1. IoT Network Architecture. The architecture of the IoT is mainly composed of three levels: perception layer (perception control layer), network layer, and application layer [9]. The network layer is also called the transport layer. The specific structure is shown in Figure 2: The perception layer of the IoT mainly completes the collection, conversion, and collection of information. The perception layer is the basis for realizing the comprehensive perception of the IoT [10]. It is by RFID, camera, GPS, etc., using sensors to collect device information and RFID technology to achieve transmission and identification within a certain range. The network layer is mainly responsible for the safe and error-free transmission of the information collected by the sensors and transmits the collected information to the application layer [11]. The application layer mainly solves the problems of information processing and the human-machine interface [12].
2.1.2. Wi-Fi Positioning Technology of the IoT. The core key technologies of the IoT mainly include RFID technology and wireless network technology. Among them, the Wi-Fi indoor positioning technology in the wireless network technology of the IoT has the advantages of easy expansion, automatic data update, and low cost [13]. The principle of Wi-Fi positioning technology is that each wireless router access point (AP) [14] that converts a wired network to a wireless network has a unique media access control (MAC) address [15]. Generally, a wireless AP basically will not be moved. After the device turns on the Wi-Fi function, it can search for nearby wireless AP signals. Regardless of whether the wireless AP is encrypted or connected to the device, the MAC address of the wireless AP can be obtained. The device sends the obtained MAC address of the wireless AP to the location server. After the server receives the MAC address of the wireless AP, it can calculate or calculate the location of the device.
The location fingerprint method [16] is a widely used indoor nonranging algorithm. Due to the limitation of the indoor space structure, the signals in different physical spaces are not evenly distributed. If the characteristic information received by the wireless signals of different spatial geographic locations is used as the fingerprint of the current geographic location, the similarity between the fingerprint and the established spatial fingerprint database can be compared to obtain the coordinates of the spatial geographic location to be measured. This method does not need to understand the distance relationship between the AP point and the mobile terminal device but only needs to collect the fingerprint signal of the reference point in advance to establish a fingerprint database and perform positioning according to the AP intensity value to match [17]. The schematic diagram of the location fingerprint positioning principle is shown in Figure 3.
When the offline database is established, the indoor space to be measured should be divided into a reasonable number of observation points according to a certain size ratio, and the observation points should be selected as far as possible to cover the entire space to be measured. Then, the real-time AP value and coordinate position of the observation point are stored, a mapping group is established, and an offline fingerprint database is formed.

K-Neighbor
Algorithm and Optimization. Wi-Fi positioning usually uses the "nearest neighbor method" to judge [18]. There are usually four methods used for online positioning: K-nearest neighbor (KNN), weighting K-nearest  3 Wireless Communications and Mobile Computing (RSSI) [22] collected by mobile terminal equipment in realtime, and the RSSI vector sets of the points. Select the observation point coordinates with the smallest difference in Euclidean distance, and estimate the final position coordinates of the positioning point to be tested. The mobile terminal is used to obtain the signal strength value of each AP that is arranged before the positioning area at the point to be located. The received signal strength vector is expressed as n is the number of AP points that can be detected at the location to be located. i represents the location of the point to be located. The fingerprint data established in the fingerprint database are expressed as j represents the jth record. F j represents the sampling value of the jth observation point. The Euclidean distance between the signal strength of the point to be located and the signal strength of the observation point is expressed as Then, the coordinates of the observation point that is closest to the position of the jth point to be fixed are shown in the following equation: However, due to the large dispersion of indoor signals and uneven distribution, positioning by distance alone is not accurate enough. Therefore, the experiment arranges the Euclidean distance between the signal intensity collected at the point to be measured and the signal intensity in the fingerprint library in ascending order. The average value of the observation points with the smallest Euclidean distance before the experiment selection (K ≥ 2) is used as the final coordinate of the terminal device to be located, as shown in the following equation: 2.2.1. WKNN Positioning Algorithm. The WKNN algorithm takes the geometric center of the coordinate position of the observation point as the final result. The position of each coordinate is different, the distance between the observation points and the point to be located is also different, and the degree of contribution to the positioning is also different. Therefore, WKNN takes the distance between each observation point and the point to be located as a weighting factor. The smaller the Euclidean distance between the observation point and the point to be located, the greater the distribution weight. Then, the coordinates of the point to be positioned are expressed as dis j is the Euclidean distance between the observation point and the point to be located. K is the number of selected observation points with the closest Euclidean distance. Let ε ≠ 0, which is a small value set to prevent the phenomenon that the denominator is zero. The strength of the wireless signal received in the experiment increases with the distance, the smaller the signal strength, so the weighted weight factor and the definition of Euclidean distance are negatively correlated.

KNN Positioning Algorithm by Weighted Euclidean
Distance. Because the weight-based KNN positioning algorithm believes that the AP wireless signal strength of each point to be measured is the same weight, in this algorithm, the signal strength at different points is different, so this algorithm has drawbacks. WKNN with weighted Euclidean distance improves WKNN and gives relatively high weight to AP points with high signal strength received. Conversely, AP points with low signal strength are given low weight values. In this way, the accuracy of calculating the fingerprint similarity of the points to be located during the experiment will improve. Suppose the detected AP signal strength at a certain place waiting for positioning is expressed as ð RSSI 1 , RSSI 2 , ⋯, RSSI l Þ, where l represents the number of AP points, which corresponds to the fingerprint database. The fingerprint value of is ðrssi 1 , rssi 2 , ⋯, rssi l Þ. At this time, the weight corresponding to each AP is shown in the Then, the weighted Euclidean distance between the point to be located and the observation point is shown in the following equation: RSSI j stands for the wireless signal of the jth AP in the set of signal strength values received by the point to be located. rssi j represents the wireless signal strength of the jth AP received in the observation point.

Adaptive WKNN Positioning
Algorithm. Due to a series of uncertainties in the indoor environment, if K is fixed, then at the corresponding coordinates of the nearest observation point, the deviation from the positioning point is relatively large. Therefore, the proposed method proposes an adaptive K-algorithm weighted WKNN. First, the weighted Euclidean distance algorithm WKNN is used to select the best K-value from the global positioning error. Second, a reasonable distance threshold is selected according to the selected experimental scene, and the signal strength with the largest similarity is used as the reference, that is, the coordinate point with the smallest weighted Euclidean distance. Other observation points whose distance from the nearest first observation point is greater than the threshold are considered to be farther away from the coordinates of the point to be located and should not participate in the position calculation. The algorithm process is as equations (9), (10), (11), and (12).
The weighted WKNN is used to select the best K-value from the global positioning error, and the K-value is fixed and arranged in a descending order of the weighted Euclidean distance. The coordinates are expressed as Find the coordinates ðx 1 , y 1 Þ corresponding to the minimum weighted Euclidean distance of the point to be located among the K reference points. Then, calculate the Euclidean distance between this coordinate and the remaining K − 1 reference points, as expressed in the following equation: Among them, Under the condition of Equation (11), when dis 1K ≤ dis min , it means that the Kth reference point is near the first reference point. If the distance is far, it is considered that this point is relatively discrete and cannot affect the positioning and should be discarded.
Finally, find all the reference points satisfying the condition of t ≤ K as the usable points, and the average value of the points satisfying the condition is the coordinate position ðx,ŷÞ of the most necessary positioning point, as shown in 2.2.4. WKNN Positioning Algorithm with Optimal K. The WKNN positioning algorithm of adaptive K has been introduced above. Due to the influence of the complex internal environment, if the error itself between the coordinates of the most similar observation point is greater, the position error of the point to be located will be greater. Therefore, the proposed method designs the optimal K-algorithm by the weighted Euclidean distance algorithm. This algorithm takes into account the weighted coordinates obtained when K takes 1 to 6 when the weighted Euclidean distance algorithm isx Using the weighted Inverse Distance Weighting (IDW) interpolation algorithm [23], calculate the signal strength value under the weighted coordinate when K takes different values. In theory, the closer the weighted Euclidean distance is to the point to be located, the smaller the weighted Euclidean distance corresponding to the similarity of signal strength is. When K is a different value, the similarity between the signal intensity value under the weighted coordinate and the actual signal intensity of the point to be located is obtained. The weighted coordinate of K is the closest to the signal strength of the point to be located, which is considered the result of positioning ( Figure 5). The strength value of the RSSI signal is unique for each location within the site. The AP signal value corresponding to each sampling point is received by using the Rsscollect signal collector [24]. Before constructing the fingerprint database, it is necessary to preprocess the collected redundant data. Regardless of whether the indoor environment is within the visible range, the strength of the collected RSSI signal usually has the characteristics of a normal distribution. Gaussian filters are used to store high-probability data below 90%. The filtered signal strength value and its corresponding relative physical coordinates are stored in the fingerprint database to construct an offline fingerprint database [25].

Collection of the Signal Strength of the Observation
Point. First, choose a reasonable test site. Second, arrange the number of AP points to ensure that the Wi-Fi signal covers the entire test area, and place the AP points as asymmetrically as possible to avoid the influence of spatial symmetry of signal strength. In the experiment, the Android smartphone Vivo X9 equipped with a signal collector is used to collect the RSSI signal strength value of the observation point and record the corresponding physical coordinate value of the report point. To ensure the positioning perfor-mance, the handheld smartphone should be placed on the same horizontal line as much as possible during the signal acquisition process. When collecting observation point data, the three-axis acceleration sensor data displayed on the collector interface can be used to correct the posture of the portable mobile terminal. The acceleration of the x-axis and y -axis is theoretically 0. When the chest is placed horizontally, the acceleration of gravity is the z-axis. Therefore, when collecting the observation point signal, the real-time correction posture is used as a reference to reduce the posture error of the handheld mobile terminal. The experiment site is carried out in the aisles of two teaching buildings in a certain university. In experimental environment A, length is 80 m and width is 10 m. In experimental environment B, length is 30 m and width is 20 m. The site environment is shown in Figure 6: The signal strength is affected by pedestrian walking, interference of internal wireless channels, jitter of portable mobile terminals, and occlusion of internal objects. If only a set of experimental data are used to locate the point to be located, a large positioning deviation will occur. In a short period, to simplify the calculation, the measured data measured continuously for 3 seconds are averaged, and the signal value is collected as the positioning point at the intermediate moment.
Then, in the 48 s time in field A, the average value of every continuous 3 s is used as the collected RSSI signal value and this time is selected as the point to be located. There are a total of 16 points waiting to be located (Table 1). Similarly, test site B can obtain the coordinate values of 16 points to be located ( Table 2).
Unlike stationary single-point positioning, indoor mobile positioning of pedestrians places higher requirements on server performance, mobile terminal equipment, and positioning matching algorithms. In indoor positioning, limited by the direction and speed of human movement, the signal strength RSSI collection of the moving point is unstable. This will result in a decrease in positioning accuracy. Therefore, by the optimal K WKNN algorithm, the spatial proximity and temporal continuity between adjacent positioning points are used to process and optimize the positioning results through information such as the value of the acceleration sensor. Firstly, the real-time acceleration data between the points to be located is used to estimate the moving distance within 3 s. This distance is used as the spatial constraint relationship between the front and back position coordinates of the pedestrian. The optimal K WKNN 6 Wireless Communications and Mobile Computing algorithm is used to find the optimal position coordinates of the current point to be located. This coordinate is the center of the circle, and the moving distance within 3 seconds between the points to be positioned is used as the space constraint radius. Use it to correct the indoor mobile positioning of pedestrians using Wi-Fi location fingerprints at the next moment. In this way, refined positioning of mobile users can be achieved, thereby improving positioning accuracy. Assumption: when people walk indoors, they will produce acceleration values in three directions: horizontal, front and back, and vertical. These three acceleration values have periodicity in the time series. The acceleration value in the WKNN is used to select the best K value from the global positioning error   Due to the complexity of the indoor environment, the signal strength of the optimal K WKNN algorithm fluctuates up and down. This will affect the positioning accuracy. The sampling time interval is 3 s, so the points to be located need to meet the distance constraint of movement within 3 s. The built-in three-axis acceleration sensor of the mobile phone is used to calculate the pace and step length of pedestrians between the points to be located. The moving distance is used as a constraint condition, and the positioning result of the optimal K WKNN algorithm is corrected. Therefore, the overall positioning performance is improved. Low-pass filtering is used to preprocess the acceleration data, reduce the detection error rate, and eliminate the gravity part, as shown in the following equation:

Error Comparison of Different Collection Schemes.
Experimental environment A and experimental environment B use the signal strength collected in 1 s and the average value of the signal strength collected in 3 s to compare the positioning errors at different K values (Figure 7). Figure 7 suggests that whether it is experimental environment A or experimental environment B, the error of averaging 3 s values is significantly smaller than the error of only collecting a set of data. In experimental environment A, the lowest error of program two can reach 1.8815, which is 10.13% lower than the error of program one with the same K-value; in experimental environment B, the lowest error of program two can reach 1.8862, which is 7.06% lower than the error of program one with the same K-value.
It is not difficult to find in Figure 7 that for program one in experimental environment A, when K =2, the error effect is the best; for program two, when K = 4, the error effect is the best. In experimental environment B, for program one, when K = 3, the error effect is the best; for program two, when K = 4, the error effect is the best. Compare the error values of all the positioning points of the two as shown in Figure 8. Figure 8 reveals that compared to the first solution, the positioning error of most points to be measured in the second solution has been improved. Combining the results of Figure 5, the same conclusion can be drawn; that is, the solution of averaging the measured values of continuous 3 s has less error and the positioning performance is more stable. Therefore, the method of averaging the signal strength measured continuously for 3 seconds is used to determine the coordinates of the point be located.

Performance Analysis of WKNN Algorithm by Weighted
Euclidean Distance. The size of the weighted Euclidean distance is used to judge the similarity of signal strength. Normally, the signal strength decreases with the increase of the weighted Euclidean distance. Figure 9 shows the location error distribution diagram of the location to be located with the value of K between 1 and 6 in different experimental scenarios.
As Figure 9 shows, in experimental environment A, the first positioning point has the smallest positioning error when K = 2, the second positioning point has the smallest positioning error when K = 6, the third and fourth points have the smallest positioning error when K = 1, the fifth and sixth points have the smallest error when K = 3, and the seventh and eighth points have the smallest error when K = 1. To minimize the error from the ninth point to the sixteenth point, the value of K is different. The same is true for experimental environment B, and the value of the best positioning effect K for each point to be positioned is not unique.

Performance Analysis of Adaptive K WKNN Algorithm.
On the above algorithm performance analysis of the weighted Euclidean distance, it can be concluded that in the experimental environment A, when K = 4, the positioning effect is relatively stable and there is no significant jitter,  The adaptive K-algorithm has been introduced above. After the K = 4 with the best comprehensive positioning performance is selected, the coordinate point with the smallest weighted Euclidean distance is used as the reference, and the discrete value of the 4 observation points is removed. The location error comparison between WKNN by weighted Euclidean distance (K = 4) and adaptive K-value WKNN in the experimental environment is shown in Figure 10: Figure 10 presents that if the value of K is fixed, it will not be possible to make every point to be located have the optimal positioning effect. If the K-value is changed accord-ing to a certain rule so that different positioning points can independently select the optimal K-value, then the positioning accuracy of each positioning point can be improved and the error can be reduced.

Performance Analysis of the Optimal K WKNN
Algorithm. The optimal K-algorithm introduces a combination of weighted Euclidean distance for positioning since adaptive K. When the K-values of location point 2 and location point 3 of the experimental environments A and B are different, the corresponding weighted coordinates adopt the signal strength value calculated by the interpolation algorithm and the weighted Euclidean distance of the measured signal strength at the point to be located is shown in Table 3.  Table 3 suggests that when K = 5, the weighted Euclidean distance of anchor point 2 is the smallest. When K = 1, the weighted Euclidean distance is the smallest. Therefore, the K-values of anchor point 2 and anchor point 3 are K = 5 and K = 1, respectively.
The data in Table 4 show that when K = 2, the weighted Euclidean distance of anchor point 2 is the smallest. The weighted Euclidean distance of the anchor point is the smallest when K = 3. Therefore, the K-values of anchor points 2 and 3 are K = 2 and K = 3, respectively.
The experiment simulates and analyzes the improved algorithm and compares the error distribution of the positioning points of the three algorithms of the weighted Euclidean distance WKNN of the experimental environment A and the experimental environment B, the adaptive K WKNN algorithm, and the optimal K WKNN algorithm.
From Figure 11, it is not difficult to find that the optimal K positioning algorithm has the smallest fluctuation in the error distribution among the three algorithms. In experimental environment A, the average positioning error of the optimal K is 1.2987 m, which is 0.2797 m less than the traditional weighted Euclidean distance WKNN positioning algorithm. In experimental environment B, the average positioning error of the optimal K is 1.5353 m, which is 0.3253 m less than the average positioning error of the traditional weighted Euclidean distance WKNN positioning algorithm. Compared with the other two algorithms, the optimal K positioning algorithm is not particularly good, except that the positioning effect of very few points is not particularly good, the overall fluctuation is small and relatively stable, and the error is small, and the positioning is more accurate.
The positioning error comparison between the distanceconstrained optimal K positioning algorithm and the other three algorithms is shown in Figure 12.
In Figure 12, the average positioning error of the WKNN algorithm by the distance constraint of the optimal K in the experimental environment A is 1.5005 m. Compared with the optimal K WKNN algorithm, the positioning error is optimized by 0.1713 m. Compared with the adaptive K WKNN, the positioning error is optimized by 0.2195 m. It is 0.3978 m more optimized than WKNN by weighted Euclidean distance. Similar results are obtained in   11 Wireless Communications and Mobile Computing experimental environment B. Therefore, the optimal K indoor positioning algorithm by distance constraints has a smaller error value and better performance.

Conclusion
Indoor positioning technology in IoT is currently a hot research topic. The K proximity algorithm is used in Wi-Fi indoor positioning technology. Its positioning accuracy is high. The KNN algorithm is upgraded. The adaptive K -value and optimal K-value Euclidean distance constraint algorithm by the optimized WKNN algorithm by the weighted Euclidean distance is proposed. The optimal K algorithm has the best stability and flexibility in indoor positioning. Wi-Fi positioning has low cost and simple application, and the future of practical application in the future is worth looking forward to. Some shortcomings still exist: for example, the Wi-Fi signal has a certain degree of volatility. If the established fingerprint database is fixed and cannot be updated in real time, the positioning accuracy of the positioning point will be reduced. In the future, reasonable algorithms will be used to update the fingerprint database in real time. In addition, only the acceleration sensor is used to find the moving distance between the points to be located within the period. If a sensor such as a gyroscope is used to find the direction of movement, the positioning points with larger errors can be further eliminated. The gyroscope will be used to find the moving direction of the point to be located, eliminate the point with a relatively large error, and reduce the amount of calculation. The designed algorithm has not been applied to the actual scene. Later, the system may be applied to specific scenarios. Then, according to the actual scene, the experimental data will be analyzed.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declare that they have no conflicts of interest to report regarding the present study.

12
Wireless Communications and Mobile Computing