Active RFID Attached Object Clustering Method with New Evaluation Criterion for Finding Lost Objects

An active radio frequency identification (RFID) tag that can communicate with smartphones using Bluetooth low energy technology has recently received widespread attention. We have studied a novel approach to finding lost objects using active RFID. We hypothesize that users can deduce the location of a lost object from information about surrounding objects in an environment where RFID tags are attached to all personal belongings. To help find lost objects from the proximity between RFID tags, the system calculates the proximity between pairs of RFID tags from the RSSI series and estimates the groups of objects in the neighborhood. We developed a method for calculating the proximity of the lost object to those around it using a distance function between RSSI series and estimating the group by hierarchical clustering. There is no method to evaluate whether a combination is suitable for application purposes directly. Presently, different combinations of distance functions and clustering algorithms yield different clustering results. Thus, we propose the number of nearest neighbor candidates (NNNC) as the criterion to evaluate the clustering results. The simulation results show that the NNNC is an appropriate evaluation criterion for our system because it is able to exhaustively evaluate the combination of distance functions and clustering algorithms.


Introduction
Radio frequency identification (RFID), which involves wireless communication of data to identify RFID tags attached to objects, is considered a key technology in the Internet of Things (IoT) field.In recent years, active RFID tags that use Bluetooth low energy (BLE) technology to communicate have attracted increasing attention.BLE is supported by many mobile operating systems (e.g., Android, iOS, and Windows Phone), and many smartphone products for finding lost objects that use BLE tags have been released.Products developed for finding lost objects use the received signal strength indicator (RSSI) to report the location of the object.However, these products cannot provide sufficient information to identify an object's position; that is, users only know that the lost object is within a certain range and whether it is moving closer or further away.The authors have studied a method to support user in finding lost objects more effectively.The authors hypothesize that users can determine the location of lost objects using information about the surrounding objects.In this paper, we introduce a method to calculate the proximity between active RFID tags using an RSSI series.Our approach enables the estimation of the group to which the lost object belongs from its proximity to surrounding objects using a distance function and hierarchical clustering.There are many combinations of distance functions and hierarchical clustering algorithms, and this method gives different group estimation or clustering results for different combinations, but there is no criterion for evaluating the clustering results.We propose the number of nearest neighbor candidates (NNNC) as the evaluation criterion.
The remainder of this paper is organized as follows.In Section 2, we describe the requirements of the proposed system and the problems faced in existing methods.Section 3 presents the framework of our system.In Section 4, we propose the evaluation criterion.Section 5 presents the results of evaluation of the existing method using the NNNC.In Section 6, we describe application of the NNNC.Finally, Section 7 concludes the paper and identifies areas for future work.The key is close to cup and PC The key is on the desk!⟨Sensing phase⟩ ⟨Finding phase⟩ Figure 1: A user is moving in a room with a smartphone.The smartphone senses data from RFID tags and sends this data to a support system for finding lost objects.The system records the data in a database and presents information about objects that are close to the lost object.The user can then identify the location of the lost object.

Support System for Finding Lost Objects
Finding lost objects is constantly required in peoples' day to day lives.According to published statistical research on finding lost objects [1], common strategies used for finding objects can be classified into five categories: the locus search (33%), exhaustive search (24%), retrace search (19%), memory search (11%), and delegation search (11%).The percentages in parentheses in the preceding list show the fraction of people selecting this technique when finding lost objects.From the locus search, in which the object is normally to be found, the retrace search, which is based on the sequential order of a person's prior physical locations, and the memory search, which is based on a person's recollection of prior interactions with the object, most people can be said to be trying to recall the location of a lost object from memory.We believe that if we are able to present a list of objects that may be located around the lost object, this will aid in the search and thereby compensate for the memory lapses experienced during the locus, retrace, and memory searches.

System Requirements.
As shown in Figure 1, our support system for finding lost objects functions in two phases: sensing data to estimate the group of RFID tags and finding the lost object using information about the proximity of objects.The RFID tags that only transmit beacons are attached to all personal belongings.In the sensing phase, when the user with a smartphone walks around indoors, the smartphone senses data, such as the IDs, measures RSSIs from the RFID tags, and logs the time of reception.The system records data from the smartphone in a database.In the finding phase, the user inputs the ID of the lost object, and the system estimates the group of objects that are near the lost object from the RSSIs.The user can determine the location of the lost object from the information about its group presented by the system.The basic concept of this system is similar to that of Konishi et al. 's system [2].Unlike his system, ours uses the RSSIs to estimate object groups and employs a smartphone to collect sensing data.In realizing such a system, we must consider the following requirements.

Input
The system must have access to the ID, RSSI series, and reception time from every RFID tag as input.

Output
The system provides information about groups of objects that are around the lost object as output.

Problems in Applying Existing Methods to the System.
Indoor tracking and localization is a key research issue in indoor applications such as routing and location services.Many studies have been conducted on methods obtaining location information about various objects.However, it is difficult to apply these existing methods to our system.There are numerous well-known metrics for localization systems, for example, angle of arrival (AOA), time of arrival (TOA), and time difference of arrival (TDOA), but none of these is suitable for smartphones.The AOA [3] measures the relative angle between transmitters from the direction of propagation of a wireless signal using an antenna array.This cannot be employed in common smartphones, as they do not have an antenna array.The TOA and TDOA [4] compute the distance between the transmitter and receiver by using the transmission time.They require accurate time synchronization between the transmitter and receiver for positioning.Therefore, it cannot be applied to our assumed environment where an inexpensive RFID tag is used for the transmitter.In contrast, no special hardware is required to measure the RSSI, and it can be obtained from all transmitters that communicate wirelessly.
Various location estimation methods that use the RSSI have been proposed.For instance, there is a well-known method that computes distance using the RSSI and a channel propagation model that has been created in advance [5][6][7].However, multipath fading and interference can cause the RSSI to fluctuate considerably.Accordingly, the computed distance has low accuracy; in addition, users have the burden of creating a channel propagation model for each environment.Location fingerprint methods [8][9][10] provide accurate position estimates by considering the RSSI from each point as characteristic of that location.Again, users have the burden of creating a characteristic database for each environment.The centroid method [11] and the approximate point-in-triangulation test (APIT) method [12] produce more cost-effective location estimates than the above-mentioned methods.These methods depend on the relative positional relationship between anchor nodes, which have a known position.Users then set up reference nodes in each room, with the estimation accuracy dependent on the number of reference nodes.Overall, therefore, finding lost objects using existing localization methods places a burden on the user.There is no existing method that is suitable for our system.

Method Using Distance Function and Hierarchical Clustering
To estimate the group of RFID tags using only RSSIs, we focus on the change in the RSSI values associated with the movement of the receiver.In free space, the RSSIs from RFID tags will decrease with distance according to the Friis equation.This can be expressed as where   (dB) and   (dB) are the gains of the transmit and receive antennas, respectively, in the device and the RFID tag,   (dBm) is the transmit power of the RFID tag, and FSPL() = 20 log 10 (4/) is the free-space path loss at the transmitter-receiver distance .If the transmit power of the RFID tag is constant, changes in the RSSI with respect to movement in the radial direction will follow the free-space path loss model, because the antenna gains are nearly constant.Accordingly, we consider changes in the RSSI associated with movement in the radial direction of the RFID tags to be similar (Figure 2).From the above, we aim to estimate the nearest neighborhood RFID tag to target the RFID tag attached to the lost object by converting the similarity of RSSI to proximity information.The present authors have presented a method for calculating the proximity of the lost object to those around it using a distance function between RSSI series and estimating the group by hierarchical clustering in prior work [13].Figure 3 shows a functional block diagram of the finding phase of the developed support system for finding lost objects.First, a similarity calculation is performed using a distance function.The system uses this to quantify the similarity of RSSIs in the RSSI series.Distance functions define the spatial or temporal difference between two elements in a set.The distances of the multiple elements are given in the form of a matrix, called the distance matrix.These functions are major components used in data mining techniques such as timeseries analysis.Therefore, a distance function is appropriate to our challenge, because it has the goal of measuring the similarity among time-series data.Second, groups of objects are estimated using hierarchical clustering.After measuring the relative distance between data in the RSSI series, the system forms clusters of RFID tags in a neighborhood.The hierarchical clustering algorithm exports the clustering result as a matrix called the cophenetic matrix.Finally, the results are displayed as a dendrogram, which is a common method of presenting clustering results.The details about different distance functions and clustering algorithms are shown in Appendix A. Dendrograms display the process of cluster generation and therefore enable the user to intuitively identify objects surrounding the lost object.
For instance, in the locus search, the list of objects around the lost object helps users to find the location of lost object.The information of location in which the surrounding objects are normally to be found makes it easy for users to remind the location of lost object.In addition, in the retrace search and memory search, the lists of time order help users to recall the sequential order of the user's prior physical locations and prior interactions with the object.

Evaluation Criterion of Group Estimation Accuracy
The methods to search for a nearest neighbor are divided into two categories: hierarchical approach and other.For example, approximate nearest neighbor [14] and locality sensitive hashing [15] are well-known methods for searching a nearest neighbor quickly in a large set of data points in high dimensional space in other than hierarchical approach.In addition, there is a method for attempting to increase the accuracy by combining multiple distance functions [16].The hierarchical approach uses distance functions and clustering algorithms to search for a nearest neighbor [13].There are many combinations of distance functions and clustering algorithms.Therefore, the criteria to evaluate clustering results in order to compare the combination of elemental technologies of hierarchical clustering are important.The cophenetic correlation coefficient is a conventional method to measure the stability of clustering results.It is defined as the Pearson correlation between the distance matrix and the cophenetic matrix.A value of 1.0 means that the concordance between the distance matrix and the clustering result is perfect.With the cophenetic correlation coefficient as a base, we expect that the Pearson correlation between the matrix of the actual distance of the RFID tag and the cophenetic matrix can quantify how well the clustering result reflects the actual position relationship of the RFID tags.However, two problems are encountered while using the Pearson correlation.First, it cannot evaluate whether the combination is suitable for the system directly.Our objective is to estimate the nearest RFID tag to the RFID tag attached to the lost object.The cophenetic correlation coefficient provides information only about linear relationships between the actual distance matrix and cophenetic matrix, and not the validity of the clustering result directly.Second, it is difficult to determine the threshold to define the goodness.For instance, it is not easy to determine if the calculation result of 0.75 is a good result.Therefore, considerable experimentation is required for defining the threshold to define the goodness.As mentioned earlier, the cophenetic correlation coefficient does not evaluate the correctness of clustering directly.There are some methods, such as Goodman-Kruskal gamma statistic [17] and Mantel test [18] to evaluate the clustering result too.However, they have the same problems as the cophenetic correlation coefficient.From the above, clearly, there is no method to evaluate whether the combination of elemental technologies is suitable for the system using hierarchical clustering.

Criterion of RFID Clustering Result.
As the application of existing methods is not suitable for evaluation in our study, we define the NNNC as a new evaluation criterion.A minimum value of 1.0 means that the clustering algorithm has estimated RFID tags in the nearest neighborhood relationship to be in one cluster firstly and the result is satisfactory.The NNNC may not take a 1.0 even if it is the best clustering result.A purpose of NNNC is to compare the elemental technologies by evaluating a clustering result based on the neighborhood between tags.The NNNC indicates the average number of candidates of the nearest RFID tag to each RFID tag.Hence, NNNC reflects the performance of finding lost objects of the combinations of distance functions and clustering algorithms.
Let   ( = 1, 2, . . ., ) be  RFID tags.We consider the nearest neighbor matrix  that takes a binary value (0 or 1).If there are  RFID tags, the matrix will have a size of  × .  represents the relationship between the nearest RFID tags by taking a value of 1 when   is the nearest RFID tag to   .
The correct nearest neighbor matrices  obtained from actual distance matrix  show the correct relationship between the nearest RFID tags.The estimated nearest neighbor matrices  which are obtained from cophenetic matrix  show the estimated relationship between the nearest RFID tags from clustering In hierarchical clustering, element refers to be classified.In this work, the subject is RFID tags.The elements merge in a cluster progressively according to an algorithm, eventually forming one large cluster.In the process, there are cases where an element merges in a cluster that comprises a plurality of elements.If the element is the RFID tag attached to the lost object, it means that the number of candidates to be considered for the nearest neighborhood RFID tag increases to the number of elements in the cluster.Therefore, we multiply the estimated nearest neighbor matrix by the number of elements Next, we multiply the estimated nearest neighbor matrices and the correct nearest neighbor matrices  =  × . ( The main diagonal   of  presents the number of candidates for the nearest neighbor RFID tag to   .When   = 0, it shows that the nearest neighbor RFID tag to   cannot be estimated from the clustering result.Therefore, the value of   is replaced by the number of RFID tags from the estimation.
Finally, the NNNC is calculated from the average of To support the recall of the location of a lost object, presenting as many objects as possible near the lost object is important.To achieve this, NNNC evaluates the clustering result based on the sequence of the merge cluster.Figure 4 shows an example of the calculation of the NNNC for both good and bad clustering results.The figure in the top right corner shows the actual position of the RFID tag.The figures in the center left and center right show the dendrogram and cophenetic matrix obtained by hierarchical clustering.The clustering result in the center right is a good result that correctly reflects the actual placement of the RFID tags.On the contrary, the clustering result in the center left does not reflect the actual placement of the RFID tags.In Figure 4, the nearest neighbor RFID tag to  4 is  3 .However, a bad clustering result shows that candidates of the nearest neighbor RFID tag to  4 are  1 ,  2 , and  3 .In addition, it does not show  4 as a nearest neighbor RFID tag to  3 .Therefore, the NNNC of a bad clustering result is increased compared to the good clustering result.We confirmed the validity of NNNC through simulation experiments.

Indoor Path Loss Model and RSSI Fluctuation.
Indoor path loss is necessary for considering fluctuation in addition to the attenuation due to free-space path loss.The shadowing, interference, and multipath fading have been said the main cause of the fluctuation of path loss [19].First, the shadowing effect has been modeled as a random variable following a zero-mean Gaussian distribution in the lognormal shadowing model [20].Second, we believe that the effect of the interference is random because we assume that a large number of terminals communicate randomly.Lastly, the fluctuation of the received power due to multipath fading can be modeled as a random value that follows the Nakagami-Rice distribution [21,22].Based on the above discussion, we simulated an environment where shadowing, interference, and multipath fading exist by considering   in the equation: where  is the transmitter-receiver distance and  is the path loss exponent.The intercept PL( 0 ) is the path loss in dB at reference distance  0 and is given by the free-space path loss PL( 0 ) = 20log 10 (4 0 /).  (dB) is a zero-mean Gaussian variable with standard deviation  and represents the shadowing, interference, and multipath fading effect.From ( 1) and ( 8), indoor RSSI is calculated as

Verification of the Validity of the Evaluation Criterion.
We verified the validity of the evaluation criterion by simulation using MATLAB.We made a virtual room and set three groups of two RFID tags that transmit a radio wave at fixed intervals.The receiver moved straight between two random points at a constant speed and the RSSI was calculated when the RFID tags transmitted the radio waves.Figure 5 shows the placement of the RFID tag, an example of a movement pattern and the calculated RSSI.Table 1 shows the parameters of the simulation.We created 10,000 movement patterns in random and checked whether the NNNC evaluates the clustering result as expected.For evaluation, we defined a score that reflects the number of groups within which RFID tags were placed in their expected group.In this simulation, we placed RFID tags in three groups (( 1 ,  2 ), ( 3 ,  4 ), and ( 5 ,  6 )).The score was increased by 1 for each clustering result that classified an RFID tag into the correct group.The

Bad clustering result
Good clustering result Equation ( 4) Equation ( 3) Placement of the tags in the real world Equation ( 4) Equation ( 5) Equation ( 6) Equation ( 6) Equation ( 5) Equation ( 7) Equation ( 7)   maximum score was 3 and the minimum score was 0. Figure 6 and Table 2 show one result of simulation.When clustering classified all RFID tags into the correct group, the NNNC was a minimum.The NNNC increased when the clustering result became unsatisfactory.The result shows that the NNNC is appropriate evaluation criterion to evaluate clustering results.

Exhaustive Evaluation of the Distance Function and Clustering Algorithm
In this section, we evaluate the combination of the distance function and the clustering algorithm using the proposed evaluation criterion to determine the group estimation accuracy of the method.Of special interest is verifying whether our system is immune to RSSI fluctuations.If the group estimation accuracy is high in environments where the fluctuation of RSSI is very large because of shadowing, interference, and multipath fading, then our system can be used in various environments such as offices, industrial facilities, and storehouses.In the evaluation experiment for the combination, the evaluation parameters are considered as follows: physical arrangement of the RFID tag, the movement pattern of the receiver, antenna pattern, and effects on the radio wave propagation path such as shadowing, interference, and multipath fading.

Simulation Result.
We simulated the radio wave propagation path, including the shadowing and Nakagami-Rice fading, using QualNet.The room size was 10 m×10 m, and 10 parallel RFID tags were placed randomly.The receiver moved in accordance with a random waypoint model; that is, the smartphone started at a random point in the room.Next, a random point was selected as the waypoint and the receiver moved to this waypoint at a random speed ranging between 0 and maximum speed.We used 10 RFID tag placements and 100 movement patterns for each RFID tag placement.Then, we simulated the RSSI series by changing the standard deviation of the RSSI fluctuations from 0 to 8 in increments of 2. The large value of 8.0 of standard deviation of the RSSI fluctuations is typically observed in industrial environments [23].Figure 7 shows RSSI fluctuations for different sigma values.It can be seen that the trend of change in the RSSI is eliminated by shadowing and the Nakagami-Rice fading.
The other simulation parameters are shown in Table 3.After RSSI simulation, hierarchical clustering of the RSSI series was performed for each combination of the distance function and the clustering algorithm, and the NNNC was calculated.Figure 8 shows that the change in the NNNC is associated with increasing RSSI fluctuations, indicated by increasing standard deviation of the RSSI fluctuations.Each plot exhibits an average of 1000 NNNCs.As can be seen from Figure 8, the Euclidean distance and complete-linkage, the unweighted pair group method with arithmetic mean (UPGMA), and Ward's method showed high group estimation accuracies.
Overall trends indicate that the NNNC increases linearly with increasing RSSI fluctuation.However, these three combinations restrained the increase of the NNNC.In particular, the combination of the Euclidean distance and Ward's method resulted in the lowest NNNC when the fluctuations were the largest.To evaluate the clustering result from the point of view of finding lost objects, we focus on the value of the NNNC in standard deviation of the RSSI fluctuation being 8.The minimum value of NNNC is approximately 5.5 when using the combination of Euclidean distance and Ward's method.
Figure 10 shows the change in the NNNC when the antenna pattern is as shown in Figure 9.The increasing tendency of the NNNC shown in Figure 10 is similar to the tendency shown in Figure 8.In addition, the difference between the NNNC values in Figures 8 and 10 was very small.Based on the above, we believe that the influence of the antenna pattern of the RFID tag is restrictive.

Application of the NNNC
In this section, we describe how users such as system designers use the NNNC.The NNNC is used for selecting elemental technologies of hierarchical clustering for a support system for finding lost objects before the system implementation.The procedure of the selection is as follows.
(1) The user obtains RSSI series data observed in the environment, where he knows the physical distance between tags and calculates the correct neighbor matrices.
(2) The user generates clustering results by different elemental technologies.
(3) The user evaluates the clustering results in terms of NNNC.
(4) The user selects the technologies indicating the minimum value of the NNNC.
We describe a scenario that applied the NNNC to finding a lost object system as an example of an application and  discuss how the design process increases the quality of the application.The combination of cosine distance and singlelinkage method shows the smallest NNNC, while the combination of Euclidean distance and Ward's method shows the largest NNNC in Figure 10.If the value of the NNNC is large, the number of nearest neighbor candidates is large so that it is difficult to find the lost object.Therefore, the combination of Euclidean distance and Ward's method is concluded as the best method.The result from all combinations shows that the NNNC increase to large values, including the best combination of Euclidean and Ward's method according to the noise increase.This is explained by examining the RSSI series observation data as in Figure 7.That is, it is understood that the correlation between RSSI series becomes difficult to identify on account of the increased noise.As a result, the two groups of nearest neighbors are clustered into the same group even by the best combination.The system designer could conceive the idea of applying a moving average filter to improve the combination.The moving average filter is a common filter for smoothing the time-series data while keeping important patterns and removing unimportant patterns such as noise.For example, Figure 7(f) is obtained by applying a 15-point moving average filter to Figure 7(e), and the combination of Euclidean and Ward's method with the moving average filter shows the smallest NNNC even under the heavy noise in Figure 10.As shown in this example, the NNNC is not only used for selecting elemental technologies, but is also used to improve technologies for increasing the quality of applications in finding a lost object.

Conclusion
In this paper, we introduced a method of finding lost objects in indoor environments using RSSI values and proposed a novel evaluation criterion.We assumed that users can determine the location of lost objects using smartphone applications that determine the proximity between active RFID tags.Our system alerts users regarding the position of a lost object by determining which objects are near the lost object using the RSSI series from the estimated group of RFID tags.The distance function, the clustering algorithm, and the effectiveness of their combination are very important to the successful operation of our system.Hence, there is a need to define a criterion that evaluates the combinations for comparison.The NNNC that we have proposed can evaluate quantitatively the most suitable elemental technologies for systems using the hierarchical clustering.By simulation, we confirmed that the NNNC can suggest the most suitable combination for finding lost objects.When we evaluated the suitability of existing popular distance functions and hierarchical clustering algorithms to our system using the NNNC, we found that the combination of the Euclidean distance and Ward's method yielded the highest group estimation accuracy.
The NNNC could be applied to a nearest neighbor search using hierarchical clustering such as group estimation in a crowd of people in an online-to-offline (O2O) scenario.For example, the movement record and history of visited stores could be considered as feature quantities used in the distance function.Cosine distance and single-linkage method Cosine distance and UPGMA method Cosine distance and complete-linkage method Correlation distance and single-linkage method Correlation distance and complete-linkage method Euclidean distance and single-linkage method Correlation distance and UPGMA method Euclidean distance and complete-linkage method Euclidean distance and UPGMA method Euclidean distance and Ward's method Euclidean distance and Ward's method 15-point moving average filter people in the estimated group to execute more effective online marketing.

Figure 2 :
Figure 2: Change in RSSI associated with the movement of the smartphone or RSSI sensor.

Figure 3 :
Figure 3: Functional block diagram of the support system for finding lost objects.

Figure 4 :
Figure 4: The calculation example of the NNNC.

Figure 5 :Figure 6 :
Figure 5: (a) Placement of the RFID tag and an example of a movement pattern.(b) Calculated RSSI.

Figure 8 :
Figure 8: The NNNC of each standard deviation of the RSSI fluctuation in each combination.The antenna pattern of the RFID tag is omnidirectional.

15 Figure 9 :
Figure 9: The antenna pattern of the RFID tags used in the simulation.
It would be possible to use the number of

Figure 10 :
Figure 10: The NNNC of each standard deviation of the RSSI fluctuations for each combination.The antenna pattern of the RFID tag is a measured pattern.

Table 1 :
Parameters of the simulation used to verify the validity of the evaluation criterion.

Table 2 :
NNNCs of the result of the cosine distance and completelinkage method.

Table 3 :
Parameters of the simulation used for exhaustive evaluation of the distance function and clustering algorithm.