Solving Sensor Ontology Metamatching Problem with Compact Flower Pollination Algorithm

To implement co-operation among applications on the Internet of+ings (IoT), we need to describe the meaning of diverse sensor data with the sensor ontology. However, there exists a heterogeneity issue among different sensor ontologies, which hampers their communications. Sensor ontology matching is a feasible solution to this problem, which is able to map the identical ontology entity pairs. +is work investigates the sensor ontology meta-matching problem, which indirectly optimizes the sensor ontology alignment’s quality by tuning the weights to aggregate different ontology matchers. Due to the largescale entity and their complex semantic relationships, swarm intelligence (SI) based techniques are emerging as a popular approach to optimize the sensor ontology alignment. Inspired by the success of the flower pollination algorithm (FPA) in the IoT domain, this work further proposes a compact FPA (CFPA), which introduces the compact encoding mechanism to improve the algorithm’s efficiency, and on this basis, the compact exploration and exploitation operators are proposed, and an adaptive switching probability is presented to trade-off these two searching strategies. +e experiment uses the ontology alignment evaluation initiative (OAEI)’s benchmark and the real sensor ontologies to test CFPA’s performance. +e statistical comparisons show that CFPA significantly outperforms other state-of-the-art sensor ontology matching techniques.


Introduction
To implement the co-operations among applications on the Internet of ings (IoT) [1], we need to describe the meaning of diverse sensor data with the sensor ontology and express them in a machine-interpretable way. As the kernel technique of the semantic web [2], sensor ontologies, such as the CSIRO sensor ontology (CSIRO) [3], semantic sensor network ontology (SSN) [4], and MMI device ontology (MMI) [5], have been widely used in the IoT domain. Although they own lots of overlapped information, the heterogeneity issue also exists among them. For example, a sensor concept might be defined with different terminologies or contexts, which hampers their communications. Sensor ontology matching is a feasible solution for this problem, which is able to map the identical ontology entity pairs [6].
To find the semantically identical sensor concepts, it is necessary to use the ontology matcher to measure the two concepts' similarity value. However, due to the limitations of the natural language processing domain, one single ontology matcher is not able to ensure its effectiveness in various heterogeneous contexts. e problem of how to comprehensively combine different ontology matchers to make their advantages and disadvantages complement each other, so as to enhance the final ontology alignment's quality, i.e., sensor ontology meta-matching problem, has attracted many researchers' attention. Fernandez et al. [7] propose a fuzzy theory-based method of aggregating the ontology matchers. Later on, the global and local ontology alignment extracting technique [8] is presented to find the alignment from diverse alignments obtained by different ontology matchers. eir proposal is able to take each correspondence's preference into consideration, which improves the alignment's quality. e generative adversarial network (GAN) [9] is used to iteratively combine different matchers. e Siamese neural network (SNN) [10] is also used to train problem-specific ontology matchers on the basis of the existing ones. e semisupervised learning-based method [11] first requires the expert to provide the partial alignment and then use it to train the Bayesian probability model and determine the rest of the alignment. Multiobjective evolutionary algorithm (MOEA) [12], coevolutionary algorithm (CEA) [13], evolutionary tabu search algorithm (ETSA) [14], co-firefly algorithm (CFA) [15], and differential evolution algorithm (DEA) [16] are also proposed to optimize the sensor ontology alignment. Particle swarm optimization (PSO) [17] is also used to determine highquality sensor ontology alignment, which introduces the simulated annealing strategy (SA) to improve the algorithm's performance by trading off the exploitation and exploration. Inspired by the success of swarm intelligence (SI) in the ontology matching domain, this work investigates a newly emerging SI algorithm, i.e., flower pollination algorithm (FPA) [18], which has successfully been applied in wireless sensor network (WSN) [19] to address complex optimization problem. To overcome the population-based FPA's disadvantages, such as slow convergence speed [20], this work proposes a compact FPA (CFPA) and uses it to address the sensor ontology meta-matching problem. In particular, CFPA uses a probability vector (PV) [21] to present the whole population, and on this basis, it stimulates the original FPA's search process. Since it does not need to tune any parameters and significantly simplify the populationbased FPA's evolving operations, which are helpful to improve FPA's searching efficiency. To be specific, the contributions made in this work are as follows: (1) we present the mathematical formula for the sensor ontology meta-matching problem; (2) we propose a problem-specific CFPA to efficiently address the problem, which uses the compact exploitation operator and compact exploration operator to mimic FPA's evolving process and an adaptive switching probability to trade-off CFPA's exploitation and exploration; and (3) we employ CFPA on ontology alignment evaluation initiative (OAEI)'s benchmark and the task of matching sensor ontologies. e results reveal that CFPA is able to efficiently solve the sensor ontology meta-matching problem. e rest of the paper is organized as follows: the sensor ontology meta-matching problem is defined in Section 2; CFPA is presented in Section 3 in detail; the statistical experimental results are shown in Section 4; and finally, Section 5 draws the conclusions.

Sensor Ontology Meta-Matching Problem
e ontology matcher measures two sensor concepts' similarity values with a real number in [0, 1]. e higher the similarity value, the more possible it is that two concepts are identical. In general, there are three kinds of ontology matchers, which are based on a string, linguistic, and ontology structure [22]. An ontology matcher calculates the similarity value by taking into consideration only one or two linguistic features, and thus none of them is able to ensure the result's confidence when facing different heterogeneous contexts. Usually, it is necessary to comprehensively aggregate their results, which is of help to enhance the final value's confidence. For the convenience of this work, a sensor ontology O is defined as a 3-tuple (C, P, R), where C, P , and R are respectively the sensor concept set, concept's property set, and the concepts' relationship set [23]. To overcome two sensor ontologies' heterogeneity issues, we need to find their entity mappings, and each correspondence is defined as 4tuple (e 1 , e 2 , simValue, rel), where e 1 and e 2 are two ontologies' entities, simValue is their similarity value and rel is two entities' semantic relationship [24]. In this work, we aim at finding the identical sensor concepts from two ontologies, and thus, a correspondence's rel is an equivalence. Given two ontologies O 1 and O 2 , an ontology matcher is executed to determine their corresponding alignment, which is a set of entity correspondences [25]. In this work, the alignment is denoted by a matrix with real numbers in [0, 1] as its elements, whose rows and columns are two entity sets, and its element is two corresponding entities' similarity value.
To combine these matchers, we assign the weights for their corresponding similarity matrices and then aggregate these matrices into the final one. e sensor ontology meta-matching problem investigates how to find an optimal weight set to determine a highquality alignment [26]. Here, we model the sensor ontology metamatching problem as a singleobjective optimization problem, which takes maximizing the alignment's quality as the objective. Given a sensor alignment, the more correspondences it has and the higher the mean similarity value of all the correspondences is, the better quality it owns. Based on this, we use the following two quality metrics on an alignment A: where |O 1 |, |O 2 |, and |A| are the number of two ontologies' entities and the correspondences in the alignment and sim i is i-th correspondence's similarity value. After that, we calculate two metrics' harmony mean to comprehensively measure the alignment's quality, which is defined as follows: 2 Wireless Communications and Mobile Computing On this basis, the mathematical model of the problem is defined as follows: where w i is the i-th weight of the ontology matcher's corresponding matching matrix and F(W) first uses W to aggregate all the matching matrices and then use the function f() to calculate the final matrix's corresponding alignment's quality.

Compact Flower Pollination Algorithm
FPA is inspired by the pollination of natural flowers, and its evolving process consists of two distinct operators, i.e., global pollination and local pollination, whose formulas are defined in the following equations: where t is current generation, x t i is i-th pollen in t-th generation, x t p and x t q are two neighbor pollens, x * is the best pollen found, and L is the step length that draws from Levy distribution [27]. FPA's exploration and exploitation are controlled by a switching probability sp ∈ [0, 1]. In each generation, for each pollen, FPA generates a random number in [0, 1] and compares it with p to decide the operation on it, and after that, FPA tries to update the best pollen. Classic FPA suffers from low converging speed, and to overcome this drawback, this work proposes a CFPA, whose main components, i.e., the encoding mechanism and exploration and exploitation operators, which are presented in the following sections, respectively.

Encoding Mechanism.
CFPA uses the gray code (GC) [28], a popular binary encoding mechanism, to encode pollen. To be specific, we use GC to encode the integers in [0, 100], and when decoding, we normalize all the integers to obtain the corresponding weights. For example, given four ontology matchers and we need to encode four integers in a pollen, assuming 20, 20, 40, and 80, and the aggregating weights for the matching matrices are 0.125, 0.125, 0.25, and 0.5, respectively. In this work, we utilize one PV to describe a population, whose dimension is equal to the length of pollen, and its element is the probability of being 1 on the corresponding bit of the pollen. In the beginning, all PV's elements are initialized as 0.5, which is updated at the end of each generation according to the best pollen found.
Given a PV (0.2, 0.4, 0.6, 0.8) T , generate four random numbers in [0, 1], e.g., 0.1, 0.3, 0.5 and 0.9 since 0.1 > 0.2, the first bit of new pollen is 1; similarly, since 0.9 > 0.8, the last bit of newly generated pollen is 0. When updating PV, if the value of the elite pollen is 1 (or 0), its corresponding PV's element will be increased (decreased), which can make the new pollen generated hereafter closer to the elite pollen. It is obvious that when all the probabilities are close to 1 or 0, the CFPA converges.

Exploration and Exploitation Operators.
e exploitation operator aims at searching for particular pollen's neighboring places, while the exploitation operator tries to search in an unexplored position. e pseudocode of the exploration and exploitation operators is shown in Algorithm 1 and Algorithm 2, respectively.
Here, we introduce the exponential crossover operator (EC) [29] to implement CFPA's exploration and exploitation operators. Given two pollens, EC randomly copies a certain number of sequential bits' values from the first one to the second one. Essentially, the obtained new pollen is generated by the turbulence on its parents, which is very exploitative. With respect to the exploration operator, we use EC to mix a newly generated pollen pollen new and the elite pollen pollen elite , while in the exploitation, we first mix two newly generated pollens pollen p and pollen q to obtain the mediate pollen, then we mix it with the pollen pollen new . Essentially, the exploration operator generates new pollen by moving it towards the global optima, and the exploitation operator moves the newly generated pollen to the direction determined by its neighbor pollens.

Pseudocode of Compact Flower Pollination Algorithm.
e pseudocode of CFPA is presented in Algorithm 3. CFPA first initializes all the elements of PV as 0.5 and then uses them to initialize the elite pollen pollen elite . In each generation, CFPA adaptively updates the switching probability sp and then uses it to decide whether to execute on exploration or exploitation. Here, sp is the probability of executing the exploitation operator. In the early phase, the algorithm mainly focuses on exploration, i.e., sp is large, while in the late phase, CFPA puts the emphasis on exploitation, i.e., sp is small. At the end of each iteration, CFPA tries to update the pollen elite and PV. Finally, when reaching the maximum iteration number max T � 3000, the algorithm terminates and returns pollen elite . Here, we update PV with the step that is determined by the pollen's length, and how to adaptively set the optimal step length for updating PV is one of our future research directions.

Experimental Setup.
e benchmark of the ontology alignment evaluation initiative (OAEI) [30] and the task of matching three real sensor ontologies are used to test CFPA's performance. In Tables 1-4, we compare CFPA with five state-of-the-art sensor ontology matching techniques, i.e., compact coevolutionary algorithm (CCEA) [13], compact evolutionary tabu search algorithm (CETSA) [14], compact co-firefly algorithm (CCFA) [15], compact differential evolution algorithm (CDEA) [16], and simulated annealing particle swarm optimization (SAPSO) [17], on all testing cases in terms of recall, precision, and f-measure, respectively. ree kinds of ontology matchers used in this work are the N-gram distance [31] (string-based ontology matcher), wordnet-based distance [32] (linguistic-based matcher), and profile-based distance [33] (structure-based ontology matcher), and the configurations of SIs are referred to in their literature. e results shown in the tables are the average of thirty independent runs.
To fairly compare with other matching techniques, we use recall, precision, and f-measure [34] to evaluate the obtained alignments. In Table 5, we briefly describe the testing cases used in the experiment, and in Tables 1-4, the testing cases 1XX, 2XX, and 3XX are the ones starting with the numbers 1, 2, and 3, respectively.

Statistical Experiment.
We utilize the statistical testing method T-test [35] to compare different competitors' performances in terms of recall, precision, and f-measure, respectively. Tables 1 and 2 show the six SI-based sensor ontology matching techniques' mean recall, precision, and f-measure, and the corresponding standard deviation on all the testing cases, and Tables 3 and 4, respectively, present the t value and p value on recall, precision, and f-measure.

Wireless Communications and Mobile Computing
As can be seen from Tables 1 and 2, CFPA's results are much better than those of other SI-based sensor ontology matching techniques.
anks to the adaptive switching probability, CFPA is able to better trade-off the algorithm's exploitation and exploration, which not only ensures the solution's quality but also the algorithm's stability. From (1) * * Initialization * * (2) generation t � 0; (3) set all elements in PV as 0.5; (4) pollen elite � generatePollen(PV); (5) * * Iteration * * (6) while t < max Tdo (7) sp � e − t/max T ; (8) * * * UpdatePollen * * * (9) if random(0, 1) < sp (10) pollen new � exploration; (11) else (12) pollen new � exploration; (13) end if (14) [winner, loser] � compete(pollen new , pollen elite );    Tables 3 and 4, except those testing cases where the results of CFPA and other competitors are the same, our approach outperforms other SI-based sensor ontology matching techniques on a 5% significant level. Since CFPS does not need to tune any parameters, it is more stable than other SIs. In addition, CFPA's adaptive switching mechanism and two compact evolutionary operators are able to significantly improve the algorithm's performance, which makes it efficiently search for better solutions.

Conclusion
To support communication among IoT applications, it is necessary to describe the sensor data at a semantic level.
Recently, sensor ontology has become a popular knowledge modeling technique in the IoT, which is able to provide semantic meanings for diverse sensor data. However, there exists the heterogeneity issue between different sensor ontologies, which hampers IoT applications' co-operation. Sensor ontology matching is a feasible solution to this problem, which aims to find identical sensor concepts at the semantic level. is work investigates a sensor ontology meta-matching problem, which aims to indirectly optimize the sensor ontology alignment's quality by tuning the weights to aggregate different ontology matchers. Inspired by the success of FPA in the IoT domain, we further propose a CFPA to efficiently address the sensor ontology metamatching problem. In particular, we introduce the compact encoding mechanism to improve the algorithm's searching efficiency and the adaptive switching parameter to trade-off the algorithm's exploitation and exploration. e experiment compares CFPA with five state-of-the-art sensor ontology matching techniques based on SIs, and the experimental results show that CFPA outperforms other SIbased sensor ontology matching techniques.
In the future, we will further improve CFPA to match the largescale sensor ontologies, especially at the instance level. We are also interested in further improving CFPA to match the specific ontologies in the biomedical domain and geographical domain. When dealing with largescale matching tasks, efficiency-improving strategies should be introduced, such as ontology partition and correspondence pruning. Also, the problem that how to choose the suitable background knowledge base to distinguish the complex entity correspondence also needs to be addressed.

Data Availability
e data used to support this study can be found in http://oaei. ontologymatching.org.  e symbols f t , r t , and p t , respectively, stand for the t-value on f-measure, recall, and precision.