Similarity-based clustering strategy for mobile ad hoc multimedia databases

Multimedia data are becoming popular in wireless ad hoc environments. However, the traditional content-based retrieval techniques are inefficient in ad hoc networks due to the multiple limitations such as node mobility, computation capability, memory space, network bandwidth, and data heterogeneity. To provide an efficient platform for multimedia retrieval, we propose to cluster ad hoc multimedia databases based on their semantic contents, and construct a virtual hierarchical indexing infrastructure overlaid on the mobile databases. This content-aware clustering scheme uses a semantic-aware framework as the theoretical foundation for data organization. Several novel techniques are presented to facilitate the representation and manipulation of multimedia data in ad hoc networks: 1) using concise distribution expressions to represent the semantic similarity of multimedia data, 2) constructing clusters based on the semantic relationships between multimedia entities, 3) reducing the cost of content-based multimedia retrieval through the restriction of semantic distances, and 4) employing a selfadaptive mechanism that dynamically adjusts to the content and topology changes of the ad hoc networks. The proposed scheme is scalable, fault-tolerant, and efficient in performing content-based multimedia retrieval as demonstrated in our combination of theoretical analysis and extensive experimental studies.


Introduction
In recent years, mobile ad hoc networks have been increasingly popular in building temporary network connections in special areas, such as battlefields or disaster spots, where infrastructures are destroyed or too expensive to be built.An ad hoc network is a collection of cooperative mobile nodes that communicate with each other without the intervention of accessing points.These mobile nodes are capable of not only storing and processing data, but also performing complex operations through their communications, such as on-demand routing [4,5] or multimedia data streaming [2].Within the scope of ad hoc networks, most of the previous research focuses on the routing protocols that adapt to the dynamic network topology [6]; however, information retrieval is becoming a hot issue in a variety of recent applications [3,37].From the viewpoint of information retrieval, the organization of ad hoc networks can be classified into three categories [32].
Some earlier ad hoc network prototypes are linked to centralized data source nodes (data centers) that host constantly updated directories of data contents [4].Queries issued from the client nodes are resolved at the data centers and the results are forwarded back to the requesting nodes through unicasting.Such centralized organization does not scale well and has the single point of failure.Moreover, the data center behaves as a hotspot and its movement within the area could increase the network traffic.
The recently proposed ad hoc network frameworks are decentralized and have no data centers -the network is unstructured and mobile nodes form peer-to-peer connections to resolve queries through the cooperation with each other [7].Flooding is the most common approach for information retrieval in such ad hoc networks, since the requesting node does not have any information of the data contents of other nodes and has to employ the blind search.The flooding approach consumes system resources, i.e., storage, bandwidth, and energy, drastically and hence offers good performance only when dealing with text information [21].Considering the sheer size of the multimedia data, the performance deterioration is more drastic.In addition, the flooding strategy may cause duplicate queries and retrieval results, which may further increase the cost of the query processing [30].
To overcome the shortcomings of the blind search approach, alternatively, the literature has proposed the structured ad hoc networks in which the network topology does not change drastically [22,23].In such networks, the data objects are not allocated to the network nodes randomly, but at specified locations that will make subsequent queries easier to satisfy [3].Such design improves the efficiency of information retrieval in some cases at the expense of the flexibility and scalability of the ad hoc networks.In practical applications, the network topology and the data contents of the mobile nodes are constantly changing, which increase the difficulty of efficient multimedia data retrieval.
Motivated by the aforementioned challenges, we introduced a semantic-based, self-stabilizing scheme that uses an overlaid infrastructure to organize multimedia data sources in an ad hoc network.The fundamental idea is to organize multimedia data based on concise and abstract description of their semantic contents, and cluster data sources with similar data contents.With the help of the proposed scheme, the multimedia data retrieval in ad hoc networks becomes a clearly aimed searching process that offers reduced network traffic, energy consumption, and response time, regardless of the distribution, heterogeneity, and autonomy of the multimedia data sources.
The remaining part of this paper is organized into seven sections: Section 2 introduces the background information and related work.Section 3 outlines the preliminary concepts of the system model.Section 4 introduces the clustering of mobile nodes according to their data contents and presents the dynamic adjustment of the clusters in accordance with content and topology changes in an ad hoc network.Section 5 analyzes the performance of the proposed scheme.Section 6 evaluates the proposed scheme with experimental results.Finally, Section 7 draws the paper to conclusions.

Multimedia content extraction
Due to the sheer size of multimedia data, considerable research work has been done on extracting semantic contents that concisely describe multimedia objects [8][9][10].In this work, the semantic contents can be obtained through the recognition of objects [34], which is performed in two phases: -The multimedia object is first partitioned into several segments, which indicate the most significant visual/audio components [36], and -The semantics of the segments are obtained through Latent Semantic Analysis (LSA) [8,10,11].
The segmentation method employed in this work is similar to the binary-partition-tree approach [9].Similar pixels are merged together as homogeneous regions, and these small regions are recursively combined into segments of the multimedia data object.
The low-level feature representation obtained from segmentation cannot represent the semantic contents, and therefore does not provide an ideal basis for semantic-based retrieval.Hence, the LSA is employed to obtain the semantics of these segments.In this work, the data set of the LSA includes two sets of entities: the objects whose semantics are already known (training samples), and multimedia segments whose semantics remain unknown [8].The LSA uses Singular Value Decomposition (SVD) to uncover the hidden semantic relationships (e.g.synonym and polysemy) between data objects [11]: the multimedia segments are classified into proper semantic categories after the singular value decomposition, and are assigned with proper semantics [13].In addition, the latent semantic information obtained in the analysis process can be trained to show the personalized concepts from users [11], improving the accuracy of content representation.

Content-based retrieval
Based on the method in which multimedia data contents is extracted, the literature has recognized several content-based retrieval techniques [14][15][16]35,37].The traditional centralized content-based retrieval approaches are based on feature representation of multimedia data that can be categorized as three classes: -The partition-based retrieval approaches recursively divide the multimedia objects (or k-dimensional feature spaces) into disjoint partitions, with clustering or classifying algorithms, while generating a hierarchical indexing structure on these partitions.The earlier models in this class include Quadtree [14], K-D-tree [15], and VP-tree [15].The recent research has focused on models based on clusters [16,17].Because clustering methods usually incur tremendous requirement of system resources, several schemes were proposed to reduce the dimension of feature vector space, thereby alleviating the system load [17].The recently proposed dimension reduction methods include locality preserving projection (LPP) [38] and heavy frequency vector (HFV) [17].The partitionbased approaches normally employ very complex computations on features [15], which make these models inefficient for real-time multimedia applications.Moreover, the partition-based models do not consider feature space of different dimensions -i.e., multimedia objects from heterogeneous data sources [2].-The region-based retrieval approaches employ small regions (either in the form of Minimum Bounding Rectangles or Spheres) to cover all the multimedia objects.Based on these bounding regions, a balanced tree is constructed as the indexing structure.This class of indexing models includes the R-tree and its variations (e.g.R * -tree, R + -tree, and SR-tree) [18,35].Relative to the partition-based approach, the region-based methods improve storage utilization by avoiding forced splits.However, due to the shapes of the bounding regions, the region-based approaches have the born weakness of overlapping [18], which deteriorates the searching performance.Furthermore, the data objects grouped in the same region may not share common semantic contents.-The structure-based retrieval approaches use some fully-structural or semi-structural representation models to imply the semantic contents and to facilitate content-based multimedia retrieval.The literature reports research work on 3-dimensional modeling of multimedia data objects, such as B-spline-based facial expression [12] and cross-modal representation [8].Some research work also proposes prototypes that enclose the semantic features of data objects, such as XML-based mobile image representation and retrieval [37], while some others present logic-based or hierarchical organization of multimedia data objects [13].Although these models achieve good performance in their defined domains, none of them have considered multimedia data manipulation in a unreliable and resource-restricted environment such as an ad hoc network.
The disadvantages of centralized strategy have motivated the research on various decentralized information retrieval models for ad hoc networks [19][20][21].Some recently proposed models in the literature try to make use of data distribution to facilitate information retrieval [22][23][24].Content-Addressable Network (CAN) model was proposed as a decentralized infrastructure that provides hash-table-like functionality on large-scale distributed networks [22].A Distributed Hash Table (DHT) is employed to map the data objects into a k-dimensional logical Cartesian space, which allows the storage and retrieval of (key, object) pairs.Further improvements of CAN have led to the logical overlay networks (e.g.pSearch [23] and SSW [24]) that use dimension-reduction techniques to reduce search cost.These content-navigated retrieval models are based on the feature vector representation of object contents [23], which is inefficient in describing multimedia data: In practical multimedia database systems, the feature vectors are often of fixed sizes to facilitate the computation and representation.However, in many instances, some features may be null.The null features do not contribute to the semantic contents of multimedia objects; however, they still occupy space in the feature vectors -hence, lower storage utilization.
In an ad hoc network with heterogeneous distributed data sources, traditional centralized retrieval methods may not work efficiently.Due to the node mobility and dynamic topology, the centralized strategy may require extra network traffic to locate the data sources [5].In many cases, the extra traffic load is formidably high to ad hoc networks, considering their limited bandwidths and computation resources.Moreover, the centralized data server behaves as the single point of failure, which deteriorates the robustness and scalability of the system.

Object-Based Indexing Infrastructure -Summary-Schemas Model (SSM)
The SSM was originally proposed as an object-oriented search engine that allows imprecise access to text information in a multi-database environment [25].It is a global layer located on top of multiple preexisting autonomous and heterogeneous local databases.The SSM comprises three major components: a thesaurus, the local nodes, and the summary-schemas nodes (Fig. 1).The thesaurus defines a set of standard access terms, the semantic categories they belong to, and their semantic relationships.It may utilize any of the off-the-shelf thesauruses (e.g.Roget's thesaurus) as its basis.A Semantic Distance Metric (SDM) is defined to provide quantitative measurement of "semantic similarity" between user query and the database contents [25,26].A local node is a physical database containing physical data.A summary-schemas node is a logical data entity, a meta-data (summary schema), representing the concise and abstract contents of its children's schemas.Fewer terms are used to describe the information contents of a summary schema than the union of the terms in the input schemas while capturing the semantic contents of the input terms.
The SSM organizes data objects in a hierarchical fashion based on synonyms, hypernyms and hyponyms relationships.Synonyms are semantically similar objects in different data formats or same-format objects in different physical locations.The SSM employs synonym links to connect and cluster the "similarcontent" data objects together.A hypernym is the generalized description of the common characteristics of a set of data objects.To find the proper hypernyms of data objects, the SSM checks the on-line thesaurus to obtain the mapping from data objects to hypernym terms.The higher-level hypernyms that describe the more comprehensive concepts can be obtained from abstracting the hypernym terms of data objects.Recursive application of hypernym relation generates the hierarchical meta-data of the SSM.This in turn conceptually gives a concise semantic view of all the globally shared data objects.As the counter concept of hypernym, a hyponym is the specialized description of the characteristics of data objects.In the SSM prototype, the hyponym links compose the routes from the most abstract descriptions to the specific data objects.More details about the SSM can be found in [25][26][27].

Extended Summary-Schemas Model (ESSM)
As mentioned before, the SSM was originally proposed for efficient management of textual and traditional databases in a wired multi-database environment.The original SSM prototype cannot be directly applied in accessing multimedia data in ad hoc networks due to several challenges: 1) Unlike textual and conventional databases, the multimedia data usually contain more complex semantic attributes (e.g.color, texture, and annotations) that cannot be represented by conventional data models; 2) The underlying fixed and predefined network topology of the SSM does not accommodate the infrastructurefree and dynamic nature of ad hoc networks.
Considering these limitations, we proposed an ontology-based framework -Semantic Ad-hoc Image Retrieval (SAIR) -that employs the linguistic characteristics of images to facilitate content-based retrieval in ad hoc networks [33].In this paper, we strengthened the representation capability of the SSM, and proposed a clustering strategy for organizing multimedia data in ad hoc networks.The proposed scheme is called Extended Summary-Schemas Model (ESSM).The major properties of the ESSM can be summarized as follows: -The semantic contents of multimedia data objects on each mobile node are extracted and represented using concise data-content-distribution expressions.The distribution-based representation approach we proposed in this work is capable of accommodating heterogeneity of the data sources.The details of this representation approach will be discussed in Section 3.2.-The mobile nodes are partitioned into clusters based on the semantic similarity of their data contents.
The mobile nodes with similar data contents are called Content-Related Nodes.And the clusters are denoted as Content-Based Clusters.These clusters are recursively aggregated together to form larger clusters.More detailed description will appear in Section 4.1.

Multimedia content representation
To represent the contents of multimedia objects in ad hoc networks, we present a novel semantic-based scheme, which can be briefly described as follows: Given a set of multimedia data objects distributed among a collection of mobile nodes, our goal is to find the data content distribution over the mobile nodes and set up the content-based relationships among these nodes.
The ad hoc network can be considered as an undirected connected graph G = (N, C), where N = {n 1 , n 2 , . . . ,nr} is the set of mobile nodes and C = {c 1 , c 2 , . . ., c s } is the set of wireless connections between mobile nodes.X = {x 1 , x 2 , . . . ,x m } is the set of multimedia data objects.Each data object xi is represented as an n-dimensional semantic vector The elements in the semantic vector could be: manually added keywords, automatically extracted features, or relevance feedback generated by users.The distribution pattern of the data objects over the mobile nodes can be considered as a many-to-many relationship, i.e., each data object may be distributed among multiple mobile nodes, and each mobile node may contain a collection of data objects.
Figure 2 illustrates an ad hoc network and the data objects distributed among the mobile nodes.The multimedia data objects, represented as small circles, are considered as data points in the semantic space.Each mobile node ni in the network may contain a set of data objects, denoted as χ(n i ) = {x i 1 , x i 2 , . . ., x i r }, from the semantic space.The information contents of a node collectively represent an area in this semantic space (in Fig. 2, this is represented as a rectangle).As can be noted from Fig. 2, in this information space, mobile nodes may contain overlapping data (duplication) or semantically overlapping information.For example, both node A and node B contain the data objects in the semantic region of χ(A) ∩ χ(B).Generally, if an object x j is distributed over a collection of mobile nodes, we use ψ(x j ) to denote the set of mobile nodes containing the replicas of x j .

Definition 1. The Semantic Similarity
The semantic similarity between two data objects is defined based on the Euclidean distance between their corresponding semantic vectors.Formally, the distance between two data objects x i and x j is defined as a cosine distance function dist(x i , x j ) = cos

Retrieval formalization
The content-based multimedia retrieval (nearest-neighbor query) in an ad hoc network can be described as follows: Given the set of multimedia data objects X ⊂ R n and the ad hoc network G, for a specific integer constant k and a given query object x q , return k data objects ) is always satisfied.Formally, the nearest-neighbor retrieval is defined as:

Definition 2. The Nearest-Neighbor Retrieval
Given an object set X = {x 1 , x 2 , . . ., x m } and a query object x q , the nearest-neighbor retrieval of x q within X, denoted as Ξ k (x q , X), is the following set: The n-dimensional semantic space of data objects The ad hoc network Fig. 2.An illustrative example of the relationship between data objects and nodes.
In the context of ad hoc networks, the resolution of nearest-neighbor query may cause flooding and forcing pair-wise comparisons of data objects in each mobile node.The flooding approach is resource intensive and hence may not be applicable in real-time applications.Therefore, alternative approaches should be devised to perform the content-based multimedia retrieval with optimized search cost.

Definition 3. The Range-Constrained Nearest-Neighbor Retrieval
Given a data object set X = {x 1 , x 2 , . . . ,x m }, a query object x q , and a distance threshold d, the range-constrained nearest-neighbor retrieval returns the following data set: Given a multimedia data object set X = {x 1 , x 2 , . . ., x m } and a query object x q , there always exists a distance threshold d, satisfying Ξ k (x q , X) = Ξ k d (x q , X, d).Proof: For a given distance threshold d, the multimedia data object set X is divided into two partitionsthe objects whose semantic distance to x q is less than d (i.e., within the sphere centered at x q with a radius d) and the objects outside the sphere.By adjusting the distance threshold, a sphere that encloses exactly the same data objects in Ξ k (x q , X) can be obtained.In other words, given the multimedia data object set X, the distance threshold can be considered as a function whose parameter is the number of nearest neighbors.Therefore, the threshold d satisfying Ξ k (x q , X) = Ξ k d (x q , X, d) always exists.

Clustering methodology
The main purpose of our semantic-based multimedia representation is to provide a paradigm that concisely describes the semantic contents of multimedia data objects and cluster the semantically similar ones.Based on this paradigm, the multimedia data contents of each mobile node can be automatically recognized and represented as semantic vectors with strong mathematical backbone.This representation would also allow simple and efficient analysis of similar data objects which assists classification, clustering, and integration of multimedia data.A clustering method that utilizes the aforementioned semantic-based paradigm is given in this section.

Definition 4. The Hyper-Rectangle Region
Given a set of multimedia data objects X * = {x * 1 , x * 2 , . . ., x * h }, each object x * can be represented as a vector of attributes ϕ x = (ϕ 1 x , ϕ x 2 , . . ., ϕ n x ).The hyper-rectangle region H(X * ) is a collection of intervals showing the n-dimensional minimum bounding region of Definition 5.The Node Content Given a mobile node ni that contains m multimedia data objects χ(n i ) = {x i 1 , x i 2 , . . ., x i m }, the semantic content of node n i , denoted as S(n i ), is defined as a binary tuple: where F dd (.) : Claim 1: Given the data object set χ(n i ), the distribution density function F dd (χ(n i )) can be computed based on the density distributions of χ(n i ) in each dimension.
As noted in Definition 5, F dd (.) is a distribution density function in the n-dimensional semantic space R n .The F dd (.) function can be defined deductively: is the distribution function in the n − 1 dimensional sub-space and f d (R|F * dd ) is the conditional distribution density in the n th dimension.Due to the independence among the dimensions, f d (R|F * dd ) is equivalent to the mapping of F dd (.) along this dimension, which is denoted as f n d (.).Therefore, F dd (R n ) can be represented as the product of the density functions along each dimension, denoted as For the multimedia data objects in the n-dimensional hyper-rectangle region χ(n i ), each object x is represented as a vector of attributes ϕ x = (ϕ 1 x , ϕ 2 x , . . ., ϕ n x ), hence, the density function of the j th dimension f j d (χ(n i )) is described as the probabilistic distribution over the interval [min{ω j 1 , . . ., ω j r }, max{ω j 1 , . . ., ω j r }](r = |χ(n i )| indicates the number of objects in χ(n i )), and F dd (χ(n i )) can be computed as the product of the 1-dimensional probabilistic distribution functions.

Definition 6. The Content-Related Nodes
Based on the aforementioned representation of multimedia data object, each mobile node ni can obtain a hyper-rectangle region H(χ(n i )) which encloses the data objects in χ(n i ).Given a pair of nodes n 0 and n 1 , they are content-related iff: where H(χ(n 0 )) ∧ H(χ(n 1 )) is defined as the overlapping region of H(χ(n 0 )) and H(χ(n 1 )).
Definition 7. The Content-Based Cluster Suppose an ad hoc network G comprises r mobile nodes n 1 , n 2 , . . ., n r .Let n i ≈ n j denote the content-related relationship between n i and n j , and n i = n j denote that n i and n j are not contentrelated.Then a content-based cluster C i is defined as follows: Based on the content distribution of each mobile node, the ad hoc network is divided into clusters with similar or overlapping multimedia data objects.A new cluster-level content distribution function is generated by describing the contents of mobile nodes within a cluster.A query, with high probability, will be directed and resolved within a specific cluster.This characteristic reduces the search space for a query, which should result in an improved query processing.

Definition 8. The Cluster-Level Hyper-Rectangle Region
Given a cluster C i = {n i1 , n i2 , . . . ,n iq }, the cluster-level hyper-rectangle region, denoted as H c (C i ), is the minimum bounding rectangle that encloses the hyper-rectangle regions of each node in cluster C i : where ω j (H(χ(n i ))) is the set of the j th feature values of the data objects in H(↑ (n i )).

Definition 9. The Cluster-Level Content Distribution
Given a cluster C i = {n i1 , n i2 , . . ., n iq }, the cluster-level content distribution, denoted as S c (C i ), is defined a binary tuple as follows: where F c dd (.) : . The cluster-level content distribution information is further integrated and fused together to form higher-level clustering, whose contents are represented using distribution functions in larger hyperrectangle regions.This process is recursively performed until the whole ad hoc network is represented as one cluster.To accommodate the cluster-level content distribution information, a centroid node is selected for each cluster.The centroid is a node n * who has most powerful capabilities, such as computation speed, memory space, communication bandwidth, and etc.In case there is a tie among more than one centroid candidate in a cluster, we choose the node whose data content has maximum overlapping with the cluster-level hyper-rectangle region of this cluster as the centroid node.

Definition 10. The Cluster Centroid
Given cluster } denote a vector of the hardware characteristics of node nij, such as computation capability and communication speed.Then the centroid of Ci can be defined as (C i ): where c 1 , c 2 , . . ., c k are predefined coefficient parameters.
To reduce the searching cost, an indexing hierarchy is built as a wrapper on top of the content-based clusters.The construction of the indexing hierarchy is a two-step process: (1) Within a cluster, the mobile nodes are grouped and one node is selected as the centroid of the cluster.This centroid node acts as the lowest indexing node in the indexing hierarchy (Definitions 11 and 12) -it contains the semantic data contents of the cluster.Connections between centroid node and other nodes provide a virtual infrastructure within the cluster.(2) Based on the virtual infrastructures of clusters, the high-level indexing nodes are created by abstracting the data contents of several clusters.Similarly, each high-level indexing node is mapped to an ad hoc node that has most powerful capabilities among its children nodes.

Scalability and self-adaptive capability
The proposed indexing hierarchy has the self-organizing capability as the network conditions changes.The cluster centroid nodes are selected according to the hardware characteristics of mobile nodes which are likely to remain static and independent of the network topology.However, dynamic nature of the ad hoc network and the dynamic nature of the data contents of the mobile nodes require dynamic maintenance of the proposed indexing scheme to guarantee the optimized and accurate data retrieval.

Definition 11. The Capacity Entropy
Given a cluster } denote a vector of the hardware characteristics of node n ij , and P = {c 1 , c 2 , . . ., c k } be a vector of predefined coefficients.The capacity entropy of node n ij , denoted as E c (n ij , C i ), can be described as follows: where Because each cluster chooses the node with the largest capacity entropy as its centroid, the centroid nodes at top level of the indexing hierarchy should have the largest capacity entropy throughout the cluster.Suppose n is the current centroid node.If another node n" achieves the largest capacity entropy as a result of content changes, then n" should be the new centroid node.
Based on the definition of capacity entropy, an adaptive scheme that dynamically adjusts the centroid node of a cluster can be proposed.The key point in this scheme is that the current centroid node of the cluster needs to dynamically keep updated information about the capacity entropies of the other nodes in its cluster.Since the hardware features of a mobile node n i will keep unchanged for a long period, the only information that needs to be updated is the content summary of node n i , which can be concisely described as a logic expression S(n i ).The updated S(n i ) can be obtained by allowing query packets to piggyback it when forwarded to the centroid node.The detailed update strategy is as follows: -Dynamic maintenance: Each mobile node n i can process a sequence of queries Q 1 i , . . ., Q m i .For any content-based query Qit submitted to node n i , if Q t i cannot be resolved locally, it is forwarded to the centroid node of n i 's cluster in an attempt to be resolved against the cluster-level content distribution information.In case of any changes to the S(n i ), after the last query forwarded to the centroid node, the S(n i ) will be attached with the query packet and forwarded to the centroid node.The new capacity entropy of n i will be computed at the centroid node, and corresponding adjustments will be performed if a new largest entropy is found.
-Node joining: For a new node joining the ad hoc network, first an attempt is made to locate its content-related cluster and then its content is added to the indexing hierarchy.-Node leaving: If a mobile node breaks down or leaves the network, its content summary will be removed from cluster centroid and ultimately from the indexing hierarchy.

Performance analysis
The performance of the proposed scheme is impacted by several factors as such: the initialization overhead of building the indexing hierarchy (including clustering and centroid selection), the cost of performing content-based multimedia queries, the overhead of maintaining the indexing hierarchy when network status changes, . . .etc. To facilitate content-based retrieval, these factors need to be analyzed in details.
Our analysis is based on the following notations: k: the number of nodes in the ad hoc network.
m: the minimum number of children for an indexing node (minimum fan-out).
-P J : the probability of a node joining the network.
-P R : the probability of a node removed from the network.
-P M : the probability of a modify operation.
-Q: the query rate.

Initialization overhead
In the initialization step, a randomly selected node, say n i , is chosen as the coordinator to construct the indexing hierarchy.The selection of the coordinator node takes θ(k) hops.The coordinator needs to send each nodes a message to notify the coordinator's location and to collect data content descriptions, which takes O(k).khops.Hence, the initialization overhead amortized on each nodes is

Searching cost
Resolution of a query requires at most 2 log m (k) steps to traverse the ESSM indexing hierarchy.Each forwarding operation takes at most k hops.Hence the average searching cost for a query in the proposed scheme is O(k log m (k)) hops.In contrast, the flooding strategy requires Ω(k 2 ) hops to resolve a query, since a network of k mobile nodes can have θ(k 2 ) connections, and a query may be transmitted on each connection at least once.

Maintenance overhead
As noted before, the proposed indexing hierarchy does not change as long as the data source contents are in tact.As a result, the indexing hierarchy changes if a new data source is inserted/deleted in/from the network or a modification is made to the contents of an existing node.Moreover, content changes on each node is piggybacked with unresolved query and forwarded to the cluster centroid to potentially trigger maintenance overhead due to the reconstruction of the indexing hierarchy.
A new node n k+1 joining the network requires at most k log m (k) hops to be included in its related cluster.Hence, the average cost for processing new nodes is 2(P J )Q k log m (k) hops.Similarly, the processing of leaving nodes requires (P R )Q k log m (k) hops.The modification operation can be viewed as a deletion of a node followed by an insertion of a new node.Consequently it requires

Experimental evaluation
In this section, we present the experimental analysis of the proposed content-aware clustering scheme against the flooding-based search scheme [4,5,30].As noted earlier, in an ad hoc network with distributed data sources, flooding-based blind search strategy may cause extra computational and communication overheads due to the forwarding of the queries to some "irrelevant" nodes.In contrast with the floodingbased schemes, the ESSM organizes nodes with similar contents into clusters, and forwards query packets only to the content-related nodes.
The evaluation consists of a series of experiments conducted using both real data set and synthetic data set.Our comparative analysis is based on various performance metrics such as accuracy, search cost, scalability, and physical characteristics of the mobile nodes.

Experimental setup
The experiments were run on the basis of ad hoc network prototype with CMU extension to the ns-2 version 2.26 [29].However, since ns-2 does not support content-based retrieval, a semantic-representation module was developed and added to facilitate multimedia data organization.In addition, the routing and data transmission processes were implemented under the framework of extended summary-schemas hierarchy.
The experiment was initialized by assuming a default number of pre-existing nodes in the network and randomly setting up the connections between the nodes.A mixture of operations, including querying, node joining, and node leaving, will then be randomly submitted to the network.The query generation time follows the exponential distribution, which is similar to the previous work [16].The access pattern in the queries follows Zipf-like distribution, which is widely used to model non-uniformly distributed queries [35].The input parameters are summarized in Table 1.Most of these parameters are selfexplanatory.More details for some parameters are given below.
Node Movement Parameters: Each node in the experimental environment randomly selects waypoints within a 670 m * 670 m flat area.The node density can be adjusted by changing the number of mobile nodes in the range of 1 to 16,384 in the flat area.The node moving pattern follows the random waypoint movement model [5]; Initially, the nodes are placed randomly in the area.Each node selects a random destination and moves toward the destination with a speed selected randomly from [0, v max ].
After reaching its destination, the node pauses for a period of time and repeats this movement pattern.Dataset Parameters: To examine both the accuracy and the scalability, we used two sets of experimental datasets as the test beds -the synthetic dataset and the real dataset as follows: -The synthetic dataset employed in the experiments is similar as the one used in [34], which includes up to 65,536 data points in a 256-dimension feature space whose feature values are assigned by a random number generator abiding normal distribution in the interval [0, 1] on each dimension.-The real dataset we used consists of 2,560 images of 32 semantic categories from the Corel dataset [36] (see Table 1).Each image in the dataset is represented as a vector of 435 features (color histograms and texture wavelet coefficients) and 4 annotation keywords.It is a large and heterogeneous image dataset.2,048 images in the test dataset were used to train a semantic subspace learning module (i.e.LPP) integrated in the ESSM system that deduces the relationships among the keywords and the semantic categories.The extra 512 image were used as the candidate pool for query examples.

Retrieval process
The content-based multimedia retrieval under the framework of ESSM can be considered as a process of checking content distribution of the integrated information: the semantic content of the query (i.e. a semantic vector) is compared with a series of distribution density functions navigated in the ESSM infrastructure to find a cluster that contains the semantically most similar images.In the experiments reported in this section, the randomly generated query issued to a mobile node is resolved as follows: 1.The query object is represented as a semantic vector, which is forwarded to the ESSM infrastructure and compared against the contents of the clusters (hyper-rectangle regions and distribution density functions).2. By comparing the overlapping of the query search sphere and the cluster-level contents, the ESSM finds a cluster that is semantically most relevant to the query.The centroid node of the selected cluster will then examine the semantic content of the query, and forward it to the mobile multimedia databases that may contain the semantically most similar objects.3.For a specific mobile multimedia database, the content-based retrieval will employ a comparison metric including both semantic concepts and low-level features to find the nearest neighbors to the query object.
The ESSM shows its adaptation to the ad hoc environment in the experiments.As mentioned in Section 2.2, the centralized search strategy achieves good performance; however, it is not suitable for ad hoc networks.Therefore, the ESSM can be proved efficient if it obtains comparable results as the centralized search strategy.Consequently, we used the result from a centralized search system as the standard for content-based image retrieval.The performance of ESSM can be evaluated by matching the percentage of its search result against the centralized search scheme.Figure 3 shows the 5-nearestneighbor retrieval results of ESSM with different hop-count constraints.The environment comprises 2048 images distributed among 128 mobile multimedia databases.The query is an image of a horse (the leftmost image in the bottom row of Fig. 3).As can be seen from Fig. 3, ESSM finds the images that are very similar to the centralized search.In addition, limiting the number of hops (10-30 hops) reduces the accuracy of the search results -i.e., results are not identical to the centralized search, however, a close observation shows that the returned results are somehow semantically related to the query image -i.e., they are mainly mammals.This shows that unlike the conventional feature-based CBR systems ESSM retrieves multimedia objects based on the gradually increasing granularities of semantic categories.

Retrieval performance evaluation
As mentioned earlier, the main focus in the development of the simulator is to evaluate the performance improvement of the proposed clustering scheme based on performance metrics such as search cost and accuracy: -Accuracy is the percentage of the results generated by the search schemes (either the proposed clustering model or flooding) match with the results of centralized strategy.The higher matching percentage implies higher retrieval accuracy.-Search path length is the average number of hops spent on locating the data source node that contains the semantically nearest neighbors of the query image.-Search cost is the average number of messages incurred during the process of query resolution.
Flooding-based schemes may have short search path length, but their search cost is high due to the duplicate query replies.-Maintenance cost is the number of messages spent on adjusting the indexing hierarchy according to the topology and content changes in the network.

Retrieval accuracy evaluation
The retrieval accuracy is evaluated for different simulation settings, varying the number of mobile nodes, the speed of node movement, the rate of query submissions, and the number of hops, during the query resolution process.In these experimental runs, we assumed 2,048 real images randomly scattered on mobile nodes.The maximum fan-out of the indexing hierarchy was set to 10, and the number of nearest neighbors in the content-based search was set to 10.
Figure 4 illustrates the impact of the number of hops on the retrieval accuracy.For a network of 128 mobile nodes moving at maximum speed of 1 m/s we limited the number of hops during the query resolution.Given a fixed setting of network scale (128 mobile nodes), the number of hops was varied from 10 to 80.As can be observed the retrieval results generated by the ESSM are more semantically related with the query image.Also note that both schemes (ESSM and flooding) achieves better accuracy as queries are generated at lower rate.This can be explained by the fact that higher query rate increases the work load of each node in storing and forwarding the queries to its neighboring nodes.The superior performance of the ESSM in contrast with flooding stems from its content-based clustering capability, since the search domain is restricted to a few clusters that are semantically most relevant to the query.
In another simulation run, we examined the impact of data density on the search accuracy.Using the same real image dataset, we varied the number of mobile nodes from 256 down to 32, increasing the average number of images per node from 8 to 64.With the fixed mobility and query rate (moving speed limit at 1 m/s and averagely 10 queries per second), the number of hops was limited to 64. Figure 5 depicts the experimental results.Since the probability of finding semantically related results increases as the data density increases, both schemes offer better accuracy as data density increases.However, the accuracy of flooding-based schemes is still less than that of the ESSM.Also note that the ESSM achieves over 90% search accuracy at a relatively low data density (16 images per node).This implies that in a relatively large-scale network (2048/16 = 128 nodes), the ESSM almost achieves its peak performance with a small search cost (less than 64 hops).In the scope of wireless ad hoc networks, this is an interesting observation since the mobile nodes usually have small storage and cannot support large data density.

Search cost evaluation
In an ad-hoc network, the search path length is usually calculated as the number of hops needed to deliver the query to proper mobile nodes that contain requested data.The real image dataset was used in this simulation run.Several factors, such as mobility, node number, and query rate, have direct impacts on the search cost; hence, we ran our simulator with different combinations of these parameters.
Figure 6 illustrates the number of hops needed during the query resolution for various network densities.As can be seen from Fig. 6, the flooding scheme finds the data source node using less number of hops.This is due to the fact that flooding broadcasts the query to all its neighboring nodes and reaches the data source node using the shortest path.However, due to the broadcasting and duplicate replies, the search cost of flooding may be formidably high.
Figure 7 shows the comparison of the search cost of both schemes in the same environment.As anticipated the ESSM retrieves the images using much less search cost than flooding-based schemes due to its content-based clustering characteristic.In addition, from Fig. 7 it can also be concluded that as node density increase, the ESSM offers even better performance than the traditional flooding schemessteady performance improvement as the network scales up.
We also evaluated the impact of node mobility on search cost.For an ad-hoc network of 64 nodes the maximum speed of mobile nodes varied from 10 to 50. Figure 8 shows the simulation results.As can be concluded, the search cost of both schemes increases as the node mobility increases; however, the ESSM resolves queries at comparatively much less cost.This is due to the very nature of the flooding schemehigher network traffic and higher workload at mobile nodes.In contrast, the ESSM resolves the queries within the scope of content-related nodes, and avoids forwarding queries to irrelevant nodes.

Scalability evaluation
In a separate simulation run, the search cost of both schemes was evaluated as the network scales up.In this simulation run, we varied the number of nodes from 128 to 512 and randomly distributed 65,536 synthetic data points on the mobile nodes.Figure 9 illustrates the result.Similar to our earlier observation (Fig. 7) one can conclude that the ESSM is scalable to large network sizes and large number of data objects.

Maintenance overhead
As mentioned in Section 5, self-adjustment capability of the proposed model incurs maintenance overhead that needs to be evaluated.In contrast to the ESSM, the maintenance cost of flooding scheme is very limited to the messages needed to update the neighborhood relationships between mobile nodes.Figure 10 shows the average search cost and the maintenance cost of both strategies as the network becomes denser.As one can conclude, even taking the maintenance overhead into account, the ESSM still offers a better performance than flooding.

Conclusions
We proposed a novel dynamic content-aware clustering scheme that facilitates content-based multimedia retrieval in ad hoc networks.This scheme is based on a concise descriptive framework - The proposed scheme makes use of the data content distribution in ad hoc networks to reduce the search cost without incurring high maintenance overhead.We have quantified the efficiency and effectiveness of our scheme with respect to various performance metrics -retrieval accuracy, search cost, and maintenance overhead.Through extensive theoretical and experimental analysis, we found that our content-based indexing methodology has the following features: -The ESSM is a decentralized non-flooding search strategy performing content-based multimedia retrieval in ad hoc networks.As shown in our simulation results, it can achieve high accuracy while visiting only a small portion of mobile nodes.-We employed semantic-based clustering to organize multimedia data -The content-related mobile nodes are grouped into clusters.As witnessed by simulation results, this approach reduces the search cost dramatically relative to the traditional flooding strategy.-Our model is dynamic and capable of self-organizing itself as the network status changes.This further offers scalability and robustness in large-scale networks.

Table 1
Input parameters to the experimental system