Semantic prefetching strategy to manage location dependent data in mobile information systems 1

In mobile computing environments, location plays an important role in determining user’s interests. If service providers can supply information reflecting users’ semantics, overall system performance can be improved. Users can avoid queries, resulting in unnecessary network usage. A query result can also be transferred to a local device. Through our proposed semantic prefetching scheme, we make it possible to send data to mobile clients in advance. Researchers have been, for a long time, trying to develop a method to overcome mobile information system issues. These issues consist mainly of low bandwidth, high delay and frequent disconnections. In this paper, we represent the semantic prefetching strategy, which utilizes users’ interests, relating to a users’ current location. Our performance results show that this method is superior when compared with traditional semantic cache schemes.


Introduction
Users freely move in mobile computing environments.However, as their location changes, the corresponding value of their location attribute may vary.Location-Based Services (LBSs) and Geographic Information System (GIS), are core attributes of mobile systems, both of which provide location based services.These systems recognize a users' particular location as important information, representing not only a cellular based system, but also, enabling the use of Global Positioning Systems (GPSs).Location information, with corresponding system developments, are becoming a critical issue in mobile computing environments.Consequently, the endeavor to provide a greater precision, in terms of predicting users requirements, for particular location is a continuing focus.Through this paper we contribute to this goal, by proposing a new mechanism to improve the precision in locating a particular user.
Traditional cache management schemes were first realized in a wired environment [3,10].The cache scheme is becoming even more important in the wireless environment due to the fact that wireless technology still lacks the level of stability that mobile systems require, in terms of reliable communication.Contrary to a wired environment, which does not need to keep track of user's location, wireless environments possess the burden of having to constantly keep track of user's location.To define these semantics, we have applied semantic locality.We explain that characteristics of semantic locality to include not only spatial locality of an existing physical database, but also semantic information based on a users' location.Knowing the exact location of the user is essential for LBSs and GISs, and this is becoming even more important in GPSs, within cellular systems.
Semantic cache includes these localities, and represents them through a semantic description [7,9,17,18].Semantic cache consists of semantic descriptions and contents received from previous queries.Because the semantic cache will have a semantic description and data received from a previous query, even if the connection is lost or has entered an unreachable state, a user is able to acquire information contained in the semantic cache.In addition, traditional prefetching strategy makes up for weak points in the cache strategy.The prefetching scheme can reduce a cache miss and additional network use occurring from a cache miss, by advance transmission of data.
The proposed semantic prefetching scheme successfully overcomes the limitations of wireless environments by transferring data and semantic information to users simultaneously.The proposed scheme focuses on data services, based on the assumption that the service is supported by a connection capable of high bandwidth rapid transmission [6].We propose the semantic prefetching scheme with a semantic description for LDD in mobile information systems.
Figure 1 shows the infrastructure for data, transmitted to specific areas.The system consists of high bandwidth fine-grained cells.Considering the philosophy of "anytime anywhere" services, a data communication system must also be designed to accommodate mobile environments.
This paper is organized as follows: In Section 2, we describe some of the recent research related to cache management schemes in mobile information systems.In Section 3, we provide a model suitable for mobile information systems.Section 4 represents a proposed semantic prefetching technique and deals with query processing and cache management schemes.In Section 5, we explain simulation results and analysis according to various scenarios, -divided into uniform and zipf distribution.Finally, in Section 6, we conclude our paper, and discuss suggestions for future work.

Related work
Our research approaches two aspects of data and mobility in mobile environments.We consider two components relating to a users' particular location.This is divided into sections denoted as "locationaware" and "location-based"."Location-aware" means to know where the client is located, whereas "Location-based" means data is transmitted to user relating to the current location [1,19,20].These terminologies both need to be applied for efficient data management in mobile environments.With these terminologies, conventional cache and semantic cache strategies have been constantly developed to form successful mobile information systems.Strategies have been introduced to reduce the network traffic, and if disconnection occurs, a cache guarantees a hit ratio using information in the client cache [3,4,10,11].Prefetching research has not only improved the level of data accuracy but also reducing cache miss.With this strategy, Infostation infrastructure has the ability to transmit large amounts of data in congested area [5,6].

Cache and semantic cache
Traditional cache mechanism is based on page cache.Page cache is sensitive to physical database organization.If we consider spatial locality in LBSs, the cache hit ratio can be diminished within the page cache.This results in frequent network communication for gaining the requested data since the page cache does not retain semantic information, clients largely depend on the status of the network connection with the server.We pay attention to how close query results conform to the local cache.During the disconnection period we cannot ensure consistency of a query result supporting a user's semantics.At this point in time, recent research has proposed semantic cache techniques for gaining information that includes users' interest and semantic method [4,[12][13][14][15].The semantic cache maintains semantic descriptions and contents of previous queries to the mobile client.The semantic cache transmits total data not as a page unit but as a semantic region or a semantic segment unit.
If the semantic cache is used with a main cache, we can reduce communication cost and network traffic.This is because clients only transfer remainder queries in the case of probe and remainder queries.Semantic cache strategy has overcome a drawback occurring with disconnection states in mobile computing environments, combining data and semantic descriptions.The semantic cache is composed of a set of elements referred to as semantic descriptions.We represent the semantic description S d = {S R , S A , S P , S L , S C } to form [7,9,[16][17][18] If we are aware of correct coordinates of a particular region regarding location information, a mobile client can consider established queries based on these coordinates or the specific region.The semantic cache strategy can yield expected results using a set of elements called the semantic predicate.Research of cache replacement regarding LDD have been proposed by Franklin [4] and Dunham [7].

Prefetching
The prefetching technique reduces the cases of cache miss and cost of bandwidth usage.The prefetching technique can expect high efficiency regarding requests as it supports fast query response time.
The conventional prefetching technique is used to overcome the delay condition of processors during execution periods.It is generally divided into hardware prefetching and software prefetching [8].The hardware prefetching does not need the assistance of a programmer or compiler.Without taking the trouble to recompile, it can dramatically improve the speed of existing programs.It does not increase the size of program, as unnecessary instructions are not added to the code base.As a result, hardware prefetching can apply to data and cache for the purpose of efficiency, whereas software prefetching is applicable for specific applications.As a software prefetching is not considered as part of the hardware, it can be inexpensive, and therefore, a benefit in reducing prefetching time.
Our paper focuses on data prefetching among several prefetching techniques, which are components of hardware prefetching.Moreover, our paper represents some methods to utilize the data prefetching in mobile environments.Our paper overcomes problems of data consistency using the semantic description.

Infostation
As the necessity of data communication becomes more important within a cellular environment, mobile information systems are making a new attempt to adjust infrastructure for data.Many researchers are processing a new challenge, which is to consider data communication and multimedia service in a cellular infrastructures.
For construction of new architecture, we must consider one of two solutions.One approach is to make an exclusive transmission path for data, remaining at an existing base station.Another approach is that the existing base station is modified, yielding an expanded structure.In this way, the data communication approach is a continuously progressing effort to accomplish new challenges, passing on the goal of sending suitable information, to users.
Infostation provides wireless information services in a limited coverage area.As users with notebook computers or PDAs pass an Infostation, they automatically receive useful information, without stopping to enter requests.This could be information that is most relevant near the Infostation, such as restaurant data, and university course information.Or it could be information of general interest, such as news, articles or music.
Research regarding Infostations are focused on data services, supporting high bandwidth availability with low transmission times [6].These service areas guarantee stable transmission without frequent disconnections.In developing a new technology such as 3 G or 4 G, data communication services deliver multimedia data packets.Since our paper applies to the prefetching technique, a system model explains expanded base stations (eBSs) that are similar to the Infostation concept.Infostations, having a high bandwidth, -ultra high speed access(n 100 Mbps), providing data services with high bit rate and capacity at the application level.If an Infostation is used in a congested area that includes many data and users, system performance will improve.Figure 2 represents the Infostation environment.
In Fig. 2, a mobile client obtains location dependent data in an Infostation area, and the GIS server generates geographic information based on the area.The location of each mobile client is recognized by the GPS satellite.Each Infostation is connected in a wired environment.GIS DB1, GIS DB2 and GIS DB3 represent a database composing LDD about each region.

The expanded Base Station(eBS) Model
The proposed model is based on client-server architecture divided under an eBS area.The eBS supports data transmission using high bandwidth.Our system model assumes that mobile clients can get LDD related to a query from the server, such as in Fig. 3.
We assume a few situations for the system model.First, a mobile client has visited each eBS area at least once.The second is that each server in the eBS area retains log files left by users.Log files are written by query contents represented from semantic descriptions, and are managed by the server.A mobile client manages to cache and query processing information.Our cache replacement strategy considers that semantic descriptions and data in a client cache is efficiently managed with a semantic description and data received from the new prefetching technique.We define a region segment as composed of semantic descriptions and data, and thus these contents will be specified for mobile information systems.
The prefetching technique statistically sends the most frequently used information in the current location, but our proposed technique referred to semantic prefetching, sends semantic information based on users' log files.
Our system model assumes that the network architecture is organized into two cases.One is continuously constructed under a cellular system of one eBS per cell.The other is only constructed by a discrete architecture offering one eBS per fine-grained area.These are necessary in order for users to support a volume of informational content.The present wireless environments are continuously attempting to change the existing infrastructure for data communication, therefore this paper assumes the former case for utilization under an identical data communication model.

Semantic prefetching
Our scheme with semantic information could help the users obtain information relating to their interests.Although users pass by an eBS area at high speed, a semantic prefetching scheme correctly supports requested data with semantic information.We examine this semantic prefetching technique in detail.

Basic concepts
The cache contents of a mobile client must be corresponded to within a query.A server sends information to a mobile client, expected for future use with a semantic description defines to the relation and F A defines the attribute.F P represents the predicate, expressing the location range of the mobile client, referencing the F L which means prior location.Finally, the F C are contents satisfied to locations visited previously.This paper describes cache replacement strategy and re-definition of LDD with semantic description using the AND, OR operations.It is applied to each Semantic-Least Recently Used(S-LRU) strategy and Semantic-Least Frequently Used(S-LFU) strategy.

Definition 1. Given a Log File
Semantic prefetching is composed of a set of items, defined as a prefetching description.
Definition 2. Given a Log File L f = {R i } and its attribute set where each P j is a conjunctive of compare predicates, i.e.P j =b j 1 ∧b j 2 ∧. ..∧b j l , each b jt is a compare predicate and F L = n i=1 m j=1 ( L i ,L j ) Lemma 1 and 2 are formally defined by the existing semantic segment used in a semantic cache mechanism.In definition 3 and 4, we assume that the user selects the function θ for the purpose of comparing prefetching descriptions to semantic descriptions.
Let S, T and W be sets with semantic descriptions where θ is a sorting function.If we suppose that S-LFU follows the frequency of the query divided into two parts: hot data and cold data, by users' semantic description.In S-LFU, the factor of θ is frequency.Whereas, we suppose that S-LRU follows the recent access of the query: recent data and old data by users' semantic description.In S-LRU, the factor of θ is time.
In this situation, LDD has an effect on F L and S L .F L are previously physical coordinates queried from the current area, which is maintained to the server.Also, S L are physical coordinates saved to the cache.Cache must be renewed to the current location.In addition, when a query part of a cache accords with the query part of F L received from an eBS, the cache will maintain its probe query, the remainder query is again transferred to the server through the general BS.As a mobile client can go into a new eBS area in several directions, results of the request vary according to the location.When a mobile client enters a new area, how the client cache is organized is important.First of all, the common part, referred to as part of the unified contents, and its remainder part are organized into a set table maintained by five semantic description elements.
In this paper, our scheme manages LDD maintained in server and client cache.Data maintained in the cache must conform to user queries.This is satisfied with a semantic description table.The data and semantic description can make one part a common set and the other part an ordering set.Ordering sets choose between the order of query frequency and the order of the latest query.Therefore, although a mobile client moves in a eBS area, the data and semantic description in the cache keeps up their validation.

Query processing
A segment of LDD in a query maintains a semantic description and data content.After making a new segment part under continuous queries like Fig. 4, the semantic segment is disjoined to probe queries and remainder queries.Probe queries are parts accorded with the semantic description table reconstructed by semantic prefetching.Concurrently, remainder queries are not accorded with the semantic description table.If the semantic segment includes well-analyzed query information, remainder queries will decrease considerably.Consequently, we can get responses regarding requested queries using probe & remainder queries.The semantic cache strategy has increased the hit ratio within a mobile client cache in an existing network architecture, but if a user in a fine-grained area wants to get information, the semantic cache strategy makes it impossible to support large-scale contents of LDD.Semantic prefetching strategy overcomes these limitations.Minimizing the contents of remainder queries, semantic prefetching strategy reduces additional communications cost and query response time.

Cache management scheme
When caching the received data, we would also store the received semantic description in the cache.After inserting the received semantic description into the existing cache contents, under the S-LRU strategy, the cache can be reconstructed in an intersection and union part.The former is a correspondence part and the latter includes intersection part and all parts of both the semantic descriptions.A remaining part of the cache can be maintained.
In this paper, including semantic information, cache management schemes have two different kinds of techniques such as an S-LRU strategy, which is least managed by recently used data, and an S-LFU strategy, which is least managed by frequently used data.Our simulation performs the S-LRU strategy for several experiments.
The S-LFU strategy will deal with statistical data related to the user's frequency after the implementation.If the cache data of a mobile client fully contains a user's query, the user can immediately get data from the local cache.The cache size of a mobile client is limited, so it is important to manage several data streams received from server, using a filtration like replacement strategy.In the S-LFU strategy, hot data and semantic descriptions are represented in the order of high frequency information.On the contrary, cold data and semantic descriptions are represented to the order of low frequency information.The server maintains both hot and cold information by statistically calculating the frequency of contents.In this case, the server can calculate the frequency by counters and information can be organized for a mobile client cache, considering the order of frequency, when a mobile client requests information.The determination of how to consider the order of frequency is adjusted in a user's environment during implementation.
We propose the S-LRU strategy for managing our cache like Fig. 6.In Figs 5 and 6, the total cache size defines N and the current cache size can represent the K/N % notation by the arbitrary K value, and intersection part also denotes the LData+(Sd c ∩ Sd F ).In the current cache size, the remainder part, not including the intersection part denotes Sd remainder .The remainder part of the current cache size defines the K part.That is, we can represent to the (1 − K/N )% notation equal to the K part.In 6, the contents of the cache are maintained in the data and semantic description.We denote the N part, which is oversized to the cache size N .If a mobile client receives the excessive data under the semantic prefetching scheme, the cache deletes the contents of N part.
The relation between a server and a mobile client is shown to Fig. 7.According to the progressing step of Fig. 7, although the eBS network has a disconnection status, the mobile client can get the result for a requested query from the updated cache.The server maintains the semantic description part and sends a users' updated information made by a former semantic description, to a client when the client enters the area.In this case, overhead occurs in the server, since it has to maintain several users' information.This paper has no consideration regarding a server's overhead.
Execution functions of server and mobile client show the server and mobile client algorithm.This is represented to each execution part.
Example 1. supposes a mobile client moves around the eBS area.The server side action is determined from a mobile client's location.When a user enters an eBS area, the server transmits predicted information in advance.For efficient transmission, the server collects data based on previous information and semantic predicate.This is the main contribution of ideas called to semantic prefetching.The ability of the server depends on data prediction.
In the server algorithm, there are two functions, F ind data and Send data, each represented to prefetch requesting data.The F ind data function identifies the fact that the location of a client changes from one eBS to another using the GPS, updated data, and other necessary data is found using information organized in log files.Furthermore, this function maintains log files in the database when a mobile client leaves this area and periodically updates data.After finishing the F ind data function, the Send data function sends the requested data to the client.The server transfers prefetching contents from the client using the Send data function.
Example 2. Suppose that a mobile client exists in an eBS area.Immediately on entering the eBS area, a mobile client receives semantic prefetching data.Although the mobile client keeps a cache, a user

Performance evaluation
We evaluate the performance of our model through extensive simulations using scenario data.We describe our simulation environment in Sections 5.1, 5.2, 5.3 and 5.4.We demonstrate our experiment results in Section 5.5.

Resource and model parameter
The cache strategy experiments using semantic cache and semantic prefetching consider the total cache hit.The factor sets each value of parameters.It is compared with the cache hit ratio regarding the In this paper, the replacement strategy regarding cache hits utilize the S-LRU strategy.Having specific bandwidth, we suppose that our network utilizes the FIFO queue model.We suppose that networks for the eBS and the BS both exist in our model together.Using 5*5 service area for experiments, the total number of queries requested by mobile client is 400.We suppose that the number of continuous queries per area is 10.The number of cache hits regarding the probe query differs depending on the size of the query segment.In our simulation, we have several experiments with various sizes of query segments.A total of 30 pieces of data is used in our simulation.We implement 400 pieces of LDD, including each data type, set into the service area fixed to the eBS.Our eBS area are divided into physical 100*100 coordinates.Using the coordinates, we represent the location of mobile client and LDD.
During this papers research, consideration related to the update is out of the question, as the overhead pertaining to update operations, doesn't apply.Query information and previous data used by a user is logged to the server.When a client re-enters the service area, the user can receive information.The server must support the service and database of users.For a semantic cache and semantic prefetching strategy, semantic regions are maintained through buffer management for the user.In semantic cache and semantic prefetching techniques, we use semantic information for supporting the proposed scheme, no consideration is given for an overhead of indexing management for mobile clients.Parameter values used to compare and analyze our simulation are determined depending on the number of user's queries, the size of the query segment between the normal semantic cache and semantic prefetching semantic cache.

Simulator for semantic prefetching technique
Our simulator used for verification is implemented in the Java language.We simulate and compile a model using a Sun Blade 1000.Experimental environments are of client-server architecture, used for mobile communication.The simulator is made of an eBS class and a client class, applying methods of the client-server architecture.An eBS class includes a method to manage data made from an experiment of uniform distribution and zipf distribution, and client class represents the cache with current location coordinates for clients, data attributes, and segment size together.We suppose that the client has visited the area at least once.Therefore, at first, users can receive the data using previous information.Data used to our simulation is prefetched with each distribution method, and query patterns sent from the client is organized differently by moving patterns according to the 4 scenarios.

Simulation assumption and specification
In existing prefetching, the server provides data, frequently used by users, to the client without semantic information.Considering the frequency of data, semantic prefetching provides previous query information and services maintained by semantic information about a user.In this paper, when clients move around a set location, and if characteristics consist of continuous queries, we define semantic relation and semantic attributes as one set.If a user queries different attributes of the same relation, we assume that it is a different data type.The data gained from our simulation is LDD.The semantic predicate including the condition attribute, is organized by the segment area, having a specific distance from the current point.When the query is processed, the queried area is overlapped with a nearby cell area, we define the query part as assigned to the physical address for the relevant area.We repeatedly organize the moving pattern by visiting the 20 areas.

Scenario
In the processing part of our simulation, we use 4 scenarios for presentation of real-world moving patterns.Each scenario determines 30 data types and physical locations of X, Y coordinates of each data represent the same data type using the uniform or zipf distribution.A uniform distribution, sometimes also known as a rectangular distribution, is a distribution that has constant probability.On the other hand, zipf distribution means a distribution of probabilities of occurrence that follows Zipf's law.
In these 4 scenarios, it is important to determine the moving pattern of clients.We can use a random walk model for real-world moving patterns, but we set specific moving characteristics of the client for the simulation as follows.Examining 4 scenarios, we consider the different mobility and query patterns.As we discuss the scenarios, we compare traditional semantic cache to prefetching semantic cache, applying uniform and zipf distribution.

Experiment results
As shown to Fig. 10, the point represents the cache hit ratio using the semantic prefetching strategy is higher than the traditional semantic cache.We try to adjust factors such as proper cache hit, moving pattern, and query pattern.We compare traditional semantic cache to semantic prefetching using uniform distribution and zipf distribution, depicted in Fig. 10.
As started above, we adopt several situations: the cache hit ratio of data requested by mobile client, determination of different or similar query pattern, definition of low or high locality, space of the remaining local cache and re-visiting number of the area by client.
The size of a segment depends on the request type of a user.In this experiment we assume that the query pattern of a user is adequate and regularly adjusts the size of a segment about each query.
Figure 11 presents a comparison of the number of cache hits, adjusting the query area, that is, the size of a segment according to each scenario.The size of a segment is determined as the size of a regular quadrilateral made by the radius parameter, setting the current location of the user.If the parameter sets a large value, the probability of cache hits relating to the probe query increases dramatically.In this Fig.11, each scenario shows that the number of cache hits is remarkably differentiated, depending on the size of segment.In each scenario, the rate of increase greatly reduces the case of the small segment size.Our paper organizes the size the segment by the distance when simulation executes in a proposed modeling.Therefore, we can get some limited results like Fig. 11.We can organize the segment area including the additional elements in the real-world implementation.We consider the additional elements including the geographic character, the process of query information and the dynamic organization of the segment area.In this way, we can understand the improvement of the cache hit relating to the request of the user and how the segment area organizes.
Figure 12 represents the comparison of response time in 4 scenarios.When mobile client requests LDD, the cache plays an important part in rapidly searching data.Semantic prefetching helps the cache support data by providing a rapid response time.Compared with the semantic cache, semantic prefetching graphs show the decreasing response time.Response time is a measure that assures efficiency of semantic prefetching.Distribution of data such as uniform and zipf applies to compare semantic prefetching to semantic cache.
Finally, Fig. 13 shows the number of bytes sent across the network per query.In this case, the use of semantic prefetching results in a significant reduction in message volume.In a network constrained environment, such communication costs may be the dominant factor.Although semantic prefetching has network traffic in the beginning, the mobile client subsequently compiles information in the local cache.
It should be noted that in environments where data communication costs are expensive, the ability to process independently of the server can result in significant savings in addition to performance gains.

Conclusion and future work
In this paper, we propose the semantic prefetching scheme and new cache replacement strategy.By applying the semantic prefetching through an eBS in mobile information systems, we reduce the network traffic.The server keeps semantic information represented by a semantic description and collects data in a query segment area.Although users try to request current location data, we can get LDD from the local cache without additional network connection.With respect to communications cost and availability of the local cache, we have excellent results for mobile information systems.In future work, we will develop semantic prefetching schemes for real-world applications and try various probability based approaches.

Figure 5
represents the S-LFU strategy.It shows the most basic cache management of the S-LFU strategy.

Fig. 9 .
Fig. 9. Mobile client. is limited in gaining a cache hit.We consider a new cache replacement strategy for using semantic prefetching.Mobile clients select cache contents and delete useless data, depending on the cache size.In a mobile client algorithm, there are four functions, represented by Receive data, Compare data, Select data, and Delete data.The Receive data function adds the contents of prefetching data to the current cache.The Compare data function remakes the contents of the cache by comparing contents received by semantic prefetching the contents of the current cache.The Select data function maintains contents of the new client cache.Finally, the Delete data function determines whether to delete after checking the cache size.
. S R represents relation, S A indicates attribute, S P marks predicate with location information named to S L .S C has contents of data.The examples show a statement of the semantic cache query.L x , L y represent the user's x, y coordinates in current position, and h xpos , h ypos , d xpos , d ypos represent the location of the LDD requested.This example shows that it reconstructs general query statements into an SQL pattern.

Table 1
, and query segment size.For this simulation, we basically coordinate one server per eBS.Table1represents parameters used to our simulation model.CACHE MAX SIZE sets from the 10% of the S CACHE MAX SIZE.After querying, the mobile device sequentially checks the contents of the cache for responding to the query.