Multiscale Feature Model for Terrain Data Based on Adaptive Spatial Neighborhood

. Multiresolution hierarchy based on features (FMRH) has been applied in the field of terrain modeling and obtained significant results in real engineering. However, it is difficult to schedule multiresolution data in FMRH from external memory. This paper proposed new multiscale feature model and related strategies to cluster spatial data blocks and solve the scheduling problems of FMRH using spatial neighborhood. In the model, the nodes with similar error in the different layers should be in one cluster. On this basis, a space index algorithm for each cluster guided by Hilbert curve is proposed. It ensures that multi-resolution terrain data can be loaded without traversing the whole FMRH; therefore, the efficiency of data scheduling is improved. Moreover, a spatial closeness theorem of cluster is put forward and is also proved. It guarantees that the union of data blocks composites a whole terrain without any data loss. Finally, experiments have been carried out on many different large scale data sets, and the results demonstrate that the schedule time is shortened and the efficiency of I/O operation is apparently improved, which is important in real engineering.


Introduction
Terrain model is widely applied in actual engineering, such as film, game, and simulation.One of the grand challenges is that it requires huge amounts of data set, in order to guarantee its precision.Especially in recent years, with the rapid development of spatial visualization technology, the requirement for terrain details is higher and higher; therefore, the terrain scene is more and more complex.Owing to the massive data, it is very difficult on the establishment, management, and index of terrain model in the scientific computing visualization engineering.
The capacity of RAM is limited for storing huge amounts of data set; therefore, the support of external storage is needed, when rendering a large scale of terrain.According to the requirements of scene, it is important to dynamically load data into the memory interactively for realizing to render the scene in real-time visualization engineering.However, there is a serious speed difference between RAM and external storage, so that the interactive operation between them frequently makes a big influence to the efficiency of the system directly.
It has become a bottleneck for the process of large scale data.As a result, how to establish the model with huge amounts of data set and provide effective scheduling method is one of the research hotspots.
The pyramid is a kind of data model composed by multiresolution levels [1].It can provide all kinds of resolution of the data without real-time sampling, because the model generates the different resolution data levels of data elevation model (DEM) in advance.It has been widely applied to commercial software.However, there are two aspects of drawbacks.
(1) The pyramid model increases the storage space of data; that is to say, a lot of data is repeated on the different layers of it.Hence, decreasing the amount of loaded data is at the cost of consuming external storage.
(2) There is still mass of redundant data in the data set which has been loaded in memory.No matter the terrain is plain or rugged, in the classical method, the data resolution for each level is uniform, and interlaced sampling is adopted in the various adjacent levels, which does not consider the terrain geometry feature.
Lindstrom and Silva [2] proposed the dynamic data loading method using the function of mirror file.This method is extremely simple, and it takes advantage of operation system to divide data pages automatically.Dai et al. [1] brought up a method of incremental data update, which dynamically updated local terrain data, according to the move of viewpoint and the geometry center of the data page.Li et al. [3] adopted the technique of incremental horizon to eliminate the invisible part of the model.
In order to further improve the efficiency of data schedule, grid simplified model is expressed as data stream by many scholars [4].Space-filling curve including Hilbert curve [5], Z curve [6], and Π Curve [7] in terrain model is a common approach of linearization for data in external storage.It still takes additional time on the index of external storage even if those methods improve the efficiency of data schedule in different ways.In recent years, a new function based on cluster analysis has been proposed to simplify the data model in external storage, and improve the efficiency of data schedule [8,9].But the out-of-core methods based on cluster almost aimed at irregular data set.However, the primary research object in terrain modeling engineering is large scale data set of grid model; therefore, the methods mentioned above are not capable of disposing the large scale data in real engineering.In addition, the parallel algorithm using hardware to deal with out-of-core terrain data was proposed.But it is not applied widely, because of its requirement for high quality hardware [10].
This paper utilized a multiresolution out-of-core model based on geometry terrain feature.There is the following advantages (1) It consists of physical model and logic model, in which the physical model is original full resolution data, so the model is independent of the data scale.(2) The logical model is a multiresolution hierarchical structure on physical model, which establishes the index of data blocks, according to the geometry feature of terrain.(3) It decreases the amount of data loaded to memory while keeping the external storage constant.
In this paper, we proposed a strategy of scheduling data, which is based on logic model after the cluster analysis.It decreases the time cost of searching the target data in external storage, by clustering the data blocks whose static geometry errors are similar.Meanwhile, we put forward the method that sorts the data blocks in the cluster via space-filling curve and presents encoding strategy on multiresolution terrain model.It resolves two primary problems (1) One is to map terrain data blocks in two-dimension structure into the form of one dimension.(2) The other is to solve the problem of how to intercept the data blocks in cluster for the multiresolution model in a local area.

Related Work
As we all know, the effect of 3D modeling based on vectors is usually good; however, there are a lot of problems such as the process of large scale data and accelerative rendering.In recent years, few scholars have achieved significant research progress in the following aspects.
Level of details (LOD) is a classical simplification method for multiresolution model.There are three categories of algorithms for LOD, that is, triangulation algorithm based on quadtree [11], simplified algorithm based on adaptive grids [12], and progressive mesh algorithm [13].
Triangulation algorithm based on quadtree is to divide one data block into four blocks according to a screen error and viewpoint.Its advantage is that it built the hierarchy using Quad-tree code, which is easy in solving the crack problem of adjacent blocks on the common edges.The representative method contains algorithm based on constrained Quad-tree [14], simplified method based on implicit restricted quadtree [15], and so on.The advantage of this kind of algorithm is that it adopts the efficient strategy to refine layers from the top to the bottom.However, a mass of information of each quadtree node needs to be stored.Thus it consumes massive storage resources.
The simplified algorithm based on adaptive regular grid evaluates whether model data should be eliminated according to model error and then determined to divide grid area adaptively [12].
Progressive mesh algorithm adopts embedded grids to remove crack between layers by setting "skirt" for different multiresolution layers [13].The literature [16,17] takes advantage of geometry clipping to optimize algorithm that it creates data buffer in order to accelerate real-time roaming of terrain scene.Meanwhile, Sun et al. [18] described the technique to compress the texture based on mipmap structure.Although this kind of algorithm improves model precision, it needs higher CPU consumption.So there are difficulties for general 3D vector model, when dealing with large scale data sets.
Cluster analysis is also a valid strategy of solving problems in construction of multiresolution model.It is widely applied in many areas.The literature [19,20] described the method of solving problems of computing coordinates mean by using cluster analysis.
When rendering multiresolution terrain model, it is necessary to deal with huge scale data.Jeong et al. [21] proposed a new method to render the complex scene.It is hard to imagine rendering such complex scene in early years.However, with the development of graphic hardware, GPU has become an important computing resource to deal with huge scale data.The literature [22,23] shows more details using this new method.

Cluster Analysis Based on Similar Error
3.1.Cluster with Similar Error.Multiresolution hierarchy based on features (FMRH) is a structure on the basis of physical model.According to the structure of FMRH proposed in the literature [24], a mass of redundant data was eliminated from the view of terrain features, so that the multiresolution model could be stored in the minimum capacity of external storage.However, the efficiency of scheduling the data in external storage would not be improved even though the amount of physical data decreased.In fact, the information of level of data blocks is not inherence with its errors; that is to say, the errors of data blocks in the same level may not be similar.The error in current data block is only related to its source data block or its sub-data block, because the hierarchical structure proposed in our paper is based on a realization of depth-first traversal.As shown in Figure 1, B  1 , and B +1 13 , B +2 133 are the blocks on the same branch in FMRH, and they satisfy the following relation: The error of nodes in the same layer such as B +1 13 and B +1 14 is uncorrelated, while the one of B +2 133 and B +2 143 is possibly great, so it cannot load data into memory as the unit of layer.Hence, it is necessary to traverse the whole hierarchical structure in external storage to load data.As a consequence, selecting data blocks of multiresolution on different levels increases the burden of rendering scene in real time.
A cluster algorithm, which satisfies spatial constraint and maximum static error of closure based on FMRH, is raised in this section.The main idea of this algorithm is to make cluster partition for nodes which own similar errors in external storage and propose the method of spatial index for each cluster.According to this strategy, it achieves multiresolution terrain data of the current scene rapidly for each frame, every error threshold, position of viewpoint and visible area, without traversing the entire FMRH structure of terrain.
According to the literature [24] and formula (1) in this paper, the static spatial error threshold can be computed, and a series of error thresholds will be gained by adjusting the parameter .The bigger the threshold, the smaller the amount of extracted data and, the lower the resolution of model; otherwise, the larger the amount of data, the higher the data resolution for memory.Let  max metric and  min metric be the maximum and minimum static spatial error threshold. max metric equals to  min metric when establishing the FMRH; namely,  = 0.03. max metric is the basic condition which reserves the profile feature of terrain, so it adopts  = 0.2 in the experiment.The threshold  min metric aims at guaranteeing the required higher resolution data in memory by FMRH, while the goal of threshold  max metric is to provide the lower resolution data which maintains the profile feature of terrain with the lowest requirement of the simplified model.Thus, the static error in all data blocks of FMRH ranged from  min metric to  max metric .
Definition 1.Let  metric and   metric (similar error cluster, SEC), where it satisfies the following constraints: , where  can be the index of every spatial layer in the structure of FMRH, where Ω denotes an entire closed terrain space.

Cluster Algorithm Based on FMRH.
A cluster instance is described as Figure 2, and Figure 2(a) is a cluster, which satisfies the constraints in Definition 1 as well.The data blocks marked within a transparent rectangle build the terrain area in Figure 2(b).These data blocks distribute on four levels in FMRH and satisfy the uniform error threshold.These data blocks obviously construct a closed space area which is to be proved in Section 3.3.
According to algorithm of construction of FMRH [24], it records the static error of closure of each data block in FMRH when constructing the FMRH model.Hence, the work is to traverse the structure of FMRH and put data blocks which own similar static error into the proper clusters, according to their static error.In order to satisfy the second constraint in Definition 1, traversing the structure of FMRH by depth-first strategy is better.More details of the cluster algorithm based on FMRH are shown as follows (Algorithm 1).
According to Algorithm 1, the data block in the same interval is aggregated into one cluster, while these different resolution data blocks distribute in different layers, and their union covers the terrain space of FMRH.

Proof of Cluster Space Closeness Proposition.
The majority of data block with different resolution on the same branch distributes into different clusters by depth-first traverse, and the static error satisfies formula (1) on a branch from the top to the bottom in FMRH.The special case is that the error of data blocks, which are partly adjacent or distributed in closed layers, are similar on the same branch, so that they locate in the same error threshold interval.As you can see in Figure 3, three branches are marked by black dotted lines.Assume that there is small difference between CovSE(B 1  13 ) and CovSE(B 2 134 ); B 1 13 and B 2 134 are probably classified into one cluster, only through error threshold which data block belongs to.But it goes against the second constraint in Definition 1; in fact there is an overlapped partition in space between B 1  13 and B 2 134 .In order to solve this problem, the current Block is not classified as cluster   , if there is ancestor data block in the current cluster   , in Algorithm 1.Thus, there is no overlapped area in the covered place of different data blocks for each Input: The structure in external storage of FMRH.Output: The set of cluster  = { 1 ,  2 , . . .,  −1 }.Description: This algorithm generates the set of cluster by traversing the structure of FMRH with depth-first order, according to the relationship of static error and threshold.(1) Initialization: (1. Algorithm 1: The cluster algorithm based on FMRH.Proof.Use induction method to prove the theorem.Let  be the maximum spatial layer in FMRH.
When  = 0, there is only one spatial layer in FMRH 0 .According to Definition 1,  0 = {B 0 1 } is consisting of single data block, so that the space is obviously closed.
Assume the theorem holds for any   ,   ≤  and ⋃ B      B    = Ω, when  = .So that for FMRH +1 a layer of data block is increased (denoted as Path A) on branch of each FMRH  or remaining the branch (denoted as Path B), when  =  + 1.In the case of Path B, since there is no change on branch, all clusters, generated on the basis of FMRH  , have already been closed in the space, when  = .As a consequence, the closeness proposition holds.The remaining work is to prove spatial closeness proposition for the branch of Path A in cluster.
Without loss of generality, we chose any branch on Path A (shown as Figure 3) for any cluster   .According to the method of depth-first traversal in Algorithm 1, for the branch on Path A from the top to the bottom, the current data block B    , where 1  ≤  + 1, shown in Figure 3, is divided into two categories.If the ancestor data block of B    is included in   , it will no longer be in   , before traversing the +1 layer.On the contrary, B    and its three brother data blocks are listed into   , conducted by recursive traversal for four sub-data blocks in step 2.2.2.For the former situation, keep   constant; thus the closeness proposition is held.B    and its three brother data blocks are overlapped with their father data blocks, so that the latter one still keeps the closeness proposition hold.This ends the proof of Theorem 2.

Index of Levels of Data Blocks in Cluster
The union of entire data blocks in each cluster composites a closed space of terrain through our algorithm.When loading data, a proper cluster is obtained accurately as long the scope of error threshold as is ascertained.Therefore, it realizes multiresolution data loading with no redundancy, which does not need to search and judge further.However, one frame roamed in a scene just requires a small part of data to load in memory.Thus, the work of encoding and sorting data blocks in each cluster by characteristics of space is necessary in order to establish the relationship of multiresolution model of level and data blocks in cluster.It realizes the index of local data blocks rapidly as well.
As a matter of fact, the reconstruction of large scale grid data is applied widely in terrain visualization, volume rending, and matrix operation, and its advantage is that it verifies data access locally.The best strategy of linearization of multidimensional data is using the technique of spacefilling curves.The reconstructed data can be intercepted as the rule of filling, according to the requirements, while the closeness proposition in space is still held.This characteristic is corresponded to the goal of indexing the feature cluster and takes advantage of the local proposition of space-filling curves.It can also realize the interception of data in any scope.

Strategy of Hilbert Space-Filling
Curves.This paper focuses on the space-filling curves of two-dimension, because the data of elevation can be mapped into the a tow-dimension space.The characters of these curves are listed as follows.
(1) They not only verify to traverse every data block in the space, but the data blocks are adjacent in local area of filling curves.(2) It is easy to form the triangle strips rendering real time, and the public vertices can be used more than once.It takes advantage of the modern graphic hardware resource.These papers adopted Hilbert curves to guide the process of index coding for data blocks in cluster, so that it improves the efficiency of data loading of terrain.
The basic approach of generating Hilbert curves, formed with the process of subdividing the current square to four small ones and connecting the center of these squares, is recursion.The literature [5] proposes a serial of indexes for Hilbert curve along different coordinate's axes.However, the Hilbert curve cannot be put into use directly, because data blocks in each cluster are presented with multiresolution.
It is necessary to consider the level information of data blocks, in order to apply it to Hilbert curve in the multiresolution model.The train of thought is as follows.First of all, ascertain the filling order of children data blocks where its number is less than or equal to four, which is determined by their father data blocks, by using the level information of data block in cluster.The data block traversed forms a multiresolution filling curve in a certain arrangement from the top to the bottom, by executing the process of above judgments.The details are stated as follows.
(1) Divide the data block into four types in cluster   as type I, II, III, and IV.This division plays two roles in classifying the data blocks.On one hand, it can make sure the sequence in space-filling curve of the subdata block is of current data block.On the other hand, it takes advantage of self-similarity of Hilbert curve to summarize rules of production.Therefore, the type of sub data block is ascertained, and it can be used for recognizing where the position of data block of next layer in the curve is.
(2) Sort sub-data blocks, according to the current type of data block.For type I, sort sub-data blocks as mark 0 to 3. For type II, sort sub-data blocks as mark 0 to 3. For type III, sort sub-data blocks as mark 0 to 3. For type IV, sort sub-data blocks as mark 0 to 3.
(3) Let the type of current data block be   , and the type of its child's data block is probably  0 ,  1 ,  2 , or  3 .In order to make sure of the type of its child's data block, the following rules of production are proposed.

Algorithm of Generating Multiresolution Model Using
Hilbert Space-Filling Curve.The necessary condition that each cluster can be sorted through space-filling curve is proved as follows.Due to Definition 1 and Theorem 2, there is only one data block on the same branch in the same cluster, and the set of all data blocks in a cluster composites a closed space area.Thus, cluster is a set with no intersections and spatial closeness data blocks.An instance of cluster is shown as follows.
By traversing data blocks in cluster from layer , top layer in cluster, to layer   , the lowest layer in cluster, it forms spacefilling curves.Data blocks are sorted in cluster, because each cluster is produced by the manner of depth-first recursion.The principle is that it backtracks to the original data blocks when a branch traverses all of the data blocks in current cluster, and then executes the depth-first recursive operation of their brother data blocks.As a result, adjacent data blocks selected in cluster are the nearest ones in FMRH space.As shown in Algorithm 2, taking an example for a cluster, more details of generating space-filling curves are described.
An instance of coding cluster is as follows.Taking the area of left bottom corner as an example, it contains 11 data blocks B 1  11 , B (1) Dividing new data source as square physical data block on the basis of the form of  × (2  + 1) × (2  + 1).For the irregular shape data source, the manner of mending boundaries is adopted to form many physical data block files.Then arranging physical file blocks, it remains the original irregular size of terrain, which satisfies the condition of (2  + 1) × (2  + 1) for each file size.
(2) Storing a physical block used head file and data body file.Replacing the old data file with new data file, if it is the one to be updated, and then modifying the head file; otherwise, adding new head file and body file into physical model.For each physical block, establishing the corresponding logical model, and adding cluster analysis.The steps of updating logical model are as follows.
(2) Get the set of cluster , and make cluster analysis for FMRH.(3) Establish the adjacent cluster set   for each couple of neighboring physical file blocks.

Multiresolution Data Schedule
The majority of data schedule methods is view-dependent, which computes the error of simplified model at the same time to schedule data, so that it realizes loading multiresolution model, in order to decrease amount of loaded data in memory.The efficiency of the operation for I/O is low, because the burden of schedule program is too heavy.The multiresolution model is established directly with calculation, as long as it is achieving the error threshold of model, because the schedule program is based on multiresolution logical model and data block cluster proposed previously.Figure 4 describes the framework of data schedule based on the above thinking.The majority of works has been completed already in Sections 3 to 4. The questions still need to be solved are as follows.
(1) Establishing the relationship between screen error of view-dependent model and the static error of closure, so that it ascertains the error threshold of loaded data in memory, according to the requirements for memory, and determines to which cluster the data source belongs.
Output: Space-filling curve (code arrangement of data block) CodeString.Description: This algorithm generates the space-filling curve which contains all data blocks in cluster , by judging the type and position of data blocks in cluster .(1) Initialization: (1.1) Define the top layer  and bottom layer   in cluster .
(1.2) Extract the first and the last data block from cluster .
According to this, decide the type  of data block B  in layer .(2) Locating and gaining the real data block according to the index of cluster in physical model.

Searching Strategy of Target
Cluster.The view-dependent simplification technology is that it selects the proper multiresolution terrain model according to the position and direction of viewpoint.Finer layer is used for terrain near the viewpoint, and coarser layer for the area is abundant to the viewpoint.We adopt the static error of closure to realize the multi-resolution model in external storage, because there is no information of viewpoint, when organizing the data in external storage.In the schedule program, determining the needed physical data block in practice, which takes advantage of static error, should be calculated according to the error of screen to select the proper index of cluster.The relation between the screen error of view-dependent model and the static error of closure is set up in this section.Let  be the dynamic error threshold of viewpoint model, and let  be the static error threshold.The relation is stated below:  Let  be the number of pixels of screen in a unit distance, and let  min be the nearest distance from the viewpoint to screen.Applying formula (2), set the value of , such as 0.5, 1, 1.5, or 2, which are the values of pixel.So the required static error threshold  of current multiresolution scene can be obtained.Then the index of cluster is ascertained based on the size relation between the cluster static error threshold preestablished in advance and that in .Assume that the arrangement of static error threshold in cluster which is metric is in the interval between   metric and  +1 metric , so that the multiresolution data block recorded by cluster   loads into memory.

Data Prefetching and Strategy of Incremental Data Schedule.
For data prefetching operation, double-threads pattern is adopted, in order to render the terrain system in real time and guarantee roaming the scene coherently.Incremental data schedule is used to decrease the amount of data loaded in memory, by taking advantage of the correlation of different frames in space and time.The main thread is used to render foreground scene, and the other one realizes preschedule of data blocks stored in external storage.
As for single file, apply the index of cluster in Section 5.1 to ascertain the selected cluster   .Determine the code of the beginning and the end data block in cluster   , according to the required position and size of data in the current scene.For example, assuming that the codes of the beginning and the end block are 10 and 24, this part of multiresolution data clips the required data and loads into memory, through searching the related file of physical data block from the structure of FMRH.
The aim is to locate data blocks in which files for the multifile data schedule operation.It is easy to determine the required data blocks, by traversing latitude and longitude of physical data blocks contained in it and position relation of current scene according to the index of head file of each data source.There are only five kinds of position relation as shown in Figure 5, because data blocks are classified by rules.So a set of cluster corresponds with each physical data block.The method in Section 5.1 selects the proper cluster   for each physical data block in the set of cluster and implements the same process for each   as single file.In addition, it is very important to match them using the adjacent cluster of the boundary of two files.Finally, we prepare the gotten data by the file mapping technology in data buffer.
Most operation of data schedule is for single file; the amount of data for one frame in roaming scene is not large.The maximum number of files is four, when it satisfies the condition of multifile (illustrated in Figure 5), so that it creates four file mappings at most.This question can be solved using file mapping API supported by VC.Another simple method, which can improve the efficiency of data schedule and decrease the amount of loaded data, is using incremental data schedule.A method which is similar to the literature [25] is adopted, which forms the supplication data block of strip shape on the direction of which viewpoint changes and puts them into a data buffer, in order to support data for next frame.The difference in our method against the above method is based on multiresolution data strips of the FMRH  structure.But the method supposed by the literature [25] is based on the nested strip-shaped data structure.Specifically, each strip is the same resolution data block with no relation to terrain features.So the amount of loaded data is much more than the incremental data schedule.More details on data prefetching and the strategy of incremental data schedule are described in Algorithm 3.

Results and Discussion
We have researched on ten data sources and analyzed results in our experiment, in order to verify the model based on huge dataset in external storage and the validity of scheduling.These data sources contain GTOPO30 global elevation data, the elevation data of Jilin province, Zhujiang, and Tian-chi (crater) lake of the Changbai Mountains in China, Colorado Grand Canyon, Mount Rainier, Crater Lake, Puget Sound, Seattle, and Yakima in America.The sum amount of elevation data is about 8,111,735,217 Byte ≈ 7.554 G, and the one of texture is 21.37 G.We used a PC with a 3.0 GHz Pentium 4 CPU, 1 GB system memory, and a Geforce 5900fx graphic card.We also used VC++ as the IDE for software developing.
We have experimented with the amount of loaded data in memory and schedule time for 10 datasets by setting error threshold  1 and 2.5, respectively.The resolution of screen is 1024 × 768, where the clipped window size of GTOPO30 and Zhujiang is 129×129, the one of Jilin province, Tian-chi crater, Colorado Grand Canyon, Puget Sound, and Mount Rainier is 257 × 257, and the one of Seattle, Yakima, and Crater Lake is 513 × 513.
Table 1 describes the result of the time and the amount of loaded data, when screen error is 0.5 pixels.There are four situations.The first one is loading data from physical block model directly.Without loss of generality, levels of detail are adopted as well, which is similar to pyramid model, but the data is the same resolution loaded in the same time.The second one is loading multiresolution data block through the FMRH structure in our logic model.The one is loading data block by using the hierarchical cluster to index FMRH structure.The fourth one is loading data block by using cluster to index FMRH structure, with encoding and arranging nodes in cluster through space-filling curve.Three results can be obtained from Table 1.On one hand, compared with using physical files, the data quantity in memory and the time of data schedule are reduced apparently after using the logic structure of FMRH.On the other hand, data quantity loaded in memory is invariant; however, the time of data schedule is shortened obviously, after using the data schedule of cluster analysis.In addition, it is significant to use space-filling curve on the base of cluster analysis, so that the time of data schedule is further shortened.The main reason for this result is that the physical model has been simplified according to the ups and downs of terrain by the structure FMRH.So a mass of redundant data is eliminated before loading in memory.It ensures loading small scale of multiresolution data into memory each time, so that the time of data schedule is decreased.Another reason for this result is that the cluster analysis is taken for data block of similar errors to the structure of FMRH when establishing the logic model.Locating the suitable cluster for data block, when data schedule is preceded, shortens the time of searching data block and realizes the high efficient data schedule.Finally, it is an efficient way which is proper to file mapping, taking advantage of space-filling curve to code and arrange data blocks in cluster.Meanwhile, owing to space-filling curve is local, the method can retrieve the local data blocks rapidly.
With the threshold of screen error increased, the quantity of data loaded in memory and schedule time decreases obviously.Data in Table 1 is the result of loading in memory at the first time, but in fact the incremental data schedule technology is adopted when it roams the scene.As a result, with the rate of frame increased, the time of data schedule and I/O response is decreased furthermore.With screen error threshold 2.5, data quantity loaded in different frames is shown in Table 2. Data quantity in first frame and the time of loading data are large, and the ones in 10th, 100th, and 1034th frame tend to be stable.Compared to the first frame, data quantity in other frames in memory is decreased obviously, and the efficiency of data schedule is increased significantly, because it adopts incremental data schedule.

Conclusions
This paper proposed the strategy of rapid data schedule based on cluster analysis, which improves the efficiency of I/O operation.It is very significant to expand terrain modeling method for large scale data in application of real engineering.More details follow.
Firstly, the cluster analysis method based on FMRH is proposed to solve the linearization problem of multiresolution terrain model.In details, each cluster satisfies two constraints.One is that data blocks owned similar error are classified into one cluster; the other is that the multiresolution data blocks in cluster covered the whole closed terrain spatial zone.Because data needs to satisfy some screen error threshold when it loads into memory, the first constraint conforms to the data requirement of simplified model in memory.The second constraint verifies loading data in each cluster as a unit, so that the situation of data block loss does not appear.
Moreover, the proposed strategy sorts each cluster through space-filling curve and encodes it.Because the nodes within the cluster are closed in space and obtained through the depth-first traversal.Small adjustments and local updating on the clustering can be mapped to form a onedimensional sequence of contiguous spatial data blocks.The advantage of those strategies is that there is no need traversing the whole data set.
Finally, our strategy solves the problem of file-block splicing and adopts incremental data scheduling strategy to further reduce the amount of data loaded into memory when roaming the scene.As a result, our method improved the scheduling efficiency, which is important in real engineering.In our future work, the theory and idea about scheduling [26], planning [27], and phase transitions [28] will be introduced to improve the efficiency of terrain modeling further.

14 Figure 1 :
Figure 1: Multiresolution layer hierarchy of logical model in external storage.

Figure 2 :
Figure 2: A cluster instance based on similar static error.(a) Structure of FMRH (transparent boxes denote data blocks which are gathered as the same cluster of the one corresponding to (b)).(b) The space closed terrain area of cluster corresponding to (a).

Figure 3 :
Figure 3: Analysis of problems about depth-first traversing FMRH.

( 1 . 3 )
Clear CodeString.(2) Recall the function DeepTraverse Cluster (B  , , ).(3) Modify the index file of cluster, and save the new index code of data blocks in cluster.(4) The end DeepTraverse Cluster (currentBlock, Type, ) Function description: As Hilbert space filling curve, generate code of data block in cluster.Parameter description: currentBlock is the current data block, Type is the type of current data block,  is the layer of current data block.(1) Get the four data blocks B 1 , B 2 , B 3 , and B 4 of currentBlock.(2) Sort B 1 , B 2 , B 3 , and B 4 , as the type of Type (shown in Figures 4(a)-4(d)).(3) Judge the data block is whether in cluster : (3.1)If B  is a data block in cluster , then encode B  by the order after sorting, and add it into CodeString, where  = 0, 1, 2, and 3. (3.2) If B  is not a data block in cluster , then ascertain the type of B  by using rules, and make the recursion furthermore, where  = 0, 1, 2, and 3. (3.2.1)If Type is I, then the types of B 1 , B 2 , B 3 , and B 4 are Type III, Type I, I, and Type IV. (3.2.2) If Type is II, then the types of B 1 , B 2 , B 3 , and B 4 are Type IV, Type II, Type II, and Type III.(3.2.3)If Type is III, then the types of B 1 , B 2 , B 3 , and B 4 are Type I, Type III, Type III, and Type II.(3.2.4)If Type is IV, then the types of B 1 , B 2 , B 3 , and B 4 are Type II, Type IV, Type IV, and Type I. (3.2.5)According to the new order, judge data blocks on next layer For  = 1 to 4: If  + 1 ≤   , then recall DeepTraverse Cluster (B  , Type B  , + 1), where Type B  denotes the new type of data block B  .(4) The end of function.Algorithm 2: The algorithm of generating multi-resolution model of Hilbert space-filling curve.

Figure 4 :
Figure 4: Framework of data schedule in external storage.

Figure 5 :
Figure 5: Five types of position relation of index of multifile data block.

( 4 )
According to the moving direction of viewpoint, update the value of LONG and LAT, and establish fileN file mapping buffers.(5) If fileN is 1, then determine the initial position of  1 (according to LONG, LAT), and intercept data block   .(6) If fileN is greater than 1, then determine the position of cluster   , and let  be 1, 2, . .., and fileN: (6.1)Judge the relation of the file of two adjacent data block (as Figures 5(b), 5(c), and 5(e)); (6.2) Distribute data blocks in   , and form   .According to the position relation Z-shape filling adjacent data block cluster after that.(7) If the intersection of   and the existed data in data buffer is empty, then load   , otherwise load   −  old ∩   , namely load data incrementally.(8) The end.Algorithm 3: The algorithm of data prefetching and incremental data schedule.
Assume the set  = { 1 ,  2 , . . .,  −1 } and each   is in the set of data block B  index in different layers; then define   as a cluster based on  −1 Clear the clusters  1 ,  2 , . . .,  −1 in aggregation .(2) Recall the function of Traverse FMRH (FMRH, B 0 1 , 0). (3) Generate the index file of clusters, and save the index of data block for each cluster.(4) The end.Traverse FMRH (FMRH, current Block, level S) Function Description: Load static error of data blocks in FMRH, and decide which cluster it belongs to.Parameter Description: FMRH is a multi-resolution level of structure based on features.The parameter current Block is the current data block.The parameter level S is the index of current data block in the level.(1) Load the static error of current Block from FMRH as current Block.CovSE.B add data block current Block into the cluster   , if there is no ancestor data block of current Block including in the cluster   .(2.2.2) When current Block.leaf is false, then, get four data blocks sub Block (2)Decide which cluster current Block belongs to: (2.1) Construct  − 1 intervals according to  error thresholds  1 metric ,  2 metric , . . .,   metric ,  +1 metric ., . . .,   metric (2.2) Each threshold interval corresponds a cluster from  = 1 to  − 1: (2.2.1) Decide the size of current Block.CovSE, if it meet two constraints: A   metric ≤ current Block.CovSE ≤  +1 metric ;  in the next level from current Block in FMRH, and recall the function of DeepTraverse FMRH(FMRH, sub Block  , level S + 1).(3) The end.
cluster, and the union of all data blocks constructs the completed closed terrain.The following proof attempted to demonstrate the inevitability in theory.
Input: The dynamic error of view-dependent , data scale of current data of scene , data position: longitude LONG, latitude LAT.Output: Pre-fetch data set , and load file mapping buffer.Description of function: Locate physical data block, and load data buffer in memory through the threshold of screen error and the position of scene, as the requirement of the amount of data in memory.(1)Initialization:Clear data set .(2) Traverse the head file of data source (physical model), and determine which data block belongs to it according to (LONG, LAT), meanwhile, record the number of involved files.(3)For each file from  = 1 to fileN, do the process as follows: //If it is a single file, then execute once, otherwise, execute more than once (3.1)According to  and Formula (2), compute the error threshold  needed by current scene.(3.2) Get cluster , and let the arrangement of static error thresholds of  be:  min metric ≤  1 metric ≤  2 metric , . . .,   metric ≤  +1 metric , . .., ≤   metric ≤  max metric , determine which interval belongs to, and define the cluster which meets the requirements as   .(3.3)If  is less than fileN, then go to step 3.

Table 1 :
Time and amount of data load when screen error is 1 pixel.

Table 2 :
Experiment result of the quantity of data load in different frames when screen error is 2.5 pixels.