A Perception-Driven Transcale Display Scheme for Space Image Sequences

With the rapid development of multimedia technology, the way of obtaining high-qualitymotion reproduction for space targets has attracted much attention in recent years. This paper proposes a Perception-driven Transcale Display Scheme, which significantly improves the awareness of multimedia processing. This new scheme contains two important modules, transcale description based on visual saliency and perception-driven display of space image sequences. The former concentrates on describing the transcle feature of space targets, including three algorithms, attention region computing, frame rate conversion, and image resolution resizing. On this basis, the latter focuses on high-quality display of space movements under different scales, including three algorithms, namely, target trajectory computing, space transcale display, and space movement display. Extensive quantitative and qualitative experimental evaluations demonstrate the effectiveness of the proposed scheme.


Introduction
With the rapid development of computer technology, spatial rendezvous and docking needs a large amount of image information between navigation information systems.As an important information form, space image sequences play an important role, such as docking control, docking mechanism design, and track interactive control, and they need to be transmitted between these navigation information systems to measure the relative position, relative velocity, and relative attitude of spacecraft [1][2][3][4].For the spatial image sequence, transcale is its important feature.So, demanding multimedia processing methods are different from those of traditional image processing.At present, the study of space image sequences is still at the initial stage and further exploration is still on the way.Specifically, there are mainly two unresolved problems.The first one is how to achieve the task of highquality monitoring space targets.For example, the continuous attention to space targets should be strengthened due to the lack of the details of space movements.The second one is how to improve the ability of processing images, such as smooth motion reproduction, accurate trajectory description, and different display effects; all of them should be fine exhibited to enhance the awareness of multimedia processing.To solve the above problems, a transcale display method [5][6][7] emerged as a new direction for image processing and the challenges are mainly shown as follows.
(1) The change of the attention scale: The existing image processing methods only use the inherent characteristics of the image sequences and lack the attention perceived by observers, which needs to be converted from the description of the whole sequences to that of space targets, so as to better capture fine details of space image sequences.
(2) The change of the scale of the frame rate: The existing frame interpolation methods have the problem that the details of the interpolation frames are not clear during the process of changing the frame rates, which affects the smoothness of space image sequences.Thereby, new methods need to be studied to obtain the optimal interpolation frames.
(3) The change of the resolution scale: Because of the limited screen sizes of different devices, the same image sequence is frequently required to be displayed in the different sizes.However, the important contents are not changed the scales in the existing resizing methods, so new methods need to be proposed to improve the definition of important contents under the guarantee of the global resizing effects.Facing the above challenges, we propose a Perceptiondriven Transcale Display Scheme (PTDS) to achieve highquality space motion reproduction.The main contributions have three points.Firstly, we construct a transcale display framework by providing a different perspective, which significantly improves the awareness of multimedia processing.This framework contains two important modules, transcale description based on visual saliency and perception-driven display of space image sequences.Secondly, the module of transcale description is presented to solve the transcale problems, which is the core of PTDS.Finally, the module of perception-driven display is proposed to realize the trajectory description and movement display for space targets under the condition of changing scales.To sum, PTDS could serve for navigation information systems.
The rest of the paper is structured as follows.Section 2 discusses the framework of PTDS.Sections 3 and 4 describe in detail the formulation of PTDS, namely, transcale description and perception-driven display.Section 5 presents experimental work carried out to demonstrate the effectiveness of PTDS.Section 6 concludes the paper.

The Framework of PTDS
Figure 1 shows the framework of PTDS and it contains two important modules, transcale description and perceptiondriven display.The first one is to improve the definitions of the important contents perceived by observers under the condition of changing different scales, including the attention scale, the scale of frame rate, and the resolution scale.The way to determine the attention regions from the observers' viewpoint is naturally becoming a key issue.Recently, the visual saliency technique is more and more widely used in the multimedia field [8][9][10][11].The reasons for using this technique in image processing contain the following three points.Firstly, it can provide the capability of choosing for image description, strengthen the description details of moving targets, so as to priority allocation to achieve the desired image analysis and synthesis of computing resources.Secondly, it can accomplish auxiliary construction of the flexible description scheme by discriminating the saliency regions and non-saliency regions.Finally, it can improve the cognitive ability and decision-making ability for space image sequences.Therefore, we adopt the visual saliency technique to address the above key issue.This module contains the following three parts: (1) Attention region computing: It focuses on the problem of the change of the attention scale and then captures the visual attention regions for space image sequences.The calculation process mainly includes two aspects: spatial attention regions and temporal attention regions.
(2) Frame rate conversion: It focuses on the problem of the change of the frame rate scale and improves the motion smoothness of space sequences.The calculation process mainly includes three steps: transform of attention blocks, acquisition of neighboring blocks and prediction of unknown pixels.
(3) Image resolution resizing: It focuses on the problem of the change of the resolution scale and improves the definition of space sequences with different resolutions under the guarantee of the global visual effects.The calculation process mainly includes two steps: partition interpolation computing and seam carving operation.
On the basis of transcale description, the module of perception-driven display is presented to achieve the task of high-quality spatial motion display under the condition of changing the different scales.And this module is an application of the first module and it could serve for the space navigation information processing system.This module also contains the following three parts: (1) Target trajectory computing: It could adaptively display the motion trajectory of space targets according to the setting of scale parameters.The calculation process mainly includes five steps: attention region computing, frame rate conversion, boundary computing, trajectory coordinate calculation, and motion trajectory display.(2) Space transcale display.It obtains clear motion details under different time scales and spatial scales, and the calculation process mainly includes five steps: key frame computing, display map calculation, thumbnail computing, transcale display, and thumbnail resizing.
(3) Space movement display.It realizes the motion overview of the space targets in a narrative way.The calculation process mainly includes four steps: transition map setting, transition pixel calculation, transition region acquisition, and display map calculation.

The Module of Transcale Description
In this section, we will go over the individual parts of transcale description for space image sequences, including attention region computing, frame rate conversion, and image resolution resizing.

Attention Region
Computing.The approach of attention region computing captures the important contents perceived by observers, i.e. visual attention regions, including spatial attention regions and temporal attention regions, and its flow chart is shown in Figure 2. Spatial attention regions are obtained according to the characteristics of color and image signature.And temporal attention regions are calculated by adopting visual tracking.Specifically, let [1, n] = {  }(1 ⩽  ⩽ ) be a space image sequence defined on the 3D space, and we use {, , } to represent the attention region, where T denotes the collection of time stamp and VA denotes the attention regions containing spatial attention region VA S and temporal attention region VA T .KT denotes the transfer mark of attention regions.The computation of VA is shown in The computation of spatial attention regions is based on the research of biological vision; i.e., human's visual system is very sensitive to the contrast of the visual signal [12,13].We adopt the latest histogram color contrast method [14] to compute the spatial saliency value of any pixel in I t , shown as follows: Then we quantify  , , which has the larger value of spatial saliency, putting it into the set   = { , }.Because  , may be in a very small, relatively unimportant region, so it is necessary to eliminate such a pixel.There are three elimination rules, shown as follows.As   = 1, 2, .., (  − ),   = 1, 2, .., (ℎ  − ), and   and h initial are the initial resolution of image sequences.On this basis, the boundary values of VA S (I t ) can be acquired by using the technology of bounding box.

Nearest blocks determination
The computation of temporal saliency regions is based on the observations of human attention continuity [15,16].According to the obtained spatial saliency regions, the temporal saliency regions are computed using visual tracking.Since the attention regions could vary over time, we use KT to describe the change extent of attention regions, shown as follows: where   denotes a threshold value.Then time t, which satisfies the above equation, is successively put into the set KTS.In our experiments, we determine the value   to be 0.75.We utilize Bayesian target tracking to compute VA T (I t ).Let   = [  ,   ,   ,   ]  be the state of VA T (I t ), in which [x t ,y t ] T denotes the position vector and [  ,   ]  denotes velocity vector.The set of spatial saliency region  1... = {  (  )} t=1...2 is denoted as the observation set of VA T (I t ), in which j 1 and j 2 are the adjacent values in KTS and  1 <  2 .Within the Bayesian inference framework, we estimate posterior probability function (  |  1... ) to compute target state   for VA T (I t ).The calculation of (  |  −1 ) includes prediction and updating, shown as follows: where   and OS k denote the state and observation value of k-th frame, respectively, (  |  −1 ) denotes the state transition model which can be solved using affine transformation, and (  |   ) denotes state measurement model which can be computed using structural local sparse appearance method, shown as follows: where Firstly, each VAB is projected using Walsh-Hadamard kernels [17], as shown below, and the results are stored in the temporary set  = {  | 1 ≤  ≤ }: Input: A spatial image sequence  Output: Changed image sequence   (1) Compute the frame number of  as len.
(2) Give the first frame of  as the current frame  1 .
(3) Give the following frame of  1 as  2 and calculate their attention regions  1 and  2 according to the method in Section 3.1.
(5) Compute the transformed value of each ST using equation ( 8) and create hash tables.( 6) FOR  = 1 . . . the number of hash tables FOR each transformed ST in  1 Obtain the expansion blocks according to spatial and temporal expansion rules; Compute the nearest blocks using the equation ( 9) and ( 10 Then the transformed result is assigned a hash value to accelerate VAB projection, setting in the corresponding hash table   [ℎ  (  )](1 ≤  ≤ ).
Secondly, we establish spatial and temporal expansion rules to obtain the expansion blocks for each VAB.For ∀   (   ∈   ), each hash table TB m needs to be operated according to expansion rules and the corresponding expansion block  +1  ( +1  ∈  +1 ) can be produced, putting into the matching set.On this basis, we determine the nearest blocks from the obtained expanding blocks using the freedom searching, shown as follows: In (9), ,  is a constant value and its value is 0.5,  is a search radius, and  +1  is the best block of    .Then for each hash table, the following is used to compare and update the nearest blocks: Finally, the predicted value of the pixels in the attention regions can be computed through a smoothing operation of the obtained nearest blocks (for detailed algorithm, see Algorithm 1).

Image Resolution Resizing.
Image resizing has gained significant importance because of rapid increases in the diversity and versatility of display devices.However, existing methods do not resize results from the viewpoint of the observers.In this section, we present a new resizing method to improve the resizing quality of the important contents perceived by observers.To accomplish this, two main issues need to be addressed.The first is the need to determine the important contents from the observers' viewpoint.The second is the need to improve the definitions of the important contents during resizing.For the first issue, we adopt the method of attention region computing (see Section 3.1) and, for the second one, we introduce a method of partition interpolation with an architecture that is illustrated in Figure 4, containing two steps: partition interpolation computing and seam carving operation.Specifically, we resize I t from h initial ×   to h new ×   and a resizing factor 2 s is The idea of partition interpolation computing is to adopt different calculation approaches for different regions of original image sequences.Naturally, for any pixel (, ) in   , the interpolated pixel   is computed using where VA t is the simplified symbol of VA(I t ), which is computed in Section 3.  The computation of    (2+1,2+1) contains two steps: initial estimation and energy modification.The initial value is calculated using In ( 13), j 1 and j 2 are the determination function, j 1 = 1 -3 2 + 3 , j 2 =-3 1 + 2 + 4 , and   ( = 1, 2, 3, 4) is computed as follows: For any pixel in general regions, the following is used to compute    (2+1,2+1): Then we use seam carving method to preserve the contents of attention regions.That is to say, the pixels in the different regions are computed through energy function, shown as follows:

The Module of Perception-Driven Display
In this section, we describe in detail the formulation of our proposed the module of perception-driven display, containing three algorithms, namely, target trajectory computing, space transcale display, and space movement display.To some extent, all of them are an application of the module of transcale description and could be directly applied in the navigation information systems.
(1) Target Trajectory Computing.To display the motion trajectories of space targets under different scales, the approach of target trajectory computing is presented and it contains the following five steps: Step 1.For each frame in I t (1≤t≤n), VA t is computed by utilizing the method of attention region computing. Step Step 5. Motion trajectory   with the different scale  is computed using the following formula: (2) Space Transcale Display.The aim of transcale display is not only to show motion pictures at different time scales but also to display with the different resolution scales.This algorithm mainly includes the following five steps: Step 1.For each I t , the key frame   ( = 1 . . .) is computed and then the corresponding attention region VAK j is obtained.
Step 2. The transcale display map KM is captured by merging the attention regions under the condition of different key frames.
Step 3.For each   ( = 1 . . .), the corresponding thumbnail VU j is computed by reducing the original image to the same resolution.
Step 4. The transcale display map   for I t is computed: Step 5. Using the method of image resolution resizing, VU j could also have high-quality display with the different resolutions.
(3) Space Movement Display.To outline the motion process of space targets, a movement display algorithm is proposed, which focuses on smooth transition of space movement.Specifically, let K p and K q be the continuous key frames in I t and the transition regions of them be KR p (u,v) and KR q (  , V  ), respectively, in which u,  ∈ [1,height], v∈[weight-r+1,weight], V  ∈ [1,r], r denotes the transition radius, and height and weight are the current resolution of key frames.The process of this algorithm consists of the following four steps: Step 1.Both KR p and KR q are resized to target resolution height×2r, and the transition region maps   ( = , ) are obtained.
Step 3. Repeat Steps 1 and 2 until all the keyframe transition regions are calculated, and then new key frame   is recorded.
Step 4. All the obtained key frames are merged to form display map MD: where  is the total number of key frames.

Experimental Results and Discussion
In this section, we conduct the quantitative and qualitative experiments to evaluate the performance of PTDS, including the transcale performance (see Sections 5.1, 5.2, and 5.3) and the display performance (see Sections 5.4, 5.5, and 5.6).We downloaded space video from Youku and they are segmented the video clips.Then we changed these clips to image sequences and formed our data shown in the Table 1.In all our experiments, we used the MATLAB platform and a PC with a 2.60 GHz Intel(R) Pentium(R) Dual-Core CPU processor with 1.96 GB of main memory.

Evaluation of Attention Scale.
Figure 5 shows the obtained attention regions using our method, in which each Figure 5: The obtained attention regions using PTDS from sample frames of different image sequences (from top to down, sequences S1, S2, S3, and S4).In each subfigure, the purple-bordered regions mark the attention regions, the blue-bordered regions mark the spatial attention regions, and the green-border regions mark the temporal attention regions.
row exhibits the results of different space image sequences and each column exhibits sample frames of each sequences.
In addition, each subfigure consists of three parts: the purplebordered region in the left marks the attention regions, the blue-bordered region in the upper right marks the spatial attention region, and the green-border region in the lower right marks the temporal attention region.Specifically, the first row shows the results for frames 100, 128, and 134 from S1, and the second row shows the results for frames 119, 121, and 135 from S2. Observing these rows, we can see that the spatial attention regions only capture part of space targets, for example, the result in the first row and first column.This means that both the calculations of spatial attention regions and temporal attention regions are necessary for obtaining the accurate attention regions, especially for image sequences with large motion amplitude.Moreover, the third row shows the results for frame 10, 24 and 30 from S3, and the last row shows the results for frames 8, 45, and 88 from S4.These sequences have a relatively complicated background, which causes great interference to the detection of attention regions.However, the proposed method still obtains good results and these results lay the foundation for transcale display.
Figure 6 shows the quantitative comparison results for the other methods, including image signature detection (ISD) [18] and global contrast detection (GCD) [14].Here we adopt Advances in Multimedia   the overlap score [19] as a quantitative indicator, which is defined as the following formula: where R S denotes the obtained attention result and R G denotes the corresponding ground truth bounding box.The larger the overlap scores are, the more accurate the results are.From this figure, we can see the average overlap scores using three methods.It is noted that our method achieves a larger score than those of the other methods, and the results show that the obtained attention regions using our method are closer to the true significant region of original image sequences.

Evaluation of Frame Rate
Scale.The quality of interpolated frames is a key factor to evaluate the change of the scale of frame rate, and in experiments, we removed one in every two consecutive frames of the original image sequence and reconstructed the removed one adopting the different methods, including three-step search (TSS), adaptive rood pattern search (ARPS), Horn and Schunck (H&S), CNF [20], CSH [21] and our proposed method.Figures 7 and 8 show the frame interpolation results for sequence S5 and S6, and from the red-bordered region of each subfigure, the visual differences among algorithms can be determined.TSS and ARPS exhibit a poor interpolation effect, H&S and CNF introduce a suspension effect, CSH produced disappointing effects, and PTDS shows the clear details of spacecraft.From these figures, it is evident that the proposed method shows comparatively better performance in terms of visual quality.Figure 9 shows the average quantitative values of each image sequence by using the different methods.The left part demonstrates the average PSNR values, which is traditional quantitative measure in the term of accuracy, and the right part demonstrates the MSSIM results, which assess the image visibility quality from the viewpoint of image formation, under the assumption of a correlation between human visual perception and image structural information.In this figure, it is obvious that our method could achieve the highest average values.

Evaluation of Resolution Scale.
We resize the original image sequences by using the different methods, including scaling, the best cropping, improved seam carving (ISC) [22], and our method, and then the obtained results are compared to evaluate the performance on the change of resolution scales.
Figure 10 shows the comparison results of sequence S7 from 320×145 to 500×230. Figure 10  scaling method, the launch base becomes vaguer than before.Figure 10(c) shows that using the best cropping method, the launch base is only partly displayed, resulting in the original information becoming missing.Figure 10(d) shows that using the improved seam carving method, the prominent part of launch base becomes smaller than before, indicating that ISC is not suitable for image enlarging.Figure 10(e) shows that our method clearly displays the prominent objects of the original frames and ensures a global visual effect when the resolution scales are changed.Similarly, resizing results of the sequence S8 are shown in Figure 11.

Evaluation of Target Trajectory
Computing.Figure 12 shows target trajectory of sequence S2 at scales  = 1,  = 10, and  = 20.As can be seen from this figure, as the value of  increases, the movement amplitude of space targets becomes greater and the description of the trajectory could be coarser, which clearly characterizes the change of the movement of automatic homing.Similarly, target trajectory of sequence S5 is shown in Figure 13.
Figure 14 shows the detailed positions of target trajectories corresponding to Figure 12, which are described in way of computing the position of the center pixel in attention regions.And it can be seen that the larger the scale  is, the fewer the description points are and the rougher the position trajectory of center point is and vice versa.Similarly, Figure 15 shows target trajectories corresponding to Figure 13.In short, PDTS exhibits good performance for displaying space target trajectories under different scales, which can lay a foundation for space tasks, such as condition monitoring and motion tracking.near to far and the motion details of the space targets.

Evaluation of Space Transcale Display.
Similarly, the process of movement from far to near is given in Figure 16(b).From this figure, we can see that the proposed space transcale display algorithm can clearly exhibit the motion process of the space targets at different scales.And by observing the accompanying attention regions, motion details of space targets, such as target pose and operating status, are displayed at different distances and time, which fully reflects the transcale characteristics of the PTDS scheme.Figure 17 summarizes the objective evaluation values of space transcale display corresponding to Figure 16, including average gradient, edge intensity, spatial frequency, image definition, and quality score.With the increase of the number of key frames, the values of the above evaluation index are also increasing, indicating that the space transcale display algorithm can obtain high-quality transcale display map.our method in the right.For the left part, it is obvious to see a straight line to distinguish between two key frames, while for the right part, smooth and delicate transition areas are displayed to achieve a perfect representation of the movement process.

Conclusions
In this paper, we focus on the transcale problem of spatial image sequences and propose a novel display scheme which improves the awareness of multimedia processing.
The contribution of this scheme manifests in two aspects.On the one hand, space targets sustained attention by using visual saliency technology and then the details of the movement of the targets are enhanced.On the other hand, the motion of spatial image can be smoothly reproduced and effectively showed on the display device with different sizes and resolutions.Experimental results show that the proposed method outperforms the representative methods in terms of image visualization and quantitative measures.In the future, we will further study the transcale characteristics of space image sequences and explore the mutual influence between

Figure 2 :
Figure 2: The flowchart of the module of attention region computing.

Figure 3 :
Figure 3: The computation of pixels in attention regions.
); Compute the value of pixels in  1 ; END FOR END FOR (7) Merge all the transformed ST. (8) Compute interpolated frame  1  by combining the attention region with non-attention region.(9) Give the following frame of  2 as  3 ,  3 →  2 .(10) Repeat steps (3) to (9) until the frame number of the current frame is (len-1); (11) Synthetic interpolated frames and original frames Return   Algorithm 1: Frame rate conversion algorithm.

Figure 4 :
Figure 4: The method of partition interpolation.

Figure 6 :
Figure 6: Comparison results of the average overlap score.
Figure 16  gives transcale displays for sequences S5 and S9, in which the attention regions are shown in the left part of each subfigure.Figure16(a) shows both the whole process of detecting from

)
(   ,  ) =   ,  ,    ,  ) =  , and  , denote any pixel in I t , (   ,  ) denotes the spatial saliency value of  , , (  ) in which near denotes the number of color that are nearest neighbors of ]  .
Note that we choose three different values for , namely, 8, 16, and 32, and we find that the appropriate value is 16 according to the obtained experiment results.The set of VAB denotes as   = {   } and the computation procedure for the contained pixels is elaborated as follows.
denotes the patch divided by 32×32 pixels in VA S (I m ) (in which 1≤m≤k), N denotes the number of the divided patches,   denotes the sparse coding of d, M denotes the normalization term, and   denotes the weighted sum of   , setting set .And Dig(•) is the function of diagonal elements computing.3.2.Frame Rate Conversion.To enhance motion smoothness of space image sequences, frame rate conversion is performed, whose core is to compute the high quality intermediate frames, i.e., how to calculate the pixel values of the intermediate frames.Here the partition method is used for calculation.Moreover, for non-attention regions, the values of pixels are unchanged.And for the attention regions, the values of pixels are accurately computed and the main computation steps contain transform of attention blocks, acquisition of neighboring blocks, and prediction of unknown pixels, which is shown in Figure3.Specifically, attention regions in any two consecutive frames, i.e., VA(I t ) and VA(I t+1 ), are divided into × overlapping image blocks, and each block is defined as an attention block, VAB for short.
pixels in the attention region and  l ,  r ,  t , and  b are the boundary values of VA t .
(  (, ))) , if (  ≤  ≤   ) and (  ≤  ≤ The four boundary values are computed for each VA t , including left boundary BV L , right boundary BV R , top boundary BV T , and bottom boundary BV B :   = min   ∈  Step 4. For each VA t , trajectory coordinates are calculated, including the value of horizontal coordinate VA row and the value of vertical coordinate VA column : 2.Using the method of frame rate conversion, I[1, ] is changed into HI[1,   ]=HI t (1≤t≤   and   >n) in which N is the number of changed frames and VA t is updated in HI t .Step 3.