Performance Optimization of 3-DOF Application Scene Based on 360-Degree Panoramic Technology with Depth Information

High-quality 3D scenes often show poor rendering e ﬀ ect and insu ﬃ cient operational performance in low-end devices. Therefore, how to make better use of 360-degree panoramic technology and replace the traditional pure model scene by visual deception is of great signi ﬁ cance to the picture quality improvement and performance optimization of low-end equipment. In this paper, a three-degree-of-freedom (3-DOF) application scenario for indoor simulation is used as an example. Under the premise of 360-degree panoramic technology, depth information is obtained through laser ranging, converted into a two-dimensional form of depth image with the help of spherical coordinates, and combined with a transparent information image, the depth image is compressed using DEFLATE compression algorithm, and ﬁ nally be stored. By optimizing the data through bilinear sampling and with the help of geometric knowledge, the occluded parts are removed or translucent according to the actual situation when the graphics are rendered, and the occlusion relationship is correctly handled in order to achieve a real integration of the virtual environment with the 3D objects. Through multiple data, it is shown that the method possesses signi ﬁ cant performance improvement while ensuring geometric realism.


Introduction
With the development of digital media technology, the 3D rendering effect has been continuously improved and applied to various fields [1][2][3][4][5][6][7][8][9][10]. However, the generated 3D images require heavy computation, which result in significant increases in performance requirement by operating equipment. In order to make high-quality 3D images available on low-end devices, we often use 360-degree panoramic technology instead of the pure modeling scheme (PM) [11][12][13][14][15][16]. Nevertheless, because of the noninteractivity of virtual background, it is difficult to achieve real integration with the internal 3D object, with a result that it is generally being used only in distant scenes, or comes alone. Therefore, to further replace the 3D scene and increase the sense of reality, it is necessary to solve the occlusion relationship between this two.
As early as the augmented reality was proposed, people have carried out relevant studies on occlusion consistency [17][18][19][20][21], in which the technology that can be directly applied to 3-DOF application scenarios is the occlusion scheme eliminated by rendering sequence based on environmental modeling (OEM) [22]. Due to the invariant nature of the virtual scene, this technique reconstructs the environment with a low-model mesh, with depth testing and depth writing enabled, we adjust the rendering order so that the environment model is rendered first and the later rendered 3D objects are excluded from the back part of the environment and the opacity of the environmental low-model is set to 0. At present, the main problem of this solution is that the accuracy of the low model is not high and the masking of small objects is poor, while if the model accuracy is increased, the performance requirements of the equipment are significantly higher, resulting in the inability to play an optimal role. Another common solution is a technique commonly used in 2D games which is implemented in 3D by layering the scene content into multiple 360-degree panoramas based on distance and proximity to achieve a sense of different levels of visualization. This approach is simpler, but the problem is also obvious: there is no transition effect when 3D objects span between layers.
In this paper, through the comparison and test of various solutions, we propose an occlusion scheme based on a 360degree panoramic depth image (OPDI) for 3-DOF application scenarios of indoor simulation, based on the premise of optimizing performance and preserving spatial realism, using the depth information of 360-degree panoramic environment and eliminating the occluded parts of objects through graphical depth calculation, so that 3D objects can form a real visual sensation of occlusion with the environment in the only 360-degree panoramic image technology environment, which allows for the preservation of spatial realism while optimizing performance.

Method
This section specifically introduces the optimization scheme of 3-DOF application scenario using 360-degree panoramic technology with depth information in interior design simulation. The 360-degree panoramic image is acquired by the camera; furthermore, the 360-degree panoramic depth information is obtained by laser ranging and then saved in a 16bit single-channel 360-degree panoramic grayscale image by spherical coordinate conversion and compression using DEFLATE compression algorithm and then based on the 360-degree panoramic depth image and scene translucent objects to draw a 360-degree panoramic transparency information image. Finally, when rendering the 3D object, the depth information and transparency information of the environment relative to the camera in each spatial voxel are retransformed to compare with the spatial voxel corresponding to the 3D object, so as to determine how to render and display. Figure 1 shows the specific process, where parallelograms represent the data, and rectangles represent the steps.
First of all, a 360-degree panoramic view of the required indoor environment needs to be collected according to the 360-degree panoramic camera, as shown in Figure 2.
In the same position of the camera, the laser ranging sensor rotates at high speed in the range of horizontal 360degree and vertical 180 degrees, and the depth information of each angle is detected and collected in turn. The next step is to generate an associated 360-degree panoramic depth image. Unlike traditional 2D image methods, depth information will be built in a spherical coordinate system, which is more suitable to the properties of the depth data and then a lossless depth image is generated. Take the left-handed coordinate system as an example, the corresponding spherical coordinates are shown in Figure 3. Then, the information of each point on the spherical coordinate is mapped to the depth image, where the yaw angle θðV e Þ is the row coordinate of the depth image pixel, and the pitch angle ωðV e Þ is the vertical coordinate of the depth image pixel. It is worth noting that for the convenience of the following calculation in this sphere coordinate, the yaw angle θ is calculated starting from +Z axis, and the pitch angle ω is calculated starting from −Y axis, while the distance value distanceðV e Þ is converted to pixel value PðP xi , P yi Þ. The mapping relationship is as follows: In these formulas, the round function is used to round a number and multiple represents the quantization coefficient of the angle, which determines the resolution of the 360-degree panoramic depth image. Since 1/multiple is exactly the step size of the sensor, the obtained depth information density can be effectively controlled by adjusting the coefficient.
The generated depth images are saved in 16-bit singlechannel PNG format grayscale images using DEFLATE compression derived from LZ77, and the data are  2 Wireless Communications and Mobile Computing compressed using DEFLATE compression algorithm after processing by the Lempel-Ziv (LZ77) [23] algorithm to ensure lossless compression while reducing spatial redundancy. The final result is shown in Figure 4. After acquiring the 360-degree panoramic depth image, the next step is to perform the most important occlusion removal work in this paper, which will be done in the graphics rendering, where the model spatial coordinates of the 3D object vertices are obtained through the vertex shader and converted to world spatial coordinates and then further into camera spatial coordinates position camera ðx camera , y camera , z camera Þ.
where matrix mw is the transformation matrix from the model space to the world space and matrix wc is the transformation matrix from the world space to the camera space. After transformation, the results are transmitted to the patch shader, in which the vertex coordinate information is automatically linear interpolated, thus ensuring that the current camera space coordinates of each voxel of the object are obtained. Then, the distance value distanceðV o Þ is calculated according to the camera spatial coordinates of each voxel. The formula is as follows: The next step is to obtain information about the direction of each corresponding voxel in the 360-degree panoramic depth image. According to the spatial coordinates of each voxel, the yaw angle θðV o Þ and the pitch angle ωðV o Þ relative to the center position are calculated according to the following formula: It is worth noting that the value range of function arctan is ½−π/2, π/2; therefore, we also need to make interval judgment of angle to ensure that θðV o Þ is in ½0,360 interval. Since we defined ωðV o Þ is in ½0,180 interval as to store data conveniently before, here, we also need to add 90 to it in order to comply with the previous setting of ω starting with −Y axis.
Based on yaw angle θðV o Þ and pitch angle ωðV o Þ obtained from 3D model voxels, mapping to the ½0, 1 interval which served as the UV value of the pixel point of the depth image, the corresponding distance PðUV Xi , UV yi Þ is read out from MainTex texture using the sampling function tex2d.
Due to the interval of sampling steps, the final effect will produce a certain degree of sawtooth. Therefore, according to the step size and application requirements when

Wireless Communications and Mobile Computing
collecting the 360-degree panoramic depth map, the 360degree panoramic depth image can be selectively bilinear sampling processing, so as to smooth the data. Finally, the distance value in the depth image is obtained according to the UV value of the corresponding voxel and compared with that voxel to determine whether the voxel needs to be rejected, where Max is the maximum value that can be stored in the depth image.
IsDiscard ð1Þ can now do most of the scene masking effects, but in reality, there are some cases where full masking is not desired, for example, glass and other transparent material environment objects which need to translucent processing rather than simply eliminate, so we also need a 360degree panoramic image of transparent information. It is worth mentioning that through the drawing software, a copy is made on the depth image, and the part that needs to be transparent is filled with different grayscale colors according to the intensity of transparency, while white is completely opaque, and the resolution can be reduced appropriately according to the actual situation, so a 360-degree panoramic environmental transparency information image is obtained, as shown in Figure 5.
We can use it to calculate the opaque value Alpha of 3D objects to achieve the superposition effect with the 360degree panoramic environment. The data acquisition method of 360-degree panoramic transparency information image is the same as acquiring depth information, where A ðUV xi , UV yi Þ is the opacity value of the corresponding point on the 360-degree panoramic transparency information image, and when AðUV xi , UV yi Þ = 1, it means completely opaque, factor dynamically controls the adjustment of transparency intensity, function saturate limits the value range to ½0, 1 interval, and at the same time, we need to redefine the elimination condition IsDiscard ð2Þ on the basis of IsDiscard ð1Þ .

Experiment
In order to verify the superiority of the scheme in this paper, we conducted three groups of control experiments on the virtual scene in Unity Engine 2020.3.11f1c1: they are OPDI,

Wireless Communications and Mobile Computing
OEM, and PM. In the experiment, the Game panel was rendered with a resolution of 1920px * 1080px, and an indoor environment (model from Unity Asset Store) was selected as the scene; the scene of three scenarios was tested for the corresponding masking effect with a rectangular cube as the 3D object. In order to observe clearer, the display type of Scane Panel in Figures 6-8 is Sprite Mask. Through the Hierarchy

Wireless Communications and Mobile Computing
Panel and the Scane Panel, we can clearly see the differences of these schemes in terms of model requirements, in which the OPDI does not use any environmental model, while the model compression rate of OEM is about 75% of the model of PM. The Game Panel on the right side shows the occlusion effect under different schemes, and there is almost no difference in vision. The OPDI also reproduces the translucent effect almost realistically, as shown in Figure 9.
With the help of Unity Profiler Panel, the performance consumed by the three schemes is statistically analyzed. Figures 10 and 11 and Table 1 show the differences of CPU usage (Rendering), GPU usage (Camera.Render), and memory usage in different schemes during a period of running. The runtime configuration is shown in Table 2.
By comparison, it can be found that no matter in CPU usage (Rendering) or GPU usage (Camera. Render), OPDI takes the shortest processing time, and memory usage is also the least, followed by OEM, and the last is PM.
By analyzing the data in Table 3, triangles, vertices, and draw calls, all three of which are major factors affecting performance, where triangles and vertices represent the number of triangles and vertices to be drawn, and too much slice by slice and vertex by vertex computation will seriously affect the GPU usage. Draw calls, on the other hand, represent the number of times the CPU submits data to the GPU and then issues rendering commands to the GPU, so it will directly determine the CPU usage. Therefore, reducing the number of these three has a significant effect on improving    performance. The depth image-based occlusion scheme provided in this paper does exactly exploit this property while ensuring a sense of spatial realism as a way to greatly improve performance. Moreover, the scheme proposed in this paper has no significant change in the number of draw calls, triangles, vertices, and memory required by the scene in any environmental scenario; however, through the above data analysis, the relatively more complicated pure modelnative scenario scheme, however, can optimize the performance substantially.

Conclusion
In this paper, taking the 3-DOF application scenario of indoor simulation as an example, a relatively novel performance optimization scheme is proposed under the condition of ensuring the spatial realism. Using 360-degree panoramic image instead of real 3D environment, mapping the distance of 360-degree panoramic environment relative to the camera to 2D image with the help of spherical coordinate conversion, and bilinear sampling data with the help of data alignment features in the image so as to optimize it, and comparing and eliminating the distance of each voxel of 3D object relative to the camera with the corresponding information saved in the 360-degree panoramic depth image to form the corresponding occlusion effect. Similarly, with the help of a 360-degree panoramic transparent information image, the occlusion of glass and other materials has a translucent effect. Through the method proposed in this paper, both the picture quality and the spatial realism are ensured, and the performance optimization is further achieved. According to the experimental results, the proposed scheme is obviously superior to the current mainstream 3D scene optimization schemes in terms of visual effect and performance. At the same time, the scheme is not only suitable for 360-degree panoramic image but also for 360-degree panoramic video. Therefore, it will have a broader application scenario, for example, in order to solve the problem of insufficient performance and poor display effect of 360degree panoramic games with nonmobile perspective using high-quality rendering on traditional mobile platforms, the 360-degree panoramic video can also be used to achieve some visual effects of scene view movement, which is of certain significance for the expansion of virtual reality technology.

Data Availability
The data we used is available and the performance optimization scheme proposed in this paper can be used in 3-DOF application scenarios. And part of them is available to you from the corresponding author upon request (zfq@mju. edu.cn).

Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the study of this work and publication of this paper.