Ray Tracing Acceleration Algorithm Based on FaceMap

Raster method and ray tracing are two important algorithms in computer graphics. The former, raster method, marked by high speed and efficiency, has the disadvantage of an unrealistic rendering effect. By contrast, ray tracing requires considerable computing resources and time despite its advantages of high fidelity and simple structure, thus only suitable for offline rendering. Given such limitations, a ray-tracing acceleration algorithm that combines the raster method and ray tracing is proposed in this study based on FaceMap, a two-dimensional data structure that stores surface distribution under point light or the view of a camera. Firstly, all triangular surfaces of the 3D model are rasterized by linear surface projection to form the surface distribution of a spherical panorama. Secondly, the structure of FaceMap is adopted to store the distribution information of all the surfaces in the scene. Thirdly, the data on the intersecting surfaces in this direction are collected by projecting the ray onto FaceMap in the ray-tracing process, thus reducing the intersection and propagation operations and improving the efficiency. Four object models including chessboard, grass patch, bust, and typewriter are selected for comparative rendering experiments. Our proposed method is used to compare with Rhino and VRay rendering software, respectively. The results show that our proposed method obtained better rendering effects within greatly reduced rendering time (© 2018Optical Society of America).


Introduction
Raster method and ray tracing are two branches of computer graphics. Raster method has the advantages of fast speed and high e ciency in scene rendering, but the disadvantage is that the rendering degree is far less than the reality; ray tracing has the advantages of high rendering reality and simple structure, and the disadvantage is that it consumes too much calculation, so it can only be used in o ine rendering.
Traditional ray-tracing algorithms adopt the Monte Carlo illumination model [1], which simulates the propagation of light in the real physical environment. After the conduct of a series of operations such as specular re ection, perspective refraction, and di use re ection, an image is nally generated. However, it is di cult to determine which surfaces the light may intersect within the propagation process due to the relatively complex rendered scenes. erefore, the main calculation in the raytracing process lies in the intersection operation of the ray and surface.
To narrow the range of surfaces requiring intersection operation, commonly used techniques are ray-tracing acceleration algorithms including the bounding box method, three-dimensional DDA (3D-DDA) algorithm, and octree division method: (i) Bounding box method: the basic idea of the bounding box is to use geometric objects of regular shapes to wrap each model in the scene, respectively. e adjacent bounding boxes are wrapped in larger boxes to form a tree structure [2]. e intersection operation starts from the root node of the tree and then to the subnodes. It will involve the subbounding boxes or the surfaces of inner models of the bounding box if there is an intersection point. Otherwise, the ray will not intersect with any surface of the inner models in the bounding box.
(ii) 3D-DDA algorithm: the low e ciency of ray intersection operation is mainly due to the uncertainty about the surfaces that may intersect and that are closest to the ray source. DDA algorithm evenly divides the space into three-dimensional grids, within which the corresponding surface is stored [3]. In the process of intersection operation, the ray starts from the grid where the ray source is located and successively intersects with the encountered surfaces inside other grids. If an intersection point appears, it will be the closest to the ray source. (iii) Octree division method: similar to the 3D-DDA algorithm, the octree division method divides the space into nonuniform grids [4]. A cube that can wrap the whole scene is divided into eight blocks, and the number of surfaces in each block is calculated. e block will be further divided into eight sub-blocks if the number is larger than the threshold; otherwise, the division stops. e above steps are repeated until the surface number of every sub-block is smaller than the threshold. e above three methods are all based on space dividing, which means dividing the space capable of wrapping the whole scene into different grids that contain some surfaces inside. When the ray propagates between different grids, the intersection operation is conducted between the ray and the surface within a grid.
Although these methods are efficient in scene rendering, they require additional calculation of ray propagation between grids and fail to determine whether there is an intersection between surfaces in a direction before the ray leaves the scene. Such drawbacks are particularly serious when there are numerous light sources in the scene [5]. e reason is that for each intersection on a surface, a shadow test line is required to be sent out to the light source for the purpose of examining whether it would be obscured by other surfaces.
It can be seen from the above analysis that both the raster method and ray tracing have their own advantages and disadvantages. A natural idea is to combine the advantages of the two algorithms to solve the aforementioned problems. As a remedy, we propose FaceMap, a two-dimensional data structure, to store surface distribution corresponding to the point light source or camera. It is by virtue of FaceMap that the surfaces intersected in a certain direction can be rapidly determined, and the efficiency of intersection operation can thus be improved. e rest of this paper is organized as follows: Section 2 and Section 3 introduce the raster method and the raytracing method, respectively. In Section 4, the ray-tracing acceleration algorithm based on FaceMap is illustrated in detail. Experimental results and analysis are presented in Section 5. Finally, conclusions are drawn in Section 6.

Raster Method
Raster rendering has the advantages of speed and efficiency. Because of its special rendering pipeline flow, raster rendering can be well integrated into the hardware (such as GPU). Besides, customized programming based on particular needs is made possible to obtain the desired picture effect. e pipeline flow mainly consists of vertex conversion, primitive assembly, rasterization, and interpolation coloring [6,7].

Vertex Conversion.
Each 3D rendering engine involves a space camera, whose projection result of a 3D scene is an image seen by a user. e camera itself is in the 3D space, capable of basic movement, rotation, and other different view transformations. e basic unit of the 3D model is a triangular surface with three vertices where the 3D coordinate data are stored in accordance with the world coordinate system. e data do not change when the model remains static. However, if the model coordinates cannot be calculated according to the world coordinate system due to changing the camera's angle of view, they should be converted to the camera's space. is is called vertex conversion. When the coordinate data of the model's vertices are transformed into the visual field of the camera, the basic projection transformation can be conducted. ere are two kinds of projection transformation, namely perspective projection and orthogonal projection. e former is similar to what is observed by human eyes, that is, an object is big when near and small when far, complying with the perspective principle. By comparison, the latter is similar to the scene when all models are compressed to a plane while retaining the original sizes, which is not in line with the perspective principle. erefore, raster rendering mainly focuses on perspective projection.

Perspective Projection.
e camera used in the raster method usually contains four parameters, namely viewing angle θ, near clipping plane distance n, far clipping plane distance f, and screen aspect ratio α, with the near clipping plane, serving as the projection plane, as shown in Figure 1.
For points that have been converted to the camera space, their 3D coordinates can be transformed accordingly by the projection transformation matrix M. At this point, the coordinates are not clipped in the screen space; the coordinates on the near projection plane can then be obtained. In addition, the transformation matrix can be regarded as the process of the space points being transformed into the near projection plane. Accordingly, the reverse process is to transform the points on the projection plane into the vectors in the camera space and thus is often used to solve the coordinates and rotation angles of camera. e transformation matrix M is expressed as follows: en, the points after perspective projection can be transferred to the screen coordinate system through the screen matrix S, and the matrix S is shown as follows:

Triangular Rasterization.
e smallest polygon in a twodimensional plane is a triangle, and any polygon can be viewed as a combination of several triangles. Hence, triangles are often used as basic structure for 3D models. For a triangle that has been transferred to screen coordinate system, its three edges and inside pixels need to be lled with progressive scanning being the frequently adopted method [8], as shown in Figure 2.
e pseudo-code of the process is described in Algorithm 1. e highest and lowest points of a triangle should not exceed the scope of the display screen and, if necessary, should be reduced to 0∼(height − 1). Speci cally, a triangle edge intersects with a certain line y if and only if one vertex of the edge is below y and the other vertex is not. Assume the two vertices of the edge are (x 1 , y 1 ) and (x 2 , y 2 ), respectively; then the equation that solves the x-coordinates of the vertices is expressed as follows: e two intersection points of the triangle edge and the line y can be obtained by the above method. e rasterization of the line is completed after lling the pixels between the two points. However, one drawback of triangle rasterization lies in the serious jagged edges of a triangle. A normal solution to such problem is to increase the sampling points in the edge pixels, as shown in Figure 3. e edge pixel is divided into several pixels for rasterization, and then, the average value of the pixel color is calculated, serving as the color of the edge pixel. is approach can greatly alleviate the jag e ect and meanwhile achieves low consumption since only edges require upsampling.
erefore, it can also be applied to mobile devices such as cell phones [9].
Given the triangular surfaces of the model, its neness can be regarded as equivalent to the number of triangles. However, a detail-emphasized model will signi cantly increase rendering time due to its large number of triangular surfaces.
erefore, existing improved algorithms use a small number of points to simulate a high-accuracy model. After tting surfaces by voxel rendering (as shown in Figure 4), the numbers of vertices and triangular surfaces are reduced, and the rendering of the model is accelerated [10].

Ray Tracing
Scene rendering of ray tracing mainly involves the process of sampling, reconstructing, synthesizing, and resampling the global illumination, for the virtual 3D space that can be viewed as the combination of objects and light sources. Unlike in the raster method, objects in ray tracing are not directly observed by cameras. Instead, certain light distribution forms in space after the light source and objects are illuminated by light, and what the camera captures is the re ected light as a result of the intersection of the ray and surface [11].
Ray tracing mainly computes the intersection of the point-based ray and the triangular surface in the space and then obtains u and v of the intersection point relative to the triangle vertices.
Assume the ray starts from point O, and the unit direction vector is D which intersects with the triangle (V 0 , V 1 , V 2 ) at point P. en, P can be viewed as translated from V 0 corresponding to V 1 and V 2 , the distance between P and O being t, as displayed in Figure 5.
e equation obtained is expressed as follows [12]: By rearranging the above equation and extracting t, u, and v as unknowns, the linear equations can be described as follows: Assume (5) can be rewritten as equation (6).
Next, according to Cramer's rule and the mixed product formula, t, u, and v are solved, as expressed as follows: To avoid repeated operations, assume R D × E 2 and Q T × E 1 , and thus, equation (7) is simpli ed as follows:

Ray-Tracing Acceleration Algorithm
Based on FaceMap e raster method is characterized by high speed and eciency but has the disadvantage of an unsatisfactory rendering e ect. In contrast, the ray-tracing method features in simple mechanism and high delity, but it only applies to o ine rendering due to its high computational complexity and time complexity. erefore, the idea of combining the advantages of the two algorithms becomes very natural and intuitive.

Basic Idea of FaceMap.
Conventional raster projection is a kind of linear plane projection, with linear variations of the distance between points relative to the origin. By contrast, the projection of FaceMap belongs to linear surface projection, with linear variations of angles between points relative to the origin. FaceMap consists of four parts as follows: linear surface projection, curve approximation by bisection method, lling of the curved triangle, and reverse solution of light vector. Figure 6 clearly shows that in terms of ordinary plane projection, the angle between point's narrows with the decrease in the distance between points and the screen edge. Consequently, serious deformation can occur at the screen edge, and the larger the eld angle is, the more serious the deformation becomes.
A typewriter model was extracted via Adobe After E ect 2015 (AE for short), and the conversions of the camera's eld angle are shown in Figure 7. To go into detail, the upper left eld angle was 50°, basically without deformation; the upper    right eld angle was 90°, with tensile deformation near the bottom corners; the bottom left eld angle was 120°, with particularly evident tensile deformation; the bottom right eld angle was 170°(because AE is unable to set the eld angle as 180°, or in other words, the eld angle in plane projection cannot reach 180°), with the whole typewriter model being stretched to a strip.
In comparison, the linear surface projection based on FaceMap is similar to real optical lens in everyday life, with a complete 360-degree visual eld. It can generate various image e ects including wide-angle, sheye, and ultra-wideangle. As shown in Figure 8, the upper left eld angle was 50°, basically without deformation (similar to that in plane projection); the upper right eld angle was 90°, with an overall bent deformation; the eld angles of the lower left and the lower right were 120°and 170°, respectively, both with the e ect closer to that obtained by sheye lens.
FaceMap is a two-dimensional data structure that stores the spherical panorama distribution of the scene. erefore, after mapping the object in the scene to FaceMap, a spherical panorama can be generated. Figure 9 presents the FaceMap schematic diagram of the interior of a typewriter model with spatial origin (0, 0, 0), the eld angle being 360°. Each point in the gure stored the surface data in that direction. A darker color of points indicated a larger number of surfaces in the direction, and a lighter color, a smaller number of surfaces.
In the ray-tracing process, for ray vectors emitted by the camera, surfaces that might intersect can be determined directly by FaceMap. In addition, the shadow test line facing the light source can also nd out whether there is any surface as an obstacle, requiring no light propagation operation.
In this study, the scene surface was projected onto FaceMap by the raster method so as to accelerate the generation. Instead of using ordinary linear plane projection whose maximum eld angle is approximately 180°, this paper adopted the linear surface projection which has a 360degree visual eld for the generation of FaceMap.
As shown in Figure 10, the central point of FaceMap was O. After projecting a triangular surface in the scene onto the sphere of FaceMap, a curved triangle ABC was formed, and the three corresponding edges, denoted as a, b, and c, were all curves. e detailed raster process is shown in Figure 11. e three vertices A, B, and C of the triangle were projected onto the two-dimensional structure of FaceMap by virtue of linear surface projection. en, the bisection method was used to approximate any curved edge of the triangle, and nally, the scanning line lling method was employed to ll the curved triangle.

Linear Surface Projection.
e BIT-VBF left-handed 3D coordinate system was adopted in this paper. e vertex coordinate transformation was viewed as converting points from the world coordinate system to the camera coordinate system, by virtue of the transformation matrix [13]. e position coordinates of any point in 3D space were represented by (x, y, z), while the world coordinate axes were described as axis X(1, 0, 0), axis Y(0, 1, 0), and axis Z(0, 0, 1) successively. Similarly, a camera also possessed position coordinates and three axes. e position of the camera was assumed as camPos, Mathematical Problems in Engineering with the three axes being camAX, camAY, and camAZ, respectively, and the axis coordinates would change according to the rotation of the camera. e angle changes relative to the screen center are linear in linear surface projection. In other words, if the distance between a point and the central origin is two times that of another point, then the angle between this point and the line of sight is also two times that of the other point. e field angle of a surface was denoted as θ, the center as camPos, and the three axes as camAX, camAY, and camAZ. For a certain point v in the space, its vector cv relative to the camera was calculated by Vector operation was expressed as cv � v − camPos.
e transformation matrix can operate a series of transformations of the model, such as translation, rotation, and scaling. Only one point in the space is needed to move and rotate, and then, all the vertices of the model experience coordinate transformation relative to that point, thereby resulting in the converted model. Besides, scaling could be realized merely by multiplying one factor in the matrix [14].  Mathematical Problems in Engineering Next, the screen radius was assumed to be R, so the distance t between point sv after mapping and the screen center was obtained by the following equation: Given the x-coordinate and y-coordinate of tv, the direction of its projection on the XOY plane of FaceMap could be determined. e distance between tv on the XOY plane and the central origin was denoted as Dist, and then, the ratios kx and ky of components in the OX and OY directions were obtained, as calculated by Dist At last, the sx − coordinate and sy − coordinate of point sv projected onto FaceMap were calculated by

Curve Approximation by the Bisection Method.
A curve can be seen as a collection of countless points, while a line in screen space is a collection of finite points. erefore, curves in a space can be approximated by calculating finite pixels. Given the method of perspective projection, the curve mapping method in screen space is described in Algorithm 2.

Filling of the Curved Triangle.
e curved triangle needs to be filled after all its points are obtained. Similar to the rasterization of regular triangles, the filling of the curved triangle also employed the progressive scanning method is described in Algorithm 3. e number of intersections per row, though might more than two due to the curved triangle edges, could only be even, so the filling was conducted in pairs.
For a curved triangle, the intersection of each line and the curved edge is described in Algorithm 4.
is method is capable of conflict detection and the recognition of multipoints on the same line, thus avoiding the occurrence of odd number of intersection points on a line.

Reverse Solution of the Light Vector.
In the reverse ray tracing, each pixel on the screen needs a reverse solution of the ray vector.
By the nature of surface projection, the distance from a point to the screen center is linearly correlated with the angle between the vector corresponding to the point and the sight axis camAZ. Accordingly, for a point (sx, sy) on the screen, the radius of the screen was assumed as R, and the field angle as θ. en, the relation between the coordinates of the point and those of the screen center was dx � sx − R and dy � R − sy. e angle β between its vector and the sight axis was calculated by θ.
After the normalization of both the ray vector and sight axis camAZ, its module became 1. e characteristics of the vector inner product indicated that the axis Z's component Lz of the ray vector could be calculated by Lz � cos(β).
(18) e module of the normalized ray vector was 1, as mentioned above. Next, the components of axis X and axis Y of the ray vector were assumed as Lx and Ly, respectively. According to the calculation of vector modulus, (19) could be expressed as ������������� � Lx 2 + Ly 2 + Lz 2 � 1. (

19)
With the ratio of Lx to Ly being denoted as k, (20) could be derived based on projection characteristics as en, (19) was embedded into (20) to form (21), described as In summary, the ray vector L(Lx, Ly, Lz) was calculated by Nonetheless, this calculated ray vector L was still within FaceMap space, thus needed to be converted to a world coordinate system. erefore, the ray vector of the world coordinate system was denoted as T, and (23) In accordance with Cramer's rule, only one solution exists when D is greater than 0; innumerable solutions exist when D equals 0; no solution exists when D is less than 0. Geometrically, (23) is similar to the intersection operation of three planes, with the normal vectors of the three planes being camAX, camAY, and camAZ, respectively, serving as three sight axes of FaceMap. erefore, the three normal vectors would certainly intersect at a point. Moreover, since the sight axes were normalized unit vectors with modules of 1 and the value of D was also calculated as 1, (25) was simplified as

Renderer Design Based on FaceMap Algorithm.
FaceMap, being abstract in the program, can be inherited by other components to obtain the panorama distribution of the scene model. en, FaceMap can be used or modified Input: A curve information (1) Plot function of space curve (Input the two vertices of the curve): (1) Project the two vertices onto the screen space as two endpoints (2) Add the first endpoint (3) if the distance between the two endpoints is longer than one pixel: (i) Find the midpoint of the two vertices (ii) Curve approximation function (the first vertex and the midpoint) (iii) Add the endpoint after the midpoint projection (iv) Curve approximation function (the midpoint and the second vertex) (4) else Add the second endpoint (2) Curve approximation function (input the two 3D vertices): (1) if the distance between endpoints after vertex projection is longer than one pixel: (i) Find the midpoint of the two vertices (ii) Curve approximation function (the first vertex and the midpoint) (iii) Add the endpoint after the midpoint projection (iv) Curve approximation function (the midpoint and the second vertex) (2) else null Output: Results after mapping ALGORITHM 2: Screen space curve mapping algorithm.

Input: A triangle vertex information
Traverse all points on a curve: if the current point is online y and the next point is below line y: Add the x-coordinate of the current point else null if the next point is online y and the current point is below line y: Add the x-coordinate of the next point else null Output: Intersection information ALGORITHM 4: Algorithm for calculating the intersection of each line and curve edge of curve triangle. according to specific needs. e structure of Renderer based on the FaceMap algorithm is shown in Figure 12: (1) Vector class contains attribute coordinate values x, y, and z and basic vector operations such as addition, subtraction, point multiplication, and cross multiplication (2) Face class is used for storing basic triangular surfaces, involving vertex arrays with a length of 3 and normal arrays with a length of 3 (3) OBJ class stores model data and helps to load Obj model files which are then converted to face object arrays (4) Space vector class represents a basic space point, with attributes including space position, rotation angle, and self-space coordinate axis (5) FaceMap class, deriving from space vector, is also abstract. However, compared with space vector, FaceMap contains additional attributes including field angle and field radius (6) Camera inherits FaceMap. While generating Face-Map structure, it only stores the information of surfaces closest to itself, and then, the closest intersection point can be directly calculated during rendering (7) Point Light also inherits FaceMap. While generating the FaceMap structure, it stores the distribution data of all surfaces. Furthermore, the point light class provides the method of get_lighten_up (). It judges by virtue of its FaceMap whether an entered space point will be blocked by other surfaces, namely whether the point is in shadow (8) Color class is used for color operation and contains the values of three components R, G, and B. While rendering a certain point, shadow test of multiple light sources should be performed. If can be irradiated, the value of illumination color should be accumulated. Color class provides several basic methods including addition, subtraction, numerical multiplication, and conversion to 24 bit color (9) Image class is a two-dimensional image buffer, storing scene graphs for ray-tracing rendering; [10] Renderer class controls the whole Renderer, which contains OBJ model, Camera, Point Light array, and Image buffer.

Selection of the Rendering Model in Comparative
Experiment. A total of four object rendering scenes was used in the comparative experiment, with detailed data of the scene as shown in Table 1.

Experimental
Results and Analysis. Rhino 5, VRay 2.00.02, and Renderer presented in this paper were employed successively to render the four models, and the corresponding effects are shown in Figures 13-16, mainly from the rendering effect for comparison. Judging from the rendering effects as reflected in Figures 13-16, the effects achieved by the proposed Renderer were similar to those by Rhino and VRay, with main differences lying in the deformation at the edge of the visual field and the brightness. Firstly, in terms of the deformation at the field edge, since surface projection was used in the proposed Renderer, the reserve resolution of the ray vector was also based on surface distribution. As shown from the effects of rendering chessboard in Figure 13 and typewriter in Figure 16, lines close to the camera were slightly bent when using Renderer, while they were straight in the results obtained by Rhino and VRay. Consequently, the rendering effects by the proposed Renderer were closer to those by a real camera. e second inference consisted of the brightness of illumination. Distinct illumination models used by different renderers resulted in diverse calculations of brightness and propagation attenuation of the light source, thus causing differences in brightness and contrast degree of the whole image. Nonetheless, the difference in brightness only led to various displaying effects, with a negligible impact on computational cost. e specific time consumption of each rendering shown in Figures 13-16 is listed in Table 2. Table 2 demonstrates that the proposed Renderer spent the least time in total, less than half the total time spent by either of the other two renderers. e computational efficiency of Renderer was 2.34 times that of Rhino and 2.15 times that of VRay 2.0. It is worth noting that the rendering by Renderer was completed in a single-thread environment, indicating an even lower CPU resources occupancy and a higher speed, while Rhino and VRay 2.0 used multithread rendering.
In summary, in this study, the proposed FaceMap-based ray-tracing algorithm, Rhino, and VRaywere successively used to render four object models including chessboard, grass patch, bust, and typewriter. e comparison in terms of rendering effect and rendering time indicated that the  Figure 12: Renderer structure based on FaceMap algorithm. Each scene was rendered in white mode with the same camera view, and eight point light sources were added to the scene. 14 Mathematical Problems in Engineering proposed method achieved relatively better rendering effects and significantly reduced the rendering time in the meanwhile.

Conclusion
is study intends to solve the shortcomings of the poor rendering effect of the raster method and the high computational complexity caused by the ray-tracing method. We propose a ray-tracing acceleration algorithm based on FaceMap by combining the raster method and ray-tracing method. FaceMap can be defined as a data structure that stores the surface distribution corresponding to a point light source or a camera. With the help of FaceMap, one can quickly determine the surfaces that may intersect in a certain direction, which can help improve the efficiency of intersection operation and thus speed up the overall rendering task and reduce computational complexity. e novelty of this study and the advantages of the proposed method can be described as follows: (i) it can directly determine the data of a surface in a certain direction, requiring no calculation of light propagation process; (ii) the camera can rapidly determine intersection points based on FaceMap, which in turn greatly improved the speed of rendering operation; (iii) once generated, the FaceMap corresponding to a specific point light source can be reutilized, thus further reduces computational complexity.
To sum up, based on the theory of rendering acceleration and optimization, this paper proposes a new structure representation and corresponding projection algorithm, which provides a new research angle and an efficient technology for the actual rendering requirements and has a promising application prospect in the field of computer graphics. In spite of what stated, the present study is not without limitations, which can be addressed in future research: (i) FaceMap consumes relatively large memory and considers compression as a feasible solution; (ii) the texture and the material of rendered object have not yet been considered by the current model; (iii) the shadow processing caused by light conditions also needs to be further improved.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.