Foggy Scene Rendering Based on Transmission Map Estimation

Realistic rendering of foggy scene is important in game development and virtual reality. Traditionalmethods havemany parameters to control or require a long time to compute, and they are usually limited to depicting a homogeneous fog without considering the foggy scene with heterogeneous fog. In this paper, a new rendering method based on transmission map estimation is proposed.We first generate perlin noise image as the density distribution texture of heterogeneous fog. Then we estimate the transmission map using the Markov random field (MRF) model and the bilateral filter. Finally, virtual foggy scene is realistically rendered with the generated perlin noise image and the transmission map according to the atmospheric scattering model. Experimental results show that the rendered results of our approach are quite satisfactory.


Introduction
Visualization of atmospheric phenomena is an important use of computer graphics, and among of these, fog is one of the most common scenes in game scene and virtual reality. A realistic fog will greatly improve the reality of virtual scenes and add attractiveness to the generated scenes when created. Thus, many applications of the simulated fog in computer games, virtual reality, special effects in movies and TV, and so forth can be found.
The simulation in real-time is usually limited to depicting a homogeneous fog with constant density distribution. However, in the natural world, the phenomenon is more complex. It is a participating medium made of many nonhomogenous layers moving with the wind and undergoing turbulent motion. Thus, this paper discusses the problem of fog simulation and visualization, with special emphasis on generating heterogeneous fog in real-time using image processing method.
The organization of this paper is as follows. We begin by reviewing existing works on the foggy scene simulation. In Section 3 we introduce the atmospheric scattering model and the Perlin noise. In Section 4 we propose the foggy scene rendering algorithm based on the estimated transmission map. In Section 5 we present some experimental results. Finally, in Section 6 we give some including remarks.

Related Works
There are several related works concerning the simulation of foggy scene. These existing methods can be divided into two categories: software-based methods and model-based methods. The software-based methods generate rendering effects using software tools. For example, the software tools, such as Photoshop, 3ds max, and OpenGL, can be used to simulate foggy scenes. However, although these tools can convert an input no-fog image to a virtual foggy image, the whole procedure has many steps and each step needs user involvement, which is very hard to control and requires good skills.
A more feasible and effective rendering method is the model-based method. Modeling the physical mechanism of fog makes it possible to simulate foggy conditions. Max [3] introduced a single scattering model for light diffusion to generate haze in the atmosphere. Yamamoto et al. [4] proposed a fast rendering method for atmospheric scattering effects by using graphics hardware. Jackèl and Walter [5] simulated the Mie Scattering in 1997, and Nishita et al. [6] used an improved sky illumination to render the fog scene. But this approach of calculating the scattering effect for air particles is quite time-consuming. Sun et al. [1] proposed a practical analytic method to render the single scattering in fog, mist, and haze in real-time. Wang [8,9] is applied to solve the foggy scene simulation problem. Zdrojewska [10] used the Koschmieder model (also called atmospheric scattering model) to synthesize the fog phenomenon by using hardware acceleration in the form of a GPU. Dong et al. [11] proposed an algorithm to simulate the foggy scene for a real image. The method first segments the input image into several semantic regions and then estimates depths for these regions and synthesizes virtual fog using the Koschmieder model. For the rendering of foggy image based on the Koschmieder model, the most important step is to estimate the transmission map of the model. Besides the color, texture, and geometric position used in Dong's method [11], there are many other features that can be taken as depth cues for the transmission map estimation. These features include focus/defocus, motion parallax, relative height/size, and texture gradient. For example, Hoiem et al. [12] generated maps using machine learning algorithms. The transmission values of scene and object are assigned according to the trained classes. Park and Kim [13] determined the distance from the focal plane by using the blurriness of low depth-of-field in optical physics. Jung et al. [14] assigned transmission maps by using edge information and prior depth knowledge. Han and Hong [15] generated the transmission map employing both vanishing points and super pixels as geometric and texture cues. However, the above methods are unreliable when the selected cues are weak in an input image. Yang et al. [16] generated a depth map using the local depth hypothesis based on the structural information of the input image and salient regions, but user interaction is required for this method.
Generating a transmission map from single image is an ill-posed problem. Not all the depth cues can be retrieved from an image. To overcome the challenge, in this work, we present a novel method to automatically generate a pseudo transmission map which reflects the depth information in the scene of the input image. Based on the estimated map, a virtual foggy scene can be simulated for both game scene and natural images. To improve the realism of simulated phenomenon, Perlin noise is also introduced in our method to generate heterogeneous fog. Experimental results indicate that the proposed algorithm may generate realistic rendered results.

Physical Characters of Foggy Scenes
To render the foggy scene, we first analyze the physical characters of a foggy image, including the model that describes the formation of a foggy image and the noise used for generating heterogeneous fog.

Atmospheric Scattering Model.
Light passing through a scattering medium is attenuated and distributed to other directions. This can happen anywhere along the path and leads to a combination of radiances incident towards the camera, as shown in Figure 1. The atmospheric scattering model widely used to describe the formation of the foggy image is as follows [8,9]: where this equation is defined on the three RGB color channels. stands for the observed image, is the airlight color vector, and is the surface radiance vector at the interaction point of the scene and the real world ray corresponding to the pixel x = ( , ). (x) is called the transmission map and expresses the relative portion of light that manages to survive the entire path between the observer and a surface point in the scene without being scatted. Theoretically, the goal of fog scene rendering is to calculate from the no-fog image , the estimated transmission , and the airlight .

Perlin Noise for Generating Heterogeneous Fog.
The natural phenomena usually do not change in regular ways but are characterized by large degree of randomness. Such feature is also present in fog, which is a volume object of variable density, taking irregular shapes due to wind and air turbulence. Because of that, random noise function seems like a good candidate to help simulate it. An argument to International Journal of Computer Games Technology  the function is two-or three-dimensional coordinates of a point in space, and the result is a pseudorandom value of fog's density at these coordinates. Noise generated by such function has a high frequency and hence displays rapid changes between contrasting values, which is not typical for fog density distribution. Therefore, it should be rather modeled with smooth noise created by interpolation of random value samples sequence. The method of generating such noise was proposed by Perlin in [17]. The noise function is characterized by two properties: amplitude and frequency. As seen in Figure 2(a), amplitude is the difference between maximum and minimum noise value. Frequency is reciprocal of the distance between the noise samples. The main ideas behind simulating the fog phenomenon are to adjust the atmospheric scattering coefficient of the transmission map in the atmospheric scattering model and add several noises with various frequencies and amplitudes together using Perlin's turbulence texture (see Figure 2(b)).

Algorithm Flowchart.
Specifically, the proposed algorithm has three steps to render a foggy scene: the first one is generating Perlin noise image as the density distribution texture of heterogeneous fog. The second step is computing the transmission map with the Markov random field (MRF) model and the bilateral filter. The goal of this step is assigning the accurate pixel label using the graph-cut based -expansion and removing the redundant details using the bilateral filter. Finally, with the generated Perlin noise image and transmission map, the foggy scene can be rendered according to the atmospheric scattering model. The flowchart of the proposed method is depicted in Figure 3.

Transmission Map Estimation.
The estimation of transmission map is the most important step for foggy scene rendering and consists in image segmentation, initial map estimation based on MRF, and refined map estimation using bilateral filter.

Image Segmentation.
The transmission map describes the portion of the light that is not scattered and reaches the camera. Since the map is a continuous function of depth, it thus reflects the depth information in scene. Koenderink [18] experimentally measured the human's ability to infer depth from an image, which shows that people cannot determine the relative depth of two points unless there is some visible and monotonic surface that connects them. Therefore, image segmentation technique is used here for estimating the transmission map and enhancing our knowledge of the image structure. The advantage of this technique is that it can often group large homogeneous regions of the image together while dividing heterogeneous regions into many smaller parts.
Mean-shift (MS) algorithm as a classical image segmentation technique is a robust feature-space analysis approach. It can significantly reduce the number of basic image entities, and due to the good discontinuity preserving filtering characteristic, the salient features of the overall image are retained. Besides, it is particularly important in the partitioning of images, in which only several distinct regions are used in representing different scenes such as sky, mountain, building, lake, and animal, whereas other information within a region is often less important and can be neglected [19]. All these features are very useful for acquiring the relative depth information of scene objects. Thus, MS method is used as the first step to estimate the transmission map. Figure 4 shows an example of image segmentation result obtained by MS method.

Initial Map Estimation Using MRF.
After obtaining the segmentation result, we use the graph-cut based -expansion method to estimate the map (x), as it is able to handle regularization and optimization problem and has a good track record with vision-specific energy function [20]. Specifically, each element of the transmission map is associated with a label , where the set of labels = {0, 1, 2, . . . , } represents the transmission values {0, 1/ , 2/ , . . . , 1}. Before labeling, we first convert input RGB image to gray-level image. Thus, the number of labels is 32 since the labeling unit of pixel value is set to be 8 and = 31. The most probable labeling * minimizes the associated energy function: where is the set of pixels in the unknown transmission and is the set of pairs of pixels defined over the standard four-connect neighborhood. The unary function ( ) is the data term representing the possibility of pixel having transmission associated with label . The smooth term ( , ) encodes the possibility where neighboring pixels should have similar depth.
For data function ( ), which represents the possibility of pixel having transmission associated with label , we first convert the input RGB image to gray-level image and then compute the absolute differences between each pixel value and the label value. The process can be written as In (3), is the intensity of pixel in the gray-level image (0 ≤ ≤ 1). ( ) is each element in the set of labels = {0, 1/ , 2/ , . . . , 1}. The parameter is introduced to ensure that and ( ) have the same order of magnitude.
The smooth function ( , ) encodes the possibility where neighboring pixels should have similar depth. Inspired by the work [21], we use the linear cost function, which is solved by -expansion: According to our daily experience, we know that objects which appear closer to the top of the image are usually further away. Thus, if we consider two pixels and , where is directly above , we have > . Thus, we can deduce that the transmission of pixel must be less than or equal to the transmission of pixel , that is, ≤ . For any pair of labels which violate this trend, a cost > 0 can be assigned to punish this pattern. Thus, the smooth function in (4) can be written as The parameters and are used to control the aspect of rendering effect. The value of controls the strength of the image details display and is usually set 0.01. The cost controls the strength of the image color display and is usually set 100. Besides, the weights associated with the graph edge should be determined. If the subtraction of the two neighboring pixel values in the input image is less than 15, then the two pixels may have high possibility of having the same transmission value. Thus, the cost of the labeling is increased by 15x to minimize the artifacts due to the depth discontinuities in this case. Taking the data function and the smooth function into the energy function equation (2), the pixel label of transmission map can be estimated by using the graph-cut based -expansion. In our method, the gco-v3.0 library [22] developed by O. Veksler and A. Delong is adopted for optimizing multilabel energies via the -expansion. It supports energies with any combination of unary, pairwise, and label cost terms [20,23]. Thus, we use the library to estimate each pixel label in initial transmission map. The pseudocode of the estimation process using the gco-v3.0 library is presented in Pseudocode 1.
In Pseudocode 1, and are the height and width of the input image. By using the functions (e.g., GCO SetDataCost, GCO SetSmoothCost, and GCO GetLabeling) defined in the optimization library, we can obtain each pixel label . Then, a proper intensity value of the initial transmission map can International Journal of Computer Games Technology In Figure 5, we show a synthetic example where the image consists of 5 gray-level regions. The image can be accurately segmented to be 5 label regions by using the proposed MRF method, and the 5 labels are represented by 5 intensity values, whose results are shown in Figure 5.
The MRF-based algorithm can also be applied for estimating the initial transmission map for real world images.
An illustrative example is shown in Figure 6. In the figure, Figure 6(b) shows the initial transmission map estimated using the algorithm presented above; its corresponding rendered result is shown in Figure 6(c). One can clearly see that the fog-like feeling given by the simulated foggy image is not very strong due to the redundant image details. Figure 6, there is obvious deficiency in the rendered image in the discontinuities of the transmission map obtained by MRF model. For example, the trees or mountains should have the same depth values. However, as shown in Figure 6(b), one can clearly see different intensity layers of trees or mountains in the transmission map estimated by the MRF-based algorithm. In order to handle these discontinuities, we apply the bilateral filter to our algorithm since the filter can smooth images while preserving edges [24]. Thus, the redundant details of the transmission map ini estimated by the algorithm presented above can be effectively removed, which improves the rendered result with better foglike feelings. This process can be written as

Refined Map Estimation Using Bilateral Filter. As shown in
where ini (u) is the initial transmission map corresponding to the pixel u = ( , ) and (u) is the neighbor of u. The spatial domain similarity function ( ) is a Gaussian filter with the standard deviation : ( ) = − 2 /2 2 , and the intensity similarity function ( ) is a Gaussian filter with the standard deviation ; it can be defined as ( ) = − 2 /2 2 . In our experiments, the value of and is set as 3 and 0.4, respectively. Thus, we can obtain the final refined transmission map, as shown in Figures 6(d) and 6(e), which is the rendered result obtained using the refined map. From Figure 6(e), one can see that the rendered result obtained using the bilateral filter has better fog-like feelings compared with the result (see Figure 6(c)) obtained without using the filter. However, the refined transmission map is in fact a pseudo depth map instead of a recovery of real depth information. It reflects the relative positions between scene objects and their neighboring regions. In our experiment we find that the pseudo depth map can produce realistic foggy scenes without creating significant errors in the rendered image.

Fog Scene Synthesis.
Since now we already know the input no-fog image (x), and the final refined transmission map (x), and the airlight is set to be 255, we can obtain the simulated foggy image (x) according to the atmospheric scattering model (see (1)). The simulation result (x) is calculated by In (8), the value of 0 is set to be 0.1, and the transmission map (x) combines the geometric distance (x) and the medium extinction coefficient (the net loss from scattering and absorption) into a single variable [25]: Experimental results show that the simulated foggy scenes with different fog density can be created by adjusting the coefficient with the parameter in (8), that is, = ( − ) = −( ) . However, using atmospheric scattering model on the input no-fog image, we can only generate rendered result with a homogenous fog added. However, fog is not always perfectly homogeneous in real situations. Therefore, Perlin noise is also introduced in our method to generate heterogeneous fog, which seems more natural to human visual perception. This process can be written as In (10), is the rendered result obtained by (8), is Perlin's noise, and is the simulated foggy scene with heterogeneous International Journal of Computer Games Technology 7 fog. The parameter is used to control the appearance of Perlin's turbulence texture in our rendered result. We fixed it to 0.15 for all the results reported in this paper. An illustrative example is shown in Figure 7. In this figure, Figure 7(a) is the input no-fog image, Figure 7(b) is the transmission map estimated by the proposed method, Figure 7(c) is the simulated foggy image with uniform fog, and Figure 7(d) is the rendered heterogeneous fog result. One can clearly see that the simulated result in Figure 7(d) is more natural.

Experimental Results
Our proposed rendering method can work well for a wide variety of game scenes and real world images. In the experiments, we performed the algorithm by executing Matlab on a PC with 3.00 GHz Intel Pentium Dual-Core Processor.

Parameter Adjustment for Different Foggy Conditions.
Since the parameter used to change the value of medium extinction coefficient controls the fog density of the simulated scene, we can thus obtain our ideal rendered foggy image by tuning this parameter. Figure 8 shows some fog simulation results with different . One can clearly see that the larger value of is, the denser fog will be, as shown in Figure 8.
Therefore, by changing the value of using the fog's density factor , it is possible to control the visibility distance or the global field of view, which decreases when rises, and vice versa. Table 1 gives the ten levels of the visibility grades and their corresponding extinction coefficient values widely used in meteorology [26].
From Table 1, one can clearly see that different foggy weather conditions are related to different visibility distances and extinction coefficients. Thus, we can deduce that the value of parameter can be selected according to the extinction coefficient value of the ten visibility grades to produce different foggy effects. To verify the effectiveness of the proposed method, a virtual scene in clear weather (see Figure 9(a)) is used as a benchmark image, which contains some targets at different distances, such as 33 m, 80 m, and 200 m. As can be seen in Figure 9, these targets gradually become blur as the value of parameter increases from 0.4 to 20. Since different visibility distances correspond to different foggy weather according to the international visibility grades shown in Table 1, we can thus simulate various fog effects by selecting proper parameter values. Table 2 gives the parameter values and the corresponding weather conditions for Figure 9. Note that the intrinsic and extrinsic parameters of the camera that creates the virtual scene are unknown for the transmission map generation, so the distance value of the targets shown in Figure 9 only reflects the relative relationship of the visibility. Thus, the weather type of the simulated scene is mainly determined by our subjective judgment.
To further validate the proposed technique for foggy weather simulation, we apply it to several synthetic images. An illustrative example is shown in Figure 10, and its corresponding parameter value and weather condition are given in Table 3. One can clearly see that the simulation results are consistent with human visual perception by setting the value according to the extinction coefficient value in visibility grades. This confirms our observations in Figures 9 and 10.

Image Test Results.
The algorithm proposed in this paper works well for a wide variety of synthetic images and real captured images. Figure 11 shows some examples of the rendering effects for game scene images. The parameter values for the simulated haze or thin fog scene are = 0.8 and = 2.0, respectively. The transmission maps estimated from the input original images are shown in Figure 11(b). Note that the maps do not recover the real depth information, but the generated foggy scenes still consist with our daily life experience. From the figures, we can see that the generated foggy scenes (see Figure 11(c)) look natural due to the differentiation of the fog's density.
We also apply our method to real world images. The results for real captured image are shown in Figure 12. The heterogeneous fog effect for the scenes is presented in the figure and the coefficient is, respectively, set to 1 and 0.5 to create the haze effect. From these results, we can get that our simulated results are realistic.

Qualitative Comparison.
To verify the effectiveness of the proposed algorithm, we evaluate the visual quality of our rendered image by comparison with the results obtained by the commercial software and the state of the art rendering methods. Figure 13 shows a comparison between our rendered foggy scene and real captured foggy image. It can be seen from the figure that there are some differences between our simulation results and real captured image, especially in the image details. Therefore, neither of the simulated foggy scenes (homogenous fog or heterogeneous fog) is visually close to the real foggy situation. That is because the reason for the change of real fog's shape, density, and color is very complicated and irregular. Thus, the fog density and color at each pixel of an image are hard to be described by some regular formulas. What we can do is adding more descriptions of fog detail transformation in the rendering  The parameter values and the corresponding weather conditions for Figure 9.
Image Figure 9(b) Figure 9(c) Figure 9(d) Figure 9(e) Figure 9 algorithm to provide viewers with more realistic sense of fog in the simulated scenes. The comparison between our rendered foggy scenes and the 3ds max result is shown in Figure 14. One can clearly see that the fog effect created by 3ds max (see Figure 14(b)) looks good but not natural enough. That is because the fog density produced by the software is constant over the whole scene, which is not always true in real situations. A more natural way is to let scene objects gradually disappear in the distance fog; thus the atmospheric scattering model is used here for generating homogeneous fog effect. Thanks to the depth information provided by the estimated transmission map, the realism of simulated phenomenon is improved by the distribution of the fog, as shown in Figure 14(c). Compared with the homogenous foggy scene, much more realistic fog effect can be produced by using the Perlin noise and turbulence. As can be seen in Figure 14(d), the visibility is lower at the point being far away from the camera and the fog's density is differentiated by using the noise textures, just like in real world. Figures 15 and 16 allow the comparison of our results with three state-of-the-art rendering methods: OpenGL, which simply blends the fog color with the object color based on the distance of the viewer; Sun's work [1], which is based on a single scattering light transport equation; Cao's work [2], which is based on an atmosphere point spread function (APSF). Notice that the object outline of OpenGL results seems still clear after rendering original scene with the builtin OpenGL functions, which is not always true for natural foggy image. Both Sun's work and Cao's work can produce impressive rendering results; especially the glows around light sources in Sun's result and the blurriness caused by contrast reduction in Cao's result are all visually compelling. However, many parameters are involved in these methods, which leads to more user interaction and makes the final simulation effects hard to control as well. Compared with these existing methods, the proposed algorithm can produce comparable result with much less parameter and user involvement.

Computational Time.
For computational efficiency, the complexity of the OpenGL is ( ) for an image of size × , which implies the complexity of the OpenGL method is a linear function of the number of input image pixels due to the point-to-point operations of the built-in OpenGL functions. Thanks to programmable graphics hardware, Sun's method precomputes the numerical functions and stores them as 2D texture maps. Then, the entire analytic computation and table lookups are implemented in simple pixel or vertex shaders in the graphics hardware [1]. Thus, the computing speed of Sun's method is very fast to achieve real-time rates. The computational time of Cao's method is much longer than Sun's method, since the APSF used in Cao's algorithm needs the convolution operation, so its computing speed is associated with the product of convolution mask size and input image size, whose time complexity is relatively high. Our proposed method has a relatively faster speed compared with Cao's method, and less time is needed to obtain a comparable rendered result of the same size in Matlab environment. Table 4 shows the computing speeds of the three methods with an image of size 640 × 480.

Conclusions and Future Work
We propose a new algorithm to simulate foggy scenes for input no-fog images. Firstly, we generate Perlin noise image as the density distribution texture of heterogeneous fog. Then, we estimate the transmission map using the MRF model and the bilateral filter. Finally, virtual foggy scene is realistically rendered with the generated Perlin noise image and transmission map according to the atmospheric scattering model. The proposed method provides a new way to simulate foggy scene with less algorithm parameter and user interaction compared with existing techniques. In the future, we intend to further accelerate the rendering speed by exploiting the many techniques of GPU.