The Brigade Renderer : A Path Tracer for Real-Time Games

We present the Brigade renderer: an efficient system that uses the path tracing algorithm to produce images for real-time games. We describe the architecture of the Brigade renderer, and provide implementation details. We describe two games that have been created using Brigade.


Background
Historically, games have been an important driving force in the advance of graphics hardware and rendering algorithms.Efort has evolved from striving for abstract, visually pleasing results, to more plausible realistic rendering.In the former, a distinct visual style is chosen, which does not necessarily require realism.Instead, over-the-top animation styles and matching graphics are used.Examples of this approach are most early 2D computer games, but there are also more recent titles such as Super Mario Galaxy [1]a n dO k a m i [ 2] (Figure 1).
Many modern games strive for realistic graphics, where the goal is to convince the player that the result is (or could be) realistic.Examples are racing games such as the Gran Turismo series [3] and light simulators such as Tom Clancy's H.A.W.X.[ ]( Figure 2), which use rasterizationbased renderers, augmented with various algorithms to add secondary efects such as shadows, relections, and indirect illumination.
Recently, eforts are being made towards physically correct results.For static scenery and a static light coniguration, this can be achieved by precalculating global illumination, or by coarsely calculating radiosity.Examples of this are games based on the Unreal 3 engine [5]( Figure 3).Games using the Frostbite 2 engine [6] support ray tracing of coarse level geometry for glossy relections. he Unreal engine [7] supports approximate global illumination using cone tracing [8].
Physically based rendering of virtual worlds has strong advantages.he obvious advantage is image idelity (Figure ) .P e r h a p so fe q u a li m p o r t a n c e ,h o w e v e r ,i sp r oduction eiciency.Whereas lighting for a scene in a rasterization-based engine typically requires a designer to work around technical limitations of the renderer to make the lighting look right, physically based rendering naturally leads to correct lighting.his limits the design efort to a creative process alone.
Of the available physically based rendering algorithms, stochastic ray-tracing based methods (path tracing and derived methods) are favored over inite element methods, due to their elegance and eicient handling of large scenes.Unlike rasterization-based approaches, path tracing scales up to photo realism with minimal algorithmic complexity: the only dependencies are compute power and memory bandwidth.Both increase over time.Moore's law states that the number of transistors that can be placed inexpensively on an integrated circuit rises exponentially over time [9].Although the link between transistor count and application performance is complex, the latter follows the same pattern, with compute power increasing at 71% per year on average, and DRAM bandwidth at 25% per year [10].
Assuming that all other factors remain constant (e.g., scene complexity, screen resolution), it can thus be assumed International Journal of Computer Games Technology that there will be a point where physically based rendering is feasible on consumer hardware.

Previous Work
Recently, Whitted-style ray tracing and distribution ray t r a c i n gh a v eb e e ns h o w nt or u ni nr e a l -t i m e ,o ra tl e a s t at interactive frame rates, on CPUs (see e.g., [12][13][14][15]a n d GPUs [16][17][18][19], as well as the streaming processors of modern consoles [20,21]).
Interactive path tracing was irst mentioned in 1999 by Walter et al. as a possible application of their Render Cache system [22].Using their system and a sixty-core machine, a scene can be navigated at interactive frame rates.During camera movement, samples are cached and reprojected to construct an approximation for the new camera view point.New samples are created for pixels with a relatively large error.he image converges to the correct solution when the camera is stationary.
Sadeghi et al. use ray packets for their path tracer [23].Coherence between rays on the paths of block of pixels is obtained by using the same random numbers for all pixels in the block.his introduces structural noise but remains unbiased.hesystemisCPUbasedandachievesabout1.2Mrays per second per core of an Intel Core 2 Quad running at 2.83 Ghz.
In their 2009 paper, Aila and Laine evaluate the performance of various ray traversal kernels on the GPU [19].Although they did not aim to implement a full path tracer, their measurements include a difuse bounce, for which they report nearly 50 M rays per second on an NVidia GTX285, not including shading.
More recently, Novák et al. used GPU path tracing with path regeneration to double the performance of the path tracing algorithm on stream processors [2 ]. heir system is able to render interactive previews on commodity hardware, achieving 13 M rays per second on an NVidia GTX285 on moderately complex scenes, and is claimed to be "the irst eicient (bidirectional) path tracer on the GPU." Van Antwerpen proposed a generic streaming approach for GPU path tracing algorithms and used this to implement three streaming GPU-only unbiased rendering algorithms: a path tracer, a bidirectional path tracer, and an energy redistribution path tracer [25].

Efficient GPU Path Tracing
he unbiased path tracing algorithm with russian roulette is shown in Algorithm 1.healgorithmaimstoindanumber of paths that connect the camera to light sources, via zero or more scene surfaces, by performing a random walk. he expected value of the average energy transported via these paths is the solution to the rendering equation [31].To reduce the variance of the estimate, two extensions are commonly used.Russian Roulette is used to reduce the number of very long paths (which generally contribute little to the inal image), and at each nonspecular surface interaction, direct light is explicitly sampled.
he path tracing algorithm can be eiciently implemented as on the GPU, using a single kernel per pixel.he kernel loops over the samples for a pixel and outputs the inal color.his limits memory access to read-only scene  access, and a single write for the inal pixel color.CUDA code for this is provided in Appendix Section.Example output is shown in Figure 5.For this scene, ray/scene intersection uses a hardcoded scene consisting of axis aligned rectangular cuboids, spheres, and ellipsoids.Materials are limited to difuse and dielectric.Using this setup, a single NVidia GTX580 achieves 750 M rays per second, which results in an almost converged image at real-time frame rates.
For more general scenes, we can replace the hardcoded ray/scene intersection by the BVH traversal code proposed by Aila and Laine [19].

The Brigade System
A renderer for games has speciic requirements, which difer signiicantly from other applications.Of these, the requirement of real-time performance probably has the greatest overall impact on the design of a renderer.A modern game runsat60fpsormore.Forcertaingenres,alowerframerateis acceptable.ForthegameDoom ,aixedframerateof30fps is enforced by the renderer [32].
Frame rate translates to a strict millisecond budget, which must be divided over all subsystems.Note that if we chose to run the subsystems in order, the budget available to rendering decreases.If, on the other hand, we run the subsystems and rendering in parallel, we introduce input lag: in a worst-case scenario, user input that occurred at the beginning of frame will be rendered in frame +1and presented to the user just before frame +2starts.Apart from real-time performance, rendering for games requires dynamic scenery.Scene elements may undergo complex movement due to physics as well as hand-crated animations and procedural efects such as explosions.Contrary to popular belief, global changes to scenery are uncommon in games.Typically, large portions of the scenery are static, to avoid game states in which the user cannot progress.
Tightly coupled to the real-time requirement is the fact that games are interactive applications.he renderer must produce correct results for all possible user input and cannot predict any scenery changes that depend on user interaction.
On top of the generic requirements, there are requirements that evolve over time, most notably rendering resolution and scene complexity.At the time of writing, a typical game renders at a resolution of at least 1280 × 720 (HD 720).A typical scene consists of hundreds of thousands of polygons.
he Brigade rendering system is designed speciically for games and applies and encapsulates the technology of Section 3 in this context.Brigade renders scenes consisting of static and dynamic geometry, consisting of millions of triangles.It uses a ixed-function shading pipeline and supports difuse and specular surfaces with textures and normal maps, as well as dielectrics with absorption. he animation system supports rigid animation and skinned meshes.Scenes are  illuminated using emissive surfaces, of which an unlimited amount may be present.
he rendering system achieves high performance by fully utilizing all compute devices in a heterogeneous architecture (Section .2).It implements a synchronization-free balancing scheme to divide the workload over these compute devices (Section .3).Adaptive converging (Section .5)anddynamic workload scaling (Section .7)are used to ensure a real-time frame rate at high-deinition resolutions.
.1.Functional Overview.Figure 6 provides a functional overview of the Brigade renderer.In a typical setup, Brigade is combined with a game engine that provides components not speciic to the rendering algorithm, such as artiicial intelligence and physics libraries.In terms of abstraction, the functionality provided by Brigade is thus similar to the functionality implemented by OpenGL and DirectX.
he main components of Brigade are as follows.
.1.1.Scene Graph. he scene and hierarchical scene graph contain all data required for rendering.his includes the object hierarchy, mesh data, materials, textures, cameras, and lights.he object decomposition represented by the scene graph is used to steer acceleration structure construction, which makes the scene graph an essential data structure within the system.For convenience, the scene graph object implements keyframe and bone animation.
.1.2.Core. he core implements the Render () method, initiates acceleration structure updates, synchronizes scene data changes with the compute devices, and divides work over the tracers, if there is more than one.
.1.3.Acceleration Structure Updater. he acceleration structure updater maintains the BVH, by selectively rebuilding parts of the acceleration structure based on changes in the scene graph.
.1. .Tracers.A tracer is an abstract representation of a compute device or group of similar compute devices.A "compute device" in this context can be a GPU, the set of available CPU cores, or a compute device connected over a network.he tracer holds a copy of the scene data and the acceleration  structure and implements the path tracing algorithm with the next event estimation and multiple importance sampling.Tracers are assumed to produce identical output for identical input (except for nondeterministic aspects of the rendering algorithm).
he acceleration structure used by the tracers is the only cached data structure that is derived from scene data.All other data can be modiied on-the-ly.his includes (all properties of) materials and lights.
In this system, the governing processes run on the CPU, and tracers (which in a typical setup primarily run on the GPUs) function as workers.
.2. Rendering on a Heterogeneous System.A modern PC is a heterogeneous architecture, which typically consists of a CPU with multiple cores, and at least one GPU.
To eiciently use the available compute power, several options are available.
(1) he rendering algorithm is implemented completely on either the CPU or the GPU.
(2) he rendering algorithm is implemented on both the CPU and the GPU.
(3) Tasks are divided over CPU and GPU.
Each of these options has advantages and disadvantages.A renderer that runs entirely on the CPU or GPU may result in underutilization of the other compute device.An algorithm that is implemented on both the CPU and the GPU will use all resources but requires a greater implementation efort.Dividing tasks over CPU and GPU seems the most attractive option.his is, however, only eicient when CPU and GPU spendequalamountsoftimeontheirassignedtasks.
Afourthoptionistouseahybridsolution,wheretheCPU has speciic tasks and uses the frame time that remains to assist the GPU.his is the approach implemented in our system.he CPU is responsible for game logic and acceleration structure maintenance, while the tracers perform the actual rendering.Assuming a CPU tracer is available, this system is able to keep compute devices fully occupied.he process is illustrated in Figure 7.
F o re a c hf r a m e ,t h eC P Uu p d a t e st h eg a m es t a t e .he resulting changes to the scene graph are then used to update theBVH.hechangestotheBVH,aswellasanyotherscenechanges, are sent to the tracers, where they are placed in a commit bufer, which the tracers use to iteratively update a local copy of the scene.
Parallel to these activities, the tracers render using the data that was prepared in the previous frame.A tracer starts a frame by processing the changes in the commit bufer, and then renders a part of the frame.CPU tracers are handled slightly diferently than GPU tracers, by postponing rendering until the acceleration structure has been updated.his prevents rendering interferes with acceleration structure maintenance.
When no CPU tracer is available, the CPU can execute game code that does not afect the scene graph ater copying scene changes to the commit bufers of the tracers.
.3.Workload Balancing.he tracer low is shown in Figure 8. Upon instantiation, the tracer spawns a thread that executes the worker loop.his loop waits for a signal from the core, renders a number of pixels, and signals the core, before going to sleep until the next frame.
When more than a single tracer is available, the core estimates an optimal workload division prior to rendering each frame.he advantage of this approach is that no communication between the tracers and the core is required once rendering has commenced, which greatly reduces communication overhead for GPU and network tracers.Dividing the work is nontrivial; however, not every compute device mayhavethesamerenderingcapacity ,andnoteverylineof pixels has the same rendering cost (see Figure 10).
In a game, a typical camera moves in a somewhat smooth fashion.A good workload division for one frame will thus be at least reasonable for the next frame.We exploit this by adjusting the workload balance in an iterative manner.
We implemented four schemes to divide work over the tracers.
.3.1.Do Not Balance.In this naive scheme, all workers are assigned an equal share of the screen pixels; no balancing is performed.his scheme is included for reference.
.3.2.Robin Hood.hisschemestartswithanequaldistribution of the work for each tracer.Ater completing each frame, the tracer that inished last passes one work unit (one work unit equals four rows of pixels) to the tracer that inished irst.When the work is poorly distributed, it may take a large number of frames to properly balance.
.3.3.Perfect.Calculates the exact amount of work a tracer can handle based on the previous frame, but without considering diferences in cost between lines of pixels.his may result in hickups, when many expensive lines are assigned to a tracer at once. he perfect balancer uses the following formula to determine the workload for worker for frame +1based on the unit count and render time of frame : Perfect Smooth.Same as "Perfect", but this time, the workload per tracer is smoothed over multiple frames, using the following formula: where ∈ (0, 1).
Figure 9 shows the eiciency of the four schemes, for a spinning camera in the Aztec scene.For a slow moving camera, the workload in two subsequent frames is similar.All schemes except the overcompensating Perfect balancer work well. he Robin Hood balancer exhibits poor eiciency for the irst frames.For a faster camera, Robin Hood is not able to keep up.For this situation, the aggressive Perfect balancer outperforms even the Perfect Smooth balancer.When more GPUs are used, Perfect Smooth is clearly the optimal scheme.
Table 1 shows the average eiciency of the four balancers over 128 frames, for a slow and a faster moving camera.his table conirms that the Perfect and Perfect Smooth schemes are similar in terms of average eiciency.he table does not, however,showthespikesthatarevisibleinthegraphs.
. .Double-Bufering Scene Data.For acceleration structure maintenance, we use the following assumptions.
(1) A game world may consist of millions of polygons.
(2) A small portion of these polygons is dynamic.
(3) Several tracers will use the same acceleration structure.
Based on these assumptions, a full rebuild of the BVH for each frame is neither required nor desired, as it would put a cap on maximum scene complexity, even when very few changes occur.We reuse the system described by Bikker [13], where each scene graph node has its own BVH, and a top-level BVH is constructed per frame over these BVHs.Each changed scene graph node is updated, using either full reconstruction or reitting.Brigade uses a double-bufered approach for BVH maintenance.During a single frame, the CPU updates the BVH based on modiications of the scene graph. he resulting changes to the BVH are sent to the tracers, where they are placed in a commit bufer.At the start of the next frame, the commit bufer is processed, which results in an up-todate BVH for each of the tracers.his process is illustrated in Figure 7.
Each frame is thus rendered using the BVH constructed during the previous frame.Acceleration maintenance construction thus only becomes a bottleneck when the time it requires exceeds the duration of a frame.
.5.Converging.To reduce the noise in the inal rendered image, several frames can be blended.Each pixel of the inal image is calculated as final =(1−) prev + new , where ∈ (0, 1].V a l u e is chosen either manually, or automatically, for example, based on camera speed.For stationary views, this approach results in a higher number of samples per pixel.For nonstationary views, this results in an incorrect image.he result can be improved by linking to camera movement.For a stationary camera, a small value of allows the renderer to blend many frames.For a moving camera, a value of close to 1 minimizes ghosting.
Note that even though the camera may be static, objects in the scene may not be.It is therefore important to limit the minimum value of to keep the ghosting for dynamic objects within acceptable bounds.
.6.CPU Single Ray Queries.Brigade exposes a CPU-based s y n c h r o n o u ss i n g l er a yq u e r yt h a tu s e st h eB V Hf r o mt h e previous frame, to provide the game engine with a fast singlerayquery .hisqueryisusefulforanumberoftypicalgamesituations, such as line-of-sight queries for weapons and AI, collision queries for physics, and general object picking. he single-ray query uses the full detailed scene (rather than, e.g., a coarse collision mesh), including animated objects.
.7.Dynamically Scaling Workload.Maintaining a suicient frame rate is of paramount importance to a game application.In this subsection, we propose several approaches to scale the workload.
.7.1.Adjusting Samples per Pixel. he relation between frames per second and samples per pixel is almost linear.Brigade adjusts the rendered number of samples per pixel when the frame rate drops below a speciied minimum and increases this value when the frame rate exceeds a speciied maximum.
.7.2.Balancing Primary Rays and Secondary Rays.By balancing the ratio of primary and secondary rays, the quality of antialiasing and depth of ield blurring can be traded for secondary efects.he primary rays are faster; increasing their ratiowillalsoimproveframerate.
.7.3.Scale Russian Roulette Termination Probability.Changing the termination probability of Russian roulette does not introduce bias, although it may increase variance [33].Altering the termination probability afects the number of deeper path segments, and thus frame rate.Unlike the  previous approach, scaling the termination probability using a factor which is based on frame rate does not distinguish between primary and secondary rays and allows smooth scaling of performance.Alternatively, the workload can be reduced by reducing rendering resolution, or limiting trace depth.Limiting the maximum recursion depth of the path tracer introduces bias but also improves performance.In practice, due to Russian roulette, deep rays are rare, which limits the efect of a recursion depth cap on performance.
For game development, the scalability of a renderer based on path tracing is an attractive characteristic.A relatively slow system is able to run the path tracer at an acceptable frame rate, albeit perhaps not at an acceptable level of variance.Faster systems beneit from the additional performance by producing more samples per pixel, and thus a smoother image.

Discussion
he rendering system described in the previous section is relatively simple.To a large extend, this simplicity is the result of the chosen rendering algorithm.he path tracer does not rely on any precalculated data, which greatly reduces International Journal of Computer Games Technology data dependencies.here are two exceptions, and these a r ea l s ot h em o s tc o m p l e xp a r t so ft h es y s t e m .hei r s t is the acceleration structure, which is cached and updated iteratively, in a double-bufered fashion.As a result, games cannot make arbitrary changes to the scene graph. he second is the data synchronization between the renderer core and the tracers, which generally run on the GPU(s).Using a commit bufer system, Brigade makes this virtually invisible to the application, and few restrictions apply.Apart from the tracers, Brigade is a platform-independent system. he tracers abstract away vendor-speciic APIs for GPGPU and allow the implementation of networked tracers and CPU-based tracers.When using a CPU tracer, the system is able to achieve full system utilization, with little overhead.

Applied
To validate our results, we have applied the renderer to twos t u d e n tg a m ep r o j e c t s .B o t hg a m e sh a v eb e e np r o d u c e di n approximately 1 working days.
6.1.Demo Project "Relect".he Relect game application is a student game that was developed using an early version of the Brigade engine.he game scenery is designed to simulate actual requirements for game development, and purposely mimics the graphical style of a well-known modern game (Portal 2 [3 ]).
he scenery has the following characteristics: (i) scenery consists of approximately 250 k triangles, divided over multiple, separated rooms; (ii) the scene is illuminated by thousands of area light sources, many of which are dynamic; (iii) the game world is populated by dozens of dynamic objects.
Art assets for the game were created in Alias Wavefront Maya 2011 and were directly imported into the game.
Like Portal 2, Relect is a puzzle game, where the player advances by activating triggers that in turn open doors or activate elevators.A "mirror gun" is available to the player to transform lat and curved wall sections into mirrors.hese mirrors, as well as glass cube objects, can be used to divert lasers that block the way.
Coniguration.Relect was developed for a dual-CPU/dual-GPU machine (2 hexacore Intel Xeon processors, 2 NVidia GTX 70 GPUs).We implemented a CPU tracer as well as a CUDA GPU tracer.For performance reasons, we limited the path tracers to a single difuse bounce.
Game-Speciic Optimizations. he scenery of the game consists of many rooms, separated by doors.A common optimization in rasterization-based renderers is to disable geometry that is known to be invisible.For a path tracer this does not signiicantly improve performance.We did ind, however, that turning of lights in those rooms reduces variance, as the path tracer will no longer sample those light sources.his optimization is implemented at the application level: a system of triggers in the scene enables and disables sets of lights.
Performance and Variance.Figure 11 shows two scenes from the game running on a dual-CPU/dual GPU machine.At 16 spp, the game runs at 10-12 fps.At this sample count, brightly lit scenes are close to acceptable.Darker regions, such as the area under the platform in the right image, show signiicant temporal noise.Careful level layout helps to reduce objectionable noise levels.To the visual artist, this is counter-intuitive, where rasterization-based renderers tend to use small amounts of point light sources, a path tracer beneits from large area lights, and incurs no slowdown when those lights are animated.
M a t e rial si nth el ev e l sa r ed e l i be r a t e l ys i m p l e .Al th o u gh specular surfaces are supported by this version of the renderer, specularity signiicantly increases noise, making this impractical in most situations.
Observations. he Relect game struggles to achieve an acceptable frame rate, at a low resolution, on a high-end system. he project does, however, show the potential of path tracing for games.he art for this game was produced in Maya 2011 and was directly imported into the game, leading to very short development cycles, and usable art on the second day oftheproject.Withinthesametimespan,theprogrammers implemented a basic physics engine using ray queries that allowedthemtonavigatetherooms.
he freedom in lighting setup led to a inal level that contains approximately 10 k light emitting polygons.Direct and indirect illumination simply works and results in subtle global illumination, both for static and dynamic objects.
he CPU tracer that was implemented for this project proved to be problematic: keeping the CPU and GPU tracers in sync required signiicant engineering efort, while the overall contribution of the CPU is quite small.be halved to improve frame rate.We developed an updated CUDA tracer that roughly doubles the performance of the irst iteration (as used in Relect), as well as an OpenCL tracer, which produces identical images.A CPU tracer was not developed; the CPU is reserved for acceleration structure maintenance and game logic. he implemented path tracers are unbiased.
6.2.2.Project-Speciic Features.One of the puzzles features an animated water surface that responds to the player, consisting of tens of thousands of polygons.For the player character, a detailed set of skinned animations is used.he puzzles make extensive use of rigid animation.As a result, acceleration structure maintenance requires considerable processing.A detailed day-night cycle and an animated cloud system (with shadowing) were implemented to emphasize the strength of the path tracer for dynamic lighting.A standard physics engine was integrated to provide high quality physics simulation. he level is detailed and consists of 1. M triangles.he artists used a small set of sand stones to construct most of the buildings and ruins.
6.2.3.Game-Speciic Optimizations. he game world is illum i n a t e db yas u n( o rt h em o o n ) ,a n ds o m es m a l l e rl i g h t sources.To reduce variance, we modiied the path tracer to always sample two light sources per difuse surface intera c t i o n .O n eo ft h e s er a y sa l w a y sp r o b e st h ep r i m a r yl i g h t source.his signiicantly reduces variance in most areas.Adaptive converging is used to improve image quality when the camera is (almost) stationary.hese application-speciic optimizations where implemented in the GPU tracer code.
6.2. .System Utilization.Figure 12 shows system utilization for the four views of Figure 13, rendered at spp.For the irst two views, the CPU is underutilized, as both acceleration structure maintenance and game logic require little processing time.For the other two views, the camera is near a simulated water surface that consists of 18 k polygons.Both the simulation itself and the resulting acceleration structure maintenance require considerable processing time.h i sa l s oa f e c t st h eG P Ut r a c e r s ,w h i c hu s em o r et i m et o transferandprocessthemodiiedscenedata.
6.2.5.Memory Use. he Brigade renderer is an in-core rendering system, which stores multiple copies of the scenery.he host system stores a full copy of the scene and synchronizes this data with each of the tracers.
For the 1. M triangle scene of "It's About Time, " memory use is 737 MB, which consists of 175 MB triangle data, 2 MB for the acceleration structure, and 520 MB texture data. he size of other data structures is negligible, except for the commit bufer, which must be large enough to store perframe changes to scene data and the acceleration structure.For "It's About Time, " we used a 2 MB commit bufer.6.2.6.Performance and Variance.Figure 1 shows a single scene from the game, rendered using varying sample counts.As in Relect, areas that are directly illuminated converge quickly, while shadowed areas exhibit more noise.For the outdoor scenery of "It's About Time, " an acceptable quality for most camera views is obtained with 8 or 16 spp.On a system with two NVidia GTX 70 GPUs, we achieve 2 to spp at real-time frame rates, at a quarter of 720p HD resolution (640 × 360).his lets us quantify the remaining performance gap: real-time frame rates at 720p require 8 to 16 times the achieved performance.

Figure 2 :
Figure 2: Two examples of modern games that aim for a high level of realism.(a) Tom Clancy's H.A.W.X., (b) Gran Turismo 5.

Figure 6 :
Figure 6: Functional overview of the Brigade renderer, combined with a generic game engine.

Figure 8 :
Figure 8: Tracer thread initialization and main loop.

Figure 10 :
Figure 10: Render cost visualized: pixels representing the sky dome or light sources have a signiicant lower cost than other pixels.Cost is represented by greyscale values (brighter is higher cost), per 32 pixels (a full warp).Measured using a tracer implemented in NVidia's CUDA[11] for two scenes: Aztec (a) and Sibenik Cathedral (b).

Figure 12 :
Figure 12: System utilization for the four views shown in Figure 13.

Figure 13 :Figure 1 :
Figure 13: Four views from "It's About Time." Double bufering the BVH. he CPU updates the BVH and sends changes to the tracers.Each tracer processes the changes in a commit bufer before rendering the next frame.

Table 1 :
Average percentage of summed rendering time for all GPUs spent idling due to early completion, for the four balancing schemes, over 128 frames, for a slow and a faster moving camera.Measured for the Aztec scene.
Coniguration."It's About Time" is designed to run on a typical high-end system, using a single hexacore CPU and one or more current-generation NVidia or AMD GPUs. he game renders to standard HD resolution.his resolution can