There used to be an interesting debate between Professer Philipp Slusallek of the University of Saarbruecken and chief scientist David Kirk of nVidia at GameStar.de (dead link) The original article has been taken down, but I found a slightly mangled version on the Wayback machine and I've cleaned it up a bit.

Als Ergänzung zu unserem großen Report 3D-Grafik 2006 in GameStar 07/2004 veroeffentlichen wir eine von GameStar moderierte, englischsprachige Diskussion zwischen dem Chef-Wissenschaftler von Nvidia, David Kirk, und dem renommierten Informatik-Professer Philipp Slusallek von der Uni Saarbrücken. Thema des ausführlichen und mit vielen Zusatzinfos gespickten Gesprächs: Die Vor- und Nachteile aktueller Render-Verfahren im Vergleich zu Raytracing.

(Babelfish says this means: "When 2006 in GameStar 07/2004 we publish addition of our large report 3D-Grafik one of GameStar moderated, English-language discussion between the chief scientist of Nvidia, David Kirk, and the renowned computer science Professer Philipp Slusallek of the University of Saarbruecken. Topic of the detailed and with many auxiliary information gespickten discussion: The pro and cons of current Render procedures compared with Raytracing." A better translation would be appreciated)

GameStar: Current games are rendered via the well-known rasterization technique. Ray tracing is an old technology, but for the first time we've the power to calculate it in real-time. Why is ray tracing the better (or the worse) technique to render a PC game?

Kirk: Rasterization (Painter's Algorithm or Depth Buffering) has a rendering time that is typically linear in the number of triangles that are drawn, because each polygon must be processed. Since there is specialized hardware for rasterization in modern GPUs, this time is very small per triangle, and modern GPUs can draw 600M polygons per second, or so. Also, Z or depth culling and hierarchical culling can allow the game to not even draw large numbers of polygons, making the complexity even less than linear.

Although it is tempting to think of GPUs as only rasterization engines, modern GPUs are now highly programmable parallel floating processors. The rasterization part of the GPU is only a small part; most of the hardware is devoted to 32-bit floating point shader processors. So, it is now possible to demonstrate real time ray tracing running on GPUs. It is not yet faster than rasterization, but often ray tracers are doing more calculation for the global illumination - shadows, reflections, etc.

I don't think of this as "ray tracing vs. rasterization". I think of it as ray tracing and rasterization. Rasterization hardware can be used to accelerate the intersection and lighting calculations that are part of ray tracing. In particular, the initial visibility calculations - what's in front - are best done with rasterization. Reflections, transparency, etc, can be done with ray tracing shader programs.

Slusallek: Since I am not limited by company politics I will try to be a bit controversial in my responses :-).

Ray tracing offers a fairly long list of advantages over rasterization. The fundamental difference is that rasterization can only look at a single triangle at a time. However, most effects require access to at least two triangles: e.g. casting a shadow from one triangle to the other, computing the reflection of one triangle off of another, or simulating the indirect illumination due to light bouncing between all triangles in the scene.

Rasterization must do various tricks to even approximate these effects, e.g. using reflections maps instead of real reflection. For changing environments these maps must be re-computed for every frame using costly additional rendering passes. But even worse, they are simply incorrect for almost all geometry, in particular for close-by or curved objects. Try rendering a car that reflects the street correctly -- games don't, because they can't.

Another big advantage of ray tracing is the ability to render huge data sets very efficiently. Recently we implemented a very simple addition to ray tracing that allows for rendering a Boeing 777 model consisting of ~350 million polygons (roughly 30 GB on disk) at 2-3 frames per second on a single dual Opteron system with just 3-4 GB of memory. Since it uses ray tracing you can automatically also render it with shadows, reflections, and complex shading even in such a large model.

You see, ray tracing is a fundamentally new way of doing interactive graphics that opens up many new opportunities for doing things that were impossible before. Many researchers are picking up realtime ray tracing now that we have demonstrated it running with realtime performance.

Kirk: You missed my point, perhaps intentionally. Rasterization-specific hardware is now <5% of GPU core area. Most of the silicon is devoted to instruction processing, memory access, and floating point computation. Given that a GeForce 6800 has 10-20x the floating point of the Opteron system you describe, you are a poor programmer if you cannot program it to run a ray tracer at least twice as fast on a GPU as on a CPU.

There are no barriers to writing a ray tracer on a GPU, except perhaps in your mind. The triangle database can be kept in the GPU memory buffers as texture information (textures are simply structured arrays). Multiple triangles can be accessed through longer shader programs. Although current GPU memory is limited to 256MB-512MB, the root of the geometry hierarchy can be kept resident, and the detail (leaf nodes) kept in the system memory and disk. In your example of ray tracing 30GB of triangle data, you are clearly using hierarchy or instancing to create a 350M polygon database, since in your 2-3 seconds you do not have time to read that volume of data from disk.

By the way, ray tracing is not a new idea. Turner Whitted's original ray tracing research paper was written in 1980. Most of the algorithmic innovation in the technique happened in the late 80s and early 90s. The most interesting recent advances are path tracing, which casts many more rays to get a more global illumination (light inter-reflection) result. Several universities have written path tracers for GPUs that run on extremely large databases.

Slusallek: Well, it seems getting a bit controversial is working better than expected :-)

We fully agree that massive parallelism in GPUs and similar hardware is a great way of achieving high raw performance. However, we also know that the specific architecture of a hardware is an important factor in determining how well the raw performance can be leveraged for specific applications and algorithms.

I have the greatest respect for the research and development that resulted in the GPUs we have today. However, from the results the we and all other I have talked to are getting from implementing ray tracing on GPUs, I conclude that the current hardware is not well suited for this sort of applications. That might change with future implementations (and maybe better programmers :-) but I do see some general and tough architectural issues that need to be solved to make this work well.

The Boeing model does not use instancing at all! It contains roughly 350 million separately stored triangles, which we load on demand as required. And with some of the outside views we are seeing working sets of several GB. The key to rendering such a model is proper memory management, which is already non-trivial on a CPU. Having to deal with the added complexity of a separate GPU, graphics memory separated from main memory, and only limited means of communication with the CPU and finally disks, makes it so much harder to use this approach. Due to the many advantages of ray tracing I believe this rendering algorithms is the right choice for future interactive graphics ... and if we can run it well on a GPU, I would be the first to use it.


GameStar: Due to its mathematical correctness, ray tracing could make games look more realistic, e.g. perfect lighting, shadowing, refractions and reflections. Current game engines, even the upcoming Unreal 3 engine, use tricks for doing such effects. Do we need a transition from rasterization to ray tracing for perfect 3D graphics?

Kirk: First of all, I would say that simple ray tracing is not "perfect lighting". Simple ray tracing traces a single ray toward each point light, which results in ugly, hard edged shadows. To get more realistic (and, I will say, still not "perfect") lighting, multiple rays must be traced toward each area light, as well as toward other surfaces which interreflect light. This is much more time consuming. Ray tracing is just one algorithm for solving a large, complex integral equation which researchers have called "the rendering equation". Simple ray tracing attempts to solve the equation by taking a few estimates. Path tracing and stochastic ray tracing increase the quality of the solution by using more estimates and adding some randomness to the process.

I believe that a lot of game engines already incorporate a combination of techniques including ray tracing and radiosity in their rendering paths.

Slusallek: As mentioned before ray tracing is certainly much more "perfect" than rasterization. For example, look at our partners from the automotive industry like Volkswagen and Audi. They need to reliably visualize the final look of the car in order to make far-reaching decisions as early as possible in the design process. They don't need "nice images", they need correct images they can trust. With rasterization this has not been possible even after they tried hard for many years.

After evaluating ray tracing for some time, a large German car company has just decided to buy a complete visualization center for production use from our spin-of inTrace GmbH. It uses realtime ray tracing on a PC cluster instead of the usual rasterization hardware. Image quality, the ability to directly render the CAD data of entire cars with tens of millions of polygons, and reliable visual results have been the main arguments for them.

While the requirements of gaming are probably not as high as in the automotive industry, having the same features would not hurt. Indeed, many games are already trying to add some of these features but this is difficult to do in realtime due to the limitations of today's hardware.

We have already shown many of the advanced rendering features such as indirect lighting (using photon maps and stochastic Monte-Carlo lighting simulation) or volume rendering running in realtime using our software ray tracer. While we already have a complete realtime ray tracing framework for many application domains there still remains a lot of work to be done to really exploit all its advantages.

However, using a PC cluster is certainly not acceptable for gaming. Some hardware support will be required to put ray tracing onto a PC graphics card that fits into everyone's PC. But a prototype of such hardware is already working in our labs.

While I believe that ray tracing is certainly the better platform for next generation graphics, it is really difficult to predict how fast it might get adopted by industry. We are already talking to hardware

Kirk: Now who's talking company politics? I see no need for special purpose ray tracing hardware if general purpose programmable GPUs can run the same algorithms faster.

I think that you must be unaware of the features of today's hardware if you find it difficult to do ray tracing on GPUs in realtime. Have you actually made the attempt?

Phillip ... give up on the dark side (specialized hardware)! Come over to the light! Programmable GPUs will double in floating point performance every 6-9 months, and specialized hardware cannot possibly keep up with mass market processor development.

Slusallek: We, of course, did implement ray tracing on GPUs and so did others. For all that I have seen, hardware is not too far off from good software for the typical simple test scenes but things look much less favorable as soon as we go to more complex scenes with dynamic changes, complex shading, global illumination, and all the other advanced things we are all looking for.

All GPUs still contain quite some "specialized hardware" for rasterization, texture filtering, and many other tasks. The same would be true for the ray tracing architecture we have been looking into so far. The core ray tracing algorithm would be implemented as a primitive operation in custom hardware and it would be used by many small CPUs that take care of shading etc. It would actually be interesting to combine the vertex and pixel shaders of today's GPUs with a ray tracing core.

However, the main reason we are looking at this ray tracing hardware is to explore this design space. And what we are seeing is very encouraging when compared to CPUs and GPUs. However, there is still much research that needs to be done. And a bit of technological competition has always been a good recipe for achieving better results for everyone. companies for producing the ray tracing chip, which would be a big step forward in the right direction.


GameStar: Game development consumes more and more time and money. Ray tracing can ease the development process via performing pixel processing and collision detection in one pass. Can ray tracing give the developers more time for making better games or are they happy with doing all these tricks for this and that?

Kirk: I don't agree with the premise of the question. Very little time is now spent by developers in creating the rendering part of the game engine, so ray tracing won't help that. Most time is spent creating the "look" of the shaders and the characters, the models, and the definitions of the materials and environments. Rendering is relatively easy.

Slusallek: Rendering with Rasterization is all but easy if you want to do anything more than drawing simple textured triangles. Too many PhD students have spend way too much of their time improving shadow tricks for rasterization -- and its still does not work correctly. The same is true for many other effects.

Even worse is that each of these tricks has limitations and often interferes with other tricks. Game designers must always keep these limitations in mind when designing content instead of concentrating on their real job -- creating the best gaming environment. There is a reason that people like John Carmack are complaining that content creation is getting too difficult and time consuming and are looking for better alternatives.

In ray tracing all that needs to be done is sending one or more rays, which can be completely done in hardware and does not involve the application at all. The application simply defines the appearance of surfaces via small shader programs and the ray tracing engine does all the rest.

On a per pixel basis a ray tracer even combines the appearances (shaders) of multiple objects automatically and in the correct way whenever necessary, e.g. when seeing the semi-transparent shadow of a translucent object being reflected in a bump-mapped surface. There is nothing the application or a shader writer has to do as the ray tracer automatically and correctly simulates the light traveling through this environment.

With rasterization each different combination of effects would require its own complex programming both at the shader and the application level. You better don't ask how much special effects programming is involved in some of the great demos that you see showing off the latest graphics cards.

Kirk: I'll stick with my original comment ... the rendering framework is essentially a solved problem. The real work in game authoring is model (scene) creation, character creation and animation, and lighting and shader creation for specific effects.

Ray tracing does not eliminate any of that work ... even in ray tracing, the material shaders still have to be written to create any material appearance that is desired. In rasterization, the shader is applied when the triangle is rendered. In ray tracing, intersections are determined, and the shader is applied to determine which additional rays should be generated for lighting and reflection. In either case, it's essentially the same shader.

Slusallek: I am a bit surprised about your comment on shading in rasterization versus ray tracing. You are right, as long as purely local effects are concerned (e.g. computing the color or normal procedurally or via textures). But the difference shows up when you want to do anything more advanced that also requires global effects.

Let's take again reflection as a very simple example to show the difference: Yes, where a ray tracer shader simply shoot a ray in the reflected direction and is done, a CG shader running on a rasterizer could just look up the reflected color from a reflection map. So it is similar on this level.

But how do you get the reflection map in the first place? What resolution do you use? Where to you place the camera? What do you do if the reflected geometry is too close or the reflective surface is curved, such that the reflection map would produce clearly visible artifacts? Or what if the reflected objects need special processing as well because they are reflective too, or receive shadows, or should show nice global illumination effects.

The problem is that all of this needs to be worked out by the application outside of the graphics hardware. With ray tracing the shader running on the graphics hardware simply shoots one or more rays that get handled within the same hardware, and the application does not even care about these details. I sometime have the feeling that with the dominance of rasterization hardware today, too many of us are simply not aware any more of the many limitations that rasterization actually has.


GameStar: If we only speak about transistors, modern graphics chips like NV40 should be able to perform ray tracing in real time. But it was not build for that - current concepts of a ray tracer in a pixel shader run slow. Will we see real-time ray tracing in todays video cards or do we need a complete new architecture with, lets say, a RISC processor for tracing the ray into the scene?

Kirk: Researchers have already demonstrated ray tracers running on GPUs that are far faster than CPUs or special purpose hardware. A modern GPU such as the GeForce 6800 has about 40x the floating point processing power of a 3 GHz Pentium 4 processor, and a general programming model with looping and branching, as well. Anyone who believe that GPUs are not yet good at ray tracing is a few years behind the times. :-)

Slusallek: While it is possible to implement ray tracing on the today's programmable GPUs, they are not well suited to the task. Even though they offer much higher raw processing power, the final ray tracing performance we and others see is well below even that of current CPUs. GPUs are optimized for simple linear code while good ray tracing algorithms need to execute complex control flow and even recursive computations. These are hard to map to the GPU at all and then do not perform too well.

In contrast we have recently implemented a prototype of a custom ray tracing based graphics card using a single Xilinx FPGA chip. The first results show that this really simple hardware running at 90 MHz and containing only a small fraction of the floating point units of a rasterization chip already performs like a 8-12 GHz Pentium 4. In addition, it uses only a tiny fraction of the external memory bandwidth of a rasterization chip (often as low as 100-200 MB/s) and therefore can be scaled simply by using many parallel ray tracing pipelines both on-chip and/or via multiple chips.

What this shows is that current GPUs are rather inefficient and not well suited for non-trivial code like ray tracing and that multi-core parallel CPUs or dedicated ray tracing hardware are a more promising approach. Maybe newer GPUs can change this but I am skeptical.

In the end I do not care which hardware my ray tracer in running on as long as it is fast. All that matters is that our research provides the end user and game designer with the best possible graphics and gaming experience -- and this requires ray tracing.

Kirk: I think we can end your skepticism if you attempt writing a ray tracer on a GeForce 6800.

Slusallek: As I said, we have a ray tracer running on GPUs and we are closely tracking its performance as newer GPUs come out.

However, you are ignoring my point, that for the first time there is now a simple but fully functional prototype hardware for ray tracing that is already faster than both GPUs and CPUs for ray tracing. Even better, its performance can be scaled up simply by adding more copies onto the same chip because it is not limited by the usual memory bandwidth problems.

While there is certainly much work still to be done, I think it is fair to say that the conventional wisdom that ray tracing is more complex, slower, not practical for interactive use, etc. is finally proven wrong. It will be interesting to analyze in detail the architectural differences between GPUs, CPUs and custom hardware and discuss its consequences for future 3D graphics and gaming. We are already implementing two different game engines on top of our system in order to study how future games could best take advantage of the new ray tracing features.

It has been much fun to work on this topic and I expect to see many new and interesting things to happen very quickly in the near future. (DV)

Lynx-enhanced by <peter at taronga.com> (Peter da Silva)