In this chapter we'll present the foundations of ray tracing as a mean of photo-realistic rendering. We'll cover the differences between photo-realistic and real-time rendering, as well as present some common terms - camera, shading, ray tracing, lights, shadows, global illumination, materials, BRDFs, Monte-Carlo methods, biased/unbiased GI algorithms. Expected is some previous knowledge in general computer graphics and high-school level of mathematics.
What does "rendering" mean
Let's take a step back and start with some theory. We'll begin with the term rendering. Imagine a computer game - it involves rendering. It has its virtual world - the levels, weapons, explosions, events, lights, etc. described in some abstract way. Rendering means turning this abstract description into an image to be presented on the screen.
We call all elements of this virtual reality a scene. The elements in the scene are described with some mathematical properties - rotations, translations, etc. Turning this scene into an image we call rendering.
There are different algorithms for rendering. We'll cover ray tracing in this chapter, while the other major approach is rasterization via DirectX/OpenGL.
DirectX/OpenGL vs. Ray tracing
Rasterization is used in games because of its high speed. All current video hardware is optimized for rasterizing triangle meshes - representations of real world objects. DirectX and OpenGL are APIs to the underlying video hardware. The main property of objects which is taken into account is their shape. After this step the rasterized triangles are colored using shader programs.
In ray tracing the main thing is light - what happens to it, how it moves, how it passes through matter and how it interacts with the scene. We simulate the particle (photon) properties of light, not its wave properties. The photons move in straight rays, those rays are absorbed, reflected from surfaces, refracted through glass, interact with athmospheric effects.
Presenting the 3D shape is not that important. We can use triangle meshes, mathematical equation for a sphere and more parametric shapes, subdivision surfaces.
The notion "ray tracing" means both the process and a concrete algorithm for its implementation. As a process this means simulating the light, how it moves through materials, volumes, how it interacts with surfaces. There's also a concrete algorithm which we'll cover later. In the optical industry other ray tracing algorithms are used, which implement more effects than our algorithm. They study what exactly happens with light, for effects like wave interference, diffraction, polarization, which are important there, but not so much in our photorealistic rendering.
Examples of simulating real-world effects
With ray tracing we can simulate all kinds of light effects, part of which are athmospheric. Examples of such effects are global illumination, reflections and refractions, and depth of field. They can only be faked with rasterization with tricks like rendering on multiple passes and compositing afterwards.
Consider the following example images:
Depth of field
Layering will work in our first example, but not in the second.
Fish-eye camera
Image by Josef Stuefer
The third example is using fish-eye camera, which can be simulated "out-of-the-box" with ray tracing. Similar to depth of field, in rasterization this can be faked using multiple renders which results in bad quality, slow renders, etc.
Conclusion
Ray tracing has the following advantages - it achieves very realistic results and is easier to implement because it follows the known laws of physics. It has the disadvantage that it's pretty slow - in a game (where rasterization is used) you can play at HD resolution with 60 FPS, while a single image of a high-quality render with the same resolution coming from a ray tracer can take minutes, hours or even days to render. As long as the accuracy of the result is concerned, the result from ray tracing is accepted as "the ground truth".
Foundations of Ray tracing
Raytracers have the following structure: we have a camera which represents the eye in the virtual space. From the camera we cast rays which simulate light's paths in nature. They go in a straight direction until they hit an obstacle. The simulation direction is opposite to what happens in nature, where light travels from its source to the eye. We can as well do tracing in the "right" direction, but this would be ineffective, because only a small amount of the rays cast will reach the camera. With the inverted tracing the image resolves orders of magnitude faster. This doesn't introduce any concerns because most light effects are symmetric. In front of the camera there is an imaginable rectangle which represents the screen.
Camera
Casting rays through the camera
The pyramid in front of the camera is where the rays begin. The rectangle mesh on the right is the virtual screen.
Intersection
- We cast rays and each ray participates in the result for the respective pixel.
- After the ray is cast we should intersect it with the geometry. Generally we must find the closest intersection point on a solid object.
Coloring/shading
- After we've found an intersection point, we must calculate how much light from the scene is reaching this point and is being reflected back to the camera. This process we call shading and is the computationally heaviest part of ray tracing.
Tracing through volumes
Light can interact not only with surfaces, but also with volumes. We can simulate * athmospheric effects like fog, clouds, smoke * refraction of the ray direction (like in haze, heterogeneous media) * attenuation of the light in absorbing media (like tea)
Beer-Lambert's law:
Image taken from Wikipedia
Attenuation of the light intensity is proportional to the exponent of the distance travelled in the media.
Tracing through media
Apart from being attenuated, light is also scattered in such media, which is important effect to simulate. Such materials include skin, milk, wax and result in an effect called Subsurface scattering. Here's an illustration:
The classical ray tracing algorithm works with the surfaces of the objects in the scene. Properties of these surfaces depend on the materials with which the geometry is covered. For instance there's a car and it's assigned with a "car paint" material. If you assign to it a tree material it would look pretty strange. There are various ways to describe materials. In physically based renderers (like PBRT, V-Ray, etc.) the way chosen to describe them is by different functions of their reflective properties, called BRDFs (Bidirectional Reflectance Distribution Function).
Lights and shadows
Next topic we'll cover is lights and shadows. Lets see how much light reaches the intersection point. Illumination in a scene comes from lights. When we intersect a geometry, we must check how much this point is illuminated by the lights. The standard approach is to cast rays from the intersection point to the lights to check if there's an obstacle, which would cause shadowing. These rays are called shadow rays.
Here point A is completely lit by the 2 lights, point B is partially shadowed and point C is completely shadowed.
Lights
Good thing about the ray tracing algorithm is that we can use a large set of lights:
- point (omni) - the basic light source in DirectX/OpenGL rendering. It is a very rough approximation of a real-world light source (e.g. lamp)
- directional (parallel rays)
- cone-shaped (spotlight)
- rectangular or another shape with non-zero area - they are more realistic and produce soft shadows
- with a metered IES profile (a given real light source, e.g. bulb, is metered using a goniophotometer)
- with arbitrary shape from a triangular mesh
Simulating area lights involves casting multiple rays, which generally slows down the rendering.
Shading
Lets recall - shading is calculating how much light from the scene falls on a given point and is reflected to the camera. There are 2 main components in it:
- Light coming directly from the light sources represents part of all the incoming light and is called direct illumination. The lights are the main light sources, so this part is usually the largest. On the very last image in this lesson this component is denoted in red.
- There are techniques to calculate the light dispersed from other objects in the scene - this is indirect illumination. In the same image this component is colored in green and blue.
Global illumination
In the real world the global illumination (GI) = direct + indirect illumination. Its accurate simulation is important for effects of photo-realism - ambient occlusion, color bleeding. We'll talk about these later.
Materials and BRDFs
Definition of BRDFs:
fr(x, ωi, ωo) := "probability that a light ray, coming from ωi in point x is reflected in direction ωo?"
The sun in the image represents a light source, x is a point on the surface. We can also ask "what part of the light from the incoming direction goes to the outcoming direction", but instead we'll use the probability model, where BRDF is actually a probability distribution function. Intuitively we are interested in what amount light from one direction continues to another.
Properties of BRDFs
- Symmetry
fr(x, ωi, ωo) = fr(x, ωo, ωi)
- Conservation of energy
∫Ω fr (x, ωi, ωo) cos(θo)dωo ≤ 1
The latter means the BRDF doesn't "generate" light. It can absorb light. Ω is a hemisphere above the surface.
BRDF Types
- BRDFs of real materials can be metered using a gonioreflectometer. The method uses a laser to light a given point and use the meter to cover the hemisphere and see what part of the light enters the different directions. This way we can construct a tabular BRDF. Practically these metered BRDFs are rarely used.
- We can use "ideal" BRDFs which resemble often used materials.
Examples
- Diffuse BRDF (like a sheet of paper) - the light is distributed uniformly in all directions:
fr(x, ωi, ωo) = 1/2π The hemisphere has solid angle 2π steradians, so the probability for each direction is 1/2π.
Reflective BRDF (like polished metal, mirror, water surface):
fr(x, ωi, ωo) = {∞, when ωo=reflect(ωi); 0, else}
- Combined case (like bumpy metal, frosted mirror) - simulates glossy reflection:
Layered materials
Usually real-world BRDFs can be approximated with a combination of several simpler ones. For instance simulating polished wood may involve using 2 layers with different BRDFs: * Lacquer (upper layer) - semi-transparent, reflective * Wood (lower layer) - diffuse, with texture (this means the value of the BRDF depends on where x is in texture space)
Monte-Carlo methods
We'll continue with some theory which is pretty important for ray tracing - Monte-Carlo integration methods. Generally said, the method can calculate the integral of an arbitrary function within a given error, so this gives an approximation of the solution. Remarkably these numeric methods for solving integrals behave well even when the function is not easy to be integrated - if it's not continuous, a large variation. The problem when using Monte-Carlo for such unusual functions is that it produces noise, or some inaccuracy. When we calculate the integral for each pixel, it will have some deviation from the actual value and we'll see this as noise. It's better though to have an accurate image with some noise instead of having a smooth image, which is inaccurate.
The following formula describes the Monte-Carlo method:
∫ f(x)dx ≈ (1 / N) ∑Ni=1 f(xi) / ρ(xi)
- f(x) - the function we're integrating; it can be multi-dimensional, returning color in our case; x can be a vector too
- N - number of samples
- xi - (uniformly distributed) random points in the domain
- ρ(xi) - probability to choose xi
The theory says that the error is proportional to 1 / √N, so it lowers with increased sample count. It's important what the distribution of the points we choose is, because if in a sub-domain we take too many points, the calculation will be biased towards those values.
Application of Monte-Carlo
A simple application would be to calculate the illumination from a rectangular light.
The usual Monte-Carlo technique is to take a number of random points on the light surface and cast rays against those points. We want to calculate how much light is falling on the lower point. For this task we choose 3 points on the light surface and see which of them are visible from the point of interest. Practically we'll cast 20, 30, 1000 rays, dependent on the accuracy needed. The function we want to integrate takes a vector - a point on the light surface - and returns 0 or 1. 1 if the point of interest is visible from the light point and 0 otherwise. We construct a ray between the two points of interest and if it doesn't hit an object, the function equals 1, otherwise 0.
In our case if the points on the light are distributed uniformly, then the probability to choose each point is equal, so we can ignore ρ(xi). Intuitively we can ignore this part of the calculation and multiply the result by the size of the light. This way the larger lights become brighter.
When we have functions that are harder to integrate, it's more convenient to have ρ which is not constant. For instance if the light is brighter in its middle area and we choose random points and check for shadowing, then multiply by the light power at this particular point, the points from the middle area are more important. This means that the result is not optimal, because we've taken too few samples from the "interesting" area. Therefore it's better to choose the points non-uniformly, in the sample case - take more points from the middle area. This means that the ρ part of the equation is not a constant anymore, to take into account the non-uniform distribution.
The Monte-Carlo method is not particularly accurate. For instance if we have 2 points in the scene next to each other, in some cases we'll have 2 points more in the "lit" part of the light, in some - 2 points less. This results in different values of the integral function for the neighbour points and this is seen as noise. Getting rid of this noise is key to producing a good looking image. One of the techniques for this is called Importance Sampling - it chooses the samples in areas where the function is "interesting", meaning it has a higher value. We can have a parameter, controlled by the user, which says how much samples we'll use for this integration. With importance sampling we can use a far lower amount of samples to produce an image of similar quality, for instance 100 instead of 1000. This decreases the speed of the integration significantly.
Other Monte-Carlo techniques
We'll cover 3 more techniques for Monte-Carlo integration:
- Antialiasing (smoothing of edges) - present in graphics cards. We achieve this by casting more than 1 ray per pixel, each ray giving its contribution, and then finding the mean value of all ray contributions.
- A similar technique can be used for implementing glossy reflections. When a light ray hits a point on a glossy reflective surface, it's partially reflected to directions around the purely specular reflectance direction. We can use Monte-Carlo to cast additional rays on those secondary directions and compute the total color at this point.
- The third technique can be used for simulating the Depth of Field effect. It's achieved again by casting multiple, slightly displaced rays from the camera. For each pixel we cast multiple rays and then the total color is considered to be the sum of all the displaced rays. This actually simulates the behaviour of a real-world camera lens.
Monte-Carlo methods are used in various areas of the rendering process. We already mentioned methods to calculate the indirect illumination at a particular point of the scene, which is important for Global illumination. Some of them are based again on Monte-Carlo techniques.
The foundation theory behind these calculations is called "The Rendering Equation", written in 1986 by Jim Kajiya. The formula postulated in his paper is extremely important and used practically by all commercial products for photo-realistic rendering that work with GI.
Global illumination (GI)
The main idea of GI is that light reflected to the camera from a given point is light coming to that point from the whole scene. In some scenes the indirect illumination (the thinner rays in the image) is pretty important.
Next we see some examples of the importance of indirect illumination:
Image taken from Wikipedia
In the first example we assume some ambient lighting which is equal across the scene. You can see how we lose detail on some edges (at the top of the column). When using indirect illumination (the second example) we achieve a far more realistic image.
Effects
Ambient occlusion
The next effect we'll cover is called ambient occlusion. It's caused by the fact that each point which is close to the pistol "sees" different part of the sky. When we simulate this, it results in the particular shadows which can be seen in the following image:
Image taken from Dan Cole's Blog
Color bleeding
The third effect, which we can see in the next image, is called color bleeding. The cubes in the scene are colored gray. In the highlighted areas the surfaces which are close to another colored surface receive some of its color.
Totally indirect illumination
Imagine a П-shaped corridor which has a light placed at one side and some objects at the other. The light travels only through reflections.
The Rendering Equation
- Kajiya formalizes his idea by his "fundamental formula of illumination" (The Rendering Equation), a.k.a. "Light transport equation".
- Applying his formula we can calculate the light at each point of the scene.
Image taken from Wikipedia
The light coming from point x in direction ωo is what we want to calculate. The first component Le is the "emitting" part, meaning what light is emitted from point x if it belongs to a light. The hardest part is in the integral. It integrates on the hemisphere, covering all possible input directions (denoted by ωi). For all incoming rays we can calculate Li (the incoming light) and multiply it by the reflectance function BRDF for that point and input direction. We multiply it additionally by the cosine of the angle between ωi and the surface normal in point x.
- The first component Lo(x, ωo) is what we want to calculate.
- Second, Le(x, ωo) is the light component in the output direction. It is non-zero only if x is part of a light.
- fr is the BRDF function.
- We take all input light from all input directions Li(x, ωi) by integrating them using Monte-Carlo method.
- We take the cosine between the surface normal n and the input ray ωi. The reason to use cosine is because when the light comes from a sharp angle, it's distributed along a larger surface and its intensity is smaller at a given point. This is also known as the "Lambertian term".
- Finally we integrate over all possible input directions to cover all the incoming light.
This means that calculating the light at a given point involves calculating the light in all other points, and everything is dependent on everything else. But we're not equally interested in all points. Therefore we can construct an algorithm which calculates the most important points first, then the less important, that the even less important and so on. Traversing the scene, each consecutive level of points is less important and we reach a point where we can ignore all the rest. So to calculate the light at a given point we must use the formula to calculate the light from all other points, including the non-emitting.
Path tracing
- Kajiya himself proposes an algorithm like this, called path tracing. It works very well more than 25 years later. It's also referred to as "Brute force".
- The idea is to construct many paths (lines of multiple rays). Rays start from the camera and at each intersection point we cast a new, randomized ray (next part of the path).
- The calculations at each point are done using The Rendering Equation, using the BRDF for the two subsequent parts of the path. Each path contributes to the final image, but the results can be pretty diverse - from pixels that are over-lit to completely black pixels. We need to integrate a large amount (thousands, for example) of paths to achieve good results. By giving the renderer more time we achieve better results.
Biased/unbiased GI algorithms
We'll now cover the biased and unbiased algorithms for GI.
- Unbiased:
- Path tracing is unbiased - its error results only in noise. This means that given enough time it eventually leads to correct results.
- There's also light tracing, where light is traced from the sources back to the camera.
- Bi-directional path tracing - combines the above two.
- Metropolis light transport - similar to path tracing, but when it finds a good path (one with more energy), it generates more paths based on it. For example it leads to good results in a scene like the following classic pathological GI test: there's a closed door with a lock and outside there's a bright light. We need to simulate the light comming through the lock. With usual path tracing it's not very likely for a path to go through the lock or we need to calculate a huge amount of rays to achieve some result. When it finds a path through the lock Metropolis light transport starts to construct more paths similar to it.
- Biased - their result may look smoother with fewer samples, but may contain inaccuracies which do not go away with any number of samples. These are examples for biased algorithms:
- Photon mapping
- Irradiance mapping - it's pretty good, but leads to worse results in some specific cases.
- Radiosity
Data structures for intersection
Because for all the Monte-Carlo strategies there's a need to trace millions, even billions of rays.
- Intersection needs to be very fast.
- When the rays are far more than the geometry primitives (objects and triangles), it's worth to use structures for fast intersection.
- K-d tree
- Spatial structure, useful for intersecting rays with triangle meshes
- O(log N) for intersection, O(N log N) for construction in the usual case
- We need to take care that the construction time doesn't negate the gain from faster intersections
- Bounding Volume Hierarchy (BVH)
- Usually constructs much faster than a K-d tree
- Intersections are slightly slower than a K-d tree
- Can be a hybrid: a BVH containing small K-d trees
Parallel algorithms for ray tracing
- Given that each ray cast contributes to a single pixel (and doesn't modify the scene while getting information) the algorithm is almost infinitely parallelizable.
- Rendering can be distributed among many processors, many machines, many resources.
Physically based rendering
To achieve maximum photorealism, we need all elements in the scene to follow a physically correct model.
- Materials obeying the Conservation of energy law
- Physically correct BRDFs
- Photometric area and metered (IES) lights, Sun-sky system based on physical models
- Physically correct GI algorithms
Conclusion
In conclusion, we can recall that ray tracing is a technique for generating an image by tracing paths of light from the camera through pixels in an image plane and simulating the effects of its encounters with virtual objects. To create different effects different rays are traced. The above diagram shows how the basic effects are generated. Primary rays (Red) are always traced from the camera into the scene in order to determine what will be visible in the final image. To create the direct illumination and shadows "Shadow rays" (Black) are traced from each rendered point to each light in the scene. If the rays "hit" a light, the point is illuminated based on the light's settings. If they hit an object the point is shaded. Reflection rays (Green) are traced in the direction of the specular reflection vector which depends on the type of reflection - normal or Fresnel with the index of refraction of the material. The direction of the Refraction rays (Blue) depends only on the index of refraction of the material. For clear reflections and refractions, only a single ray is traced. To create glossy reflections or refractions many rays are traced in a cone - the spread of the cone depends on the glossiness amount.
Subsurface scattering and translucency effects are generated by tracing rays inside the geometry.