Interactive GPU Rendering

The Interactive GPU rendering modes rely on the GPU to perform almost all tasks, making use of either RTX or CUDA. Graphics cards have more pure computation power, but they can only make full use of it when the tasks are parallel and operations are the same for all data. This requires a different programming approach when implementing a GPU render engine. The other big difference from the CPU is that usually the available memory is several times smaller. This may limit the usability of GPUs for very large and complex scenes, especially if they are not designed for GPU rendering from the start.

Due to the architectural differences V-Ray GPU is a separate render engine and so it may give slightly different results compared to V-Ray. Also global illumination on GPU can't be turned off.

The CUDA engine on GPU devices requires NVIDIA cards with a minimum compute capability 5.2*

RTX is not supported on macOS (V-Ray GPU works with CUDA CPU devices under macOS)

* CUDA compute capability and card reference

Example

The following code can be used to activate either mode by setting the renderMode of the VRayRenderer:

with vray.VRayRenderer() as renderer:    
    renderer.renderMode = 'interactiveCuda' # or 'interactiveOptix'
    renderer.load('retro.vrscene')
    renderer.startSync()
    # ...

VRayInit init(NULL, true);
VRayRenderer renderer;
renderer.setRenderMode(VRayRenderer::RENDER_MODE_INTERACTIVE_CUDA); // or VRayRenderer::RENDER_MODE_INTERACTIVE_OPTIX
renderer.load("retro.vrscene");
renderer.startSync();
// ...

using (VRayRenderer renderer = new VRayRenderer()){
    renderer.RenderMode = RenderMode.INTERACTIVE_CUDA; // or RenderMode.INTERACTIVE_OPTIX
    renderer.Load("retro.vrscene");
    renderer.StartSync();
    // ...
}

var renderer = vray.VRayRenderer();
renderer.renderMode = 'interactiveCuda'; // or 'interactiveOptix'
renderer.load('retro.vrscene', function(err) {
    if (err) throw err;
    renderer.startSync();
    // ...
});

Parameters

Options for the V-Ray GPU engine are in SettingsRTEngine and most are the same as described in the previous lesson. There are a few that are specific to the GPU:

gpu_bundle_size - Controls the number of rays that are processed together. Smaller values may increase interactivity. It is not recommended to increase this value beyond 512.
gpu_samples_per_pixel - The number of rays that are traced for each pixel during one image pass. The greater the value, the smoother the picture from the very beginning of the rendering, but interactivity may be significantly diminished and image updates will come at longer intervals. Increasing this value also reduces the amount of data transferred from render servers back to the client machine in DR.
low_gpu_thread_priority - Enable this to reduce the load on GPUs that are connected to a display. This should make the operating system UI more responsive. It works best with gpu_samples_per_pixel=1 and gpu_bundle_size=64 (or less).
opencl_resizeTextures - Texture transfer mode for the GPU. Note that on-demand mipmapping is available only in non-interactive mode. (0:Full size textures; 1:Resize textures; 2:On-demand mipmapping)
opencl_textureFormat - Format for the textures on the GPU (0 - 32-bit float per channel; 1 - 16-bit half float per channel, default; 2 - 8-bit per channel)
coherent_tracing - When enabled, V-Ray GPU will spawn secondary rays (GI, reflection, refraction) in similar directions in order to improve rendering speed on GPUs. This has the most effect for interior scenes where light bounces around. The downside is that initially the image will have some artifacts until more samples are accumulated.
enable_bucket_sampler - If 1, V-Ray GPU will check the sampler type in SettingsImageSampler, and it will use the settings there, if there sampler type is "bucket". Default is (0=use progressive)
out_of_core - When true, V-Ray GPU Out-of-core codepath will be used.

Selecting GPU devices

The following example illustrates how you can select specific devices to render with. You may use it for example to disable the GPU used for display or to drive an OpenGL viewport. Note that you can also use the low_gpu_thread_priority parameter to reduce load on the main GPU. By default all available devices will be used.

selected = []
devices = renderer.getComputeDevicesCUDA() # or getComputeDevicesOptix()
# we have to pass a list with the indices of the selected devices
for index, device in enumerate(devices):
    if (device['totalMemoryMB'] > 4000]):
        selected.append(index)
setDevicesOk = renderer.setComputeDevicesCUDA(selected) # or setComputeDevicesOptix(selected)

std::vector<int> selected;
std::vector<ComputeDeviceInfo> devices = renderer.getComputeDevicesCurrentEngine();
// we have to pass a list with the indices of the selected devices
for (int index = 0; index < devices.size(); ++index) {
    if (devices[index].totalMemoryMB > 4000)
        selected.push_back(index);
}
bool setDevicesOk = renderer.setComputeDevicesCurrentEngine(selected);

List<int> selected = new List<int>();
ComputeDeviceInfo[] devices = renderer.GetComputeDevicesCurrentEngine();
// we have to pass a list with the indices of the selected devices
for (int index = 0; index < devices.Length; ++index) {
    if (devices[index].TotalMemoryMB > 4000)
        selected.Add(index);
}
bool setDevicesOk = renderer.SetComputeDevicesCurrentEngine(selected);

var selected = [];
var devices = renderer.getComputeDevicesCUDA(); // or getComputeDevicesOptix()
// we have to pass a list with the indices of the selected devices
for (var index = 0; index < devices.length; ++index) {
    if (devices[index].totalMemoryMB > 4000)
        selected.push(index);
}
var setDevicesOk = renderer.setComputeDevicesCUDA(selected); // or setComputeDevicesOptix(selected)

"Hybrid" rendering (CUDA)

V-Ray GPU CUDA allows you to use your CPU as an emulated CUDA device. It shows up as "C++/CPU" in the first position of the device list and is disabled by default. You can enable it to improve your render speed. You could even use it without any GPUs. In some cases rendering with V-Ray GPU on this device may be faster than rendering with the "regular" V-Ray on the same device. This is because of the simpler code structure for GPU. The results you get when doing "hybrid" rendering are identical to results from V-Ray GPU on a graphics card. In other words it has the same potential differences from the normal CPU V-Ray engine and the same limitations.

Content

Space Tools

Example

Parameters

Selecting GPU devices

"Hybrid" rendering (CUDA)