V-RAY NEXT  3DS MAX  DISTRIBUTED RENDERING

This guide covers some of the most common questions and issues with Distributed Rendering along with instructions on how to troubleshoot and resolve them.

Before reading this article, please make sure you are familiar with the Distributed Rendering Setup page.

 

Overview


Distributed Rendering (DR) is a technique for distributing a single render job within a single frame across many computers in a network. This reduces render times by dividing the job into sub-tasks and assigning them to different Render Slave machines. When a Render Slave completes a sub-task, it sends the result back to the Workstation and can then receive new sub-tasks until all tasks are completed. Finally, the image is assembled by combining the output of all Render Slaves.

Since the terms Workstation, Render Slave and a few others are going to be used throughout the entire article, you can find their definitions below:

  • A Workstation is the machine, from which a Distributed Rendering job is submitted;
    • A Render Slave is a machine, to which a Distributed Rendering job is submitted. It can also be referred to as a Render Server;
  • Distributed Rendering Spawner is a V-Ray instance waiting for render jobs. A Distributed Rendering Spawner must be running on each Render Slave machine participating in Distributed Rendering;
  • V-Ray DR spawner (vrayspawner) is used in CPU distributed rendering. It starts a 3ds Max instance on the slave machine that waits for render requests;
  • V-Ray GPU Render Server is the equivalence of a V-Ray DR spawner when GPU rendering is used. A V-Ray GPU Render Server is a V-Ray standalone instance running in server mode (waiting for render requests).
A consistent setup across all machines is crucial for DR to work correctly. Among other things, all network paths, if used, must be accessible by all machines and all machines need to run the same version of all software used. An inconsistent setup may lead to rendering discrepancies and the final image may be incorrect. This is also why proper maintenance and setup integrity is vital.

 

Here is a short description of how Distributed Rendering works:

  • The scene is saved in the temp folder of the Workstation machine
  • The scene is then transferred from the Workstation machine to all Render Slave machines
  • Each Render Slave loads and reads the scene
  • Each Render Slave loads all scene assets
  • Each Render Slave renders the part of the image it was assigned
  • Each Render Slave sends the rendered pixels to the Workstation machine
  • The Workstation machine assembles the final image

 

 

Could not connect to host {IP_ADDRESS}:20204


This is a general error, which indicates that the Workstation cannot reach a Render Slave at this IP address.

Possible Causes

  • V-Ray Spawner has not been started. It is fairly easy to check if that's the cause and fix the problem. Log on to the Render Slave machine and ensure that a V-Ray DR Spawner is running. If the spawner runs as a service - make sure that a “VRaySpawner for 3ds Max 20xx” (“20xx” is your 3dsMax version) is started as a Windows service. This can be checked in the Windows Services app.
  • Network/Firewall permissions. Since network configuration is a very large topic, it is recommended to involve an IT specialist to debug your network setup in more depth. Check if the following permissions are set correctly:
    • Incoming and outgoing communication should be allowed for V-Ray Spawner on TCP ports 20204 (for CPU Rendering) and 20206 (for GPU Rendering) on the Workstation and Render Slaves machines. This ensures proper data exchange between the Workstation and the Render Slaves.

    • Incoming and outgoing communication should be allowed for V-Ray Spawner on TCP port 30304 on the Render Slaves. This ensures proper communication between the Render Slaves and the License Server.

  • Autodesk Backburner is not installed on the Render Slave. To function properly, V-Ray DR requires Autodesk Backburner to be installed on each Render Slave. If you observe 3dsMax to constantly appear and then disappear in the Windows Taskbar when V-Ray Spawner is launched, then most likely Backburner is not installed.
  • A Render Node license is not available. Each Render Slave needs a Render Node license in order to participate in Distributed Rendering. More information about configuring your license setup can be found here.
  • Another process is already using one or more of the communication ports. If the ports used by V-Ray are already taken by another application, then the Spawner will not be able to receive and send information. You can stop the Spawner and check if another process is occupying the same port. For more information on tools that can be used to check network and port connectivity, please the Tools for Testing Network Connectivity page.


We recommended reading the Setup Distributed Rendering page for all necessary steps to properly configure your DR setup.

This issue is easy to debug with a very simple scene. Make a few basic primitives, add a light and try running a DR job to see if it succeeds or fails. You can re-run the test after checking on each item in the list of possible causes until the issue is resolved.

Make sure that the scene renders locally on the Render Slave machine. Log onto the machine, launch 3ds Max and render the scene. If the scene still doesn't render, this may be an indication that there is something wrong with the 3ds Max installation, the V-Ray installation, plugins or there may be other issues.

 

 

“Not responding” error message


This message usually appears when:

  • Render Slave has not finished loading yet. After starting the render slave, it needs some time to fully load and get ready to receive render jobs. The “Netstat” command or the “DR check tool” can be used to check if the Spawner has finished loading and is listening for render jobs on the chosen port - 20204 by default.
  • Render Slave is currently busy with another task.
  • Render Slave is stuck on a previous job. If that’s the case, the “Restart servers on render end” option can be enabled to force-restart the slave after the render job is finished. This option should be used carefully when submitting multiple jobs in a row. It is potentially slow, as a restart takes some time. If the second job is sent to the render slave machine while the Spawner is still restarting, the slave won’t be able to join the rendering until the Spawner has finished loading after restart.




  • Render Slave has crashed and exited. This can happen if there’s a problem with starting 3ds Max, e.g. corrupt or missing 3ds Max installation. Try starting 3ds Max locally to check if it starts correctly.
  • Backburner is not installed.
  • The username the slave is running with is written with other than latin letters, e.g. Cyrillic.



Multiple V-Ray Spawners running at the same time


This may happen if there are different versions of 3ds Max installed in the same pipeline. Currently, V-Ray does not have an out-of-the-box solution for avoiding mixed results from multiple V-Ray Spawner instances. Using such a setup may lead to unexpected results like bucket discrepancies.

If, for example, two V-Ray Spawners are running at the same time for 3ds Max 2019 and 2020, and a DR job is submitted from a 3ds Max 2020 Workstation machine, there is no way for V-Ray to determine which job should be handled by which Spawner. This is because both Spawners are using the same port number - 20204 by default.

To avoid such issues, each V-Ray Spawner needs to be configured to use a different port number and also select the corresponding port when submitting a DR job from a Workstation. There are several ways to do that depending on how the V-Ray Spawner is launched.

  • If you want to use the V-Ray Spawner as a Windows Service, you can use a command to register the service and setup a port number. In the example below “20xx” needs to be replaced with the 3ds Max version used. Port 20209 is just an example and it can be any other number, preferably bigger than 20209, as lower numbers are already used by V-Ray in other cases:
Batch
sc create "VRaySpawner 20xx_20209" DisplayName= "VRaySpawner 20xx_20209" binPath= "C:\Program Files\Autodesk\3ds Max 2018\vrayspawner20xx.exe -port=20209"
PowerShell
New-Service -Name "VRaySpawner 20xx_20209" -DisplayName "VRaySpawner 20xx_20209" -BinaryPathName "C:\Program Files\Autodesk\3ds Max 20xx\vrayspawner20xx.exe -port=20209"
  • If the V-Ray Spawner is launched manually from the Start Menu, the port number can be changed from the Shortcut Properties, by adding -port=20209 at the end of the Target value:

 

  •  If the V-Ray Spawner is launched from a Command Line you can change its port by using a -port=20209 argument: 

    "C:\Program Files\Autodesk\3ds Max 2020\vrayspawner2020.exe" -port=20209
    

To submit the job to the correct V-Ray Spawner, the port settings need to be adjusted in the V-Ray Distributed Rendering settings on the Workstation machine:

 

 

Bucket discrepancies


There is a number of possible causes for differences in the render output from the Render Slaves. As a rule of thumb, the Workstation and Render Slave machines need to have identical setups, including 3rd-party plugins, access to shared locations, unified software versions and service packs, etc. Differences in machine configurations will most likely result in rendering the scene differently, even when Distributed Rendering is not used.

 

Here are some of the possible causes for bucket discrepancies:

  • Inaccessible assets. All shared assets need to be accessible by all Render Slaves.
    • A common problem with accessing assets stored on shared locations (with UNC paths or mapped drives) is caused by the V-Ray Spawner running as a Windows Service with Local System Account privileges. The Local System account is a predefined local account on Windows. It's the default account that the service uses, unless otherwise specified. The service has complete unrestricted access to local resources, but has no access to shared locations, as share permissions cannot be granted to the Local System Account. To allow the V-Ray Spawner to access network assets, the V-Ray Spawner service must be registered with a Windows Account with proper access permissions to the network shares.


This can be done from Window Services > V-Ray Spawner Properties > Log On tab:

 



    • Using UNC paths (e.g., \\server\project) to access Network Resources instead of Mapped Drives (Z:\project) is recommended, if possible.
    • Enabling “Transfer Missing Assets” option from Distributed Render Settings is a useful option when an asset cannot be accessed by the render slaves.

 

  • Missing plugins or scripts. If plugin or a script is used in the scene, it is necessary to install that plugin/script on the Render Slave machines, too. Otherwise, V-Ray will not be able to render the data generated from the plugin.
  • Different versions of 3ds Max,V-Ray and other 3rd-party plugins and scripts. It’s important to use the same versions and service packs for all software, including 3ds Max, V-Ray and all other plugins. Different software versions have different features. If a feature used in the scene is available only in a newer plugin version and not all render slaves are updated to this version, then a discrepancy will likely occur.
  • Incorrect V-Ray Spawner version - e.g. a render job from 3dsMax 2020 workstation is submitted to a V-Ray for 3dsMax 2019 Spawner.
  • Different Windows regional settings - make sure the Windows regional settings for "Decimal symbol" and "Digit grouping symbol" have the same configuration for all machines.
    You can find the options under Control Panel > Region > Formats > Additional settings.

 

The easiest way to troubleshoot this is to first find which Render Slave renders differently. A very useful render element called VRayDRbucket stores information for the Render Slave host names. Enabling it for Distributed Rendering jobs will allow tracking down the machine producing the problematic buckets.

Please note that the VRayDRbucket element only works with Bucket Image Sampler.

After identifying which machine renders differently, log on to that machine, start 3dsMax, load the same scene that produced errors or artifacts and render it locally. Very often, the problem will become immediately obvious after the scene is loaded or rendered.

 

Render Slaves join the rendering after some time


Sometimes the Render Slaves join the rendering after the Workstation machine already completed a part of the image. This happens because the Render Slaves join the rendering only after the scene and assets are transferred and loaded, and pre-pass calculations (like the Light Cache for example) are complete. Depending on scene complexity, asset size and Light Cache settings, this might take a while. Since the Workstation already has part of the information loaded into memory, it is always one step ahead of the Render Slaves.

There are several ways to make improvements:

  • Export heavy geometry objects to V-Ray Proxy (or .vrscene) files. Proxy geometry is not written to the scene file and this makes the scene size smaller, which saves render time and network transfer time.
  • Copy assets locally on each Render Slave. This can help tremendously, especially for complex scenes with lots of assets of large size since the network transfer will be minimal.

The Transfer Missings Assets option can be very useful, as it can re-use assets that have already been transferred and cached. In other words, all assets will be transferred to the Render Slaves only once. Assets will then be cached to the Render Slaves and re-used. Only new or updated assets will be transferred.

  • Network Speed has significant impact on Distributed Rendering. The faster the network, the quicker V-Ray will transfer its data packages, which helps reduce render times.



Low CPU utilization with Distributed Rendering


CPU utilization can be low in some circumstances:

  • Using the Progressive Image Sampler and when the value for the Ray Bundle Size parameter is too small. The Ray Bundle Size determines the size of the job (or the sub-task) which the Workstation sends to the Render Slaves. Lower values mean that V-Ray will send many smaller tasks, which may flood the network and the Render Slaves will spend more time sending and receiving data packages instead of rendering. Higher values do exactly the opposite - the size of the packages will be bigger, which will minimize network transfer and improve CPU utilization.
  • Using the Progressive Image Sampler with lots of Render Slaves. The Progressive Image Sampler has many advantages over the Bucket Sampler, but when it comes to Distributed Rendering, the Bucket Sampler provides superior scaling and it is the recommended sampler type for DR.
  • Last bucket stuck syndrome. When rendering animations V-Ray cannot start rendering the next frame before all the buckets are complete for the current one. This may slow down the whole process if there is an area of the image that requires a lot more sampling than the rest of it. As a result, one or more buckets can get “stuck” on a specific area, while all other machines are idle and waiting for those few buckets to complete. Lowering the bucket size can help in this case. Avoid using extremely low values, because this can have the opposite effect.

 

 

Using machines with different hardware


Although V-Ray doesn’t have any hardware limitations, it’s important to note the following:

  • If a machine used in DR is a lot more powerful than the rest, this can in theory slow down the rendering
  • If there is a difference in the amount of RAM installed on the Render Slaves, it is theoretically possible that some machines with less RAM will fail to render very complex scenes

  • Difference in the Windows Virtual Memory (Pagefile) settings could also cause crashing on some of the Render Nodes when rendering complex scenes requiring more RAM memory than currently installed.



Optimal number of Render Slaves


It’s natural to assume that the more Render Slaves there are for rendering, the faster the rendering will be.

Although this is generally true, it may not always be the case. For example, if a scene is extremely large (in megabytes) and uses a lot of assets, but renders relatively quick, it is technically possible that the first few machines to join the rendering will complete the whole image before the rest of the machines can even join.

 

 

Other relevant information


  • Rendering Animations. Although it is possible to render Animations with Distributed Rendering, it’s recommended to use Network Rendering instead. Some calculations may be duplicated with distributed rendering in animations. For example, V-Ray needs to calculate the Light Cache separately for each machine, which is less efficient than calculating it only once.
  • Distributed Rendering does not require a 3ds Max license. You will be able to use it even if a 3ds Max license is not available.
  • Distributed Rendering depends on your Network Speed. The faster the network, the faster the rendering will be. WiFi networks are not recommended.
  • Maintenance. Distributed Render setups require some maintenance just like any other Network Rendering setup. Windows updates, for example, could disable the V-Ray Spawner service, power outages might put the machine in a boot state, etc.

 

    • Regular restarts are recommended - sometimes Windows won’t work well if it hasn’t been restarted for months.
    • Regular clearing of the Windows Temp folder is recommended - V-Ray uses the Windows Temp folder for transferring assets. If the Temp folder is never cleaned, it is possible to fill up the entire HDD, which will affect the proper functioning of DR.
    • The Restart Servers On Render End option in the Distributed Rendering Settings can be very useful, since it automatically reboots the V-Ray Spawner after a job completes, thus will clear things up.