Darktide Performance Deep Dive & System Requirements
Interested in learning more about the Pre-order Beta before you read the Performance Deep Dive? Read more here!
Hello!
I am Rikard Blomberg, Chief Technical Officer and co-founder of Fatshark. The first computer I did programming on (or even had access to) back in 1983 was an ABC80, on which anything resembling graphics was what we nowadays consider ASCII art. The development since has been mindblowing.
Today, we are releasing our extended system requirements for Warhammer 40,000: Darktide that covers higher-end machines and those looking to enable NVIDIA-supported technologies such as RTX and DLSS 3. At the same time, we wanted to dive a bit deeper on the performance of Darktide, what to expect and provide guidance on how best to get the optimal experience with your hardware.
Ever since starting to make first- or third-person games, going back to the days of Lead and Gold, performance in terms of end-user framerate and the game experience has always been on our mind. That said, I can’t name a game we released where we were fully satisfied with the performance. Why? Because as a game developer, you are very seldom one hundred percent happy with any specific aspect of a game since there is always a “tug-of-war” between them.
At Fatshark, we believe that the best games result from striking a balance such that no single aspect is given priority over all others. This does not mean that there is no room for improvement over time; rather, we must always be mindful of our priorities.
Games like Vermintide, Vermintide 2, and Darktide are all good examples of this approach. In no way are they perfect in terms of having a high and steady framerate, but that was the approach we chose rather than, for example, making too big compromises regarding the number of enemies or the graphical quality of levels. We believe that most players can accept a short temporary loss of frame-rate to be able to get the overall gameplay we offer. We know this is a harder sell in a competitive game, but felt that this was a necessary trade-off for games in our category - cooperative play. But then - are we satisfied with the current level of performance in our games? No - we aren’t and will never be. It is and will continue to be, an ongoing battle for as long as we have people on the project.
Given this context, one needs to understand that we put in an incredible amount of work to make our games as performant as possible both before release but also after when we can use actual player data and feedback to set our priorities.
Generally speaking, our games have faced similar challenges. We have a lot of stuff going on in the game world, many enemies, all having detailed rigs and attachments and an AI to guide them. We also have to rely heavily on physics simulation to handle primarily melee combat, where countless entities move and interact with each other.
With modern processors, there has always been the promise of parallelism using the many cores that have become commonplace in recent hardware. As any game developer would tell you, this is in no way a silver bullet. The thing with parallelism is that it requires things to be conducted in parallel instead of in a sequence, which means that they need to be independent of each other. Large parts of gameplay code - especially involving the player and her direct interactions with the world can be tough to parallelize because the rules governing those are typically highly interconnected as opposed to independent. In our games, parallelized things are rather systems that one might consider more low-level, like updating animations, resource handling, dispatching of commands to the GPU, etc. But even with those things done, our games still tend to be heavy on the CPU side.
Let us take a step back here. What exactly do we mean by CPU-side, and GPU-side, and how does this relate to the framerate? Most of you, being kids of today, probably learned these things in kindergarten; I didn’t. Also, being a stiff old sog, I revert to my general principle that no question is too dumb to ask.
The framerate is the number of frames rendered to the target display per second. As a developer, though, it is more practical to talk about the frametime - which is the inverse of the framerate, i.e., the time it takes to render one full frame to the target display. We usually measure frame time in milliseconds. In other words, a framerate of 60 frames per second equals a frametime of roughly 16.7 milliseconds.
To achieve a frametime equal to or less than 16.7 milliseconds, everything the game does in one of its discrete updates (i.e. a frame) must be handled within that time. That means that the CPU at this time must handle all updates of logical entities, physics, collisions, AI, sending and receiving networking messages, constructing and dispatching instructions to the GPU, etc., etc. Similarly, the GPU must be able to handle all the instructions from the CPU applying code to transform and place objects, tessellate surfaces, render shadows, apply lighting and postprocessing, etc. If one of the two parts (GPU or CPU) is slower than the other, it will be that one deciding the framtime/framerate.
To get the most out of any specific set of hardware, you like the CPU and GPU to have similar frametimes so that no part needs to wait for the other. Typically when optimizing, we look at the slowest of the two and try to figure out what we can do to make it faster.
As mentioned earlier, parallelization is a big thing on the CPU side. We try to identify as many tasks as possible that don’t have interdependence and let another thread on the CPU handle them, thus more or less removing that time from the total frametime. We also try to find things that maybe don’t need doing or should not be done all the time. Perhaps it is enough to do it every 10th frame instead of every single one. Then there is a plethora of more specific tricks, patterns to avoid, patterns to use, etc.
Reducing GPU frame times are usually easier than cpu time for us. Specifically we already expose a long list of render settings for the user that affects GPU times significantly.
Reducing resolution, turning off advanced visual effects and utilizing hardware specific scaling and anti-aliasing options are all things that the user can do themselves to reduce the GPU frame times and increase their framerate.
That said, we generally put a lot of effort into tweaking our content and rendering features to make sure we keep our frame times as low as possible on the GPU.
- Reduce the amount of draw calls we issue every frame.
- Manually remove objects from levels
- Improve culling in our levels to not draw things when we don’t see them
- Create lower resolution mesh lods with fewer materials on them
- Combine multiple objects into larger single meshes.
- Utilize instancing to combine draw calls
- Reduce time to render individual objects
- Create lower resolution mesh lods
- Remove unnecessary features from shaders that may not be needed in certain circumstances
- Optimize and improve our advanced rendering features such as post effects, material/lighting models to make better use of the hardware.
With every new game we make we have the ambition to push the capabilities of our technology and game engine a notch. This is not solely for the betterment of that single title but also for driving the overall level of our technology. A new feature that at the time seemed unnecessary or a strange priority might prove essential for future updates and games.
In terms of technology there were several areas that we wanted to improve or re-invent that were identified at the start of the Darktide development. Some of the major things were:
- We wanted to change and improve our network model (this is a large subject and hopefully will get a blog post of its own).
- Transition to a DX12 only rendering backend to utilize more of the new DX12 featureset.
- Improve the visual quality of our lighting as well as our lighting workflows.
- Newer, less code-focused way of authoring and handling content in the game such as weapons, cosmetics and other items.
- Build our levels from larger building-blocks that could be more effective across multiple levels.
- Improve our technical bundling strategy to optimize the size of game downloads and the disk size requirements.
To stay true to the world of Warhammer 40,000 one has to make the grim world of the far future dark. But how do you do dark, darkness just being the absence of light? Our answer to that question would probably be - you do it by making the few light sources there better, more living and eye-catching.
Thus, we needed to improve our ambient lighting model which in Vermintide 2 mostly consisted of very sparse baked probe information and global ambient light overrides. In essence we wanted to get global illumination into our scenes and make sure objects placed in our worlds felt like they contributed to the lighting in the scene accurately. With more accurate ambient lighting one also cannot forget about the specular lighting and reflections which play a big role in making our scenes feel coherent. It became obvious that we had to make big improvements here to make sure that we live up to everything that makes Warhammer 40,000 the grim dark world it should be.
Luckily there was a good candidate technology for solving this problem. ray tracing and specifically RTXGI for global illumination (GI).
Ray tracing just started to become a well established component for rendering in games, especially with an explosion in terms of hardware support. Ray tracing is also compelling from a rendering development standpoint because it provides a unified solution for complex corner cases that arise when doing more traditional rasterized rendering.
NVIDIA DLSS 3 AND RTX SUPPORT
Partnering up with NVIDIA we opted to support ray tracing in our renderer and ended up implementing both RTXGI and raytraced reflections to boot. This also lays the groundwork needed for us to continue experimenting with additional ray tracing features down the line which carries the promise of further improving things like shadows, transparency rendering and VFX (visual effects such as particles). We also decided to support other RTX features like DLSS and Reflex to further improve frame times and response times of the game.
Another added benefit of the RTXGI implementation we ended up going with is actually that we decided to replace our baked ambient light solution with baked RTXGI probe grids. This allows us to use RTX cards on our development machines to quickly bake GI that can be applied to our scenes even for gpus that do not have enough power to push advanced ray tracing features like this. You won’t get the added benefit of the GI being fully dynamic that you get if you have a powerful gpu in your machine of course but the static GI still retains the nice dark feeling in our scenes that would otherwise be very flat and boring.
Building very expensive features like ray tracing doesn’t make much sense without having performant super resolution features so we decided to also go with supporting DLSS and to throw in Reflex for good measure.
We actually started out implementing DLSS 2.1 but as the project got extended and we started slipping into DLSS 3 territory we immediately jumped at the chance to easily integrate DLSS 3 into Darktide as well. With the new frame generation features it is actually a very good fit for our games as we very often see ourselves being CPU bound and since we already had Reflex integrated to handle any concerns of increased latency we were very happy to see DLSS 3 giving us huge improvements in frame rates across the board.
(*) Typical for most situations, but one might occasionally experience brief degradation in the most complex and intense scenes, primarily due to CPU-constraints.
OVERALL PERFORMANCE
Pushing quality and quantity of things make Darktide a demanding game, but we nevertheless are keen to support as wide range of hardware as possible. To be able to do this we have to make many features and fidelity of features configurable. We acknowledge that understanding exactly what settings to choose to get the optimal experience might not be trivial to many players. Thus, we have tried to compile explanations, suggestions and tips below on how to think of and use the different settings in the game. We hope you find this helpful.
With Darktide being capable of over a hundred on-screen enemies, some settings tweaks may increase framerate specifically during combat. Reducing the number f maximum amount of ragdolls (MAX RAGDOLLS) increases CPU performance (due to a lesser amount of local physics objects being simulated) while reducing the amount of weapon impact and blood decals (MAX WEAPON IMPACT DECALS, MAX BLOOD DECALS), as well as their lifetime (DECAL LIFETIME), increases the GPU performance in similar situations.
A majority of the options tend to affect mostly the GPU so the ways of increasing framerate when constrained by the CPU tends to be more limited. For the CPU constrained scenario there are but a few options to point at, namely:
- MAX RAGDOLLS (can have a huge impact depending the number)
- SCATTER DENSITY
- LENS FLARES (just the highest option “all lights” have CPU impact though)
- FIELD OF VIEW
SETTINGS MAINLY RELATED TO PERFORMANCE
DLSS / FSR
Image enhancement and upscaling techniques are highly recommended for increasing GPU performance. For GeForce RTX cards we’ve integrated NVIDIA DLSS, a technology framework that outputs high resolution frames using AI. We’ve also added support for two of AMD’s FidelityFX Super Resolution upscaling techniques that can be used by any GPU. Their different settings balance speed against quality, but note that the most performant presets are rather tailored for resolutions higher than 1080p. Only one of the three available techniques can be used at any one time.
Running the game at high resolutions without using one of these techniques will have a huge impact on performance.
NVIDIA DLSS Super Resolution
DLSS Super Resolution uses AI to output higher resolution frames without compromising image quality or responsiveness. This feature requires a GeForce RTX graphics card. The different variant goes from QUALITY (having the least visual impact) to ULTRA PERFORMANCE (giving the highest boost in performance) where the latter is best utilized when running the game in higher resolutions like 4k. There is also an AUTOMATIC version that tries to decide the best version given the circumstances.
Available options in order of least visual impact and least performance gain to highest visual impact and most performance gain:
OFF ⇒ QUALITY ⇒ BALANCED ⇒ PERFORMANCE ⇒ULTRA PERFORMANCE
AMD FidelityFX Super Resolution 1.0 (FSR 1.0)
AMD FidelityFX Super Resolution 1.0 is a cutting edge super-optimized spatial upscaling technology that produces impressive image quality at fast framerates for any GPU. The most performant settings reduce image quality and are recommended mostly for higher screen resolutions.
Available options in order of least visual impact and least performance gain to highest visual impact and most performance gain:
OFF ⇒ QUALITY ⇒ BALANCED ⇒ PERFORMANCE ⇒ULTRA PERFORMANCE
AMD FidelityFX Super Resolution 2 (FSR 2)
AMD FidelityFX Super Resolution 2 is a cutting edge temporal upscaling algorithm that produces high resolution frames from lower resolution inputs. The different variant goes from ULTRA QUALITY (having the least visual impact) to PERFORMANCE (giving the highest boost in performance) where the latter is best utilized when running the game in higher resolutions like 4k. FSR 2.0 generally works better across a higher variety of resolutions than its predecessor, but with the downside of being a bit more expensive, especially on older hardware.
Available options in order of least visual impact and least performance gain to highest visual impact and most performance gain.
OFF ⇒ ULTRA QUALITY ⇒ QUALITY ⇒ BALANCED ⇒ PERFORMANCE
NVIDIA REFLEX LOW LATENCY
Technology used to measure and improve the latency of the game. Having this enabled reduces system latency and increases PC responsiveness. It has a very limited effect on performance. The BOOST option overrides the power saving features in the GPU and can provide small additional increases in latency reduction.
Available options:
DISABLED ⇒ ENABLED ⇒ ENABLED + BOOST
Framerate Cap
Locks the framerate to a maximum value and might be used to achieve a more steady framerate. Only used in conjunction with NVIDIA Reflex Low Latency.
Available options:
30 ⇒ 60 ⇒ 120 ⇒ UNLIMITED (DISABLED)
ANTI-ALIASING
Anti-aliasing improves the appearance of jagged polygon edges, so they are smoothed out on the screen. TAA increases quality over FXAA but at a higher GPU usage.
Available options in order of increased load on the GPU, from least to most:
OFF ⇒ FXAA ⇒ TAA
SHARPEN
Used together with TAA to improve edge quality. Increases GPU usage slightly.
OFF ⇒ ON
FIELD OF VIEW
Controls the extent of the observable game world that is seen on the display at any given moment. The field of view shows two numbers HFOV(Horizontal FOV) and VFOV(Vertical FOV). VFOV is the main setting and HFOV is calculated from the current VFOV and screen aspect ratio. Default VFOV is 65 which is equivalent to a HFOV of 91 on a 16:9 screen. When the screen grows wider the HFOV will adapt with the screen so the player does not need to change the FOV manually to fit their wide aspect ratio screens. Raising the field of view can have a significant negative impact on the performance. Field of view has an impact both on GPU and CPU.
Can be set to a VFOV value between 45°(narrowest) to 85° (widest)
SETTINGS RELATED TO RAY TRACING
RAY TRACING PRESET
Enable DirectX Raytracing (DXR) for life-like reflections and global illumination.
Sets the individual ray tracing options according to the scheme below. Ray tracing can have a significant impact on GPU performance depending on the exact setting. It might also have a smaller effect on CPU performance.
RAY TRACED REFLECTIONS
Enable DirectX Raytracing (DXR) for life-like reflections. LOW preset combines DXR with SSR. The HIGH option has a significant impact on GPU performance and is only recommended on higher-end hardware.
Available options in order of increased load on the GPU, from least to most:
OFF ⇒ LOW ⇒ HIGH
RTX GLOBAL ILLUMINATION
Enable this option for life-like global illumination using ray tracing to give more accurate lighting within the game. Impacts mainly GPU performance.
Available options in order of increased load on the GPU, from least to most:
OFF ⇒ LOW ⇒ HIGH
GENERAL GRAPHICS SETTINGS AND PRESETS
GRAPHICS PRESET (GRAPHICS QUALITY)
Sets all the more detailed (advanced) settings according to three predefined schemes. This does not affect any settings related to field-of-view, ray tracing, performance etc. Using these presets is generally a good starting point for most users. For each of the three presets the corresponding settings can be found in the table below.
Changing any of the advanced settings after setting the GRAPHICS QUALITY preset will result in that being set to CUSTOM. Note that changing this setting from CUSTOM to any of the defined presets will result in all custom settings being reset.
Available preset options in order of increased load on the GPU and CPU, from least to most:
LOW ⇒ MEDIUM ⇒ HIGH
ADVANCED SETTINGS
AMBIENT OCCLUSION QUALITY
Ambient Occlusion is a model for calculating indirect light in a scene. We use Combined Adaptive Compute Ambient Occlusion (CACAO) as an ambient occlusion model. Increasing AO quality creates a higher load on the GPU
Available options in order of increased load on the GPU, from least to most:
OFF ⇒ LOW ⇒ MEDIUM ⇒ HIGH
LIGHT QUALITY
Effect the quality of light and shadows. Higher settings increase memory consumption and load on the GPU.
Available options in order of increased load on the GPU and memory consumption, from least to most:
LOW ⇒ MEDIUM ⇒ HIGH ⇒ EXTREME
VOLUMETRIC FOG QUALITY
Sets the visual quality of fog in the game. Increasing fog quality means a higher load on the GPU.
Available options in order of increased load on the GPU, from least to most.
LOW ⇒ MEDIUM ⇒ HIGH ⇒ EXTREME
DEPTH OF FIELD
Adds a camera focus effect with different quality settings. Increases GPU usage.
Available options in order of increased load on the GPU, from least to most.
OFF ⇒ MEDIUM ⇒ HIGH
BLOOM
Bright in-game glow effects through post-processing. Increases GPU usage slightly
OFF ⇒ ON
SKIN SUBSURFACE SCATTERING
Realistic skin effect on characters through post-processing. Increases GPU usage.
OFF ⇒ ON
MOTION BLUR
Simulates a blur effect for movement through post-processing. Increases GPU usage.
OFF ⇒ ON
SCREEN SPACE REFLECTIONS
Generates in-game reflections through post-processing. Increases GPU usage.
Available options in order of increased load on the GPU, from least to most.
OFF ⇒ MEDIUM ⇒ HIGH
GLOBAL ILLUMINATION
Baked simulation of indirect bounce lighting. This is replaced by ray-traced global illumination when running with ray tracing and the setting does not have any effect in such scenarios.
LOW ⇒ HIGH
LENS QUALITY
Enables post-processing effects for both Lens Quality Colour Fringe and Distortion. Increases GPU usage.
OFF ⇒ ON
LENS FLARES
Adds lens flare effects for sunlight or all light sources. Increases GPU usage.
ALL LIGHTS has a small effect also on the CPU.
Available options in order of increased load on the GPU, from least to most.
OFF ⇒ SUNLIGHT ONLY ⇒ ALL LIGHTS
SCATTER DENSITY
Scales the in-game density of detailing elements such as debris. More scatter increases CPU usage.
Can be set to a value between 0.0 and 1.0
MAX RAGDOLLS
Decides how many enemy ragdolls are active. More ragdolls significantly increases CPU usage.
Can be set to a value between 3 and 50
MAX WEAPON IMPACT DECALS
Decides the amount of concurrent weapon impact decals. Increases GPU usage.
Can be set to a value between 5 and 100
MAX BLOOD DECALS
Decides the amount of concurrent blood decals. Increases GPU usage.
Can be set to a value between 5 and 100
DECAL LIFETIME
Decides the lifetime (in seconds) for decal being drawn in the game. Increases GPU usage.
Can be set to a value between 10 and 60
IF HELP IS NEEDED
The great variance in consumer hardware and software makes it hard to predict the GPU and CPU performance for all player configurations. If you encounter performance issues we encourage reporting those through our forums so that they can be forwarded to the dev team. When reporting, please take some time to describe the issue, especially if it is gameplay situational or is dependent on a settings change, and include information on relevant hardware and setup. Taking a bit of extra time to provide information and context helps us greatly, and will hopefully help us figure out your problem much quicker.