Releases: netborg-afps/dxvk-low-latency
v2.7.1-3 low-latency
The Gpu Progress feature (#5) is now working as intended.
There was an inaccuracy present leading single frames to have excessive latency. This has been fixed, and input now should feel really responsive. The pacing also got a lot smoother, in particular for lightweight games and VRR.
Furthermore, the VK_EXT_calibrated_timestamps extension will be used if possible in case the device doesn't support VK_KHR_calibrated_timestamps. This makes the recently introduced feature (#9) accessible to older drivers.
v2.7.1-2 low-latency
Further improves the smoothness by using the VK_KHR_calibrated_timestamps extension (#9) to measure - when submits have completed processing - directly on the device/GPU itself. The enhanced GPU execution time measurement precision leads to better predictions, as well as improving the frame time and latency stability. Most of the recent drivers and devices should support VK_KHR_calibrated_timestamps. If a particular driver doesn't support it, the measurements fall back to the previous timestamps taken after vkWaitSemaphores.
The VRR mode no longer requires V-Sync to be active (#8) and now even pairs well with VK_PRESENT_MODE_IMMEDIATE_KHR. It works by implicitly deriving estimated V-Blanks by simulating how the VRR-display would sync to the render-finished timestamps. This greatly improves compatibility for Wayland and other display servers. The motivation behind this change is the observation that the timestamps we got from VK_KHR_present_wait were not precise enough, even on x11, and were completely broken on Xwayland because VK_KHR_present_wait wasn't doing anything there. The VRR mode should now work on every VRR setup.
V-Sync-based VRR synchronization to explicit V-Blank measurements may come back in a later release together with VK_EXT_present_timing, as well as the V-Sync buffer statistics which were removed for the same reasons stated above.
v2.7.1 low-latency
GPU submissions are now being tracked in real-time (#5) to shift the decision when to start a frame to the latest possible point in time. This significantly improves smoothness and responsiveness, especially for the uncapped case.
Added the dxvk.hud = "latencydetails" option (#4) which provides insights about GPU buffer and V-Sync buffer statistics. Helpful for fine-tuning the dxvk.lowLatencyOffset variable to minimize or even completely eliminate GPU buffering without affecting fps throughput too much, and helpful for fine-tuning the VRR refresh rate to minimize V-Sync buffering in the VRR mode. This display is also useful to check how external programs or specific system configurations are affecting input lag.
Removed the presentlatency hud option.
v2.7 low-latency
Added a new VRR frame pacing mode which can be enabled with dxvk.framePace = "low-latency-vrr-360" where you need to replace 360 with the refresh rate of your monitor. It automatically enables v-sync and adjusts the pacing to finish frames after your monitor becomes ready to display the new frame, and thus prevents running into v-sync buffering.
This mode runs on x11-flip and Wine Wayland, but not on Xwayland since v-blank information isn't available there. This is the first release where we analyzed Wayland performance, and at least on Nvidia, we can say for sure that x11 flip is still strongly recommended for low-latency gameplay. Hopefully we will get a true flip mode on Wayland in the future, as what we are currently getting from Wayland doesn't even come close to the crisp input feel x11 flip is providing.
Furthermore, latency has been improved by potentially sleeping a second time before starting a frame, in case new information becomes available after sleeping the first time, that reveals processing the previous frame has been slower than expected.
Present latency no longer impacts the hud's renderlatency, but can be shown separately via dxvk.hud = "presentlatency".
Removed DXVK_LOW_LATENCY_OFFSET and DXVK_LOW_LATENCY_ALLOW_CPU_FRAMES_OVERLAP environment variables as the respective config variables can be set up via DXVK_CONFIG as well.
v2.6.1 low-latency
Merged with dxvk 2.6.1
v2.6 low-latency
Merged with dxvk 2.6. Presentation delay is now included in render latency measurement when using immediate present mode.
Low-latency frame pacing 2.5.3.v3
This update contains frame pacing improvements, robustness refinements and generally increases the level of fine-tuning.
Latency has further been decreased dramatically in some games by speeding up the dxvk-internal flush heuristic delivering GPU submissions quicker, which was presumably tuned for bandwidth/fps. Synchronization now better incorporates the CPU timeline, increasing frame time stability and contributing to better responsiveness as well.
GPU scheduling rework
Low-latency frame pacing is about predicting how the next frame will line up with the current frame being processed on the GPU. This prediction has been reworked to be more accurate and robust for a wide range of games.
In addition, it has been rebased on the current master, which recently added signals about when frames start and finish rendering-processing on the CPU. This could be helpful to further improve this pacing in the future, and for now responsiveness in the CPU-limited case can be improved by setting dxvk.lowLatencyAllowCpuFramesOverlap = False or DXVK_LOW_LATENCY_ALLOW_CPU_FRAMES_OVERLAP=0 respectively. By default, overlap is allowed for fps reasons.
Low-latency frame pacing for dxvk 2.5.3
This low-latency mode aims to reduce latency with minimal impact in fps. Effective when operating in the GPU-limit. Efficient to be used in the CPU-limit as well.
Greatly reduces input lag variations when switching between CPU- and GPU-limit, and compared to the max-frame-latency approach, it has a much more stable input lag when GPU running times change dramatically, which can happen for example when rotating within a scene.
The current implementation rather generates fluctuations alternating frame-by-frame depending on the game's and dxvk's CPU-time variations. This might be visible as a loss in smoothness, which is an area this implementation can be further improved.
An interesting observation while playtesting was that not only the input lag was affected, but the video generated did progress more cleanly in time as well with regards to the wow and flutter effect.
This version renders with up to 67% latency reduction in the GPU limit compared to max-frame-latency-3 and with up to 35% latency reduction compared to max-frame-latency-1.
Optimized for VRR and VK_PRESENT_MODE_IMMEDIATE_KHR, proper optimization for v-sync may follow later. It also comes with its own fps-limiter which is typically used to prevent the game's fps exceeding the monitor's refresh rate.
Usage
Add renderlatency to the DXVK_HUD variable to watch the render latency in real-time.
DXVK_FRAME_PACE has the options max-frame-latency, low-latency and min-latency. This release defaults to low-latency for demonstration purposes.
DXVK_LOW_LATENCY_OFFSET allows for fine-tuning the low-latency mode. Values are in microseconds. Positive values might improve responsiveness even further, although only very slightly, this may be relevant for edge cases. Negative values might improve fps. Defaults to 0.
These environment variables are also represented in the dxvk.conf: dxvk.framePace and dxvk.lowLatencyOffset.
Benchmark
Below are latency measurements of what is to be expected when running games in the GPU-limit:
Test release
2.5.3-low-latency-framepacing-rc [d3d8/9] Proper (and age accurate) handling of d3d9.shaderModel = 0

