Releases: HansKristian-Work/vkd3d-proton
Version 3.0b (bugfix)
Another tiny bugfix release:
- Fix silly regression in synchronization when VK_KHR_unified_image_layouts is not supported.
- Update shader workaround hash for Wuthering Waves
Version 3.0a (bugfix)
Tiny bugfix release that addresses a silly performance regression in the new unified image layout path.
Version 3.0
A new major release, yay!
A few milestones have been reached over the last year, warranting a new major bump.
It's been quite a while since the last release due to new things coming up constantly.
These tags are mostly arbitrary anyway, and tend to be done when islands of calm and stability emerge.
Major items
DXBC shader backend rewrite
@doitsujin rewrote the entire DXBC backend, replacing our legacy vkd3d-shader path.
DXVK and vkd3d-proton now share the same DXBC frontend which gives us clean,
"readable" (as readable as DXBC can be) and lean IR to work with.
dxil-spirv standalone project now supports DXBC as well as a result.
Lots of games which used to be completely broken before due to bugs and missing features
in the legacy vkd3d-shader backend are now fixed. E.g. Red Dead Redemption 2 runs just fine now in D3D12 mode.
Some recently released DXBC based games also only work on the new path.
The amount of regressions found the last months in DXBC games has been very minor,
but it's possible there are still bugs in this area.
However, given that DXVK uses it now as well, it's been battle tested quite extensively already.
FSR4 support
We added support for AGS WMMA intrinsics through VK_KHR_cooperative_matrix and VK_KHR_shader_float8,
which is enough to support FSR4.
Note that these shaders are tightly coded for AMD GPUs with some implementation defined behavior
(particularly around matrix layouts), and they will not necessarily work on other GPU vendors.
There is also a quite hacky emulation path of this which relies on int8 and float16 cooperative matrix support,
which can run on older GPUs at significant performance cost (and some cost to theoretical correctness).
Note that the default "official" build of vkd3d-proton only exposes this feature when the native
VK_KHR_shader_float8 is properly supported, i.e. RDNA4+ only.
The emulation path is available when building from source with the appropriate build flags.
The decision to not include this emulation path by default is over my pay grade.
The aim is to be able to ship FSR4 in a more proper way in Proton.
Features
We've more or less caught up on the things we can feasibly implement,
so there isn't much exciting stuff happening on the feature front.
- Implemented experimental support for D3D12 work graphs. No real-world content ships this yet.
This implementation is far from complete,
but it works on "any" GPU since we emulate the feature with normal compute shaders.
Funnily enough, the performance of this emulation can massively outperform native driver implementations of the feature
in many scenarios we've tested (at the cost of some extra VRAM usage).
Seedocs/for more details on implementation and some performance numbers. - Expose
AdvancedTextureOpsSupportedby default from SM 6.7 ifVK_KHR_maintenance8is supported. - Expose the recently added sparse TIER_4.
- Bump exposed D3D12SDKVersion to latest 618.
- Experimentally expose support for opacity micromaps.
There are some details which aren't quite compatible with the D3D12 API, but some basic demo content is working fine. - Add support for AMD_anti_lag when exposed. The current implementation does not take frame-gen into account.
- Implement support for tight alignment from recent AgilitySDK.
- Add support for shared resource path on upstream Wine.
Performance
- Overhaul the texture copy batching situation.
The new batching logic should be able to improve performance in many more cases than before.- Implemented support for
VK_KHR_unified_image_layouts.
Image copy batching in particular can take advantage of this to avoid a lot of unnecessary barriers.
- Implemented support for
- Removed manual clear workaround on newer (6.15.9+) kernels on AMD, where an old kernel regression was finally fixed.
Kernels older than 6.10 are also not affected by this workaround. - Use push descriptor path on Qualcomm GPUs over BDA for speed.
- Improve handling of GDeflate when decompression extension is not available.
We now ship our own fallback shader in GLSL instead of the more awkward HLSL shader that dstorage ships. - Bump DGC scratch size on NVIDIA. Should avoid some massive perf drops in Halo Infinite on NVIDIA.
- Add performance optimization for The Last of Us Part 1 to prefer 2D tiling on 3D images.
Requires an update to Mesa as well to get the proper effect. - Handle depth/stencil <-> color image copies better when
VK_KHR_maintenance8is supported. - Make use of
VK_EXT_zero_initialize_device_memoryto avoid manual clears on allocation.
Fixes
- Emit render pass barriers as expected on tiled GPUs. Fixes misc rendering bugs reported on e.g. Turnip.
- For performance reasons, we deliberately skirt the spec a bit on desktop GPUs.
- Fixed a bunch of minor correctness problems exposed by new Vulkan-ValidationLayers.
- Adjust how
PointSamplingAddressesNeverRoundUpis reported to match recent driver behaviors. - Fix overflow bugs in massive (> 4GiB) sparse resource handling.
- Fix reporting of some esoteric format properties to better match native drivers.
- Fix handling of NULL acceleration structure descriptors.
- Fix some texturing bugs in Helldivers II on NVIDIA.
- Fix some bugs with memory type handling on very old NVIDIA GPUs.
- Fix bug when pixel shader includes root signature.
- Make ClearUAV barrier insertion the default now.
Too many games screw this up, and D3D12 drivers seem to do it by default. - Fix shared fences when initial value is not 0. Fixes some Star Citizen issues.
- Fix rare deadlock scenario in Ninja Gaiden 4.
Fixes some long-standing issues with how we deal with fence rewinds. - Fix some long-standing issues with how we deal with placed MSAA resources and alignment.
- Make sure we don't clear memory of imported resources.
This doesn't fix any known games, but you never know :V - Improve correctness for many odd GS/HS/DS corner cases with primitive types and API validation.
- Fixes crashes when index buffer SizeInBytes = 0, but VA was invalid.
Seen in some Saber Interactive games. - Fixes some potential deadlocks in VR interop APIs when multiple threads attempt to acquire Vulkan queue.
- Fixes 16-bit aligned structured buffer strides. Not observed in any real content, but you never know!
Workarounds
- Add FF VII rebirth sync bugs workarounds. Fixes some rare GPU hangs.
- Add misc AMD workarounds for Monster Hunter Wilds caused by bugged hardware around sparse SMEM.
- A proper hardware workaround in RADV is still pending.
- Workaround some Starfield bugs around
NonUniformResourceIndexuse. - Add performance workarounds for extremely large tessellation factors used in misc new Koei Tecmo games.
- Add Wreckfest 2 workarounds for illegal texture placement aliasing. Fixes some broken textures.
- Add barrier in Satisfactory that game missed. Fixes some corrupt rendering especially on AMD.
- Ignore NOT_CLEARED flags on allocation in all games now. Native drivers seem to always clear regardless of the flag,
and e.g. Street Fighter 6 relies on NOT_CLEARED memory to actually be cleared :( - Workaround some issues with RGB9E5 and alpha write masks observed in Ninja Gaiden 4.
- Add missing barrier in Death Stranding (the older build, not Director's Cut).
- Add missing barrier in Wuthering Waves.
- Workaround bugged uninitialized loop variable in Dune MMO.
- Disable UAV compression in Spider-Man Remastered. Fixes some weird RT issues on RDNA2.
- Add Root CBV robustness workaround for Gray Zone Warfare.
- Disables color compression in Rise of the Tomb Raider. Fixes some glitches due to game bug on AMD.
- Workaround some bugs in Port Royal benchmark.
- Workaround Mafia: Definitive Edition hanging GPU when using FSR on startup due to use-after-free.
- The workaround applies to all uses of FSR. Plausibly workaround a hang in MGS: Delta as well, but not confirmed it was this bug.
- Workaround Control RT path occasionally observing NaNs due to bad normalize() patterns.
- Workaround Final Fantasy Tactics Ivalice Chronicles illegally using dynamically indexed root constants.
Misc
- Added a lot more debug instrumentation as usual.
- Not user facing, so omitting details.
- Make it a bit easier to use vkd3d-proton in Linux-native projects.
- Remove
DXVK_FRAME_RATEto align with DXVK's removal. OnlyVKD3D_FRAME_RATEremains (at least for now).
Version 2.14.1
This is a bug-fix release which resolves some regressions introduced in 2.14.
- Fix a crash on start-up which affected GPUs without sparse support. E.g. Intel iGPU or Turnip.
Crash could happen even if that GPU was the secondary GPU on the system. - Fix a memory allocation issue affecting NVK.
- Fix a CPU performance regression issue affecting Horizon Zero Dawn Remastered on NVIDIA GPUs.
This fix might improve CPU performance in other games too, but unverified. - Not a regression fix, but add a
no_upload_hvvworkaround for Arma Reforger to workaround weird asset loading behavior.
Version 2.14
Rolls up the usual collection of new features, performance improvements, bug fixes and the copious amount of game workarounds,
just in time for the holidays.
Features
- Implement DXGI frame statistics (exposed by DXVK DXGI).
- Implement a global frame rate limiter (see
VKD3D_FRAME_RATEorDXVK_FRAME_RATE).
Also improves behavior of presentation with swap interval > 1 since we use frame limiter instead
of duplicated presents now. Also allows support for full-screen frame rate targets in DXGI which normally would imply a mode change. - Implement support for planar video formats such as NV12.
- Implement D24 depth bias correctly now on AMD when
VK_EXT_depth_bias_controlis supported. - Expose a new command interop interface that allows e.g. dxvk-nvapi to implement DLSS3 frame generation.
- Use VK_KHR_compute_shader_derivatives when available.
- Use VK_EXT_device_generated_commands when available. Expose execute indirect tier 1.1.
- Implement GPU upload heap from latest AgilitySDKs. Allows explicit control over ReBAR instead of heuristic based hacks in games that use the new API.
- Implement ID3DDestructionNotifier. Fixes some particular games that expect this to be supported.
Performance
- Reduce some VRAM bloat on RDNA2 and 3 GPUs when
VK_MESA_image_alignment_controlis exposed. - Improve CPU overhead for games that query swapchain format support over and over.
- Remove old heuristic that preferred 2 frames of latency depending on BufferCount used.
The default on DXGI is 3, and using 2 caused some performance issues in various games with GPU starvation,
especially on Deck.VKD3D_SWAPCHAIN_LATENCY_FRAMESis still available as an override to force a tighter default. - Rewrite queue submission logic to deal better with difficult submission patterns such as FSR3 3.1 Frame Generation.
On implementations with only one graphics queue, vkd3d-proton will now attempt to do basic software scheduling of GPU work.
This may regress GPU performance in some other cases andVKD3D_CONFIG=no_staggered_submitis a way to disable this code path.
One particularly big improvement is FF XVI on RADV with FSR 3 frame-gen, with almost doubled performance in some cases.
We are still awaiting a proper kernel-level fix for this problem to be fully resolved. - Rewrite queue submission logic to use fewer "dummy" wait/signal submissions.
Works around pathological CPU overhead in amdgpu taking 20ms+ to submit work in some cases. - Rewrite queue submission logic for sparse updates to be more efficient.
Fixes and workarounds
- Rework various multi-sampling queries to be more spec correct.
- Workaround bugged MSAA behavior in World of Warcraft.
- Workaround buggy/questionable use of ID3D12PipelineLibrary in FF XVI.
- Always use native 16-bit integers for min16int. Fixes some real-world bugs where shaders expect min16int is always implemented as 16-bit.
- Workaround game bug leading to GPU hang in Dragon Age: Veilguard on RADV.
- Always emit proper floating-point environment modes in DXBC shaders. Fixes glitched eyes in Dragon Age: Veilguard on NV.
- Fix potential use-after-free bug for some sparse resource update cases.
- Correctly validate when application attempts to allocate a too large descriptor heap.
Fixes Stalker 2 entering into undefined behavior. - A lot of misc fixes in dxil-spirv as usual.
- Workaround broken amdgpu zerovram behavior on 6.10+ kernels. Fixes random extreme glitchiness in Helldivers 2 on AMD.
- Workaround NV issue which lead to GPU hang when loading a save file in Star Wars: Outlaws.
- Fix copying between BC <-> RGBA images in some cases.
- Add workaround for a game bug in The First Descendant which lead to broken cubemap reflections in some cases.
- Workaround Skull & Bones crashing on startup on NV GPUs by disabling Reflex support.
- Workaround Hunt: Showdown missing precise qualifiers on vertex shaders, leading to glitched rendering.
- Workaround poor CPU performance in Red Dead Redemption.
Misc / Debug
- Add support for instruction_qa_checks. For deep debug, allows us to be notified when NaNs and Infs are generated in shaders.
For internal QA use. - Add fine-grained control of QA behavior on a per-shader basis. For narrowing down issues.
- Remove a bunch of old and obsolete workarounds for NV drivers. New cutoff is 535 series.
- Bump exposed SDKVersion to 614 to match latest stable AgilitySDK.
- Add an optional code path to support DXBC via the official dxilconv library.
This code is not enabled in release builds,
and is currently only intended as a path to take advantage of QA instrumentation for DXBC shaders.
Version 2.13
Features
- Implement Shader Model 6.8 min-spec
SV_StartInstanceLocationSV_StartVertexLocation- WaveSize range
- Implement Vulkan texturing catch-up features (esoteric comparison sampling functions)
- Implement interop for OpenVR / OpenXR on Proton
- Correctly support
NULLindex buffers withVK_KHR_maintenance6. - Implement
VK_MESA_image_alignment_control. Reduces memory bloat on AMD cards in particular.
Fixes
- Reimplement
VK_NV_low_latency2to fix some issues with heavy stuttering caused by non-monotonic frame IDs.
Relies on a more recent dxvk-nvapi which can paper over API design issues in Reflex API.
Requires a more recent NVIDIA driver which fixes some bugs exposed in this new code.
On older NVIDIA drivers, it should run, but low-latency will not kick in as expected. - Explicitly disable variable-rate shading when depth-stencil is written in shader.
Fixes glitched hair rendering in Hellblade 2. - Correctly expose MSAA features for depth-stencil. Fixes Arma Reforger.
- Fix bugs in MSAA resolve implementation when dealing with custom resolve formats. Fixes Arma Reforger.
- Fix validation error in internal query resolve shader.
- Fix some bugs in wave-ops where helper lanes participated where they were not supposed to.
Fixes some WaveMatch / WaveMultiPrefix use-cases in the wild. - Various dxil-spirv fixes to fix invalid control-flow as always.
Performance
- Tweak how we opt-in to ReBAR for UPLOAD heaps. Now, only > 8 GB cards will get it.
On 8 GB cards, we were regularly hitting the upper limits of what the GPU could hold in VRAM,
and using ReBAR would be detrimental to performance since there was risk of more important
memory being demoted to system memory. Works well together withVK_MESA_image_alignment_control
to free up significant amounts of VRAM. Performance gains from ReBAR on 8 GB were also found to be minimal
compared to the larger GPUs since we quickly exhausted the limited 512 MiB budget anyway. - Sub-allocate small image heaps. Avoids heavy stutter in Ghost of Tsushima on desktop.
(Steam Deck code path does not seem to use small heaps to begin with). - Improve performance with ROV when used with more complicated shader code patterns.
Workarounds
- Implement a crude workaround for depth-stencil sparse and MSAA sparse.
- Just allocates a committed resource instead. Not correct, but good enough band-aid.
- Allows SottR to run on RADV.
- Disable NV_dgcc on Halo Infinite on NV drivers.
- Workaround a missing barrier in AC: Mirage causing random corrupt geometry.
Misc
- Split vkd3d-proton shader cache up by .exe name when using a unified directory with
VKD3D_SHADER_CACHE_PATH. - Implement
VK_EXT_device_address_binding_report.
Version 2.12
Features
- Implement support for NVIDIA Reflex through
VK_NV_low_latency2. Thanks to NVIDIA for contributing implementation - Implement D3D12 render pass API (tier 0)
- Implement ID3D12DeviceRemovedExtendedDataSettings stubs. Fixes some games that rely on this existing
- Implement
VK_EXT_device_fault. Makes it possible to grab fault information and vendor binary if supported - Implement
VK_EXT_swapchain_maintenance1- Allows seamless transition between V-Sync and tearing present modes without stutter
- Implemented on both Mesa and NV drivers
- Expose Shader Model 6.7 by default if
VK_KHR_shader_maximal_reconvergenceandVK_KHR_shader_quad_controlare supported - Add optimized descriptor copy path on Intel Arc GPUs that support
VK_EXT_descriptor_buffer - Implement fallback for compute shader derivatives on NVIDIA Pascal and older GPUs.
Allows exposing Shader Model 6.7 by default on Pascal as well (albeit with some known cases where it does not work).
The workaround is expected to work with any known use of SM 6.6 compute derivatives in the wild
Fixes
- Fix Atlas Fallen black screen due to edge case with MinLODClamp
- Correctly disable alpha-to-coverage if sampler mask is exported
- Fix format feature reports for
DXGI_FORMAT_UNKNOWN - Relax root signature compatibility rules when compiling Ray Tracing pipelines.
Fixes GPU hang on NV in Warhammer: Darktide - Fix GPU hang on NV in UE5 Lyra demo
- Explicitly validate stage IO signatures in PSO creation similar to native D3D12 runtime.
Fixes some scenarios where a game attempts to create an invalid pipeline that should have failed creation
on native D3D12
Workarounds
- Workaround crash in Resident Evil 4 RT mode when tessellation is enabled
- Workaround mesh shader glitches on NVIDIA in several UE5 titles
- Workaround GPU hang on NVIDIA in World of Warcraft when MSAA is enabled
- Disable RT by default in Persona 3 Reload on Deck
Performance
- Implement
VK_NV_raw_access_chains. Significantly improves GPU performance on NV GPUs in some games.
Games using DXBC instead of DXIL are expected to see more improvements.
Not all games are expected to see an uplift - Fix extremely poor GPU performance in some locations in Persona 3 Reload
Debug
- Add support for
VKD3D_QUEUE_PROFILE, a simple system profiling method- Includes
VK_NV_low_latency2support to debug NVIDIA Reflex sleeps
- Includes
- Root signature blobs are also dumped when dumping shaders
- A simple CLI tool to inspect the root-sig blobs is included in
programs/
- A simple CLI tool to inspect the root-sig blobs is included in
- Misc improvements to breadcrumbs, debug ring, etc
- Pipeline creation failure now dumps PSO creation commands in log
Version 2.11.1
This release is a minor bug-fix release before the holidays.
- Implement COLOR -> STENCIL fallback copy on NVIDIA
- Implement SM 6.6 ResourceDescriptorHeap[] + UAV counters correctly on RADV
- Fix bugged implementation of DXBC resinfo instruction, affecting Avatar: Frontiers of Pandora
- Fix memory type used for DGC preprocess memory on NVIDIA (~5% performance, YMMV)
- Fix crash in Callisto Protocol when booting game with DXR support
More complete MSAA resolve implementation
- Add depth-stencil resolve
- Support typeless formats
- Add MIN/MAX resolve modes
- Implement missing code paths on NVIDIA
Workarounds
- Update workaround for GPU hang in CP77 when using DXR for patch 2.1.
- Remove workaround for NO_DGCC in Halo Infinite on NVIDIA.
- Workaround game bug in Pioneers of Pagonia causing GPU hangs on RADV.
Version 2.11
This release rolls up a bunch of features, perf improvements and bug fixes / workarounds as usual.
Features
DXR enabled by default
VKD3D_CONFIG=dxr is default now, and no longer needed.
There are some special cases where DXR is not enabled by default. The only such current example is
"Hellblade: Senua's Sacrifice" on Deck which force-enables DXR if it is supported, even on Deck.
New semantics are:
dxr: Force-enable DXR, even when it is considered unsafenodxr: Disable DXRdxr11: Removed.dxralready implied DXR 1.1 anyway
Sampler feedback
This feature was the last feature required for FL 12.2 and is implemented through emulation.
As demonstrated in the implementation docs, all
native implementations of this feature are fundamentally broken in some way.
There's also no known game that ships requiring this feature, so we just consider this a checkbox feature.
DX Ultimate (FL 12.2) now exposed by default
On RDNA2+ and Turing+ we can finally expose the DX Ultimate feature set!
Misc
- Implement a bunch of missing "Vulkan-on-D3D12" features
- IndependentFrontAndBackStencilRefMaskSupported
- TriangleFanSupported
- DynamicIndexBufferStripCutSupported
- DynamicDepthBiasSupported
- NonNormalizedCoordinateSamplersSupported
- MismatchingOutputDimensionsSupported
- PointSamplingAddressesNeverRoundUp
- RasterizerDesc2Supported
- Explicit line rasterization mode
- NarrowQuadrilateralLinesSupported
- AnisoFilterWithPointMipSupported
- Implement missing MSAD instruction in DXIL, allowing FSR3 to run
- Implement some esoteric DXR features
- Implement support for multiple mismatching global root signatures in DXR
- Fixes crash in Battlefield V
- Implement support for LOCAL_ON_EXTERNAL dependencies in DXR
- Fixes DXR in Warhammer: Darktide
- Implement support for multiple mismatching global root signatures in DXR
- Implement support for ExecuteIndirect + Mesh shaders with state changes
- Currently unused by games
Performance
- Improve performance of NV_device_generated_commands and NV_device_generated_commands_compute by
reordering and batching command preprocessing- We have observed 15% FPS gains in Halo Infinite on RADV
- 1-2% in Starfield in some test locations
- Needs pending Mesa work to land to take advantage of this improvement
- Tune memory allocation patterns for DGC preprocess buffers
- Avoids a lot of allocation churn
- Greatly reduces CPU overhead on NV
Workarounds
- Work around RADV bug causing GPU hang in RE4: Separate Ways DLC
- Work around RADV bug causing GPU hang in Lords of the Fallen
- Work around Witcher 3 bug causing broken shadows and GPU hangs when enabling DXR
- Work around Cyberpunk 2077 bug when RT is enabled, where game would cause spurious GPU hangs due to accessing descriptor heap out of bounds
- Work around Windjammers 2 bug causing random crashes on startup
- Add support for VK_EXT_image_compression_control to allow for more fine-grained workarounds for broken games running on RADV
- Enable NV_device_generated_commands_compute on latest NV beta drivers
- 545.x drivers are still disabled until a fix can be confirmed on shipping drivers
- Remove CURB_MEMORY_PSO_CACHE workaround on Mesa 23.2+
- Should reduce overhead in PSO creation
Fixes
- Misc dxil-spirv changes to fix various bugs in game shaders as usual
- Fix Jurassic World Evolution 2 crashing when enabling DXR
- Fix some deprecation warnings in Meson build system
- Some submodule locations moved, which may cause minor disruption
Version 2.10
This release rolls up a ton of bug fixes, game and driver workarounds, and other improvements.
Features
DirectStorage MetaCommands
We can now make use of NV_memory_decompression to implement
GPU accelerated GDeflate compression in DirectStorage.
This is demonstrated to work in Ratchet & Clank: Rift Apart.
We also worked around an NV driver bug when using the fallback GDeflate shader.
The fallback works on RADV.
Enhanced Barriers
NOTE: This isn't all that well tested because there are no games shipping with this yet to our knowledge.
Device generated commands for compute
With NV_device_generated_commands_compute we can efficiently implement
Starfield's use of ExecuteIndirect which hammers multi-dispatch COMPUTE + root parameter changes.
Previously, we would rely on a very slow workaround.
NOTE: This feature is currently only enabled on RADV due to driver issues.
Misc
- Support Root Signature version 1.2
- Implement Shader Model 6.7
- Includes all SM 6.7 features like AdvancedTextureOps, WaveOpsIncludeHelperLanes
- Caveat: Technically not Vulkan spec compliant implementation, but works fine on at least NV and RADV. Currently implemented as an opt-in option for now in case some game relies on it to work
- Implement CreateSampler2
- Expose inverted viewport / height feature
- Implement RelaxedFormatCasting feature from Enhanced Barriers
- Implement support for adjacency topologies
- Support A8_UNORM format properly by using
VK_KHR_maintenance5, allowing A8_UNORM UAVs to work correctly - Handle range checked index buffers correctly with
VK_KHR_maintenance5
New extension use
- VK_EXT_dynamic_rendering_unused_attachments
- VK_KHR_maintenance5
- VK_NV_device_generated_commands_compute
Performance
- Batch acceleration structure builds. Massively improves build performance on at least RADV.
- Massively improve ExecuteIndirect performance when using COMPUTE + root parameter changes when
VK_NV_device_generated_commands_computeis enabled.
Fixes
- Fix root signature creation from DXIL library target (DXR) blobs
- Fix some dual source blending PSOs scenarios. Fixes Star Wars Battlefront II
- Implement wave operations in pixel shaders more strictly according to D3D12 rules
- Fix spurious hangs in Ashes of Singularity when using shared fences and wait-before-signal
- Fix PSO caching bug in mesh shaders. Fixes mesh shaders in Unreal Engine 5
- Fix udiv remainder in DXBC, which fixed some Xenia bugs
- Fix query heap tracking bug that was exposed by NV Streamline
- Various DXIL -> SPIR-V fixes as usual
- Rewrote descriptor set layouts to be more robust against application bugs
- Motivated by Armored Core VI bug (see below)
- Native D3D12 drivers are also robust against these application bugs :(
Workarounds
- Workaround bad ReBAR performance in Age of Wonders 4
- Remove workaround for
KHR_present_waiton NV 535+ drivers - Workaround Starfield memory corruption issue where it does not correctly query for 4 KiB alignment
- Disable ReBAR usage on Halo Infinite to workaround very poor CPU performance
- Workaround Street Fighter 6 bug causing spurious GPU hangs
- Also appears to have worked around GPU hangs in Resident Evil 2
- Workaround Armored Core VI bug causing GPU hang on Balteus fight in chapter 1
- Workaround "firefly" glitches in Resident Evil 4 caused by dubious min16float usage
- Workaround "firefly" glitches in Monster Hunter Rise caused by dubious shader requiring particular precise math
- Workaround Unreal Engine 5 breaking if mesh shaders are exposed, but not barycentrics
- Workaround NV driver bug with TIMESTAMP query heaps that could cause spurious GPU hangs
- Workaround broken CFG code generation in Xenia's DXBC emitter