Skip to content

Conversation

@MarijnS95
Copy link
Member

@MarijnS95 MarijnS95 commented Jun 12, 2025

gpu-allocator creates heaps on which callers can allocate ranges and create "placed" resources like textures, buffers and acceleration structures. These individual resources, or the heaps as a whole, need to be made resident on the command buffer or even globally on an entire queue.

In the previous API those heaps had to be made resident on individual command encoders with useHeap(s): (making an entire heap resident perfectly matches a bindless design, as opposed to making every individual resource -either placed on the heap or allocated separately-
resident with useResource(s):). Worse, this API only applies MTLResourceUsageRead (exluding RenderTarget and ShaderWrite textures) which would disallow any resources on the heap to be written.

Now with MTLResidencySet multiple heaps can be made resident with one call, defeating the performance overhead of individually "using" all heaps on every command encoder. But without tracking this inside gpu-allocator, users of our crate still have to manually rebuild this MTLResidencySet each time they change their allocations, without knowing when gpu-allocator created or destroyed a heap.

By managing a single updated MTLResidencySet in gpu-allocator, callers can simply call .commit() on this object right before they submit command buffers referencing resources on these heaps, as long as they have the residency set attached to the queue in question or "used" on the command buffer that is being submitted. This removes all the performance overhead of repeatedly creating MTLResidencySets, which otherwise defeats the purpose of it over plain useHeap(s): call(s).

@MarijnS95 MarijnS95 requested a review from Athosvk June 12, 2025 18:11
@MarijnS95 MarijnS95 force-pushed the metal-residency-set branch from c90ef32 to bf93064 Compare June 12, 2025 18:25
@MarijnS95 MarijnS95 requested a review from Jasper-Bekkers June 12, 2025 20:30
@MarijnS95 MarijnS95 force-pushed the metal-residency-set branch from bf93064 to 9f5dedf Compare June 12, 2025 20:32
src/metal/mod.rs Outdated
}
// Note that `block` will be destroyed on `drop` here
if mem_block.sub_allocator.is_empty()
&& (!mem_block.sub_allocator.supports_general_allocations()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This complex conditional is significantly less readable then what was there before with the multiple if statements. Please revert.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to deduplicate the bodies since more and more code gets added to the destruction path now. But since the original expression is confusing, I've factored it out to a let binding now with explanatory comment (on each of the 3 backends) in hopes of making more clear what's going on.

In short, we'll only destroy empty memory blocks if they are either dedicated or not dedicated and not the last block. That way we always have one general (i.e. suballocatable) memory block available.

pub allocation_sizes: AllocationSizes,
/// Whether to create a [`MTLResidencySet`] containing all live heaps, that can be retrieved via
/// [`Allocator::residency_set()`]. Only supported on MacOS 15.0+ / iOS 18.0+.
pub create_residency_set: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe for now its a bit early, but if we do more of these platform specific settings I don't think they belong in AllocatorCreateDesc, maybe we need some platform specific traits.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AllocatorCreateDesc already is platform-specific (defined in metal/mod.rs), gpu-allocator has relatively few platform-agnostic types and definitions (just the error, debug settings, allocation reports, and the allocator algorithms themselves). Everything else is platform-specific because it integrates with platform-specific primitives (i.e. having to pass the Device of the given API).

@MarijnS95 MarijnS95 force-pushed the metal-residency-set branch 2 times, most recently from 7f79300 to 7463571 Compare June 13, 2025 09:13
@MarijnS95 MarijnS95 requested a review from Jasper-Bekkers June 13, 2025 09:15
`gpu-allocator` creates heaps on which callers can allocate ranges
and create "placed" resources like textures, buffers and acceleration
structures.  These individual resources, or the heaps as a whole,
need to be made resident on the command buffer or even globally on an
entire queue.

In the previous API those heaps had to be made resident on individual
command *encoders* with `useHeap(s):` (making an entire heap resident
perfectly matches a bindless design, as opposed to making every
individual resource -either placed on the heap or allocated separately-
resident with `useResource(s):`).  Worse, this API only applies
`MTLResourceUsageRead` (exluding `RenderTarget` and `ShaderWrite`
textures) which would disallow any resources on the heap to be written.

Now with `MTLResidencySet` multiple heaps can be made resident with one
call, defeating the performance overhead of individually "using" all
heaps on *every* command *encoder*.  But without tracking this inside
`gpu-allocator`, users of our crate still have to manually rebuild
this `MTLResidencySet` each time they change their allocations, without
knowing when `gpu-allocator` created or destroyed a heap.

By managing a single updated `MTLResidencySet` in `gpu-allocator`,
callers can simply call `.commit()` on this object right before they
submit command buffers referencing resources on these heaps, as long as
they have the residency set attached to the queue in question or "used"
on the command buffer that is being submitted.  This removes all the
performance overhead of repeatedly creating `MTLResidencySet`s, which
otherwise defeats the purpose of it over plain `useHeap(s):` call(s).
@MarijnS95 MarijnS95 force-pushed the metal-residency-set branch from 7463571 to 2136bcf Compare June 13, 2025 09:16
@MarijnS95 MarijnS95 merged commit 1ad50b6 into main Jun 20, 2025
26 checks passed
@MarijnS95 MarijnS95 deleted the metal-residency-set branch June 20, 2025 09:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants