Skip to content

Conversation

kylesayrs
Copy link
Contributor

@kylesayrs kylesayrs commented Jun 20, 2025

Purpose

  • Enable util that may be useful for dealing with offloading of modules which are no leaf modules. For example, if we want to attach parameters to an attention module for attention quantization, we'll need to know the offload device of the attention module (which is not a leaf module)

Changes

  • Generalize get_offloaded_device to support nested modules

Testing

  • Added additional tests, previous tests pass and previous behavior is preserved

Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs changed the title [Accelerate] Support inference of offload device for models [Accelerate] Support get_offloaded_device for models Jun 20, 2025
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
Signed-off-by: Kyle Sayers <[email protected]>
@kylesayrs kylesayrs marked this pull request as ready for review July 31, 2025 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant