Merged
Conversation
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
|
👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review. Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed. |
dsikka
reviewed
Apr 9, 2025
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Collaborator
Author
|
This is waiting on getting dedicated HF tokens to add to transformers tests |
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
andy-neuma
reviewed
Apr 16, 2025
Collaborator
andy-neuma
left a comment
There was a problem hiding this comment.
let me know if the token works
brian-dellabetta
previously approved these changes
Apr 16, 2025
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
This was referenced Apr 27, 2025
Collaborator
Author
|
FYI this PR is waiting on gated access to the meta-llama/Llama-3.2-11B-Vision-Instruct and Llama4 models EDIT: works now, thanks @andy-neuma ! |
kylesayrs
added a commit
that referenced
this pull request
Apr 29, 2025
## Purpose ## * Reduce model support burden by skipping any modules which are not call graph ancestors of the sequential targets * Rather than requiring the user to specify a list of ignored modules, only trace what is necessary to disjointly execute sequential targets * In the future, the ignore field will be used to skip untraceable function/method names * This change does not change functionality because all ignored modules are already non-ancestors of sequential targets ## Changes ## * Remove `ignore` modules requirement (all ignored modules are already non-ancestors of sequential targets) * Implement `get_sequential_ancestors` which returns all ancestors of the sequential targets * Modify tracer to skip anything that is not a sequential ancestor or has offloaded modules * The two sets rarely overlap, and when they do, the module is skipped for safety and the user is warned ## Testing ## * Added tests for `get_sequential_ancestors` * #1335 ## Follow ups ## * #1390 --------- Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
Signed-off-by: Kyle Sayers <kylesayrs@gmail.com>
dsikka
added a commit
that referenced
this pull request
Apr 30, 2025
## Purpose ## * When #1389 landed, modules being skipped by ignore were no longer being skipped. However, this requires that the sequential targets list be correct. Mllama defaults to targeting vision layers, and hence the vision tower was being traced, leading to errors. ```python3 _no_split_modules = [ "MllamaVisionEncoderLayer", "MllamaCrossAttentionDecoderLayer", "MllamaSelfAttentionDecoderLayer", ] ``` ## Changes ## * Only target text decoder layers, not vision decoder layers ## Testing ## * #1335 passes Signed-off-by: Kyle Sayers <kylesayrs@gmail.com> Co-authored-by: Dipika Sikka <dipikasikka1@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
Prerequisites
skip_weights_downloadfor developers and testing #1334Changes
trust_remote_codeargument to debuggertests/llmcompressor/transformers/tracing/models.py