-
Notifications
You must be signed in to change notification settings - Fork 741
Qualcomm AI Engine Direct - Delegated mutable buffer #6727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qualcomm AI Engine Direct - Delegated mutable buffer #6727
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6727
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 1 New FailureAs of commit 89af1e0 with merge base 86cb5d7 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @cccclai, Thank you very much :) |
cccclai
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very solid, thanks!
|
@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
Hey it seems like breaking the CI |
It seems to be something wrong to delegate mutable buffer without quantize. |
|
It seems that delegated mutable buffer is not removed from the output. |
summary: - Support copy op with QNN Reshape - Consume mutable buffer in QNN Delegate - Set the same memory address for I/O of mutable buffer at runtime
7f236c3 to
da1df61
Compare
This PR needs a
|
|
@pytorchbot label "topic: not user facing" |
|
Didn't find following labels among repository labels: topic: not user facing |
da1df61 to
89af1e0
Compare
|
It seems like the label thing is new...will check how to resolve it |
|
Hello, is this PR still needed? Assuming yes, but we're focusing on static llama now... |
Yes, I think we can close it. Thanks. |
…le buffer issue (#11782) Summary: - Add a parameter to support mutable buffer delegation in QNN Backend - Set the same memory address for I/O of mutable buffer at runtime - Ref: #6727 - Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process. - Deprecated use_legacy_export in executorch llama cc @cccclai @winskuo-quic @cbilgin
…le buffer issue (pytorch#11782) Summary: - Add a parameter to support mutable buffer delegation in QNN Backend - Set the same memory address for I/O of mutable buffer at runtime - Ref: pytorch#6727 - Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process. - Deprecated use_legacy_export in executorch llama cc @cccclai @winskuo-quic @cbilgin
summary:
Test the PR for llama 3.2 1B instruct with seq_len=512 on SM8650


Test the mainline