Qualcomm AI Engine Direct - Delegated mutable buffer #6727

shewu-quic · 2024-11-08T07:20:19Z

summary:

Support copy op with QNN Reshape
Consume mutable buffer in QNN Delegate
Set the same memory address for I/O of mutable buffer at runtime

Test the PR for llama 3.2 1B instruct with seq_len=512 on SM8650

Test the mainline

pytorch-bot · 2024-11-08T07:20:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6727

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

❌ 1 New Failure

As of commit 89af1e0 with merge base 86cb5d7 ():

NEW FAILURE - The following job has failed:

Check Labels / Check labels (gh)
# This PR needs a release notes: label

This comment was automatically generated by Dr. CI and updates every 15 minutes.

shewu-quic · 2024-11-08T07:31:47Z

Hi @cccclai,
This PR is to delegate mutable buffer and maintain it in QNN Backend.
I also added a condition to choose whether consuming mutable buffer or not.
Please have a look.

Thank you very much :)

cccclai

This looks very solid, thanks!

facebook-github-bot · 2024-11-11T19:05:08Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

cccclai · 2024-11-11T22:09:49Z

Hey it seems like breaking the CI test-llama-runner-qnn-linux, can you take a look?

shewu-quic · 2024-11-12T08:32:47Z

Hey it seems like breaking the CI test-llama-runner-qnn-linux, can you take a look?

It seems to be something wrong to delegate mutable buffer without quantize.
I am trying to figure out the root cause. If it is not easy to solve, I will disable delegated mutable buffer in fp mode.

shewu-quic · 2024-11-12T09:35:33Z

It seems that delegated mutable buffer is not removed from the output.
When I trace back, I found the mutable buffer doesn't exist in original_program.state_dict . So, it doesn't be added into output_specs_to_delete. Do you have any idea for it?

summary: - Support copy op with QNN Reshape - Consume mutable buffer in QNN Delegate - Set the same memory address for I/O of mutable buffer at runtime

github-actions · 2024-11-19T03:23:26Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

shewu-quic · 2024-11-19T03:24:58Z

@pytorchbot label "topic: not user facing"

pytorch-bot · 2024-11-19T03:25:01Z

Didn't find following labels among repository labels: topic: not user facing

cccclai · 2024-11-19T03:57:19Z

It seems like the label thing is new...will check how to resolve it

cccclai · 2025-02-07T00:15:59Z

Hello, is this PR still needed? Assuming yes, but we're focusing on static llama now...

shewu-quic · 2025-02-07T06:14:51Z

Hello, is this PR still needed? Assuming yes, but we're focusing on static llama now...

Yes, I think we can close it. Thanks.

@cccclai

…le buffer issue (#11782) Summary: - Add a parameter to support mutable buffer delegation in QNN Backend - Set the same memory address for I/O of mutable buffer at runtime - Ref: #6727 - Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process. - Deprecated use_legacy_export in executorch llama cc @cccclai @winskuo-quic @cbilgin

@cccclai

…le buffer issue (pytorch#11782) Summary: - Add a parameter to support mutable buffer delegation in QNN Backend - Set the same memory address for I/O of mutable buffer at runtime - Ref: pytorch#6727 - Avoid annotating the input node because mutable buffers will be folded during the convert_pt2e process. - Deprecated use_legacy_export in executorch llama cc @cccclai @winskuo-quic @cbilgin

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 8, 2024

shewu-quic mentioned this pull request Nov 8, 2024

Qualcomm AI Engine Direct - Optimize QNN embedding op for llama #6725

Closed

cccclai approved these changes Nov 11, 2024

View reviewed changes

Qualcomm AI Engine Direct - Delegated mutable buffer

2043e22

summary: - Support copy op with QNN Reshape - Consume mutable buffer in QNN Delegate - Set the same memory address for I/O of mutable buffer at runtime

shewu-quic force-pushed the dev1/hutton/delegated_mutable_buffer branch from 7f236c3 to da1df61 Compare November 19, 2024 03:23

workround for unexpected result in fp flow

89af1e0

shewu-quic force-pushed the dev1/hutton/delegated_mutable_buffer branch from da1df61 to 89af1e0 Compare November 19, 2024 03:26

shewu-quic closed this Feb 7, 2025

shewu-quic mentioned this pull request Jun 18, 2025

Qualcomm AI Engine Direct - Delegate mutable buffer and fix the mutable buffer issue #11782

Merged

Qualcomm AI Engine Direct - Delegated mutable buffer #6727

Qualcomm AI Engine Direct - Delegated mutable buffer #6727

Uh oh!

Conversation

shewu-quic commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/6727

❗ 1 Active SEVs

❌ 1 New Failure

Uh oh!

shewu-quic commented Nov 8, 2024

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Nov 11, 2024

Uh oh!

cccclai commented Nov 11, 2024

Uh oh!

shewu-quic commented Nov 12, 2024

Uh oh!

shewu-quic commented Nov 12, 2024

Uh oh!

github-actions bot commented Nov 19, 2024

This PR needs a release notes: label

Uh oh!

shewu-quic commented Nov 19, 2024

Uh oh!

pytorch-bot bot commented Nov 19, 2024

Uh oh!

cccclai commented Nov 19, 2024

Uh oh!

cccclai commented Feb 7, 2025

Uh oh!

shewu-quic commented Feb 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shewu-quic commented Nov 8, 2024 •

edited

Loading

pytorch-bot bot commented Nov 8, 2024 •

edited

Loading

This PR needs a `release notes:` label