Skip to content

Conversation

@haowhsu-quic
Copy link
Collaborator

Summary

  • rename folders in backends/qualcomm/runtime/backends
  • add gpu infra

Test plan

python backends/qualcomm/tests/test_qnn_delegate.py TestQNNFloatingPointOperator.test_qnn_backend_conv2d -b build-android/ -m SM8750 -s 5f396958 --online_prepare --backend gpu

- rename folders in backends/qualcomm/runtime/backends
- add gpu infra
@pytorch-bot
Copy link

pytorch-bot bot commented Jul 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12165

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job

As of commit f21b2b8 with merge base 929ec94 (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 2, 2025
@haowhsu-quic
Copy link
Collaborator Author

@pytorchbot label "release notes: qualcomm"

@pytorch-bot pytorch-bot bot added the release notes: qualcomm Changes to the Qualcomm backend delegate label Jul 2, 2025
#include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpContext.h>
#include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpDevice.h>
#include <executorch/backends/qualcomm/runtime/backends/htpbackend/HtpGraph.h>
#include <executorch/backends/qualcomm/runtime/backends/gpu/GpuBackend.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly worried about the runtime size increase, that usually is a requirement for production. Do we know how much size increase with this PR? If I have a model runs on HTP only, can the runtime include HTP only?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The libqnn_executorch_backend.so grows from 630984 to 652672 bytes. We'll deprecate few files in next PR, hopefully it could further reduce the number.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What files will be deprecated in next PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be aot/ir and runtime/backend/CustomProtocol*. We now switch to QNN IR backend (DLC) for online-prepare path, the qcir and the legacy code for multi-method compilation can be fully deprecated.
But it would break backward compatibility since we used to wrap preprocess result with custom protocol. Probably will let you to decide when will be the right time to apply the change.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, I was thinking wrong about the impact of deprecating files. We still need to keep the custom protocol implementation to make multi-graph path work.
The change is in #12583 now and will guarantee BC.

@cccclai
Copy link
Contributor

cccclai commented Jul 11, 2025

Sorry I need to spend a bit more time on this, because we don't have CI to test the pllm model and I'm worried it will cause breakage

@haowhsu-quic
Copy link
Collaborator Author

Sorry I need to spend a bit more time on this, because we don't have CI to test the pllm model and I'm worried it will cause breakage

No worries, I think GA decoder models is way more important than this. This PR is mainly a proof of concept that we can extend the capability of QNN backend.

@cccclai
Copy link
Contributor

cccclai commented Jul 18, 2025

Can we prioritize the stories.pte as part of CI to prevent BC breakage? Otherwise it's hard to catch failure

@github-actions
Copy link

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

@github-actions github-actions bot added the stale PRs inactive for over 60 days label Sep 16, 2025
@cccclai
Copy link
Contributor

cccclai commented Sep 16, 2025

We should be good to continue this PR, what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. release notes: qualcomm Changes to the Qualcomm backend delegate stale PRs inactive for over 60 days

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants