QNN Llama Runner implement IRunner #13171

rohansjoshi · 2025-08-07T01:31:03Z

Summary:
This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface

Using this, enable running static Llama models inside LlamaDemo Android app

Switched default eval mode to hybrid everywhere

Differential Revision: D79759817

pytorch-bot · 2025-08-07T01:31:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13171

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 1 Pending

As of commit 484a9f0 with merge base 6a875f9 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold
pull / android / run-emulator (gh)
The process '/usr/bin/sh' failed with exit code 255

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-07T01:31:13Z

This pull request was exported from Phabricator. Differential Revision: D79759817

github-actions · 2025-08-07T01:31:47Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Differential Revision: D79759817

facebook-github-bot · 2025-08-07T21:35:13Z

This pull request was exported from Phabricator. Differential Revision: D79759817

shewu-quic

LGTM. Thank you for the efforts. It is really awesome 👍

cccclai

Looks good to me, also stamp on behalf of qcom team

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

facebook-github-bot · 2025-08-08T18:00:51Z

This pull request was exported from Phabricator. Differential Revision: D79759817

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

facebook-github-bot · 2025-08-08T20:42:55Z

This pull request was exported from Phabricator. Differential Revision: D79759817

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

facebook-github-bot · 2025-08-09T17:16:13Z

This pull request was exported from Phabricator. Differential Revision: D79759817

Summary: Pull Request resolved: pytorch#13171 This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

facebook-github-bot · 2025-08-12T04:04:03Z

This pull request was exported from Phabricator. Differential Revision: D79759817

facebook-github-bot · 2025-08-12T20:34:27Z

This pull request was exported from Phabricator. Differential Revision: D79759817

facebook-github-bot · 2025-08-12T20:58:28Z

This pull request was exported from Phabricator. Differential Revision: D79759817

Summary: Pull Request resolved: pytorch#13171 This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models with QNN backend inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

facebook-github-bot · 2025-08-19T18:33:17Z

This pull request was exported from Phabricator. Differential Revision: D79759817

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Reviewed By: cccclai Differential Revision: D79759817

Reviewed By: cccclai

facebook-github-bot · 2025-08-20T04:19:31Z

@rohansjoshi has imported this pull request. If you are a Meta employee, you can view this in D79759817.

facebook-github-bot · 2025-08-20T04:43:35Z

@rohansjoshi has imported this pull request. If you are a Meta employee, you can view this in D79759817.

mergennachin · 2025-08-20T14:05:06Z

examples/qualcomm/oss_scripts/llama/runner/runner.cpp

 // A llama 3.2 runner that includes preprocessing and post processing
 // logic. The module takes in a string as input and emits a string as output.

+#include <executorch/examples/models/llama/runner/runner.h>


why do you need this particular runner?

I'm only using the function load_llama_tokenizer from executorch/examples/models/llama/runner/runner, not the runner there. I'm trying to reuse code from examples/models

Summary: This PR makes the Runner for running Qualcomm LlamaModels implement the IRunner interface Using this, enable running static Llama models inside LlamaDemo Android app Switched default eval mode to hybrid everywhere Differential Revision: D79759817

rohansjoshi requested review from cccclai, kirklandsign, larryliu0820 and shoumikhin as code owners August 7, 2025 01:31

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 7, 2025

facebook-github-bot added the fb-exported label Aug 7, 2025

cccclai requested review from DannyYuyang-quic, haowhsu-quic, shewu-quic and winskuo-quic August 7, 2025 17:12

rohansjoshi force-pushed the export-D79759817 branch from ed495fc to 0fc31e2 Compare August 7, 2025 21:35

shewu-quic approved these changes Aug 8, 2025

View reviewed changes

cccclai approved these changes Aug 8, 2025

View reviewed changes

rohansjoshi force-pushed the export-D79759817 branch from 0fc31e2 to 07fce7f Compare August 8, 2025 18:00

rohansjoshi requested review from jackzhxng and lucylq as code owners August 8, 2025 18:00

rohansjoshi force-pushed the export-D79759817 branch from 07fce7f to 991dd98 Compare August 8, 2025 20:42

rohansjoshi force-pushed the export-D79759817 branch from 991dd98 to ce2893c Compare August 9, 2025 17:09

rohansjoshi force-pushed the export-D79759817 branch from ce2893c to e0cfcbd Compare August 9, 2025 17:16

rohansjoshi force-pushed the export-D79759817 branch from e0cfcbd to 533247c Compare August 12, 2025 04:04

rohansjoshi force-pushed the export-D79759817 branch from 533247c to ebb9c3a Compare August 12, 2025 20:34

rohansjoshi force-pushed the export-D79759817 branch from ebb9c3a to 938b552 Compare August 12, 2025 20:58

rohansjoshi force-pushed the export-D79759817 branch from 938b552 to 5b377a2 Compare August 19, 2025 18:33

rohansjoshi added 3 commits August 19, 2025 21:17

Rebased on top of recent changes

ea5d4d6

Reviewed By: cccclai

Fixed lint issues

484a9f0

rohansjoshi force-pushed the export-D79759817 branch from 5cabbd1 to 484a9f0 Compare August 20, 2025 04:18

rohansjoshi merged commit 53146a4 into pytorch:main Aug 20, 2025
103 of 106 checks passed

mergennachin reviewed Aug 20, 2025

View reviewed changes

QNN Llama Runner implement IRunner #13171

QNN Llama Runner implement IRunner #13171

Uh oh!

Conversation

rohansjoshi commented Aug 7, 2025

Uh oh!

pytorch-bot bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13171

❌ 2 New Failures, 1 Pending

Uh oh!

facebook-github-bot commented Aug 7, 2025

Uh oh!

github-actions bot commented Aug 7, 2025

This PR needs a release notes: label

Uh oh!

facebook-github-bot commented Aug 7, 2025

Uh oh!

shewu-quic left a comment

Choose a reason for hiding this comment

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Aug 8, 2025

Uh oh!

facebook-github-bot commented Aug 8, 2025

Uh oh!

facebook-github-bot commented Aug 9, 2025

Uh oh!

facebook-github-bot commented Aug 12, 2025

Uh oh!

facebook-github-bot commented Aug 12, 2025

Uh oh!

facebook-github-bot commented Aug 12, 2025

Uh oh!

facebook-github-bot commented Aug 19, 2025

Uh oh!

facebook-github-bot commented Aug 20, 2025

Uh oh!

facebook-github-bot commented Aug 20, 2025

Uh oh!

Uh oh!

mergennachin Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

rohansjoshi Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pytorch-bot bot commented Aug 7, 2025 •

edited

Loading

This PR needs a `release notes:` label

rohansjoshi Aug 20, 2025 •

edited

Loading