Phi3 runner uses TextLLMRunner #11551

larryliu0820 · 2025-06-11T16:35:45Z

As titled.

pytorch-bot · 2025-06-11T16:35:49Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11551

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 27 New Failures

As of commit 7418d2e with merge base 042eb1a ():

NEW FAILURES - The following jobs have failed:

pull / android / build-llm-demo / linux-job (gh)
/opt/ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/c++/v1/__memory/unique_ptr.h:601:30: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-eval_llama-mmlu-linux / linux-job (gh)
RuntimeError: Command docker exec -t 969e1d25cf204a24b76baf31eb31372ffc13e06df95298bf5db12ce2abd2bd04 /exec failed with exit code 1
pull / test-llama-runner-linux (bf16, custom, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llama-runner-linux (fp32, xnnpack+custom+qe, linux.arm64.2xlarge, executorch-ubuntu-22.04-gc... / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.2xlarge, executorch-ubuntu-22.04... / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llama-runner-linux (fp32, xnnpack+custom+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu... / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llama-runner-linux (fp32, xnnpack+quantize_kv, linux.arm64.2xlarge, executorch-ubuntu-22.04-... / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
pull / test-llama-runner-linux-android / linux-job (gh)
/opt/ndk/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/include/c++/v1/__memory/unique_ptr.h:601:30: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-llava-runner-linux / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
pull / test-phi-3-mini-runner-linux / linux-job (gh)
RuntimeError: Command docker exec -t aac86748c51b54826dccf55664d20a99568c0748aaf417ef6b9cbf5bb38ed4cc /exec failed with exit code 1
trunk / test-llama-runner-linux (bf16, custom, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
trunk / test-llama-runner-linux (bf16, portable, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-linux (bf16, portable, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
trunk / test-llama-runner-linux (fp32, portable, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-linux (fp32, portable, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11-aarch64) / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
trunk / test-llama-runner-linux (fp32, xnnpack+custom, linux.2xlarge, executorch-ubuntu-22.04-clang12) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-linux (fp32, xnnpack+custom, linux.arm64.2xlarge, executorch-ubuntu-22.04-gcc11... / linux-job (gh)
/usr/include/c++/11/bits/unique_ptr.h:962:30: error: no matching function for call to ‘executorch::extension::llm::TextPrefiller::TextPrefiller(executorch::extension::llm::TextDecoderRunner*, long int&, long int&, long int&, long int&)’
trunk / test-llama-runner-mac (fp32, coreml) / macos-job (gh)
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__memory/unique_ptr.h:689:30: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-mac (fp32, mps) / macos-job (gh)
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__memory/unique_ptr.h:689:30: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-mac (fp32, xnnpack+custom+quantize_kv) / macos-job (gh)
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__memory/unique_ptr.h:689:30: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-qnn-linux (fp32, qnn_16a16w, qnn) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
/usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/unique_ptr.h:962:34: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'
trunk / test-llama-torchao-lowbit / macos-job (gh)
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/v1/__memory/unique_ptr.h:689:30: error: no matching constructor for initialization of 'executorch::extension::llm::TextPrefiller'

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-06-11T16:36:21Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

extension/tensor/tensor_ptr_maker.h

extension/llm/runner/text_prefiller.h

extension/llm/runner/text_llm_runner.cpp

examples/models/phi-3-mini/main.cpp

examples/models/phi-3-mini/CMakeLists.txt

extension/tensor/tensor_ptr_maker.cpp

examples/models/phi-3-mini/main.cpp

mergennachin · 2025-06-26T21:51:32Z

@larryliu0820 do you plan to have this as part of 0.7?

larryliu0820 · 2025-07-11T17:38:59Z

ok rebasing

As titled, this PR started to use `TextLLMRunner` to run phi-3-mini. Eager model comes from Huggingface, not using kv cache as custom op because it is only being supported on Optimum-executorch repo. Performance may not be the best.

larryliu0820 · 2025-07-15T17:14:15Z

Merged in #12482

larryliu0820 requested review from iseeyuan, jackzhxng, jathu, kirklandsign, lucylq and swolchok as code owners June 11, 2025 16:35

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 11, 2025

larryliu0820 requested a review from shoumikhin as a code owner June 13, 2025 07:27

mergennachin reviewed Jun 13, 2025

View reviewed changes

shoumikhin reviewed Jun 13, 2025

View reviewed changes

extension/tensor/tensor_ptr_maker.cpp Outdated Show resolved Hide resolved

shoumikhin approved these changes Jun 13, 2025

View reviewed changes

shoumikhin reviewed Jun 13, 2025

View reviewed changes

examples/models/phi-3-mini/main.cpp Show resolved Hide resolved

larryliu0820 force-pushed the phi3_runner branch from f240099 to 277fd57 Compare June 16, 2025 17:04

jackzhxng approved these changes Jun 16, 2025

View reviewed changes

larryliu0820 force-pushed the phi3_runner branch from 277fd57 to 50a8b60 Compare July 11, 2025 17:43

Phi3 runner uses TextLLMRunner

7418d2e

As titled, this PR started to use `TextLLMRunner` to run phi-3-mini. Eager model comes from Huggingface, not using kv cache as custom op because it is only being supported on Optimum-executorch repo. Performance may not be the best.

larryliu0820 force-pushed the phi3_runner branch from 2f55b06 to 7418d2e Compare July 15, 2025 03:56

larryliu0820 closed this Jul 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Phi3 runner uses TextLLMRunner #11551

Phi3 runner uses TextLLMRunner #11551

Uh oh!

larryliu0820 commented Jun 11, 2025

Uh oh!

pytorch-bot bot commented Jun 11, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergennachin commented Jun 26, 2025

Uh oh!

larryliu0820 commented Jul 11, 2025

Uh oh!

larryliu0820 commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Phi3 runner uses TextLLMRunner #11551

Phi3 runner uses TextLLMRunner #11551

Uh oh!

Conversation

larryliu0820 commented Jun 11, 2025

Uh oh!

pytorch-bot bot commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/11551

❌ 27 New Failures

Uh oh!

github-actions bot commented Jun 11, 2025

This PR needs a release notes: label

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergennachin commented Jun 26, 2025

Uh oh!

larryliu0820 commented Jul 11, 2025

Uh oh!

larryliu0820 commented Jul 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

pytorch-bot bot commented Jun 11, 2025 •

edited

Loading

This PR needs a `release notes:` label