[Executorch][llama] Change runner to decouple prompt length from sequence length #9350

kimishpatel · 2025-03-18T04:38:02Z

Stack from ghstack (oldest at bottom):

length

Following previous diff now we can utilize entire kv cache to generate more
tokens than max prompt length allowed.

Differential Revision: D69073908

…ence length Following previous diff now we can utilize entire kv cache to generate more tokens than max prompt length allowed. Differential Revision: [D69073908](https://our.internmc.facebook.com/intern/diff/D69073908/) [ghstack-poisoned]

pytorch-bot · 2025-03-18T04:38:06Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9350

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Long queue for ROCM runners, also B200 and XPU queueing is observed

❌ 2 New Failures

As of commit d8a3dce with merge base 6daff83 ():

NEW FAILURES - The following jobs have failed:

Check Labels / Check labels (gh)
RuntimeError: Error checking labels: PR does not have required labels
Propose to merge ghstack orig PRs to main / Try to create a PR with ghstack /orig branch (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-03-18T04:38:20Z

This pull request was exported from Phabricator. Differential Revision: D69073908

github-actions · 2025-03-18T04:38:55Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

…h from sequence length" length Following previous diff now we can utilize entire kv cache to generate more tokens than max prompt length allowed. Differential Revision: [D69073908](https://our.internmc.facebook.com/intern/diff/D69073908/) [ghstack-poisoned]

…ence length Pull Request resolved: #9350 Following previous diff now we can utilize entire kv cache to generate more tokens than max prompt length allowed. ghstack-source-id: 272776941 @exported-using-ghexport Differential Revision: [D69073908](https://our.internmc.facebook.com/intern/diff/D69073908/)

facebook-github-bot · 2025-03-19T18:30:00Z

This pull request was exported from Phabricator. Differential Revision: D69073908

…ence length Following previous diff now we can utilize entire kv cache to generate more tokens than max prompt length allowed. Differential Revision: [D69073908](https://our.internmc.facebook.com/intern/diff/D69073908/) ghstack-source-id: 272375855 Pull Request resolved: pytorch/executorch#9350

github-actions · 2025-08-31T00:51:19Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

github-actions · 2025-10-30T00:51:59Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
If you are unable to remove the Stale label please contact a maintainer in order to do so.
If you want the bot to never mark this PR stale again, add the no-stale label.
Stale pull requests will automatically be closed after 30 days of inactivity.

kimishpatel requested review from jackzhxng and lucylq as code owners March 18, 2025 04:38

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 18, 2025

This was referenced Mar 18, 2025

[Executorch][SDPA] Fix bug in sdpa #9105

Merged

[Executorch][kv cache] Make quantized cache return only the updated cache portion #9351

Closed

facebook-github-bot added the fb-exported label Mar 18, 2025

jackzhxng approved these changes Mar 18, 2025

View reviewed changes

github-actions bot added the stale PRs inactive for over 60 days label Aug 31, 2025

jackzhxng closed this Oct 30, 2025

jackzhxng had a problem deploying to cherry-pick-bot October 30, 2025 14:38 — with GitHub Actions Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Executorch][llama] Change runner to decouple prompt length from sequence length #9350

[Executorch][llama] Change runner to decouple prompt length from sequence length #9350

Uh oh!

kimishpatel commented Mar 18, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Mar 18, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Mar 18, 2025

Uh oh!

github-actions bot commented Mar 18, 2025

Uh oh!

facebook-github-bot commented Mar 19, 2025

Uh oh!

github-actions bot commented Aug 31, 2025

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Executorch][llama] Change runner to decouple prompt length from sequence length #9350

[Executorch][llama] Change runner to decouple prompt length from sequence length #9350

Uh oh!

Conversation

kimishpatel commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9350

❗ 1 Active SEVs

❌ 2 New Failures

Uh oh!

facebook-github-bot commented Mar 18, 2025

Uh oh!

github-actions bot commented Mar 18, 2025

This PR needs a release notes: label

Uh oh!

facebook-github-bot commented Mar 19, 2025

Uh oh!

github-actions bot commented Aug 31, 2025

Uh oh!

github-actions bot commented Oct 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kimishpatel commented Mar 18, 2025 •

edited

Loading

pytorch-bot bot commented Mar 18, 2025 •

edited

Loading

This PR needs a `release notes:` label