Skip to content

Conversation

@kimishpatel
Copy link
Contributor

@kimishpatel kimishpatel commented Feb 11, 2025

Stack from ghstack (oldest at bottom):

Summary:
Previous PR #7927 deecoupled max_seq_length from kv cache. That broke
perf ci workflow. Fix that.

Test Plan:
Trigger it manually and check
apple perf: https://github.com/pytorch/executorch/actions/runs/13267110949
android perf: https://github.com/pytorch/executorch/actions/runs/13267110908

Reviewers:

Subscribers:

Tasks:

Tags:

cc @guangy10 @huydhn @kirklandsign @shoumikhin

Summary:
Previous PR #7927 deecoupled max_seq_length from kv cache. That broke
perf ci workflow. Fix that.

Test Plan:
Trigger it manually and check

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/8374

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit 553d875 with merge base 78752a0 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 11, 2025
kimishpatel added a commit that referenced this pull request Feb 11, 2025
Summary:
Previous PR #7927 deecoupled max_seq_length from kv cache. That broke
perf ci workflow. Fix that.

Test Plan:
Trigger it manually and check

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 3a09b1a
Pull Request resolved: #8374
@kimishpatel kimishpatel temporarily deployed to upload-benchmark-results February 11, 2025 16:48 — with GitHub Actions Inactive
@kimishpatel kimishpatel temporarily deployed to upload-benchmark-results February 11, 2025 17:13 — with GitHub Actions Inactive
@kimishpatel kimishpatel requested a review from guangy10 February 11, 2025 22:49
@kimishpatel kimishpatel added release notes: misc Miscellaneous module: benchmark Issues related to the benchmark infrastructure labels Feb 11, 2025
@guangy10
Copy link
Contributor

The linked job in the PR summary doesn't run with the SpinQuant and QLora. You need to trigger the job using the model id on Hugging Face:

@mergennachin mergennachin self-requested a review February 11, 2025 23:11
@kimishpatel
Copy link
Contributor Author

kimishpatel commented Feb 12, 2025

doesn't run with the SpinQuant and QLora

let me do this in follow up PR

Actually let me just do it here

@kimishpatel
Copy link
Contributor Author

The linked job in the PR summary doesn't run with the SpinQuant and QLora. You need to trigger the job using the model id on Hugging Face:

need to trigger the job using the model id on Hugging Face:

What does this mean? Is there description as to how to trigger this. I followed steps here https://github.com/pytorch/executorch/tree/main/extension/benchmark

Summary:
Previous PR #7927 deecoupled max_seq_length from kv cache. That broke
perf ci workflow. Fix that.

Test Plan:
Trigger it manually and check
apple perf: https://github.com/pytorch/executorch/actions/runs/13267110949
android perf: https://github.com/pytorch/executorch/actions/runs/13267110908

Reviewers:

Subscribers:

Tasks:

Tags:

cc guangy10 huydhn kirklandsign shoumikhin

[ghstack-poisoned]
kimishpatel added a commit that referenced this pull request Feb 12, 2025
Summary:
Previous PR #7927 deecoupled max_seq_length from kv cache. That broke
perf ci workflow. Fix that.

Test Plan:
Trigger it manually and check

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: cd637af
Pull Request resolved: #8374
@kimishpatel
Copy link
Contributor Author

doesn't run with the SpinQuant and QLora

let me do this in follow up PR

Actually let me just do it here

this is updated. But I think I am gonna have to do one more round of scrubbing in subsequent PRs for various incarnations of llama

@guangy10
Copy link
Contributor

guangy10 commented Feb 12, 2025

The linked job in the PR summary doesn't run with the SpinQuant and QLora. You need to trigger the job using the model id on Hugging Face:

need to trigger the job using the model id on Hugging Face:

What does this mean? Is there description as to how to trigger this. I followed steps here https://github.com/pytorch/executorch/tree/main/extension/benchmark

You need to specify the models you want to benchmark against explicitly, separated by ",". In this case, they are "meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8". See the screenshot for example.

Updated the screenshot. You need to run against your branch, not on main.

Screenshot 2025-02-11 at 7 35 08 PM

Copy link
Contributor

@guangy10 guangy10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me

@kimishpatel
Copy link
Contributor Author

The linked job in the PR summary doesn't run with the SpinQuant and QLora. You need to trigger the job using the model id on Hugging Face:

need to trigger the job using the model id on Hugging Face:

What does this mean? Is there description as to how to trigger this. I followed steps here https://github.com/pytorch/executorch/tree/main/extension/benchmark

You need to specify the models you want to benchmark against explicitly, separated by ",". In this case, they are "meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8,meta-llama/Llama-3.2-1B-Instruct-QLORA_INT4_EO8". See the screenshot for example.

Screenshot 2025-02-11 at 7 30 20 PM

oh you are right. I forgot about that step.

@kimishpatel kimishpatel temporarily deployed to upload-benchmark-results February 12, 2025 04:16 — with GitHub Actions Inactive
@kimishpatel kimishpatel temporarily deployed to upload-benchmark-results February 12, 2025 04:46 — with GitHub Actions Inactive
@kimishpatel kimishpatel changed the base branch from gh/kimishpatel/158/base to main February 12, 2025 15:02
@kimishpatel kimishpatel merged commit e137c22 into main Feb 12, 2025
70 of 71 checks passed
@kimishpatel kimishpatel deleted the gh/kimishpatel/158/head branch February 12, 2025 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: benchmark Issues related to the benchmark infrastructure release notes: misc Miscellaneous

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

4 participants