[Distributed] Separate prefill and decode #1162

kwen2501 · 2024-09-18T08:46:13Z

Prefill phase and decoding phase are now done separately.
Decoding phase will leverage KV cache and speed improves.

Requires a PyTorch side fix to work. Will change the pin version.

Requires pytorch PR pytorch/pytorch#136243 to land.
Otherwise, you may hit issue described in: pytorch/pytorch#136225

pytorch-bot · 2024-09-18T08:46:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1162

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Unrelated Failure

As of commit 3512101 with merge base e27e162 ():

NEW FAILURE - The following job has failed:

pull / test-mps-dtype / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-mps / macos-job (gh) (trunk failure)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

lessw2020 · 2024-09-18T14:30:49Z

dist_run.py

-
-    # create schedule
-    schedule = ScheduleGPipe(stage, mbs)
+    # TODO: figure out how to set input_pos for each prompt in the batch then we


Removing this limitation is probably our most important next step.

lessw2020

Great work, looks great. Left comment but removing the all prompt same len is likely most important next step. Anyway this pr is a big step forward, nice job!

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 18, 2024

kwen2501 requested a review from lessw2020 September 18, 2024 08:46

lessw2020 reviewed Sep 18, 2024

View reviewed changes

lessw2020 approved these changes Sep 18, 2024

View reviewed changes

kwen2501 force-pushed the tp_not_sp branch from 3b896db to 611de83 Compare September 20, 2024 05:25

kwen2501 changed the base branch from tp_not_sp to main September 20, 2024 06:35

[Distributed] Separate prefill and decode

3512101

kwen2501 force-pushed the decode2 branch from b4630f2 to 3512101 Compare September 20, 2024 06:37

kwen2501 merged commit 8d01d9b into main Sep 20, 2024
49 of 51 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Distributed] Separate prefill and decode #1162

[Distributed] Separate prefill and decode #1162

Uh oh!

kwen2501 commented Sep 18, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 18, 2024 •

edited

Loading

Uh oh!

lessw2020 Sep 18, 2024

Uh oh!

lessw2020 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Distributed] Separate prefill and decode #1162

[Distributed] Separate prefill and decode #1162

Uh oh!

Conversation

kwen2501 commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1162

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

lessw2020 Sep 18, 2024

Choose a reason for hiding this comment

Uh oh!

lessw2020 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kwen2501 commented Sep 18, 2024 •

edited

Loading

pytorch-bot bot commented Sep 18, 2024 •

edited

Loading