Update dependency accelerate to v1.12.0#335
Update dependency accelerate to v1.12.0#335konflux-internal-p02[bot] wants to merge 1 commit intorhoai-3.4from
Conversation
Signed-off-by: konflux-internal-p02 <170854209+konflux-internal-p02[bot]@users.noreply.github.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: konflux-internal-p02[bot] The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Hi @konflux-internal-p02[bot]. Thanks for your PR. I'm waiting for a red-hat-data-services member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
This PR contains the following updates:
==1.11.0->==1.12.0Release Notes
huggingface/accelerate (accelerate)
v1.12.0: : Deepspeed Ulysses/ALSTCompare Source
Deepspeed Ulysses/ALST integration
Deepspeed Ulysses/ALST is an efficient way of training on long sequences by employing sequence parallelism and attention head parallelism. You can learn more about this technology in this paper https://arxiv.org/abs/2506.13996 or this deepspeed tutorial https://www.deepspeed.ai/tutorials/ulysses-alst-sequence-parallelism/.
To enable Deepspeed Ulysses, you first need to create
ParallelismConfigand settingsprelated args:Then, you need to make sure to compute the correct loss as described on our docs
... losses_per_rank = torch.distributed.nn.functional.all_gather(loss, group=sp_group) good_tokens = (shift_labels != -100).view(-1).sum() good_tokens_per_rank = torch.distributed.nn.functional.all_gather(good_tokens, group=sp_group) total_loss = sum( losses_per_rank[rank] * good_tokens_per_rank[rank] for rank in range(sp_world_size) if good_tokens_per_rank[rank] > 0 ) total_good_tokens = sum(good_tokens_per_rank) loss = total_loss / max(total_good_tokens, 1)Thanks @S1ro1 for starting this work and for @stas00 for finishing this work. Also thanks @kashif for adding docs and reviewing/testing this PR !
This feature will also be available in HF Trainer thanks for this PR from @stas00: huggingface/transformers#41832
Minor changes
cpu_ram_efficient_loadingby @SunMarc in #3816New Contributors
Full Changelog: huggingface/accelerate@v1.11.0...v1.12.0
Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
To execute skipped test pipelines write comment
/ok-to-test.Documentation
Find out how to configure dependency updates in MintMaker documentation or see all available configuration options in Renovate documentation.