Update transformer to accelerator.device with it's weight_dtype #10466

Viditnegi · 2025-01-06T05:38:41Z

By default, the transformer always gets transferred to the device with float32 whether we specify the dtype as bfloat16 or float16. This fixes it.

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

By default, the transformer always gets transferred to the device with float32 whether we specify the dtype as bfloat16 or float16. This fixes it.

HuggingFaceDocBuilderDev · 2025-01-06T09:08:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sayakpaul · 2025-01-06T11:39:14Z

examples/dreambooth/train_dreambooth_flux.py

            "Mixed precision training with bfloat16 is not supported on MPS. Please use fp16 (recommended) or fp32 instead."
        )

+    transformer.to(accelerator.device, dtype=weight_dtype)


We should not be doing it as we're fine-tuning the transformer. accelerator.prepare() handles this for us (plus any additional hook placements that might be required).

The transformer gets transferred to the gpu in float32 by default with accelerator.prepare(). Takes 40 gb memory at once!

Well it will trained in mixed-precision through autocast. This is handled by accelerate. If the user wants to cast it to a lower precision, it should be done through an CLI argument, not by default IMO.

Okay got it. Thanks.

for context, we do have this line - transformer.to(accelerator.device, dtype=weight_dtype) in the LoRA training scripts, and the flag --upcast_before_saving (controls wether to upcast to fp32 before saving) that is False by default indeed due to the memory reqs

Okay got it, thanks!

DN6 · 2025-01-08T14:13:19Z

Can we close this since it seems we can resolve the issue with a CLI arg?

Viditnegi · 2025-01-08T14:17:26Z

Can we close this since it seems we can resolve the issue with a CLI arg?

Yes, just closed the pr!

Update transformer to accelerator.device with it's weight_dtype

dd69d54

By default, the transformer always gets transferred to the device with float32 whether we specify the dtype as bfloat16 or float16. This fixes it.

hlky requested review from linoytsaban and sayakpaul January 6, 2025 09:02

sayakpaul reviewed Jan 6, 2025

View reviewed changes

Viditnegi closed this Jan 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Update transformer to accelerator.device with it's weight_dtype #10466

Update transformer to accelerator.device with it's weight_dtype #10466

Uh oh!

Viditnegi commented Jan 6, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jan 6, 2025

Uh oh!

sayakpaul Jan 6, 2025

Uh oh!

Viditnegi Jan 6, 2025 •

edited

Loading

Uh oh!

sayakpaul Jan 6, 2025

Uh oh!

Viditnegi Jan 6, 2025 •

edited

Loading

Uh oh!

linoytsaban Jan 6, 2025

Uh oh!

Viditnegi Jan 7, 2025 •

edited

Loading

Uh oh!

DN6 commented Jan 8, 2025

Uh oh!

Viditnegi commented Jan 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Update transformer to accelerator.device with it's weight_dtype #10466

Update transformer to accelerator.device with it's weight_dtype #10466

Uh oh!

Conversation

Viditnegi commented Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Jan 6, 2025

Uh oh!

sayakpaul Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

Viditnegi Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sayakpaul Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

Viditnegi Jan 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linoytsaban Jan 6, 2025

Choose a reason for hiding this comment

Uh oh!

Viditnegi Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DN6 commented Jan 8, 2025

Uh oh!

Viditnegi commented Jan 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Viditnegi commented Jan 6, 2025 •

edited

Loading

Viditnegi Jan 6, 2025 •

edited

Loading

Viditnegi Jan 6, 2025 •

edited

Loading

Viditnegi Jan 7, 2025 •

edited

Loading