Skip to content

Need an example for FSDP + FP16 training #44169

@quic-meetkuma

Description

@quic-meetkuma

In my setup, I am trying to run FSDP with FP16 precision. Is there any limitation that I can not use FSDP with FP16 precision? How can I convert my existing code to FSDP for FP16 precision? I believe there is ShardedGradScaler from FSDP should be used. How is it different than normal GradScaler in terms of implementation? It will be great if someone share a concise example for this.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions