Skip to content

[Feature]: FP8 for DS-R1 #8234

@nzmora-nvidia

Description

@nzmora-nvidia

🚀 The feature, motivation and pitch

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    AutoDeploy<NV> AutoDeploy BackendLow PrecisionLower-precision formats (INT8/INT4/FP8) for TRTLLM quantization (AWQ, GPTQ).

    Projects

    Status

    Backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions