-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Labels
Frontend<NV>Frontend of the LLM workflow<NV>Frontend of the LLM workflowbugSomething isn't workingSomething isn't working
Description
System Info
Hang is seen in all the dashboard models when using torch-opt. Isolated to the first rms_norm+allreduce fusion in all strategies. when the first fusion is skipped hang is gone. hang does not happen in torch-cudagraph mode. see #9847 for more details.
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
na
Expected behavior
na
actual behavior
na
additional notes
na
Before submitting a new issue...
- Make sure you already searched for relevant issues, and checked the documentation and examples for answers to frequently asked questions.
Metadata
Metadata
Assignees
Labels
Frontend<NV>Frontend of the LLM workflow<NV>Frontend of the LLM workflowbugSomething isn't workingSomething isn't working
Type
Projects
Status
Backlog