- 
                Notifications
    
You must be signed in to change notification settings  - Fork 1.1k
 
Open
Description
I’m excited about the recent introduction of Domino and its impressive TP optimization.
When I was using deepspeed-domino to better overlap comm & comp in TP, I found domino use forward_backward_no_pipelining() in schedules.py. Is that mean I couldn't use domino(tp optimization) and pp together?
Metadata
Metadata
Assignees
Labels
No labels