Commit 92fb93f
committed
rdma: Re-enable eager messages
While the eager protocol did not have an impact on nccl-tests
performance, it did have a sizeable (30% difference in step time for
Maxtext Llama2 70B on P5) impact on applications. So re-enable the
eager protocol, and adjust the early completion detection to
automatically adjust to eager enablement.
We need to come back to the early completion protocol and find a
way to have early completion and eager co-exist in the general
case, but that's future work.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>1 parent d809a88 commit 92fb93f
2 files changed
+12
-2
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
342 | 342 | | |
343 | 343 | | |
344 | 344 | | |
345 | | - | |
| 345 | + | |
346 | 346 | | |
347 | 347 | | |
348 | 348 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8109 | 8109 | | |
8110 | 8110 | | |
8111 | 8111 | | |
8112 | | - | |
| 8112 | + | |
| 8113 | + | |
| 8114 | + | |
| 8115 | + | |
| 8116 | + | |
| 8117 | + | |
| 8118 | + | |
| 8119 | + | |
| 8120 | + | |
| 8121 | + | |
| 8122 | + | |
8113 | 8123 | | |
8114 | 8124 | | |
8115 | 8125 | | |
| |||
0 commit comments