Commit 8839b04
committed
Fix qpIndex selection in ncclIbIrecv for AINIC mode in net_ib_rocm
In AINIC mode, comm->base.qpIndex is intentionally not updated inside
the ncclIbIrecv recv-posting loop — it is deferred to ncclIbPostFifo
so that CTS messages are sent on the correct QPs. However, the loop
body was still using comm->base.qpIndex for QP selection, causing all
iterations to post receives on the same QP instead of distributing
them across all physical NICs in the merged device.
Introduce curQpIndex that reads from the local qpIndex variable (which
does advance each iteration) in AINIC mode, and from comm->base.qpIndex
in the standard path. This ensures round-robin QP selection works
correctly with both AINIC and non-AINIC configurations.1 parent c0f9dd1 commit 8839b04
File tree
2 files changed
+10
-8
lines changed- projects/rccl
- ext-src
- src/transport
2 files changed
+10
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
665 | 665 | | |
666 | 666 | | |
667 | 667 | | |
668 | | - | |
| 668 | + | |
669 | 669 | | |
670 | 670 | | |
671 | 671 | | |
| |||
697 | 697 | | |
698 | 698 | | |
699 | 699 | | |
700 | | - | |
| 700 | + | |
| 701 | + | |
701 | 702 | | |
702 | 703 | | |
703 | 704 | | |
| |||
718 | 719 | | |
719 | 720 | | |
720 | 721 | | |
721 | | - | |
| 722 | + | |
722 | 723 | | |
723 | 724 | | |
724 | 725 | | |
| |||
762 | 763 | | |
763 | 764 | | |
764 | 765 | | |
765 | | - | |
| 766 | + | |
766 | 767 | | |
767 | 768 | | |
768 | 769 | | |
| |||
771 | 772 | | |
772 | 773 | | |
773 | 774 | | |
774 | | - | |
| 775 | + | |
775 | 776 | | |
776 | 777 | | |
777 | 778 | | |
| |||
792 | 793 | | |
793 | 794 | | |
794 | 795 | | |
795 | | - | |
| 796 | + | |
796 | 797 | | |
797 | 798 | | |
798 | 799 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2819 | 2819 | | |
2820 | 2820 | | |
2821 | 2821 | | |
2822 | | - | |
| 2822 | + | |
| 2823 | + | |
2823 | 2824 | | |
2824 | 2825 | | |
2825 | 2826 | | |
2826 | 2827 | | |
2827 | 2828 | | |
2828 | 2829 | | |
2829 | | - | |
| 2830 | + | |
2830 | 2831 | | |
2831 | 2832 | | |
2832 | 2833 | | |
| |||
0 commit comments