Commit 8be9663
Fix verbs on new InfiniBand hardware/drivers (#2712)
Verbs started rejecting setting retry_cnt to 20 because this is only
a 3-bit field, so the maximum legal value is 7. This caused a
retry using "QLOGIC" values, which attempted to modify the mtu,
but the mtu flag was not set and mtu could not be modified at this
point of setup anyway, so the value was only changed for future
nodes attempting to connect and was in general inconsistent,
resulting in "Work completion error in sendCq" Charm++ aborts
along with mlx5 driver "got completion with error" messages.
Fix is to always set retry_cnt to 7 and don't try to change mtu.1 parent 7427330 commit 8be9663
1 file changed
+1
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
976 | 976 | | |
977 | 977 | | |
978 | 978 | | |
979 | | - | |
980 | 979 | | |
981 | 980 | | |
982 | | - | |
983 | 981 | | |
| 982 | + | |
984 | 983 | | |
985 | 984 | | |
986 | 985 | | |
| |||
999 | 998 | | |
1000 | 999 | | |
1001 | 1000 | | |
1002 | | - | |
1003 | | - | |
1004 | 1001 | | |
1005 | | - | |
1006 | 1002 | | |
1007 | | - | |
1008 | | - | |
1009 | 1003 | | |
1010 | | - | |
1011 | 1004 | | |
1012 | 1005 | | |
1013 | 1006 | | |
| |||
0 commit comments