Commit ed6648b
committed
UCP: Allow eager inline sends for host memory when CUDA MDs are present
When CUDA memory domains are loaded, the memory type cache becomes
non-empty after any GPU allocation. Previously, ucp_proto_is_inline()
would conservatively disable inline (am_short) sends for all buffers
when the cache was non-empty, unless the user explicitly set the
memory type to HOST. This caused a performance regression for host
memory buffers on systems with CUDA/ROCm installed.
Fix by performing a memtype cache lookup when the cache is non-empty
to positively identify whether the buffer is host memory. If the
address is not found in the cache, it is host memory and inline send
is safe to use.
Fixes #42751 parent 6dda7bd commit ed6648b
3 files changed
+30
-9
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
915 | 915 | | |
916 | 916 | | |
917 | 917 | | |
918 | | - | |
| 918 | + | |
919 | 919 | | |
920 | 920 | | |
921 | 921 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
649 | 649 | | |
650 | 650 | | |
651 | 651 | | |
652 | | - | |
| 652 | + | |
| 653 | + | |
653 | 654 | | |
654 | | - | |
655 | | - | |
656 | | - | |
657 | | - | |
658 | | - | |
| 655 | + | |
| 656 | + | |
| 657 | + | |
| 658 | + | |
| 659 | + | |
| 660 | + | |
| 661 | + | |
| 662 | + | |
| 663 | + | |
| 664 | + | |
| 665 | + | |
| 666 | + | |
| 667 | + | |
| 668 | + | |
| 669 | + | |
| 670 | + | |
| 671 | + | |
| 672 | + | |
| 673 | + | |
| 674 | + | |
| 675 | + | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
659 | 680 | | |
660 | 681 | | |
661 | 682 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
154 | 154 | | |
155 | 155 | | |
156 | 156 | | |
157 | | - | |
| 157 | + | |
158 | 158 | | |
159 | 159 | | |
160 | 160 | | |
161 | 161 | | |
162 | 162 | | |
163 | 163 | | |
164 | | - | |
| 164 | + | |
165 | 165 | | |
166 | 166 | | |
167 | 167 | | |
| |||
0 commit comments