UCP/WIREUP: Allow self endpoint to skip a device lane. #10953
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What?
Fix perftest failure with
cuda_ipcbut withoutrc_gdaon device tests.Why?
Workaround: Allow self endpoints to miss a device lane (
UCP_FEATURE_DEVICE). This can happen withUCX_TLS=^rc_gdaascuda_ipcdoes not support same process lane.How
ucx_perftestuses a self endpoint to copy SN on non-host memory (for instance, ucp_put_lat, to copy the sn). This self endpoint shares the same worker containing the device feature request, hence triggers failure. Using separate context and worker withoutUCP_FEATURE_DEVICEwould mean that we need to register the memory both oncontextandcontext_self, which does not seem completely impossible.Repro: