ofi - add MCA parameters to not use FI_HMEM #11981
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit adds two MCA parameters:
mtl_ofi_disable_hmem
btl_ofi_disable_hmem
to allow for disabling use of FI_HMEM in cases where the provider may advertise support for HMEM but in fact may not, and does not observe the OFI libfabric FI_HMEM_DISABLE_P2P environment variable.
This is actually the situation as of the writing of this commit on certain systems owing to limitations in kernel support for registration of accelerator memory. The OFI provider on such systems unfortunately stil advertises support for FI_HMEM with ZE but fails when trying to register memory. These mca parameters allow for turning off use of FI_HMEM in such cases.
Related to ofiwg/libfabric#9315
Signed-off-by: Howard Pritchard [email protected]
(cherry picked from commit baf882a)