-
Notifications
You must be signed in to change notification settings - Fork 41
workaround for cxil_map write error #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
preview available: https://docs.tds.cscs.ch/161 |
|
|
||
| #### `"cxil_map: write error"` when doing inter-node GPU-aware MPI communication | ||
|
|
||
| The following environment variable can be set to disable gdrcopy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about:
This error message is sometimes triggered by applications that use GPU Direct MPI calls when they trigger a bug in gdrcopy (a low-level library used to copy buffers between GPUs).
Setting the following option will completely disable gdrcopy.
Note that this has a performance impact for small message sizes, so it should only be enabled on a case-by-case basis.
You could also mention that it has been used for ICON.
|
preview available: https://docs.tds.cscs.ch/161 |
Co-authored-by: Mikael Simberg <[email protected]>
|
preview available: https://docs.tds.cscs.ch/161 |
…into cxil_map-write-error
|
preview available: https://docs.tds.cscs.ch/161 |
Co-authored-by: Mikael Simberg <[email protected]>
|
preview available: https://docs.tds.cscs.ch/161 |
|
preview available: https://docs.tds.cscs.ch/161 |
No description provided.