-
Notifications
You must be signed in to change notification settings - Fork 41
LAMMPS Documentation #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
Co-authored-by: Rocco Meli <[email protected]>
|
preview available: https://docs.tds.cscs.ch/96 |
msimberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor formatting issue, which I think is not intentional?
The other question about launcher scripts can be discussed after merging or even offline.
docs/software/sciapps/lammps.md
Outdated
|
|
||
| export MPICH_GPU_SUPPORT_ENABLED=1 | ||
|
|
||
| numactl --cpunodebind=$NUMA_NODE --membind=$NUMA_NODE "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity: have you found using numactl necessary/useful to actually improve performance or is it there "just to be sure"? Not asking for a change, just asking to understand if we need to align other launcher scripts.
This launcher script is basically the same as the "single rank per gpu" case in the slurm docs, except for the memory binding. The CPU binding is already set by slurm, so should be redundant to set it with numactl. The visible devices should be equivalent to what we set with --gpus-per-task=1 (assuming we stick with four tasks per node). If the additional --membind seems useful, we might want to recommend it in the slurm docs. Conversely, if it doesn't seem to help, simply referring to the slurm docs would be simpler. First touch mostly takes care of the memory binding, except --membind obviously has the added benefit of disallowing overallocating on a numa node, which may also be good default to recommend for other applications.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think its probably possible to avoid using this wrapper with sbatch/srun commands. I'll have a look into this today or tomorrow depending on availability!
Co-authored-by: Mikael Simberg <[email protected]>
|
preview available: https://docs.tds.cscs.ch/96 |
|
Please wait on a merge - I want to first test to see if numactl can be removed. |
I converted it to a draft, so that it can't accidentally be merged. |
|
Ping @nickjbrowning? Reminder that if you need more time to check the |
|
preview available: https://docs.tds.cscs.ch/96 |
|
@msimberg I've just checked the numactl stuff and we can remove it. Recently had a ticket where something related came up and the latest commit here has the current best-practice wrt. kokkos + GPUs. |
|
preview available: https://docs.tds.cscs.ch/96 |
|
preview available: https://docs.tds.cscs.ch/96 |
| #SBATCH --gpus-per-node=4 | ||
| #SBATCH --gpus-per-task=1 | ||
| #SBATCH --gpu-bind=per_task:1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the --gpus-per-task=1 alone covers this. I've never seen the --gpus-per-node=4 + --gpu-bind=per_task:1 form before so can't say for sure if that does something different, but if the goal is to have four ranks and one GPU per task, then in my experience just having --gpus-per-task=1 is sufficient.
| #SBATCH --gpus-per-node=4 | |
| #SBATCH --gpus-per-task=1 | |
| #SBATCH --gpu-bind=per_task:1 | |
| #SBATCH --gpus-per-task=1 |
If you're unsure and would like to leave the other options there for now that's also ok by me. @RMeli?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've also never seen this combination. If they are equivalent, I'd go for --gpus-per-task=1 for consistency in our documentation. If it does something different, maybe it is worth commenting/adding a note explaining the difference?
msimberg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @nickjbrowning for pushing this through. I added a couple of minor comments, but not blocking in my opinion.
Co-authored-by: Mikael Simberg <[email protected]>
|
preview available: https://docs.tds.cscs.ch/96 |
|
Hey guys, can we finally merge this? |
|
@sekelle, @nickjbrowning was on holidays last week. I was waiting for the last conversation to be resolved before merging. |
|
Since the documentation is now live, I'll merge this and then we can go back to refining the last few details. |
Re-opening this PR. Will address the previous comments in future commits.